Data Visualization, Instagram Optimization & Python News
Matplotlib 2.0 and Data Visualization
The episode kicks off with a discussion on the release of Matplotlib 2.0, highlighting a focused tutorial that utilizes Pandas, NumPy, and Seaborn. The tutorial covers essential data visualization techniques, including:
• Creating financial charts with shaded areas for specific events like recessions.
• Customizing labels, legends, and using horizontal lines to delineate data points.
• Adding advanced elements like annotations with arrows and watermarks.
High-Performance Python at Instagram
Interestingly, the hosts dive into how Instagram optimizes its massive scale through radical Python configurations.
"They successfully raised the shared memory from 140 megs to 225... And they were able to drop the memory usage per server by 8 gigs."
Key takeaways from their engineering approach include:
• Using a multi-process Django deployment with shared memory.
• The bold decision to disable Python garbage collection in specific high-load scenarios, resulting in a 25% RAM savings and a 10% speed increase due to improved CPU cache alignment.
Type Hints and Static Analysis
The hosts discuss a practical look at type hints and the MyPy tool. While initially skeptical, they note that:
• Type hints act as enhanced documentation that can be verified by tools.
• They serve as a powerful aid when bridging the gap between legacy Python 2 and Python 3.
• Using typing.Any provides a useful escape hatch for dynamic code.
The Special Role of Underscores
There is a breakdown regarding the idiomatic use of the underscore (_) in Python, which can confuse those coming from other languages. Key meanings include:
• REPL usage: Storing the result of the last executed expression.
• 'I don't care' variable: Used in loops or tuple unpacking to satisfy linters.
• Naming conventions: Indicating protected or private members, and dealing with keyword conflicts.
• Formatting: Digit grouping in Python 3.6 (e.g., 1_000_000).
PyPI Governance and Data Rescue
The podcast concludes by covering PEP 541, a proposal for managing stagnant package names on PyPI, and a fascinating story about programmers using Beautiful Soup and Scrapy to archive U.S. government climate data during a presidential transition.