Python Development: Testing, Packaging, and Parallelism
Testing and Code Quality
TDD for Algorithmic Problems
Michael and Brian discuss an article by Adam Johnson on using PyTest to solve coding challenges, such as those found on Project Euler. They highlight how treating problems as Test-Driven Development (TDD) exercises helps translate specifications into small, manageable tests.
"It's a cool way to get some experience working with PyTest... these types of problems are pretty good entry-level PyTest-type problems."
The Dangers of Import Star
Reflecting on a piece by Mike Croucher, the hosts express their strong disdain for from module import *. They emphasize the importance of:
• Explicit Namespaces: Making code readable by showing exactly where functions originate.
• Naming Conflicts: Avoiding confusion when multiple libraries (like NumPy, SciPy, or Math) contain similarly named functions.
Developer Tools and Tooling
Managing Dependency Hell with DepHell
DepHell is introduced as a meta-tool for Python packaging.
• It acts as a bridge between various formats like setup.py, pyproject.toml, and virtual environment tools.
• It is built with AsyncIO, which provides high performance and modern speed.
Scaling with Dask
They cover Dask, a library used for native parallel computing in Python. It is described as a powerful way to scale the NumPy and Pandas stack seamlessly.
• It enables processing data that exceeds local RAM.
• It supports both local mini-cluster generation and distributed cluster operation.
Visualization and Advanced Features
Animations in Matplotlib
Building animations in Matplotlib is presented as a practical way to visualize data transformations or simulations, such as raindrop patterns.
PEP 554: Multiple Sub-interpreters
They discuss PEP 554, which explores exposing sub-interpreters in the CPython standard library.
• Increased Parallelism: It could potentially allow code to bypass the Global Interpreter Lock (GIL) limitations by running isolated interpreters.
• Isolation: It offers a way to run untrusted code or manage dependency version conflicts without firing up separate heavy processes.