Mastering Python Programming: Insights and Best Practices

Mastering Python Programming: Insights and Best Practices

This section contains my writings on Python programming, sharing insights from years of professional software development. Topics include best practices, advanced techniques, performance optimization, and practical applications across various domains.

Stargazers CLI Update: Nested Commands, Account Trends, and Plotting!

May 20, 2025

Tags:stargazers, github, cli, open source, python

Announcing the latest stargazers CLI update: all commands now under 'stargazers', plus new account-trend analysis and plotting features.

Read more →

Mutation Testing with mumut for Pygeohash

May 19, 2025

Tags:python

Mutation testing checks if your tests catch real bugs by making small code changes. Learn how it works and why it matters.

Read more →

Digging into Code Churn with GitPandas

May 16, 2025

Tags:code churn, data-analysis, git, gitpandas, python

Quantify code churn in your Git repositories with the gitpandas Python library. Analyze file change rates and spot areas of high activity or instability.

Read more →

Refactoring Library Interfaces

May 15, 2025

Tags:api-design, best-practices, library-development, programming, python, refactoring

Discover techniques for improving library interfaces through thoughtful refactoring, using real-world examples while maintaining backward compatibility.

Read more →

Context-Aware Library Design: Build for Your Users

May 14, 2025

Tags:api-design, best-practices, library-development, programming, python, user-experience

Learn to design Python libraries that adapt to various user needs and experience levels, ensuring simplicity and effectiveness for all users.

Read more →

Who Holds the Keys? Calculating Bus Factor with GitPandas

May 13, 2025

Tags:bus factor, data-analysis, git, gitpandas, python, risk management

Explore the concept of bus factor and how gitpandas can help you quantify knowledge distribution and risk in your software project.

Read more →

The ECF Rating System: The British Approach to Chess Ratings

May 5, 2025

Tags:algorithms, elote, open-source, python, rating-systems

Explore the English Chess Federation (ECF) rating system, its unique performance-based approach, and its distinct history from Elo.

Read more →

Writing Tools MCP: A Toolkit for Better Writing

May 5, 2025

Tags:mcp, writing, tools, open-source, python

Boost your writing workflow with custom tools for Markdown and content analysis. Discover scripts and automations for faster, better publishing.

Read more →

Glicko-2: Adding Volatility to the Rating Equation

April 30, 2025

Tags:algorithms, elote, glicko, open-source, python, rating-systems

Explore Glicko-2, an enhancement over Glicko-1, by incorporating player volatility for more precise and accurate ratings.

Read more →

The Glicko Rating System: When Confidence Matters

April 29, 2025

Tags:algorithms, elote, glicko, open-source, python, rating-systems

Explore the Glicko rating system, an Elo enhancement adding Rating Deviation (RD) to measure confidence. Learn its mechanics and implementation with Elote.

Read more →

Handling Deprecation: Gracefully Retiring Features

April 28, 2025

Tags:api-design, best-practices, development, library, maintenance, python

Learn to deprecate Python library features gracefully with warnings, clear communication, and migration paths that minimize disruption to your users.

Read more →

McCabe Complexity: The Python Metric You Should Care About

April 24, 2025

Tags:best-practices, code-quality, development, python, tools

Learn about McCabe Complexity, a key metric for code complexity. Understand, measure with tools like Ruff, and manage complexity in Python projects.

Read more →

Python Logging Best Practices for Library Developers

April 20, 2025

Tags:best-practices, development, library, python

A comprehensive guide to implementing logging in Python libraries - from basic setup to advanced patterns and common pitfalls to avoid

Read more →

Introducing 'stargazers': A Tool to Understand Your GitHub Audience

April 16, 2025

Tags:cli, community, github, python

Announcing 'stargazers': A CLI tool to fetch, analyze & summarize GitHub stargazers/forkers for any public repo. Inspired by Cockroach Labs' analysis.

Read more →

HashingEncoder: Tackling Extreme Cardinality with the Hashing Trick

April 15, 2025

Tags:category-encoders, feature-engineering, machine-learning, python

Explore HashingEncoder: learn how it handles extreme cardinality, its mechanism, ideal use cases, and implementation with the category_encoders library.

Read more →

BinaryEncoder: The Space-Efficient Alternative to One-Hot Encoding

April 13, 2025

Tags:category-encoders, feature-engineering, machine-learning, python

Explore the BinaryEncoder from category_encoders: a space-efficient alternative to one-hot encoding. Learn how it works, when to use it, and see implementation.

Read more →

OrdinalEncoder: When Order Matters in Categorical Data

April 10, 2025

Tags:category-encoders, feature-engineering, machine-learning, python

Explore OrdinalEncoder for categorical data where order matters. Learn how it works, its benefits, and implementation using the category_encoders library.

Read more →

Makefiles: The Unsung Hero of Python Development

April 8, 2025

Tags:automation, development, python, tools

Explore why Makefiles, though old-school, are invaluable for Python projects. Learn how they provide consistent, simple shortcuts for common dev tasks.

Read more →

Modern Python Package Publishing: PyGeoHash's New CI/CD Pipeline

April 6, 2025

Tags:development, github-actions, pygeohash, python

PyGeoHash uses GitHub Actions, cibuildwheel, and PyPI's Trusted Publisher for automated, secure multi-platform Python package builds and publishing.

Read more →

PyGeoHash Gets Type Hints: A Journey into Modern Python

April 3, 2025

Tags:development, open-source, pygeohash, python, type-hints

Enhance PyGeoHash with comprehensive type hints and a new types module, improving developer experience and code quality for modern Python.

Read more →

Optimal Bankroll Management with Keeks: The Kelly Criterion

April 1, 2025

Tags:betting, finance, keeks, kelly-criterion, open-source, python

Dive into the Kelly Criterion, the foundation of optimal betting and bankroll management. Learn its theory, pros, cons, and implementation with Keeks.

Read more →

Documenting Your Library's API: Best Practices

March 30, 2025

Tags:autodoc, best-practices, development, docstrings, documentation, library, python, sphinx

Build a clear, comprehensive API reference with Sphinx & autodoc. Learn best practices for structure, content, cross-referencing your Python library docs.

Read more →

OneHotEncoder: The Workhorse of Categorical Encoding

March 27, 2025

Tags:category-encoders, feature-engineering, machine-learning, python

A comprehensive guide to OneHotEncoder in category_encoders, exploring its core functionality, advantages, and practical limitations in machine learning.

Read more →

Elo Rating System: The Grandfather of Competitive Rankings

March 25, 2025

Tags:algorithms, elo, elote, open-source, python, rating-systems

Deep dive into the Elo rating system: history, the math behind it (K-factor, expected score), and practical implementation with the Elote Python library.

Read more →

Automating Docs Deployment with GitHub Actions and Pages

March 23, 2025

Tags:automation, ci-cd, development, documentation, github-actions, library, python, sphinx

Keep docs current automatically! Learn to set up GitHub Actions to build Sphinx docs & deploy to GitHub Pages, ensuring sync with your code changes.

Read more →

Crafting Code Examples: From Snippets to Real-World Scenarios

March 22, 2025

Tags:development, documentation, library, python, sphinx

Master the art of crafting clear, runnable code examples for documentation using doctest, tutorials, and scripts to enhance user understanding.

Read more →

Keeks 0.1.0 Release: Optimal Bankroll Management Made Simple

March 18, 2025

Tags:betting, finance, keeks, kelly-criterion, open-source, python

Keeks 0.1.0 is here! Python library for optimal bankroll management & betting strategies (Kelly Criterion). Includes simulation and visualization tools.

Read more →

Getting Started with Sphinx for Python Project Documentation

March 15, 2025

Tags:development, docstrings, documentation, library, python, restructuredtext, sphinx

Learn how to use Sphinx to generate professional, cross-referenced documentation for your Python projects, covering installation, configuration, and autodoc.

Read more →

Elote 1.0.0 Release: Rating Systems Made Simple

March 13, 2025

Tags:elo, elote, open-source, python, rating-systems

Announcing Elote 1.0.0, a Python library that simplifies the implementation of rating systems like Elo, Glicko, and TrueSkill for competitive ranking.

Read more →

PyGeoHash v3.0.0: Faster, Freer, and More Pythonic

March 11, 2025

Tags:geospatial, licensing, open-source, performance, pygeohash, python

Deep dive into PyGeoHash v3.0.0: a major release with a pure CPython rewrite, MIT relicensing, and dramatic performance gains. Faster & freer geohashing!

Read more →

Using Cursor for Open Source Library Maintenance

March 9, 2025

Tags:ai, cursor, elote, keeks, open-source, pygeohash, python

How AI tools like Cursor simplify open source library maintenance by assisting with code understanding, environment setup, housekeeping, and project standards.

Read more →

Effective Docstrings: Google vs. NumPy vs. reStructuredText Styles

March 6, 2025

Tags:autodoc, development, docstrings, documentation, library, python, restructuredtext, sphinx

Learn to write clear docstrings Sphinx understands. Compares Google, NumPy, and reStructuredText formats for effective Python library documentation.

Read more →

PyGeoHash 2.1.0: Modernizing a Geospatial Python Library

March 4, 2025

Tags:cursor, geohash, geospatial, open-source, python

A look at the latest updates to PyGeoHash, a lightweight Python library for working with geohashes, and how modern AI tools helped revitalize the project.

Read more →

Geohash: When Clever Isn't Always Smart

March 2, 2025

Tags:algorithms, data-engineering, geohash, python

A deep dive into the limitations of the Geohash algorithm, including boundary issues, non-uniform cell sizes, and common implementation mistakes to avoid.

Read more →

Where Did All the RAM Go? Memory Profiling with Memray

March 1, 2025

Tags:development, library, optimization, performance, profiling, python, testing

High CPU isn''t the only performance issue. Learn how Memray helps track memory leaks and excessive allocation in your Python library to optimize usage.

Read more →

Finding the Slowdown: Profiling Python Code with Pyinstrument

February 25, 2025

Tags:development, library, optimization, performance, profiling, python, testing

Your benchmark says a function is slow, but why? Profilers like Pyinstrument help you pinpoint exactly where your Python code is spending its time.

Read more →

How Fast Is It? Benchmarking Your Code with Pytest-Benchmark

February 22, 2025

Tags:development, library, performance, pytest, python, testing

Performance matters! Easily measure Python library speed with pytest-benchmark. Track performance, find regressions, and optimize effectively with benchmarks.

Read more →

Silos to Shared Libraries: Guide to Inner Source Adoption

February 18, 2025

Tags:best-practices, development-practices, inner-source, library-development, python, security

Guide for transitioning from team-specific code to shared libraries, covering governance models, security, and standardized development practices.

Read more →

Mastering Mocking in Python with pytest-mock

February 16, 2025

Tags:pytest, python, testing

A practical guide to mocking in Python testing - from basic concepts to advanced techniques with pytest-mock and other helpful libraries

Read more →

Building Your Internal Library Developer Community

February 15, 2025

Tags:best-practices, corporate-culture, development-practices, inner-source, library-development, python

Explore strategies for building a thriving community of library developers in your organization through effective incentives, recognition, and collaboration.

Read more →

Will It Blend? Testing Across Environments with Tox

February 13, 2025

Tags:ci-cd, development, library, pytest, python, testing

Works on your machine? Great, but what about Python 3.9 or 3.12? Tox ensures library compatibility across different Python versions and dependency sets easily.

Read more →

Inner Source: Bringing Open Source Culture Inside Your Organization

February 11, 2025

Tags:best-practices, corporate-culture, development-practices, inner-source, library-development, open-source, python

Learn how to harness the power of open source development practices within your organization through inner source principles and practices.

Read more →

Are Your Tests Enough? Measuring Coverage with Coverage.py

February 9, 2025

Tags:code-quality, development, library, pytest, python, testing

Writing tests is step one. Step two is knowing what parts of your library code those tests actually exercise. Enter Coverage.py.

Read more →

Designing for Developer Joy: Python Library Ergonomics

February 6, 2025

Tags:api-design, best-practices, developer-experience, library-development, programming, python

What makes Python libraries joyful? Explore API ergonomics, from naming conventions & sensible defaults to helpful error messages that guide users to solutions.

Read more →

Why Your Library Needs Pytest (And How to Get Started)

February 4, 2025

Tags:code-quality, development, library, pytest, python, testing

Testing is vital for Python libraries. Explore why it''s crucial and how Pytest simplifies writing powerful tests with less boilerplate and better assertions.

Read more →

The Art of API Design: Making the Right Things Easy

February 3, 2025

Tags:api-design, best-practices, developer-experience, library-development, programming, python

Learn principles of intuitive Python API design that make common operations simple and guide users toward best practices, while preserving advanced features.

Read more →

Secure Coding Practices for Python Library Developers

February 2, 2025

Tags:best-practices, development, input validation, library, python, secure-coding, security

Beyond tools, what principles guide secure Python library development? Explore essential practices: input validation, least privilege, error handling, and more.

Read more →

Taming the Python Chaos: Linting & Formatting with Ruff

January 30, 2025

Tags:ci-cd, code-quality, development, github-actions, python, ruff

What linting and formatting actually are, why they matter (a lot!), and how the speedy tool Ruff can save your Python project (and your sanity).

Read more →

Handling Sensitive Data Securely Within Your Python Library

January 29, 2025

Tags:development, library, python, secure-coding, security

Handle sensitive data in Python libraries securely. Learn best practices for managing API keys, passwords, PII, and other secrets without exposing them in code.

Read more →

Decoding Library Updates: Understanding Semantic Versioning (SemVer)

January 28, 2025

Tags:dependencies, development, library, packaging, pip, python

Guide to Semantic Versioning (SemVer) for Python library authors. Understand MAJOR.MINOR.PATCH rules to communicate changes and manage dependencies.

Read more →

Dependency Security: Managing Vulnerabilities with pip-audit

January 27, 2025

Tags:dependencies, development, library, python, security, vulnerabilities

Your library relies on packages. Learn how to use pip-audit to scan your dependencies for known security vulnerabilities and keep your users safe.

Read more →

The Center of Your Python Project: Understanding pyproject.toml

January 26, 2025

Tags:development, library, packaging, pytest, python, ruff

From setup.py chaos to pyproject.toml clarity. Learn why it exists, how it standardizes Python packaging/tool config via PEPs (518, 517, 621), and its anatomy.

Read more →

Bandit Security Rules: Finding Common Python Security Issues

January 25, 2025

Tags:development, library, python, ruff, secure-coding, security, vulnerabilities

Learn how to use Ruff's Bandit integration to automatically scan your Python code for common security pitfalls through static analysis.

Read more →

Don't Forget the Fine Print: Licensing Your Python Library

January 24, 2025

Tags:compliance, dependencies, development, library, licensing, open-source, python

Choosing an open-source license is crucial. Understand common options (MIT, Apache, GPL), why compatibility matters, and how to comply with obligations.

Read more →

Building and Engaging a Community Around Your Open Source Library

January 22, 2025

Tags:community, development, github, library, maintenance, open-source, python

Attract users, encourage contributions, and build a welcoming environment for your open source library. Learn practical steps for community engagement.

Read more →

The Library Author's Dilemma: Managing Python Dependencies

January 21, 2025

Tags:best-practices, dependencies, development, library, packaging, pip, python

Python library dependency management balances features vs user pain. Explore best practices for choosing, versioning (~= compatible release), and maintenance.

Read more →

Avoiding Common Pitfalls: Injection Flaws in Python Libraries

January 18, 2025

Tags:development, input validation, library, python, secure-coding, security

Injection flaws aren''t just for web apps. See how SQL & command injection affect Python libraries via input handling, and learn crucial prevention techniques.

Read more →

The Art of Saying No: Defining Your Python Library's Scope

January 17, 2025

Tags:design, development, library, python

Why keeping your Python library focused is harder than it looks, and how saying 'no' can be your most powerful design tool.

Read more →

SDLC in the Age of AI

January 12, 2025

Tags:ai, data-science, programming, software-development

Exploring how AI reshapes software development: natural language programming, documentation as prompt engineering, and the role of `.cursorrules` files.

Read more →

From Weekend Hack to Core Tool: The category_encoders Journey

December 27, 2022

Tags:data-science, machine-learning, open-source, python

Explore category_encoders' journey from a weekend Python experiment to a widely used data science library, now part of scikit-learn-contrib.

Read more →

Category Encoders v1.2.8 Release

June 4, 2018

Tags:category-encoders, data-science, open-source, python

Announcing Category Encoders v1.2.8! This release includes important bugfixes and introduces new features like optional category names in output columns.

Read more →

Category Encoders published in JOSS

January 26, 2018

Tags:category-encoders, data-science, open-source, python

Announcing the category_encoders Python package publication in JOSS. Discover the peer-review process and citation details.

Read more →

Revisiting Python support in Apache Flink

January 11, 2018

Tags:big-data, data-science, python

Early 2018 look at Apache Flink's Python support. Checking compatibility, batch vs streaming capabilities, & future developments like Streaming API support.

Read more →

On taking things too seriously: holiday edition

December 9, 2017

Tags:data-science, python, rating-systems, sports

Building a CFB bowl game prediction system with Python packages elote, keeks, & keeks-elote. Combines rating, betting strategies, and backtesting for analysis.

Read more →

Elote: a python package of rating systems

December 6, 2017

Tags:data-science, machine-learning, python, rating-systems

Introducing Elote, a Python package implementing various rating systems like Elo and Glicko. Learn its core concepts and see how to use it for ranking.

Read more →

Ripyr: sampled metrics on datasets using python's asyncio

November 28, 2017

Tags:data-science, python, type-hints

An introduction to ripyr, a Python library for streaming through large datasets and parsing basic metrics using asyncio and type hinting

Read more →

Category Encoders v1.2.5 Release

November 22, 2017

Tags:category-encoders, data-science, machine-learning, open-source, python

Category Encoders v1.2.5 brings community updates including stable binary/BaseN encoding, new leave-one-out encoding, and pandas compatibility fixes.

Read more →

git-pandas Caching: Faster Analysis

July 25, 2017

Tags:data-analysis, git, pandas, performance, python

Boost git repository analysis speed! Learn how git-pandas now uses caching to dramatically improve performance for repeated queries on large codebases.

Read more →

Category Encoders v1.2.4 Release

July 12, 2017

Tags:category-encoders, data-science, machine-learning, python

Category Encoders v1.2.4 is out! Includes pandas categorical type support, improved missing value handling, better error messages, BaseN fixes, and docs.

Read more →

BaseN Encoding Grid Search in Category Encoders

December 18, 2016

Tags:category-encoders, data-science, machine-learning, python

Explore category_encoders' BaseN encoder for representing categorical data. Learn how to use scikit-learn's grid search to find the optimal encoding base.

Read more →

Category Encoders accepted into scikit-learn-contrib

November 20, 2016

Tags:category-encoders, data-science, open-source, python

Category Encoders, a Python library for encoding categorical variables, has been accepted into the scikit-learn-contrib ecosystem. A project milestone!

Read more →

Category Encoders now on conda-forge

September 17, 2016

Tags:category-encoders, data-science, open-source, python

The category_encoders Python package is now available on conda-forge, making installation easier for Conda users. Learn about the package and feedstock.

Read more →

Introducing unified glob-syntax in git-pandas

June 15, 2016

Tags:data-science, git-pandas, python

Explore the unified glob syntax (`include_globs`, `ignore_globs`) for git-pandas v2.0, offering flexible file pattern specification and usability.

Read more →

Parallelizing cumulative blame in git-pandas with joblib

June 12, 2016

Tags:data-science, git-pandas, performance, python

Boost git-pandas cumulative blame analysis performance with joblib. Parallel processing via multithreading speeds up this costly operation.

Read more →

When do I work on what?

April 30, 2016

Tags:data-science, dataviz, git-pandas, open-source, python

Use git-pandas to analyze and visualize work patterns across open source vs. closed source projects. Compare commit times with punchcard plots. Learn the code.

Read more →

Estimating the time spent on a project with git-pandas

April 16, 2016

Tags:git, git-pandas, github, open-source, python

Learn how to estimate project development time using commit history with git-pandas. Compares to git_time_extractor, git-hours, and glass.

Read more →

Automating documentation workflow with sphinx and github pages

February 29, 2016

Tags:documentation, github, open-source, python, sphinx

Explore a comprehensive guide on automating the deployment of Sphinx documentation to GitHub Pages, streamlining your workflow with efficient practices.

Read more →

Pypi-publisher: a simple cli for publishing python libraries

February 24, 2016

Tags:cli, deployment, open-source, packaging, python

Introducing pypi-publisher (ppp): a CLI tool simplifying Python library publishing. Handles .pypirc updates, linting, git tags, and PyPI sdist uploads.

Read more →

Using survival analysis and git-pandas to estimate code quality

February 21, 2016

Tags:data-analysis, data-science, dataviz, git, git-pandas, github, open-source, python

Apply survival analysis with git-pandas to measure code quality in Git repositories by analyzing code longevity and contributor patterns over time.

Read more →

Git-pandas v1.0.0, or how to check for a stable release

February 2, 2016

Tags:data-science, dataviz, git, git-pandas, github, open-source, python

Explore git-pandas v1.0.0, focusing on interface consistency, parameter naming, and API simplification for improved data analysis workflows.

Read more →

Github.com cumulative blame in 5 lines of python

January 31, 2016

Tags:data-science, dataviz, git, git-pandas, github, pandas, python

Visualize your GitHub repository growth over time using the git-pandas Python library and GitHubProfile class—all in just a few lines of code.

Read more →

Data-driven engineering team management with gitnoc and git-pandas

January 19, 2016

Tags:data visualization, dataviz, git, git-pandas, github, open-source, python

Leverage git-pandas and gitnoc for data-driven engineering management. Visualize git data for insights on bus factor, risk, project growth, and team oversight.

Read more →

Create organization-wide punchcards with git-pandas

January 17, 2016

Tags:data-analysis, git, git-pandas, github, open-source, python

Learn how git-pandas enables creating organization-wide punchcard visualizations, aggregating commit activity across multiple repositories for a unified view.

Read more →

How to Write Comprehensions and Alienate People

January 8, 2016

Tags:best-practices, data-science, programming, python

A tongue-in-cheek guide to writing Python comprehensions that will make your colleagues question their life choices and your sanity.

Read more →

Gitpandas v0.0.6: python 2.7, fileowners, file-wise blame and examples

January 7, 2016

Tags:data-analysis, data-science, git, git-pandas, github, open-source, projects, python

Overview of git-pandas v0.0.6 release, highlighting new features like Python 2.7 support, file-wise blame, file owner determination, and other improvements.

Read more →

Git-Pandas v0.0.5: coverage.py, risk, and more

December 25, 2015

Tags:data-analysis, git, git-pandas, github, open-source, pandas, projects, python

Git-pandas v0.0.5 is out! Adds coverage.py support, file change rate metrics for risk analysis, API updates, time-based filtering for commits.

Read more →

Visualize all of your git repositories with gitnoc and git-pandas

December 13, 2015

Tags:data visualization, dataviz, git, git-pandas, pandas, python

Visualize git repositories at scale using GitNOC & git-pandas. Create profiles to analyze cumulative blame & file change rates across multiple projects.

Read more →

Analyzing GitPython and Pandas With GitPandas

November 19, 2015

Tags:data-analysis, git, git-pandas, github, open-source, pandas, projects, python

Analyze GitPython and Pandas repositories with git-pandas! Explore LOC, contributors, and bus factor using this Python library for git analysis.

Read more →

Create a pip-installable python package in 2 minutes

November 12, 2015

Tags:open-source, packaging, pip, python

Rapidly create and publish Python packages. Learn the steps from cookiecutter-pipproject template setup to pushing your first release to PyPI in minutes.

Read more →

Blame the world with git-pandas

November 10, 2015

Tags:dataviz, git, git-pandas, github, pandas, python

Introducing git-pandas, a Python library offering a pandas interface for git analysis. Easily aggregate git blame across projects.

Read more →

Miscellaneous MATLAB

June 10, 2012

Tags:engineering, programming

Useful MATLAB tips learned the hard way: logical indexing, read-only parameters, pre-allocation, OOP, user input, and publishing for cleaner, faster code.

Read more →