Open Source Projects
For many years I’ve been building and maintaining open source software that helps developers work more efficiently and effectively. My projects span a range of domains including machine learning tools, data analysis libraries, development tooling, and geospatial utilities. With a focus on Python development, I’ve contributed to the scientific computing ecosystem through projects like Category Encoders (now part of scikit-learn-contrib) and created developer tools like Git Pandas for repository analysis.
Here are some of the open source projects I maintain:
Keeks
A Python library for key-value store benchmarking and performance testing.
Related Posts
- Keeks 0.1.0 Release: Optimal Bankroll Management Made Simple - First major release announcement and features
- Kelly Criterion Implementation - Deep dive into the Kelly Criterion and its implementation in Keeks
- Taking Things Seriously: Holiday Edition - Early development and integration with other tools
Cookiecutter PIP Project
A modern, production-ready cookiecutter template for Python packages that follows best practices. This template sets up a complete development environment with all the tools needed to build, test, and publish professional Python libraries.
Git Pandas
A Python library that wraps GitPython to produce pandas dataframes for Git repository analysis. Enables data-driven analysis of Git repositories with features for commit history, blame information, and project-level insights.
Related Posts
- git-pandas Caching: Faster Analysis - Performance improvements through caching implementation
- Year’s End: Looking Back 2017 - Updates on Redis-based caching and professional usage
- Create organization-wide punchcards with git-pandas - New punchcard visualization features
- Gitpandas v0.0.6: python 2.7, fileowners, file-wise blame and examples - Release notes and new features
- Git-Pandas v0.0.5: coverage.py, risk, and more - Release notes and risk analysis features
- Visualize all of your git repositories with gitnoc and git-pandas - Introduction to GitNOC visualization tool
- Analyzing GitPython and Pandas With GitPandas - Early analysis and examples
Elote
A Python library implementing various rating and ranking algorithms.
Related Posts
- Elote 1.0.0 Release: Rating Systems Made Simple - Major release announcement and new features
- The Elo Rating System - Deep dive into Elo ratings with Elote
- Elote: A Python Package of Rating Systems - Original introduction to the library
Stargazers
A modern CLI tool to fetch, analyze, and summarize the stargazers or forkers of any public GitHub repository.
Related Posts
- Introducing ‘stargazers’: A Tool to Understand Your GitHub Audience - Announcing the tool and its features
PyGeoHash
A Python library for working with geohashes, providing encoding, decoding, and distance calculations.
Related Posts
- PyGeoHash Gets Type Hints: A Journey into Modern Python - Adding comprehensive type hints and a new types module
- PyGeoHash v3.0.0: Faster, Freer, and More Pythonic - Latest major release with complete rewrite
- PyGeoHash 2.1.0: Modernizing a Geospatial Python Library - Modernization and new features
- Using Cursor for Library Maintenance - How AI tools helped revitalize PyGeoHash
Category Encoders
A scikit-learn-contrib library providing encoders for categorical variables as part of machine learning workflows. I authored this project but am no longer the day to day maintainer.
Related Posts
- OneHotEncoder: The Workhorse of Categorical Encoding - Deep dive into one-hot encoding implementation and best practices
- The Category Encoders Journey - Reflecting on the project’s growth and impact
- Category Encoders Published in JOSS - Academic publication milestone
- Category Encoders Accepted into scikit-learn-contrib - Major ecosystem integration
- Beyond One Hot: An Exploration of Categorical Variables - Original motivation and exploration