Artificial Intelligence & Machine Learning
This section contains my writings on artificial intelligence and machine learning, drawing from years of experience leading AI initiatives across various organizations. Topics range from practical implementation advice to theoretical discussions and industry trends.
Elote 1.0.0 Release: Rating Systems Made Simple
March 13, 2025
Tags:python, open-source, elote, rating-systems, elo
After what feels like forever (and honestly, it kind of has been), I’m thrilled to announce that Elote 1.0.0 has been released.
For those who haven’t been following along, Elote is my little Python library for implementing and comparing rating systems like the Elo system used in chess.
PyGeoHash v3.0.0: Faster, Freer, and More Pythonic
March 11, 2025
Tags:python, open-source, geospatial, pygeohash, performance, licensing
A deep dive into the latest major release of PyGeoHash, featuring a complete rewrite in pure CPython, MIT relicensing, and dramatic performance improvements
Using Cursor for Open Source Library Maintenance
March 9, 2025
Tags:cursor, open-source, python, ai, library-maintenance, pygeohash, keeks, elote
How AI-powered coding tools like Cursor can simplify the maintenance burden of open source projects, featuring real-world examples from PyGeoHash, Keeks, and Elote
PyGeoHash 2.1.0: Modernizing a Geospatial Python Library
March 4, 2025
Tags:python, geospatial, open-source, geohash, cursor, claude
A look at the latest updates to PyGeoHash, a lightweight Python library for working with geohashes, and how modern AI tools helped revitalize the project.
Claude 3.7 and new Cursor: first impressions
February 26, 2025
Tags:data-science, ai, tools
Anthropic finally released Claude 3.7, and Cursor has a new version with some interesting features. Let’s take a look at what’s new and some first impressions.
Data Science Things Roundup #13
February 10, 2025
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #12
January 20, 2025
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
SDLC in the Age of AI
January 12, 2025
Tags:ai, software-development, programming, prompt-engineering, data science
Exploring how AI is reshaping software development practices and the emerging role of prompt engineering
From Weekend Hack to Core Tool: The category_encoders Journey
December 27, 2022
Tags:open-source, python, data science, machine-learning
Reflecting on how a simple Python experiment grew into a fundamental data science library with millions of downloads
Investment Review: Seer.ai
October 18, 2022
Tags:investing, startups, angel-investing, deal-review, ai, machine-learning, analytics
A review of my angel investment in Seer.ai, exploring how they align with my investment thesis and their unique value proposition in AI-powered analytics.
Category Encoders v1.2.8 Release
June 4, 2018
Tags:python, data science, category-encoders, open-source
Release announcement for Category Encoders v1.2.8 with bugfixes and new features
TechEmergence Podcast and Atlanta AI Article
June 3, 2018
Tags:artificial-intelligence, podcast, atlanta, predictive-maintenance, data science
Discussing predictive maintenance and the Atlanta AI ecosystem on TechEmergence
Data Engineering Podcast
February 20, 2018
Tags:data-engineering, podcast, predictive-maintenance, industrial-iot, data science
Discussing data engineering, predictive maintenance, and industrial IoT on the Data Engineering Podcast
Category Encoders published in JOSS
January 26, 2018
Tags:python, data-science, category-encoders, open-source, academic, publication
Category Encoders package gets published in the Journal of Open Source Software (JOSS)
The Problem with Industrial IoT
January 16, 2018
Tags:iot, industry, technology, data science
A discussion of the challenges facing industrial IoT adoption and implementation
Revisiting Python support in Apache Flink
January 11, 2018
Tags:python, apache-flink, big-data, streaming, data science
A follow-up on the state of Python support in Apache Flink after a couple of years
Tendencies of Data Engineers and Scientists
January 9, 2018
Tags:data-engineering, data-science, team-organization, engineering-management, data science
Exploring the relationship dynamics and organizational challenges between data engineering and data science teams
I Made a Model, Now What?
January 4, 2018
Tags:data-science, machine-learning, production, pydata, data science, atlanta
Insights from a PyData Atlanta talk about successfully deploying and maintaining machine learning models in production
Elote: a python package of rating systems
December 6, 2017
Tags:python, data-science, machine-learning, rating-systems
Introducing Elote, a Python package for implementing various rating systems
Category Encoders v1.2.5 Release
November 22, 2017
Tags:python, category-encoders, open-source, data-science, machine-learning
Release notes for Category Encoders v1.2.5, highlighting community contributions and improvements
Data Science Things Roundup #11
September 23, 2017
Tags:data-science, roundup, visualization, bayesian, finance
A collection of interesting data science articles and projects, including SEC keynotes, Bayesian inference, and visualization tools
git-pandas Caching: Faster Analysis
July 25, 2017
Tags:python, git, pandas, data-analysis, performance
Improving performance in git-pandas with caching
Category Encoders v1.2.4 Release
July 12, 2017
Tags:python, data-science, machine-learning, category-encoders, release
New release of category_encoders with improved functionality and bug fixes
Data Science Things Roundup #10
April 19, 2017
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #9
March 12, 2017
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #8
January 25, 2017
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
BaseN Encoding Grid Search in Category Encoders
December 18, 2016
Tags:python, data-science, machine-learning, category-encoders
Exploring the BaseN encoder and grid search capabilities in category_encoders
Category Encoders accepted into scikit-learn-contrib
November 20, 2016
Tags:python, data-science, open-source, scikit-learn, category-encoders
Category Encoders joins the scikit-learn-contrib ecosystem
Data Science Things Roundup #7
November 10, 2016
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Category Encoders now on conda-forge
September 17, 2016
Tags:python, data-science, open-source, conda, category-encoders
Announcing the availability of category_encoders on conda-forge
Data Science Things Roundup #6
July 20, 2016
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #5
March 15, 2016
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
How to Write Comprehensions and Alienate People
January 8, 2016
Tags:python, programming, humor, best-practices, data science
A tongue-in-cheek guide to writing Python comprehensions that will make your colleagues question their life choices
Common Data Pitfalls for Recurring Machine Learning Systems
December 20, 2015
Tags:data science, machine learning, analytics, data engineering
A reference guide to common data problems and limitations encountered when building recurring machine learning systems, supplementing the comprehensive guide to bad data with problems specific to recurring systems.
CyberLaunch: An Accelerator for Machine Learning Companies
December 8, 2015
Tags:startups, atlanta, machine learning, accelerators, data science
A look at CyberLaunch, Atlanta's accelerator focused on machine learning and information security companies, and its role in the startup ecosystem.
Data Science Things Roundup #4
December 5, 2015
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Beyond One-Hot: An Exploration of Categorical Variables
November 29, 2015
Tags:machine learning, data science, categorical variables, feature engineering
A deep dive into different methods for encoding categorical variables in machine learning, exploring their benefits and trade-offs
Data Science vs. Data Engineering
October 31, 2015
Tags:data science, data engineering, career, technology, big data
Understanding the fundamental differences between data science and data engineering through the lens of methodology rather than tools
Data Science Things Roundup #3
September 10, 2015
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #2
May 20, 2015
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources
Data Science Things Roundup #1
February 15, 2015
Tags:data-science, machine-learning, roundup, resources
A collection of interesting data science articles and resources