Artificial Intelligence & Machine Learning

This section contains my writings on artificial intelligence and machine learning, drawing from years of experience leading AI initiatives across various organizations. Topics range from practical implementation advice to theoretical discussions and industry trends.

Elote 1.0.0 Release: Rating Systems Made Simple

March 13, 2025

Tags:python, open-source, elote, rating-systems, elo

After what feels like forever (and honestly, it kind of has been), I’m thrilled to announce that Elote 1.0.0 has been released.

For those who haven’t been following along, Elote is my little Python library for implementing and comparing rating systems like the Elo system used in chess.

Read more →

PyGeoHash v3.0.0: Faster, Freer, and More Pythonic

March 11, 2025

Tags:python, open-source, geospatial, pygeohash, performance, licensing

A deep dive into the latest major release of PyGeoHash, featuring a complete rewrite in pure CPython, MIT relicensing, and dramatic performance improvements

Read more →

Using Cursor for Open Source Library Maintenance

March 9, 2025

Tags:cursor, open-source, python, ai, library-maintenance, pygeohash, keeks, elote

How AI-powered coding tools like Cursor can simplify the maintenance burden of open source projects, featuring real-world examples from PyGeoHash, Keeks, and Elote

Read more →

PyGeoHash 2.1.0: Modernizing a Geospatial Python Library

March 4, 2025

Tags:python, geospatial, open-source, geohash, cursor, claude

A look at the latest updates to PyGeoHash, a lightweight Python library for working with geohashes, and how modern AI tools helped revitalize the project.

Read more →

Claude 3.7 and new Cursor: first impressions

February 26, 2025

Tags:data-science, ai, tools

Anthropic finally released Claude 3.7, and Cursor has a new version with some interesting features. Let’s take a look at what’s new and some first impressions.

Read more →

Data Science Things Roundup #13

February 10, 2025

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #12

January 20, 2025

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

SDLC in the Age of AI

January 12, 2025

Tags:ai, software-development, programming, prompt-engineering, data science

Exploring how AI is reshaping software development practices and the emerging role of prompt engineering

Read more →

From Weekend Hack to Core Tool: The category_encoders Journey

December 27, 2022

Tags:open-source, python, data science, machine-learning

Reflecting on how a simple Python experiment grew into a fundamental data science library with millions of downloads

Read more →

Investment Review: Seer.ai

October 18, 2022

Tags:investing, startups, angel-investing, deal-review, ai, machine-learning, analytics

A review of my angel investment in Seer.ai, exploring how they align with my investment thesis and their unique value proposition in AI-powered analytics.

Read more →

Category Encoders v1.2.8 Release

June 4, 2018

Tags:python, data science, category-encoders, open-source

Release announcement for Category Encoders v1.2.8 with bugfixes and new features

Read more →

TechEmergence Podcast and Atlanta AI Article

June 3, 2018

Tags:artificial-intelligence, podcast, atlanta, predictive-maintenance, data science

Discussing predictive maintenance and the Atlanta AI ecosystem on TechEmergence

Read more →

Data Engineering Podcast

February 20, 2018

Tags:data-engineering, podcast, predictive-maintenance, industrial-iot, data science

Discussing data engineering, predictive maintenance, and industrial IoT on the Data Engineering Podcast

Read more →

Category Encoders published in JOSS

January 26, 2018

Tags:python, data-science, category-encoders, open-source, academic, publication

Category Encoders package gets published in the Journal of Open Source Software (JOSS)

Read more →

The Problem with Industrial IoT

January 16, 2018

Tags:iot, industry, technology, data science

A discussion of the challenges facing industrial IoT adoption and implementation

Read more →

Revisiting Python support in Apache Flink

January 11, 2018

Tags:python, apache-flink, big-data, streaming, data science

A follow-up on the state of Python support in Apache Flink after a couple of years

Read more →

Tendencies of Data Engineers and Scientists

January 9, 2018

Tags:data-engineering, data-science, team-organization, engineering-management, data science

Exploring the relationship dynamics and organizational challenges between data engineering and data science teams

Read more →

I Made a Model, Now What?

January 4, 2018

Tags:data-science, machine-learning, production, pydata, data science, atlanta

Insights from a PyData Atlanta talk about successfully deploying and maintaining machine learning models in production

Read more →

Elote: a python package of rating systems

December 6, 2017

Tags:python, data-science, machine-learning, rating-systems

Introducing Elote, a Python package for implementing various rating systems

Read more →

Category Encoders v1.2.5 Release

November 22, 2017

Tags:python, category-encoders, open-source, data-science, machine-learning

Release notes for Category Encoders v1.2.5, highlighting community contributions and improvements

Read more →

Data Science Things Roundup #11

September 23, 2017

Tags:data-science, roundup, visualization, bayesian, finance

A collection of interesting data science articles and projects, including SEC keynotes, Bayesian inference, and visualization tools

Read more →

git-pandas Caching: Faster Analysis

July 25, 2017

Tags:python, git, pandas, data-analysis, performance

Improving performance in git-pandas with caching

Read more →

Category Encoders v1.2.4 Release

July 12, 2017

Tags:python, data-science, machine-learning, category-encoders, release

New release of category_encoders with improved functionality and bug fixes

Read more →

Data Science Things Roundup #10

April 19, 2017

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #9

March 12, 2017

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #8

January 25, 2017

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

BaseN Encoding Grid Search in Category Encoders

December 18, 2016

Tags:python, data-science, machine-learning, category-encoders

Exploring the BaseN encoder and grid search capabilities in category_encoders

Read more →

Category Encoders accepted into scikit-learn-contrib

November 20, 2016

Tags:python, data-science, open-source, scikit-learn, category-encoders

Category Encoders joins the scikit-learn-contrib ecosystem

Read more →

Data Science Things Roundup #7

November 10, 2016

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Category Encoders now on conda-forge

September 17, 2016

Tags:python, data-science, open-source, conda, category-encoders

Announcing the availability of category_encoders on conda-forge

Read more →

Data Science Things Roundup #6

July 20, 2016

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #5

March 15, 2016

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

How to Write Comprehensions and Alienate People

January 8, 2016

Tags:python, programming, humor, best-practices, data science

A tongue-in-cheek guide to writing Python comprehensions that will make your colleagues question their life choices

Read more →

Common Data Pitfalls for Recurring Machine Learning Systems

December 20, 2015

Tags:data science, machine learning, analytics, data engineering

A reference guide to common data problems and limitations encountered when building recurring machine learning systems, supplementing the comprehensive guide to bad data with problems specific to recurring systems.

Read more →

CyberLaunch: An Accelerator for Machine Learning Companies

December 8, 2015

Tags:startups, atlanta, machine learning, accelerators, data science

A look at CyberLaunch, Atlanta's accelerator focused on machine learning and information security companies, and its role in the startup ecosystem.

Read more →

Data Science Things Roundup #4

December 5, 2015

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Beyond One-Hot: An Exploration of Categorical Variables

November 29, 2015

Tags:machine learning, data science, categorical variables, feature engineering

A deep dive into different methods for encoding categorical variables in machine learning, exploring their benefits and trade-offs

Read more →

Data Science vs. Data Engineering

October 31, 2015

Tags:data science, data engineering, career, technology, big data

Understanding the fundamental differences between data science and data engineering through the lens of methodology rather than tools

Read more →

Data Science Things Roundup #3

September 10, 2015

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #2

May 20, 2015

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →

Data Science Things Roundup #1

February 15, 2015

Tags:data-science, machine-learning, roundup, resources

A collection of interesting data science articles and resources

Read more →