Journal
Posts and long-form series on AI, startups, venture capital, and more.
All Posts
Standing Peachtree Park
Visit Standing Peachtree Park, an Atlanta site at the meeting of Peachtree Creek and Chattahoochee River, marking one of the city's earliest settlements.
Data Science Things Roundup #11
A collection of interesting data science articles and projects, including SEC keynotes, Bayesian inference, and visualization tools
Modernizing Pedalwrencher: Whatever That Means
Migrating the pedal wrencher Flask app from Heroku plus an EC2 cron box to a Dockerized AWS ECS deployment with RDS, Route 53, and CloudWatch alerts.
git-pandas Caching: Faster Analysis
Boost git repository analysis speed! Learn how git-pandas now uses caching to dramatically improve performance for repeated queries on large codebases.
Category Encoders v1.2.4 Release
Category Encoders v1.2.4 is out! Includes pandas categorical type support, improved missing value handling, better error messages, BaseN fixes, and docs.
Data Science Things Roundup #10
A curated collection of data science articles and tools exploring network analysis, StashPy for log processing, and Bayesian survival analysis techniques.
Data Science Things Roundup #9
Data Science Things Roundup #9: Highlighting Pedro Domingos'' ML paper, Spyre (Shiny for Python), and BetaGo, an AlphaGo-inspired Go bot framework.
Data Science Things Roundup #8
Data Science Things Roundup #8: Dive into LIME for model interpretation, sklearn-expertsys for interpretable classifiers, and the value of nearest neighbors.
BaseN Encoding Grid Search in Category Encoders
Explore category_encoders' BaseN encoder for representing categorical data. Learn how to use scikit-learn's grid search to find the optimal encoding base.
Category Encoders accepted into scikit-learn-contrib
Category Encoders, a Python library for encoding categorical variables, has been accepted into the scikit-learn-contrib ecosystem. A project milestone!