Exploring Ideas: A Blog on Technology, Startups, Food, and More

Welcome to my blog where I share thoughts and insights on technology, startups, and life in Atlanta. Browse through the articles below or explore by topic.

On taking things too seriously: holiday edition

December 9, 2017

data-science python rating-systems sports

For some reason Atlanta got a pretty significant amount of snow yesterday, and because of that I’ve been mostly stuck at home. When faced with that kind of time on hand, sometimes I spend too much time on things that don’t really matter all that much. Recently, I’ve been fascinated with rating systems (see a post on Elote here), so that was in the front of my mind this week. Every year, around thi...

Elote: a python package of rating systems

December 6, 2017

data-science machine-learning python rating-systems

Recently I’ve been interesting in rating systems. Around here the application most front of mind for those is college football rankings. In general, imagine any case you have a large population of things you want to rank, and only a limited set of head-to-head matchups between those things to use for building your ratings. Without a direct comparison point between each possible pair, you’ve got to...

Ripyr: sampled metrics on datasets using python's asyncio

November 28, 2017

data-science python type-hints

Today I’d like to introduce a little python library I’ve toyed around with here and there for the past year or so, ripyr. Originally it was written just as an excuse to try out some newer features in modern python: asyncio and type hinting. The whole package is type hinted, which turned out to be a pretty low level of effort to implement, and the asyncio ended up being pretty speedy. But first the...

Category Encoders v1.2.5 Release

November 22, 2017

category-encoders data-science machine-learning open-source python

This release was actually cut a couple of weeks ago, but I forgot to put a post here. It’s been a release of mainly incremental changes, but also one of increased contributions from the community, so while not a huge feature-packed release, it’s one I’m particularly proud of. Here’s to more like this. It was around 4 months since the last release, which I think is a pretty decent cadence, consider...

Standing Peachtree Park

November 20, 2017

atlanta history

Standing Peachtree Park is one of Atlanta’s lesser-known historical sites, yet it holds significant importance in the city’s history. Located at the confluence of Peachtree Creek and the Chattahoochee River, this site marks one of the earliest settlements in what would become Atlanta. Historical Significance The site was originally home to a Creek Indian village called Standing Peachtree, which se...

Data Science Things Roundup #11

September 23, 2017

data-science finance roundup

Note: This post has been migrated from my previous blog. Some links to previous roundups or external resources may no longer be available. Once again time for the data science things roundup, a few links of articles or projects I’ve stumbled across and found interesting. This is the 11th one in the extremely irregular series, so if you think it’s cool, check out some of the others. This time we’v...

git-pandas Caching: Faster Analysis

July 25, 2017

data-analysis git pandas performance python

The git-pandas library has been around for a while now, providing tools to analyze git repositories using pandas DataFrames. One of the common pieces of feedback has been about performance - analyzing large repositories can be slow, especially when running multiple analyses on the same data. To address this, I’ve added caching to the repository objects. This means that when you run an analysis, th...

Category Encoders v1.2.4 Release

July 12, 2017

category-encoders data-science machine-learning python

I’m happy to announce the release of category_encoders v1.2.4! This release includes several improvements and bug fixes to make the library more robust and easier to use. Key updates in this version: Added support for pandas categorical types Improved handling of missing values across all encoders Better error messages when input validation fails Fixed several edge cases in the BaseN encoder Updat...

Data Science Things Roundup #10

April 19, 2017

data-science machine-learning resources roundup

Hey all, I haven’t done one of these in quite a while, but thought I’d share a few more articles I’ve found interesting recently. An analysis of twitter influencers in the field of data science & big data This is a pretty in depth medium article that goes through some of the concepts in network analysis, through the lens of twitter data. It’s not an area I know a ton about, but I found it approach...

Data Science Things Roundup #9

March 12, 2017

data-science machine-learning resources roundup

Things got a bit busy and I feel off the wagon posting, but here we are back for the ninth edition of the data science things roundup. If you haven’t seen previous editions, it’s basically just 3 data science or python related articles or packages that I’ve stumbled across recently and thought were interesting. This time we have a great paper and 2 python packages, so dig in. A few useful things t...

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.