Exploring Ideas: A Blog on Technology, Startups, Food, and More

Welcome to my blog where I share thoughts and insights on technology, startups, and life in Atlanta. Browse through the articles below or explore by topic.

Introducing unified glob-syntax in git-pandas

June 15, 2016

In an effort to improve the user interface to git-pandas, I’m introducing a new way of specifying which files in a repository you care about, which will become the sole way of specifying this kind of thing in version 2.0.0. Currently, for any given function, you can specify a list of extensions you’d like to include, and a list of directories you’d like to exclude. A toy example would be: df = rep...

Read more →

Parallelizing cumulative blame in git-pandas with joblib

June 12, 2016

It’s been a little while since I’ve posted anything about git-pandas, as I’ve been working on getting a sister project, twitter-pandas up and running. Work has continued though, and today I’d like to show a currently experimental feature, parallelized cumulative blame. Cumulative blame is one of the more popular features used in git-pandas, as it can be used to easily create a pretty interesting p...

Read more →

Exit Interviews in Startups

May 5, 2016

Exit interviews are a critical but often overlooked practice in startups. While larger companies have formalized processes for gathering feedback when employees leave, startups often skip this valuable opportunity for insight, either due to the emotional nature of departures or the pressure to focus on immediate operational needs. Why Exit Interviews Matter In startups, every departure represents ...

Read more →

When do I work on what?

April 30, 2016

In past posts, I’ve shown that it’s pretty easy to create organization wide punchcards with git-pandas. Today, I put together a little twist on that particular visualization, to split my projects into two cohorts: open and closed source. My work at Predikto is, as work tends to be, mostly closed source (though we try to contribute to projects when we can). The work I do outside of Predikto is, in ...

Read more →

Building an Engineering Team Around Ownership

April 28, 2016

Small, talented teams have an inherent challenge: the individuals that make them up are talented. In a small, talented engineering team, the engineers understand architecture, the architects understand engineering, and the product managers understand the technical side of things. While this cross-functional knowledge seems beneficial on the surface - enabling empathy and better questions - it come...

Read more →

Estimating the time spent on a project with git-pandas

April 16, 2016

I stumbled across a conversation recently on the Tech404 slack channel (a pretty good public slack group for Atlanta area software folks) about mostly taxes, but nestled in the middle was this project: git_time_extractor. In the past I’ve noticed a kind of weird concentration of git related open source projects among Atlanta developers, I’m not sure if that says more about Atlanta or git’s abstrus...

Read more →

Data Science Things Roundup #5

March 15, 2016

Time again for the 5th edition of the data science things roundup, named suspiciously similarly to the much more established Data Science Roundup by RJ Metrics (but we won’t worry about that this week). In previous weeks we’ve seen some pretty cool ML and Data Science libraries, mostly in python, this week we branch out a little bit in more engineering-level projects (databases and deployment). Sp...

Read more →

Automating documentation workflow with sphinx and github pages

February 29, 2016

I’ve got a large and growing list of open source projects I try to keep up to date. Some are better documented than others, but I’m working on making them all very simple to get started with. In light of that, I’ve done some hacking around on how to get the workflow of github pages and sphinx nailed down. For a while I was manually building the docs in the master branch, copying the html out to a ...

Read more →

Pypi-publisher: a simple cli for publishing python libraries

February 24, 2016

I’ve just finished the first release of pypi-publisher (ppp), a library for simple command line publishing of python libraries. You can grab the source or an install here: https://github.com/wdm0006/pypi-publisher In previous posts, I’ve shown how with a cookiecutter framework and some fast typing, you can release a package to pypi in pretty much no time at all. In it’s current state, ppp can cut ...

Read more →

Using survival analysis and git-pandas to estimate code quality

February 21, 2016

Survival analysis is a statistical technique for determining the likelihood of events to happen over a timeline. It was originally based heavily in the medical/actuarial profession, where it would answer questions like: given this set of conditions, how likely is a person to survive X years? In previous posts, we’ve seen that we can tap into a huge amount of data in git repositories with git-panda...

Read more →

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.