Exploring Ideas: A Blog on Technology, Startups, Food, and More

Welcome to my blog where I share thoughts and insights on technology, startups, and life in Atlanta. Browse through the articles below or explore by topic.

Estimating the time spent on a project with git-pandas

April 16, 2016

I stumbled across a conversation recently on the Tech404 slack channel (a pretty good public slack group for Atlanta area software folks) about mostly taxes, but nestled in the middle was this project: git_time_extractor. In the past I’ve noticed a kind of weird concentration of git related open source projects among Atlanta developers, I’m not sure if that says more about Atlanta or git’s abstrus...

Read more →

Data Science Things Roundup #5

March 15, 2016

Time again for the 5th edition of the data science things roundup, named suspiciously similarly to the much more established Data Science Roundup by RJ Metrics (but we won’t worry about that this week). In previous weeks we’ve seen some pretty cool ML and Data Science libraries, mostly in python, this week we branch out a little bit in more engineering-level projects (databases and deployment). Sp...

Read more →

Automating documentation workflow with sphinx and github pages

February 29, 2016

I’ve got a large and growing list of open source projects I try to keep up to date. Some are better documented than others, but I’m working on making them all very simple to get started with. In light of that, I’ve done some hacking around on how to get the workflow of github pages and sphinx nailed down. For a while I was manually building the docs in the master branch, copying the html out to a ...

Read more →

Pypi-publisher: a simple cli for publishing python libraries

February 24, 2016

I’ve just finished the first release of pypi-publisher (ppp), a library for simple command line publishing of python libraries. You can grab the source or an install here: https://github.com/wdm0006/pypi-publisher In previous posts, I’ve shown how with a cookiecutter framework and some fast typing, you can release a package to pypi in pretty much no time at all. In it’s current state, ppp can cut ...

Read more →

Using survival analysis and git-pandas to estimate code quality

February 21, 2016

Survival analysis is a statistical technique for determining the likelihood of events to happen over a timeline. It was originally based heavily in the medical/actuarial profession, where it would answer questions like: given this set of conditions, how likely is a person to survive X years? In previous posts, we’ve seen that we can tap into a huge amount of data in git repositories with git-panda...

Read more →

Journalism and the Perfect Pitch Deck

February 15, 2016

Having been pretty immersed in the VC funded startup experience at Predikto for a couple of years now, I have (counter to what I would have ever thought), taken an intellectual curiosity to pitch decks, including studying great examples like LinkedIn’s fantastically annotated deck from their Series B. Despite having pretty much always been in some sort of engineering, I don’t come from a family of...

Read more →

Git-pandas v1.0.0, or how to check for a stable release

February 2, 2016

In the process of making the v1.0.0 release of git-pandas, I had one primary goal: to simplify and solidify the interface to git-pandas objects (the ProjectDirectory and the Repository). At the end of the day, the usefulness of a project like git-pandas versus one off analysis or rolling your own interface is consistent and predictable interfaces to commonly used functions. So with that in mind, I...

Read more →

Github.com cumulative blame in 5 lines of python

January 31, 2016

Git-pandas has gotten to be pretty capable. Currently in the master branch and soon to be in the v1.0.0 release, we’ve included a github.com interface to git-pandas via the GitHubProfile class. With this, in just a few lines of code, you can see how your profile has grown over time: from gitpandas.utilities.plotting import plot_cumulative_blame from gitpandas import GitHubProfile g = GitHubProfile...

Read more →

Decision Strategies: Beyond Expected Value

January 28, 2016

Oftentimes when making some kind of uncertain decision, the decision maker will use a measure such as expected value to make that decision. Imagine the case of a single coin flip where the better pays 5 dollars to play, and gets 2 dollars for heads and 10 dollars for tails. The expected value of this game is to pay 5 dollars to enter and make 6 dollars, giving an expected one dollar profit. But th...

Read more →

Data-driven engineering team management with gitnoc and git-pandas

January 19, 2016

The management of engineering projects is very very different when it scales outside of one developer, and even more so when it scales outside of one repository. Devops and source control aside, as the number of contributors and repos increases, the ability for a manager to keep track of each developer and repository relative to each other increases in kind. Data, especially incomplete data, canno...

Read more →

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.