Exploring Ideas: A Blog on Technology, Startups, Food, and More

Welcome to my blog where I share thoughts and insights on technology, startups, and life in Atlanta. Browse through the articles below or explore by topic.

Create organization-wide punchcards with git-pandas

January 17, 2016

In preparation for the v1.0.0 release of git-pandas, I’m going through the library and trying to stabilize the inputs and outputs across all functions. The idea is that there should be a common language for interacting with these objects, whether they by single repositories or project directories, and regardless of which function you are interacting with. I’m also trying to lower the barrier to en...

Read more →

How to Write Comprehensions and Alienate People

January 8, 2016

Python list comprehensions are a powerful feature that can make code more concise and readable. But like any powerful tool, they can also be misused to create code that’s virtually incomprehensible. Here’s a guide to writing comprehensions that will make your colleagues wonder if you’re actually writing Python or inventing a new language. The Basics of Alienation Start with simple list comprehensi...

Read more →

Gitpandas v0.0.6: python 2.7, fileowners, file-wise blame and examples

January 7, 2016

After the last release about two weeks ago, I’ve focused the development in git-pandas around testing, documentation, and bugfixes in preparation for the first major release, v1.0.0. In this release, we’ve added: Support for python 2.7 Testing and integration with Travis-CI and Coveralls Revision name was added to the file change history dataframe Blame can be done file-wise throughout the project...

Read more →

Market-Product fit vs Product-Market fit

December 28, 2015

Marc Andreessen originally coined the phrase product/market fit in the statement: “Product/market fit means being in a good market with a product that can satisfy that market.” Since then, many prominent entrepreneurs and investors have built up process and theory around this idea to formalize its definition and generalize how to find it. The gist of it is that a startup should quickly build a min...

Read more →

Git-Pandas v0.0.5: coverage.py, risk, and more

December 25, 2015

The fifth release of git-pandas just hit pypi, install with: pip install -U git-pandas It includes a few neat features: Basic introduction of support for .coverage files as well, including checking if that exists at all in the first place A file change rate table that includes time rates of addition, change and other metrics aimed at identifying files that may be at a higher than normal risk of bu...

Read more →

Common Data Pitfalls for Recurring Machine Learning Systems

December 20, 2015

Many data science and machine learning resources exist for the analysis of static datasets, but often in the real world, solutions and systems need to work in an automated fashion on recurring data. This presents a unique challenge, because when the system is being designed and built, you know that you don’t have all of the data. You will, by design, be getting new data every day, week or however ...

Read more →

Visualize all of your git repositories with gitnoc and git-pandas

December 13, 2015

In a few past posts, I’ve shown you some of the functionality of one of my projects: git-pandas. You can do aggregate analysis of all of those, or you can even do a cumulative blame across them all. But with a little bit of extra code, you can start to see where aggregated, higher level analysis of git repositories can be useful for a team. GitNOC is a flask/d3/redis based app that, with git-panda...

Read more →

CyberLaunch: An Accelerator for Machine Learning Companies

December 8, 2015

Atlanta has a new accelerator focused on machine learning and information security companies, called CyberLaunch. It is a rebranding of a previous accelerator called SparkLabs, and is run by the same team. The accelerator is focused on companies that are building products in the machine learning and information security spaces, and is looking to invest in companies that are building products that ...

Read more →

Data Science Things Roundup #4

December 5, 2015

Time for another edition of the data science things roundup, where I round up some data science things for ya’ll. Todays collections are uncharacteristically R heavy. It’s usually pretty python and machine learning heavy, so if you find something you like here, be sure to check out previous editions as well. Without further adieu: Scikit-Learn Groups Scikit-learn groups (skl-groups) is a python li...

Read more →

Beyond One-Hot: An Exploration of Categorical Variables

November 29, 2015

In machine learning, data are king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out. With that in mind, let’s look at a little subset of those input data: categorical variables. Categorical variables (wiki) are those that represent a fixed number of possible values, rather than a con...

Read more →

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.