Exploring Ideas: A Blog on Technology, Startups, Food, and More
Welcome to my blog where I share thoughts and insights on technology, startups, and life in Atlanta. Browse through the articles below or explore by topic.
Create organization-wide punchcards with git-pandas
January 17, 2016
In preparation for the v1.0.0 release of git-pandas, I’m going through the library and trying to stabilize the inputs and outputs across all functions. The idea is that there should be a common language for interacting with these objects, whether they by single repositories or project directories, and regardless of which function you are interacting with. I’m also trying to lower the barrier to en...
How to Write Comprehensions and Alienate People
January 8, 2016
Python list comprehensions are a powerful feature that can make code more concise and readable. But like any powerful tool, they can also be misused to create code that’s virtually incomprehensible. Here’s a guide to writing comprehensions that will make your colleagues wonder if you’re actually writing Python or inventing a new language. The Basics of Alienation Start with simple list comprehensi...
Gitpandas v0.0.6: python 2.7, fileowners, file-wise blame and examples
January 7, 2016
After the last release about two weeks ago, I’ve focused the development in git-pandas around testing, documentation, and bugfixes in preparation for the first major release, v1.0.0. In this release, we’ve added: Support for python 2.7 Testing and integration with Travis-CI and Coveralls Revision name was added to the file change history dataframe Blame can be done file-wise throughout the project...
Market-Product fit vs Product-Market fit
December 28, 2015
Marc Andreessen originally coined the phrase product/market fit in the statement: “Product/market fit means being in a good market with a product that can satisfy that market.” Since then, many prominent entrepreneurs and investors have built up process and theory around this idea to formalize its definition and generalize how to find it. The gist of it is that a startup should quickly build a min...
Git-Pandas v0.0.5: coverage.py, risk, and more
December 25, 2015
The fifth release of git-pandas just hit pypi, install with: pip install -U git-pandas It includes a few neat features: Basic introduction of support for .coverage files as well, including checking if that exists at all in the first place A file change rate table that includes time rates of addition, change and other metrics aimed at identifying files that may be at a higher than normal risk of bu...
Common Data Pitfalls for Recurring Machine Learning Systems
December 20, 2015
Many data science and machine learning resources exist for the analysis of static datasets, but often in the real world, solutions and systems need to work in an automated fashion on recurring data. This presents a unique challenge, because when the system is being designed and built, you know that you don’t have all of the data. You will, by design, be getting new data every day, week or however ...
Visualize all of your git repositories with gitnoc and git-pandas
December 13, 2015
In a few past posts, I’ve shown you some of the functionality of one of my projects: git-pandas. You can do aggregate analysis of all of those, or you can even do a cumulative blame across them all. But with a little bit of extra code, you can start to see where aggregated, higher level analysis of git repositories can be useful for a team. GitNOC is a flask/d3/redis based app that, with git-panda...
CyberLaunch: An Accelerator for Machine Learning Companies
December 8, 2015
Atlanta has a new accelerator focused on machine learning and information security companies, called CyberLaunch. It is a rebranding of a previous accelerator called SparkLabs, and is run by the same team. The accelerator is focused on companies that are building products in the machine learning and information security spaces, and is looking to invest in companies that are building products that ...
Data Science Things Roundup #4
December 5, 2015
Time for another edition of the data science things roundup, where I round up some data science things for ya’ll. Todays collections are uncharacteristically R heavy. It’s usually pretty python and machine learning heavy, so if you find something you like here, be sure to check out previous editions as well. Without further adieu: Scikit-Learn Groups Scikit-learn groups (skl-groups) is a python li...
Beyond One-Hot: An Exploration of Categorical Variables
November 29, 2015
In machine learning, data are king. The algorithms and models used to make predictions with the data are important, and very interesting, but ML is still subject to the idea of garbage-in-garbage-out. With that in mind, let’s look at a little subset of those input data: categorical variables. Categorical variables (wiki) are those that represent a fixed number of possible values, rather than a con...
Subscribe to the Newsletter
Get the latest posts and insights delivered straight to your inbox.