Create organization-wide punchcards with git-pandas
In preparation for the v1.0.0 release of git-pandas, I’m going through the library and trying to stabilize the inputs and outputs across all functions. The idea is that there should be a common language for interacting with git-pandas objects, whether they by single repositories or project directories, and regardless of which function you are interacting with.
I’m also trying to lower the barrier to entry for getting interesting use out of the library by providing some higher level functions that quickly provide some value. In light of that, I’ve added in a utilities directory with canned plotting functions for painless visualization.
In the past, I’ve posted about two of these higher level functions:
- Cumulative Blame
- File Change Rates
This weekend, I added in a new one: punchcards. You can grab it from the master branch on github until the v1.0.0 release. Github does a great job of visualizations and metrics for single repositories, and one popular visualization they do is the punchcard. Here is git-pandas':
[Image: git-pandas github punchcard]
Part of the founding premise of git-pandas, however, is that often projects and development teams are split across many repositories. So a rolled up view is often more useful. Now in git-pandas master branch, you can:
from gitpandas.utilities.plotting import plot_punchcard
from gitpandas import ProjectDirectory
repo = ProjectDirectory(working_dir=[
'git://github.com/wdm0006/git-pandas.git',
'git://github.com/wdm0006/categorical_encoding.git',
'git://github.com/wdm0006/sklearn-extensions.git',
'git://github.com/wdm0006/pygeohash.git',
'git://github.com/wdm0006/petersburg.git',
'git://github.com/wdm0006/incomprehensible.git',
], verbose=True)
by = None
punchcard = repo.punchcard(branch='master', extensions=['py'], by=by, normalize=2500)
plot_punchcard(punchcard, metric='lines', title='punchcard', by=by)
And get a matplotlib plot:
[Image: punchcard]
By changing the “by” parameter, you can get this data aggregated by committer or repository, but the default is None, which rolls it all up into one aggregate punchcard. Portions of the plotting code were inspired by or taken from hgpunchcard.
A helper function for plotting the cumulative blame stacked area charts is also in the utilities folder, and an update for gitnoc: the flask/d3 engineering team management app powered by git-pandas is in the works. So head to github, grab the master branch, and try it out!
Subscribe to the Newsletter
Get the latest posts and insights delivered straight to your inbox.