Data-Science
52 posts
Category Encoders now on conda-forge
The category_encoders Python package is now available on conda-forge, making installation easier for Conda users. Learn about the package and feedstock.
Data Science Things Roundup #6
Data Science Things Roundup #6: Focuses on calendar visualizations with D3.js Calendar Heatmap and Bostock's Calendar View, plus insights on coding interviews.
Introducing unified glob-syntax in git-pandas
Explore the unified glob syntax (`include_globs`, `ignore_globs`) for git-pandas v2.0, offering flexible file pattern specification and usability.
Parallelizing cumulative blame in git-pandas with joblib
Boost git-pandas cumulative blame analysis performance with joblib. Parallel processing via multithreading speeds up this costly operation.
When do I work on what?
Use git-pandas to analyze and visualize work patterns across open source vs. closed source projects. Compare commit times with punchcard plots. Learn the code.
Data Science Things Roundup #5
Explore Deep Q-Learning for Space Invaders, insights from Elasticsearch in production, and improved Python package management strategies.
Using survival analysis and git-pandas to estimate code quality
Apply survival analysis with git-pandas to measure code quality in Git repositories by analyzing code longevity and contributor patterns over time.
Git-pandas v1.0.0, or how to check for a stable release
Explore git-pandas v1.0.0, focusing on interface consistency, parameter naming, and API simplification for improved data analysis workflows.
Github.com cumulative blame in 5 lines of python
Visualize your GitHub repository growth over time using the git-pandas Python library and GitHubProfile class-all in just a few lines of code.
How to Write Comprehensions and Alienate People
A tongue-in-cheek guide to writing Python comprehensions that will make your colleagues question their life choices and your sanity.