Data-Science
50 posts
Ripyr: sampled metrics on datasets using python's asyncio
An introduction to ripyr, a Python library for streaming through large datasets and parsing basic metrics using asyncio and type hinting
Category Encoders v1.2.5 Release
Category Encoders v1.2.5 brings community updates including stable binary/BaseN encoding, new leave-one-out encoding, and pandas compatibility fixes.
Data Science Things Roundup #11
A collection of interesting data science articles and projects, including SEC keynotes, Bayesian inference, and visualization tools
Category Encoders v1.2.4 Release
Category Encoders v1.2.4 is out! Includes pandas categorical type support, improved missing value handling, better error messages, BaseN fixes, and docs.
Data Science Things Roundup #10
A curated collection of data science articles and tools exploring network analysis, StashPy for log processing, and Bayesian survival analysis techniques.
Data Science Things Roundup #9
Data Science Things Roundup #9: Highlighting Pedro Domingos'' ML paper, Spyre (Shiny for Python), and BetaGo, an AlphaGo-inspired Go bot framework.
Data Science Things Roundup #8
Data Science Things Roundup #8: Dive into LIME for model interpretation, sklearn-expertsys for interpretable classifiers, and the value of nearest neighbors.
BaseN Encoding Grid Search in Category Encoders
Explore category_encoders' BaseN encoder for representing categorical data. Learn how to use scikit-learn's grid search to find the optimal encoding base.
Category Encoders accepted into scikit-learn-contrib
Category Encoders, a Python library for encoding categorical variables, has been accepted into the scikit-learn-contrib ecosystem. A project milestone!
Data Science Things Roundup #7
Data Science Things Roundup #7: Python-focused edition featuring Intel's Python Distribution, Go-Python for extensions, and PyFilesystem for unified access.