Data Science Things Roundup #4
Time for another edition of the data science things roundup, where I round up some data science things for ya’ll. Todays collections are uncharacteristically R heavy. It’s usually pretty python and machine learning heavy, so if you find something you like here, be sure to check out previous editions as well. Without further adieu:
Scikit-Learn Groups
Scikit-learn groups (skl-groups) is a python library for operating on sets of features (aka “groups”). It can be particularly useful for less-structured data, and locally interesting subsets like “galaxy clusters that are made up of individual galaxies” or “a set of tweets from a given area and time”. Rather than constructing one long feature vector for the set, a bag of respective features for the members of the group can be used. Super interesting library. Check it out here.
Detecting Events with Markov Modulated Poisson Processes
This tutorial goes through the usage of Markov-modulated Poisson processes for unsupervised anomaly detection in noisy time series data. It’s got some nice looking plots and seems to perform well on noisy data with rare anomalies, which is promising. It also comes along with some corresponding R-code. Check it out here.
Stochastic Dummy Boosting
DBoost is an implementation of stochastic dummy boosting, wherein the weak learner is encoding categorical variables by hash as it learns along, which adds an extra layer of randomness to the process. It’s a pretty short piece of code and is well commented, so even if you don’t have an immediate use for it, is worth a look. Check it out here.
Did you enjoy this? Check out previous editions:
Linked from
- Data Science Things Roundup #7 Data Science Things Roundup #7: Python-focused edition featuring Intel's Python …
- Data Science Things Roundup #6 Data Science Things Roundup #6: Focuses on calendar visualizations with D3.js …
- Data Science Things Roundup #5 Explore Deep Q-Learning for Space Invaders, insights from Elasticsearch in …
Stay in the loop
Get notified when I publish new posts. No spam, unsubscribe anytime.