Data Science Things Roundup #8

Time again for the Tuesday-regular data science things roundup. I’ve definitely fallen into a repeatable routine here (3 new links every Tuesday at 10EST), but may play with the format of this some in the future, so if you have any feedback (good, bad, or indifferent), leave a comment below. In previous editions we talked about some lower-level tooling and dataviz, but this week we swing back towards machine learning and algorithms a little bit.

LIME - Local Interpretable Model-Agnostic Explanations

LIME is a very good idea. It is basically a regression technique that lets you pull locally faithful interpretations out of an otherwise black box and maybe highly non-linear classifier. Marco Tulio Ribeiro does a really good job of explaining it in simple terms, and I highly recommend giving his overview a read. Check it out here.

Sklearn-Expertsys

In a similar vein, Tamas Madl has a library up on github called sklearn-expertsys. The idea here is to build perfectly interpretable classifiers using sklearn syntax and the Bayesian Rule List Classifier. The key here is to get not only a valid prediction output, but a human readable (and more importantly human understandable) output of what the model really means. Check it out here.

Have you tried using ‘Nearest Neighbor Search’?

It’s pretty easy for nerds (myself included) to get swayed by the alure of complex and intricate solutions. In this post, Gregory J Stein relays the advice of a former mentor and professor who advocated trying the simpler methods first. In a world where someone (maybe you) has to maintain the systems you are building in the lab, this advice is pretty golden. Check it out here.


Did you enjoy this? Check out previous editions: