Data Science Things Roundup #3

Time again for the 3rd edition of the data science things roundup, where I share a few data science things I’ve come across recently. Check out previous editions here and here.

Self Organizing Maps with TensorFlow

Google’s open sourcing of TensorFlow late last year caused a pretty big splash in the machine learning and data science communities, and since then a ton of tutorials, examples and projects have popped up around it. One such example from soon after it’s release was Sachin Joglekar’s tutorial of creating a self organizing map (SOM). SOMs are interesting as one of the relatively few unsupervised neural network applications, and are a refreshing respite from image classification. Check it out here.

Density Based Clustering Toolbox

DeBaCl is a python library for doing density based clustering using level-set trees. This is particularly useful for datasets with differing clustering behavior at different scales. Included in the repo is an in-progress set of example jupyter notebooks. Check it out here.

Protecting a Python Codebase

Python is taking over the data science and machine learning worlds, but once a project is done and it comes time to package/commercialize/distribute python code, things can get messy. When SaaS doesn’t work out and you have to physically ship code to a customer, it can be hard to protect IP. In this blog post, Mattias Aguirre walks through some of the options available to python programmers and the pros/cons of each. Check it out here.


Did you enjoy this? Check out previous editions:

Subscribe to the Newsletter

Get the latest posts and insights delivered straight to your inbox.