Machine-Learning
30 posts
Joining Scale Venture Partners as Chief Data Scientist
Joining Scale Venture Partners as Chief Data Scientist to build AI infrastructure augmenting investment decisions in venture capital.
Perplexity 101: How Language Models Measure Surprise
A deep dive into perplexity: the metric that tells us how 'surprised' a language model is by text and why it matters for AI detection.
HashingEncoder: Tackling Extreme Cardinality with the Hashing Trick
Explore HashingEncoder: learn how it handles extreme cardinality, its mechanism, ideal use cases, and implementation with the category_encoders library.
BinaryEncoder: The Space-Efficient Alternative to One-Hot Encoding
Explore the BinaryEncoder from category_encoders: a space-efficient alternative to one-hot encoding. Learn how it works, when to use it, and see implementation.
OrdinalEncoder: When Order Matters in Categorical Data
Explore OrdinalEncoder for categorical data where order matters. Learn how it works, its benefits, and implementation using the category_encoders library.
OneHotEncoder: The Workhorse of Categorical Encoding
A comprehensive guide to OneHotEncoder in category_encoders, exploring its core functionality, advantages, and practical limitations in machine learning.
Data Science Things Roundup #13
Data Science Roundup #13: Exploring IBM Granite for enterprise AI, Mistral AI's Le Chat for Europe, and Open Deep Research for DIY AI tools.
Data Science Things Roundup #12
Data Science Things Roundup #11: Discover hidden gems in data science, featuring ModernBERT, ReaderLM v2, and Cohere''s Rerank 3.5. Catch up on what''s new!
From Weekend Hack to Core Tool: The category_encoders Journey
Explore category_encoders' journey from a weekend Python experiment to a widely used data science library, now part of scikit-learn-contrib.
Investment Review: Seer.ai
A review of my angel investment in Seer.ai, exploring how they align with my investment thesis and their unique value proposition in AI-powered analytics.