Data-Science
50 posts
Gitpandas v0.0.6: python 2.7, fileowners, file-wise blame and examples
Overview of git-pandas v0.0.6 release, highlighting new features like Python 2.7 support, file-wise blame, file owner determination, and other improvements.
Common Data Pitfalls for Recurring Machine Learning Systems
Explore common data pitfalls in recurring machine learning systems, including new categories, data format changes, sending issues, deduplication, and updates.
CyberLaunch: An Accelerator for Machine Learning Companies
Explore CyberLaunch, Atlanta's accelerator for machine learning and info security startups, its program details, and its impact on the local startup ecosystem.
Data Science Things Roundup #4
Data Science Things Roundup #4: Featuring Scikit-learn groups for feature sets, Markov Modulated Poisson Processes for event detection, and DBoost for boosting.
Beyond One-Hot: An Exploration of Categorical Variables
A deep dive into different methods for encoding categorical variables in machine learning, exploring their benefits and trade-offs
Data Science vs. Data Engineering
Understanding the fundamental differences between data science and data engineering through the lens of methodology rather than tools
Data Science Things Roundup #3
Data Science Things Roundup #3: TensorFlow Self Organizing Maps, DeBaCl for density-based clustering, and options for protecting Python codebases.
Data Science Things Roundup #2
Data Science Things Roundup #2: Featuring Lifelines for survival analysis, Patsy-learn for R-style syntax in scikit-learn, and HDBSCAN clustering library.
Data Science Things Roundup #1
Data Science Things Roundup #1: Explore Kaggle past solutions, Gooey for simple Python CLIs, and Metric-Learn for optimal distance metrics. First edition!
Artificial Intelligence & Machine Learning
Technical insights and practical experiences in AI/ML from industry roles at JPMC, RTX/UTC, and Predikto, covering theory and real-world applications.