Data Science Things Roundup #13
In this edition of Data Science Things Roundup, I’m sharing three interesting developments from the world of data science that caught my attention recently.
IBM Granite: Enterprise AI with Clear IP Rights
IBM has launched Granite, their third generation of AI language models, with a strong focus on responsible development and clear licensing terms. What sets it apart is IBM’s approach to IP protection and data rights:
- Clean Data: Implements strict content filtering and quality checks for training data
- Clear IP Terms: Provides standard contractual IP indemnification
- Open Source: Available under Apache 2.0 license, from sub-billion to 34B parameters
This comes at a crucial time, as the industry grapples with training data controversies like Meta’s recent copyrighted books allegations.
Le Chat: European Open Source AI
Mistral AI’s Le Chat emerges as Europe’s answer to US-dominated AI chatbots. Built in Paris, it demonstrates that high-performance AI can thrive outside Silicon Valley while adhering to European values:
- European-First: Built with EU principles around AI and privacy
- Open Source: Transparent development and community involvement
- High Performance: Competitive speed and capabilities
Open Deep Research: DIY Deep Research Tool
Open Deep Research offers an open-source alternative to proprietary deep research tools from OpenAI and Google. Instead of expensive fine-tuned models, it cleverly combines:
- Firecrawl: For web data extraction and search
- Flexible Models: Support for various AI providers including Deepseek
- Modern Stack: Built on Next.js with extensible architecture
This project shows how the open-source community can build sophisticated research tools that rival proprietary alternatives.
Did you enjoy this? Check out these previous editions: