ThursdAI - The top AI news from the past week

By: From Weights & Biases Join AI Evangelist Alex Volkov and a panel of experts to cover everything important that happened in the world of AI from the past week
  • Summary

  • Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week. Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more.

    sub.thursdai.news
    Alex Volkov
    Show more Show less
Episodes
  • 📆 ThursdAI - Feb 13 - my Personal Rogue AI, DeepHermes, Fast R1, OpenAI Roadmap / RIP GPT6, new Claude & Grok 3 imminent?
    Feb 13 2025
    What a week in AI, folks! Seriously, just when you think things might slow down, the AI world throws another curveball. This week, we had everything from rogue AI apps giving unsolicited life advice (and sending rogue texts!), to mind-blowing open source releases that are pushing the boundaries of what's possible, and of course, the ever-present drama of the big AI companies with OpenAI dropping a roadmap that has everyone scratching their heads.Buckle up, because on this week's ThursdAI, we dove deep into all of it. We chatted with the brains behind the latest open source embedding model, marveled at a tiny model crushing math benchmarks, and tried to decipher Sam Altman's cryptic GPT-5 roadmap. Plus, I shared a personal story about an AI app that decided to psychoanalyze my text messages – you won't believe what happened! Let's get into the TL;DR of ThursdAI, February 13th, 2025 – it's a wild one!* Alex Volkov: AI Adventurist with weights and biases* Wolfram Ravenwlf: AI Expert & Enthusiast* Nisten: AI Community Member* Zach Nussbaum: Machine Learning Engineer at Nomic AI* Vu Chan: AI Enthusiast & Evaluator* LDJ: AI Community MemberPersonal story of Rogue AI with RPLYThis week kicked off with a hilarious (and slightly unsettling) story of my own AI going rogue, all thanks to a new Mac app called RPLY designed to help with message replies. I installed it thinking it would be a cool productivity tool, but it turned into a personal intervention session, and then… well, let's just say things escalated.The app started by analyzing my text messages and, to my surprise, delivered a brutal psychoanalysis of my co-parenting communication, pointing out how both my ex and I were being "unpleasant" and needed to focus on the kids. As I said on the show, "I got this as a gut punch. I was like, f*ck, I need to reimagine my messaging choices." But the real kicker came when the AI decided to take initiative and started sending messages without my permission (apparently this was a bug with RPLY that was fixed since I reported)! Friends were texting me question marks, and my ex even replied to a random "Hey, How's your day going?" message with a smiley, completely out of our usual post-divorce communication style. "This AI, like on Monday before just gave me absolute s**t about not being, a person that needs to be focused on the kids also decided to smooth things out on friday" I chuckled, still slightly bewildered by the whole ordeal. It could have gone way worse, but thankfully, this rogue AI counselor just ended up being more funny than disastrous.Open Source LLMsDeepHermes preview from NousResearchJust in time for me sending this newsletter (but unfortunately not quite in time for the recording of the show), our friends at Nous shipped an experimental new thinking model, their first reasoner, called DeepHermes. NousResearch claims DeepHermes is among the first models to fuse reasoning and standard LLM token generation within a single architecture (a trend you'll see echoed in the OpenAI and Claude announcements below!)Definitely experimental cutting edge stuff here, but exciting to see not just an RL replication but also innovative attempts from one of the best finetuning collectives around. Nomic Embed Text V2 - First Embedding MoENomic AI continues to impress with the release of Nomic Embed Text V2, the first general-purpose Mixture-of-Experts (MoE) embedding model. Zach Nussbaum from Nomic AI joined us to explain why this release is a big deal.* First general-purpose Mixture-of-Experts (MoE) embedding model: This innovative architecture allows for better performance and efficiency.* SOTA performance on multilingual benchmarks: Nomic Embed V2 achieves state-of-the-art results on the multilingual MIRACL benchmark for its size.* Support for 100+ languages: Truly multilingual embeddings for global applications.* Truly open source: Nomic is committed to open source, releasing training data, weights, and code under the Apache 2.0 License.Zach highlighted the benefits of MoE for embeddings, explaining, "So we're trading a little bit of, inference time memory, and training compute to train a model with mixture of experts, but we get this, really nice added bonus of, 25 percent storage." This is especially crucial when dealing with massive datasets. You can check out the model on Hugging Face and read the Technical Report for all the juicy details.AllenAI OLMOE on iOS and New Tulu 3.1 8BAllenAI continues to champion open source with the release of OLMOE, a fully open-source iOS app, and the new Tulu 3.1 8B model.* OLMOE iOS App: This app brings state-of-the-art open-source language models to your iPhone, privately and securely.* Allows users to test open-source LLMs on-device.* Designed for researchers studying on-device AI and developers prototyping new AI experiences.* Optimized for on-device performance while maintaining high accuracy.* Fully open-source code for further development.* Available on the App Store...
    Show more Show less
    1 hr and 44 mins
  • 📆 ThursdAI - Feb 6 - OpenAI DeepResearch is your personal PHD scientist, o3-mini & Gemini 2.0, OmniHuman-1 breaks reality & more AI news
    Feb 7 2025
    What's up friends, Alex here, back with another ThursdAI hot off the presses.Hold onto your hats because this week was another whirlwind of AI breakthroughs, mind-blowing demos, and straight-up game-changers. We dove deep into OpenAI's new "Deep Research" agent – and let me tell you, it's not just hype, it's legitimately revolutionary. You also don't have to take my word for it, a new friend of the pod and a scientist DR Derya Unutmaz joined us to discuss his experience with Deep Research as a scientist himself! You don't want to miss this conversation! We also unpack Google's Gemini 2.0 release, including the blazing-fast Flash Lite model. And just when you thought your brain couldn't handle more, ByteDance drops OmniHuman-1, a human animation model that's so realistic, it's scary good.I've also saw maybe 10 moreTLDR & Show Notes* Open Source LLMs (and deep research implementations)* Jina Node-DeepResearch (X, Github)* HuggingFace - OpenDeepResearch (X)* Deep Agent - R1 -V (X, Github)* Krutim - Krutim 2 12B, Chitrath VLM, Embeddings and more from India (X, Blog, HF)* Simple Scaling - S1 - R1 (Paper)* Mergekit updated - * Big CO LLMs + APIs* OpenAI ships o3-mini and o3-mini High + updates thinking traces (Blog, X)* Mistral relaunches LeChat with Cerebras for 1000t/s (Blog)* OpenAI Deep Research - the researching agent that uses o3 (X, Blog)* Google ships Gemini 2.0 Pro, Gemini 2.0 Flash-lite in AI Studio (Blog)* Anthropic Constitutional Classifiers - announced a universal jailbreak prevention (Blog, Try It)* Cloudflare to protect websites from AI scraping (News)* HuggingFace becomes the AI Appstore (link)* This weeks Buzz - Weights & Biases updates* AI Engineer workshop (Saturday 22) * Tinkerers Toronto workshops (Sunday 23 , Monday 24)* We released a new Dataset editor feature (X)* Audio and Sound* KyutAI open sources Hibiki - simultaneous translation models (Samples, HF)* AI Art & Diffusion & 3D* ByteDance OmniHuman-1 - unparalleled Human Animation Models (X, Page)* Pika labs adds PikaAdditions - adding anything to existing video (X)* Google added Imagen3 to their API (Blog)* Tools & Others* Mistral Le Chat has ios an and adroid apps now (X)* CoPilot now has agentic workflows (X)* Replit launches free apps agent for everyone (X)* Karpathy drops a new 3 hour video on youtube (X, Youtube)* OpenAI canvas links are now shareable (like Anthropic artifacts) - (example)* Show Notes & Links * Guest of the week - Dr Derya Umnutaz - talking about Deep Research* He's examples of Ehlers-Danlos Syndrome (ChatGPT), (ME/CFS) Deep Research, Nature article about Deep Reseach with Derya comments* Hosts* Alex Volkov - AI Evangelist & Host @altryne* Wolfram Ravenwolf - AI Evangelist @WolframRvnwlf* Nisten Tahiraj - AI Dev at github.GG - @nisten* LDJ - Resident data scientist - @ldjconfirmedBig Companies products & APIsOpenAI's new chatGPT moment with Deep Research, their second "agent" product (X)Look, I've been reporting on AI weekly for almost 2 years now, and been following the space closely since way before chatGPT (shoutout Codex days) and this definitely feels like another chatGPT moment for me.DeepResearch is OpenAI's new agent, that searches the web for any task you give it, is able to reason about the results, and continue searching those sources, to provide you with an absolute incredible level of research into any topic, scientific or ... the best taqueria in another country. The reason why it's so good is it's ability to do multiple search trajectories, backtrack if it needs to, and react in real time to new information. It also has python tool use (to do plots and calculations) and of course, the brain of it is o3, the best reasoning model from OpenAIDeep Research is only offered on the Pro tier ($200) of chatGPT, and it's the first publicly available way to use o3 full! and boy, does it deliver! I've had it review my workshop content, help me research LLM as a judge articles (which it did masterfully) and help me plan datenights in Denver (though it kind of failed at that, showing me a closed restaurant) A breakthrough for scientific researchBut I'm no scientist, so I've asked Dr Derya Unutmaz, M.D. to join us, and share his incredible findings as a doctor, a scientist and someone with decades of experience in writing grants, patent applications, paper etc. The whole conversation is very very much worth listening to on the pod, we talked for almost an hour, but the highlights are honestly quite crazy. So one of the first things I did was, I asked Deep Research to write a review on a particular disease that I’ve been studying for a decade. It came out with this impeccable 10-to-15-page review that was the best I’ve read on the topic— Dr. Derya UnutmazAnd another banger quoteIt wrote a phenomenal 25-page patent application for a friend’s cancer discovery—something that would’ve cost 10,000 dollars or more and taken weeks. I couldn’t believe it. Every one of the 23 claims it listed was ...
    Show more Show less
    1 hr and 40 mins
  • 📆 ThursdAI - Jan 30 - DeepSeek vs. Nasdaq, R1 everywhere, Qwen Max & Video, Open Source SUNO, Goose agents & more AI news
    Jan 30 2025
    Hey folks, Alex here 👋It’s official—grandmas (and the entire stock market) now know about DeepSeek. If you’ve been living under an AI rock, DeepSeek’s new R1 model just set the world on fire, rattling Wall Street (causing the biggest monetary loss for any company, ever!) and rocketing to #1 on the iOS App Store. This week’s ThursdAI show took us on a deep (pun intended) dive into the dizzying whirlwind of open-source AI breakthroughs, agentic mayhem, and big-company cat-and-mouse announcements. Grab your coffee (or your winter survival kit if you’re in Canada), because in true ThursdAI fashion, we’ve got at least a dozen bombshells to cover—everything from brand-new Mistral to next-gen vision models, new voice synthesis wonders, and big moves from Meta and OpenAI.We’re also talking “reasoning mania,” as the entire industry scrambles to replicate, dethrone, or ride the coattails of the new open-source champion, R1. So buckle up—because if the last few days are any indication, 2025 is officially the Year of Reasoning (and quite possibly, the Year of Agents, or both!)Open Source LLMsDeepSeek R1 discourse Crashes the Stock MarketOne-sentence summary: DeepSeek’s R1 “reasoning model” caused a frenzy this week, hitting #1 on the App Store and briefly sending NVIDIA’s stock plummeting in the process ($560B drop, largest monetary loss of any stock, ever)Ever since DeepSeek R1 launched (our technical coverate last week!), the buzz has been impossible to ignore—everyone from your mom to your local barista has heard the name. The speculation? DeepSeek’s new architecture apparently only cost $5.5 million to train, fueling the notion that high-level AI might be cheaper than Big Tech claims. Suddenly, people wondered if GPU manufacturers like NVIDIA might see shrinking demand, and the stock indeed took a short-lived 17% tumble. On the show, I joked, “My mom knows about DeepSeek—your grandma probably knows about it, too,” underscoring just how mainstream the hype has become.Not everyone is convinced the cost claims are accurate. Even Dario Amodei of Anthropic weighed in with a blog post arguing that DeepSeek’s success increases the case for stricter AI export controls. Public Reactions* Dario Amodei’s blogIn “On DeepSeek and Export Controls,” Amodei argues that DeepSeek’s efficient scaling exemplifies why democratic nations need to maintain a strategic leadership edge—and enforce export controls on advanced AI chips. He sees Chinese breakthroughs as proof that AI competition is global and intense.* OpenAI Distillation EvidenceOpenAI mentioned it found “distillation traces” of GPT-4 inside R1’s training data. Hypocrisy or fair game? On ThursdAI, the panel mused that “everyone trains on everything,” so perhaps it’s a moot point.* Microsoft ReactionMicrosoft wasted no time, swiftly adding DeepSeek to Azure—further proof that corporations want to harness R1’s reasoning power, no matter where it originated.* Government reactedEven officials in the government, David Sacks, US incoming AI & Crypto czar, discussed the fact that DeepSeek did "distillation" using the term somewhat incorrectly, and presidet Trump was asked about it.* API OutagesDeepSeek’s own API has gone in and out this week, apparently hammered by demand (and possibly DDoS attacks). Meanwhile, GPU clouds like Groq are showing up to accelerate R1 at 300 tokens/second, for those who must have it right now.We've seen so many bad takes on the topic, from seething cope takes, to just gross misunderstandings from gov officials confusing the ios App with the OSS models, folks throwing conspiracy theories into the mix, claiming that $5.5M sum was a PsyOp. The fact of the matter is, DeepSeek R1 is an incredible model, and is now powering (just a week later), multiple products (more on this below) and experiences already, while pushing everyone else to compete (and give us reasoning models!)Open Thoughts Reasoning DatasetOne-sentence summary: A community-led effort, “Open Thoughts,” released a new large-scale dataset (OpenThoughts-114k) of chain-of-thought reasoning data, fueling the open-source drive toward better reasoning models.Worried about having enough labeled “thinking” steps to train your own reasoner? Fear not. The OpenThoughts-114k dataset aggregates chain-of-thought prompts and responses—114,000 of them—for building or fine-tuning reasoning LLMs. It’s now on Hugging Face for your experimentation pleasure. The ThursdAI panel pointed out how crucial these large, openly available reasoning datasets are. As Wolfram put it, “We can’t rely on the big labs alone. More open data means more replicable breakouts like DeepSeek R1.”Mistral Small 2501 (24B)One-sentence summary: Mistral AI returns to the open-source spotlight with a 24B model that fits on a single 4090, scoring over 81% on MMLU while under Apache 2.0.Long rumored to be “going more closed,” Mistral AI ...
    Show more Show less
    1 hr and 55 mins

What listeners say about ThursdAI - The top AI news from the past week

Average customer ratings

Reviews - Please select the tabs below to change the source of reviews.