Synced files: - ai-misconceptions-reading-list.md (radio show research) - ai-misconceptions-radio-segments.md (distilled radio segments) - extract_license_plate.py - review_best_plates.py Machine: ACG-M-L5090 Timestamp: 2026-02-09 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 KiB
10 KiB
AI/LLM Misconceptions Reading List
For Radio Show: "Emergent AI Technologies"
Created: 2026-02-09
1. Tokenization (The "Strawberry" Problem)
- Why LLMs Can't Count the R's in 'Strawberry' - Arbisoft - Clear explainer on how tokenization breaks words into chunks like "st", "raw", "berry"
- Can modern LLMs count the b's in "blueberry"? - Max Woolf - Shows 2025-2026 models are overcoming this limitation
- Signs of Tokenization Awareness in LLMs - Ekaterina Kornilitsina, Medium (Jan 2026) - Modern LLMs developing tokenization awareness
2. Math/Computation Limitations
- Why LLMs Are Bad at Math - Reach Capital - LLMs predict plausible text, not compute answers; lack working memory for multi-step calculations
- Why AI Struggles with Basic Math - AEI - How "87439" gets tokenized inconsistently, breaking positional value
- Why LLMs Fail at Math & The Neuro-Symbolic AI Solution - Arsturn - Proposes integrating symbolic computing systems
3. Hallucination (Confidently Wrong)
- Why language models hallucinate - OpenAI - Trained to guess, penalized for saying "I don't know"
- AI hallucinates because it's trained to fake answers - Science (AAAS) - Models use 34% more confident language when WRONG
- It's 2026. Why Are LLMs Still Hallucinating? - Duke University - "Sounding good far more important than being correct"
- AI Hallucination Report 2026 - AllAboutAI - Comprehensive stats on hallucination rates across models
4. Real-World Failures (Great Radio Stories)
- California fines lawyer over ChatGPT fabrications - $10K fine; 21 of 23 cited cases were fake; 486 documented cases worldwide
- As more lawyers fall for AI hallucinations - Cronkite/PBS - Judges issued hundreds of decisions addressing AI hallucinations in 2025
- The Biggest AI Fails of 2025 - Taco Bell AI ordering 18,000 cups of water, Tesla FSD crashes, $440K Australian report with hallucinated sources
- 26 Biggest AI Controversies - xAI exposing 300K private Grok conversations, McDonald's McHire with password "123456"
5. Anthropomorphism ("AI is Thinking")
- Anthropomorphic conversational agents - PNAS - 2/3 of Americans think ChatGPT might be conscious; anthropomorphic attributions up 34% in 2025
- Thinking beyond the anthropomorphic paradigm - ArXiv (Feb 2026) - Anthropomorphism hinders accurate understanding
- Stop Talking about AI Like It Is Human - EPIC - Why anthropomorphic language is misleading and dangerous
6. The Stochastic Parrot Debate
- From Stochastic Parrots to Digital Intelligence - Wiley - Evolution of how we view LLMs, recognizing emergent capabilities
- LLMs still lag ~40% behind humans on physical concepts - ArXiv (Feb 2026) - Supporting the "just pattern matching" view
- LLMs are Not Stochastic Parrots - Counter-argument: GPT-4 scoring 90th percentile on Bar Exam, 93% on MATH Olympiad
7. Emergent Abilities
- Emergent Abilities in LLMs: A Survey - ArXiv (Mar 2026) - Capabilities arising suddenly and unpredictably at scale
- Breaking Myths in LLM scaling - ScienceDirect - Some "emergent" behaviors may be measurement artifacts
- Examining Emergent Abilities - Stanford HAI - Smoother metrics show gradual improvements, not sudden leaps
8. Context Windows & Memory
- Your 1M+ Context Window LLM Is Less Powerful Than You Think - Can only track 5-10 variables before degrading to random guessing
- Understanding LLM performance degradation - Why models "forget" what was said at the beginning of long conversations
- LLM Chat History Summarization Guide - Mem0 - Practical solutions to memory limitations
9. Prompt Engineering (Why "Think Step by Step" Works)
- Understanding Reasoning LLMs - Sebastian Raschka, PhD - Chain-of-thought unlocks latent capabilities
- The Ultimate Guide to LLM Reasoning - CoT more than doubles performance on math problems
- Chain-of-Thought Prompting - Only works with ~100B+ parameter models; smaller models produce worse results
10. Energy/Environmental Costs
- Generative AI's Environmental Impact - MIT - AI data centers projected to rank 5th globally in energy (between Japan and Russia)
- We did the math on AI's energy footprint - MIT Tech Review - 60% from fossil fuels; shocking water usage stats
- AI Environment Statistics 2026 - AllAboutAI - AI draining 731-1,125M cubic meters of water annually
11. Agents vs. Chatbots (The 2026 Shift)
- 2025 Was Chatbots. 2026 Is Agents. - "Chatbots talk to you, agents do work for you"
- AI Agents vs Chatbots: The 2026 Guide - Generative AI is "read-only", agentic AI is "read-write"
- Agentic AI Explained - Agent market at 45% CAGR vs 23% for chatbots
12. Multimodal AI
- Visual cognition in multimodal LLMs - Nature - Scaling improves perception but not reasoning; even advanced models fail at simple counting
- Will multimodal LLMs achieve deep understanding? - Frontiers - Remain detached from interactive learning
- Compare Multimodal AI Models on Visual Reasoning - AIMultiple 2026 - Fall short on causal reasoning and intuitive psychology
13. Training vs. Learning
- 5 huge AI misconceptions to drop in 2026 - Tom's Guide - Bias, accuracy, data privacy myths
- AI models collapse when trained on AI-generated data - Nature - "Model collapse" where rare patterns disappear
- The State of LLMs 2025 - Sebastian Raschka - "LLMs stopped getting smarter by training and started getting smarter by thinking"
14. How Researchers Study LLMs
- Treating LLMs like an alien autopsy - MIT Tech Review (Jan 2026) - "So vast and complicated that nobody quite understands what they are"
- Mechanistic Interpretability: Breakthrough Tech 2026 - Anthropic's work opening the black box
- 2025: The year in LLMs - Simon Willison - "Trained to produce statistically likely answers, not to assess their own confidence"
15. Podcast Resources
- Latent Space Podcast - Swyx & Alessio Fanelli - Deep technical coverage
- Practical AI - Accessible to general audiences; good "What mattered in 2025" episode
- TWIML AI Podcast - Researcher interviews since 2016
Top Radio Hooks (Best Audience Engagement)
- Taco Bell AI ordering 18,000 cups of water - Funny, relatable failure
- Lawyers citing 21 fake court cases - Serious real-world consequences
- 34% more confident language when wrong - Counterintuitive and alarming
- AI data centers rank 5th globally in energy (between Japan and Russia) - Shocking scale
- 2/3 of Americans think ChatGPT might be conscious - Audience self-reflection moment
- "Strawberry" has how many R's? - Interactive audience participation
- Million-token context but only tracks 5-10 variables - "Bigger isn't always better" angle