claudetools/ai-misconceptions-reading-list.md

# AI/LLM Misconceptions Reading List
## For Radio Show: "Emergent AI Technologies"
**Created:** 2026-02-09

---

## 1. Tokenization (The "Strawberry" Problem)
- **[Why LLMs Can't Count the R's in 'Strawberry'](https://arbisoft.com/blogs/why-ll-ms-can-t-count-the-r-s-in-strawberry-and-what-it-teaches-us)** - Arbisoft - Clear explainer on how tokenization breaks words into chunks like "st", "raw", "berry"
- **[Can modern LLMs count the b's in "blueberry"?](https://minimaxir.com/2025/08/llm-blueberry/)** - Max Woolf - Shows 2025-2026 models are overcoming this limitation
- **[Signs of Tokenization Awareness in LLMs](https://medium.com/@solidgoldmagikarp/a-breakthrough-feature-signs-of-tokenization-awareness-in-llms-058fe880ef9f)** - Ekaterina Kornilitsina, Medium (Jan 2026) - Modern LLMs developing tokenization awareness

## 2. Math/Computation Limitations
- **[Why LLMs Are Bad at Math](https://www.reachcapital.com/resources/thought-leadership/why-llms-are-bad-at-math-and-how-they-can-be-better/)** - Reach Capital - LLMs predict plausible text, not compute answers; lack working memory for multi-step calculations
- **[Why AI Struggles with Basic Math](https://www.aei.org/technology-and-innovation/why-ai-struggles-with-basic-math-and-how-thats-changing/)** - AEI - How "87439" gets tokenized inconsistently, breaking positional value
- **[Why LLMs Fail at Math & The Neuro-Symbolic AI Solution](https://www.arsturn.com/blog/why-your-llm-is-bad-at-math-and-how-to-fix-it-with-a-clip-on-symbolic-brain)** - Arsturn - Proposes integrating symbolic computing systems

## 3. Hallucination (Confidently Wrong)
- **[Why language models hallucinate](https://openai.com/index/why-language-models-hallucinate/)** - OpenAI - Trained to guess, penalized for saying "I don't know"
- **[AI hallucinates because it's trained to fake answers](https://www.science.org/content/article/ai-hallucinates-because-it-s-trained-fake-answers-it-doesn-t-know)** - Science (AAAS) - Models use 34% more confident language when WRONG
- **[It's 2026. Why Are LLMs Still Hallucinating?](https://blogs.library.duke.edu/blog/2026/01/05/its-2026-why-are-llms-still-hallucinating/)** - Duke University - "Sounding good far more important than being correct"
- **[AI Hallucination Report 2026](https://www.allaboutai.com/resources/ai-statistics/ai-hallucinations/)** - AllAboutAI - Comprehensive stats on hallucination rates across models

## 4. Real-World Failures (Great Radio Stories)
- **[California fines lawyer over ChatGPT fabrications](https://calmatters.org/economy/technology/2025/09/chatgpt-lawyer-fine-ai-regulation/)** - $10K fine; 21 of 23 cited cases were fake; 486 documented cases worldwide
- **[As more lawyers fall for AI hallucinations](https://cronkitenews.azpbs.org/2025/10/28/lawyers-ai-hallucinations-chatgpt/)** - Cronkite/PBS - Judges issued hundreds of decisions addressing AI hallucinations in 2025
- **[The Biggest AI Fails of 2025](https://www.ninetwothree.co/blog/ai-fails)** - Taco Bell AI ordering 18,000 cups of water, Tesla FSD crashes, $440K Australian report with hallucinated sources
- **[26 Biggest AI Controversies](https://www.crescendo.ai/blog/ai-controversies)** - xAI exposing 300K private Grok conversations, McDonald's McHire with password "123456"

## 5. Anthropomorphism ("AI is Thinking")
- **[Anthropomorphic conversational agents](https://www.pnas.org/doi/10.1073/pnas.2415898122)** - PNAS - 2/3 of Americans think ChatGPT might be conscious; anthropomorphic attributions up 34% in 2025
- **[Thinking beyond the anthropomorphic paradigm](https://arxiv.org/html/2502.09192v1)** - ArXiv (Feb 2026) - Anthropomorphism hinders accurate understanding
- **[Stop Talking about AI Like It Is Human](https://epic.org/a-new-years-resolution-for-everyone-stop-talking-about-generative-ai-like-it-is-human/)** - EPIC - Why anthropomorphic language is misleading and dangerous

## 6. The Stochastic Parrot Debate
- **[From Stochastic Parrots to Digital Intelligence](https://wires.onlinelibrary.wiley.com/doi/10.1002/wics.70035)** - Wiley - Evolution of how we view LLMs, recognizing emergent capabilities
- **[LLMs still lag ~40% behind humans on physical concepts](https://arxiv.org/abs/2502.08946)** - ArXiv (Feb 2026) - Supporting the "just pattern matching" view
- **[LLMs are Not Stochastic Parrots](https://medium.com/@freddyayala/llms-are-not-stochastic-parrots-how-large-language-models-actually-work-16c000588b70)** - Counter-argument: GPT-4 scoring 90th percentile on Bar Exam, 93% on MATH Olympiad

## 7. Emergent Abilities
- **[Emergent Abilities in LLMs: A Survey](https://arxiv.org/abs/2503.05788)** - ArXiv (Mar 2026) - Capabilities arising suddenly and unpredictably at scale
- **[Breaking Myths in LLM scaling](https://www.sciencedirect.com/science/article/pii/S092523122503214X)** - ScienceDirect - Some "emergent" behaviors may be measurement artifacts
- **[Examining Emergent Abilities](https://hai.stanford.edu/news/examining-emergent-abilities-large-language-models)** - Stanford HAI - Smoother metrics show gradual improvements, not sudden leaps

## 8. Context Windows & Memory
- **[Your 1M+ Context Window LLM Is Less Powerful Than You Think](https://towardsdatascience.com/your-1m-context-window-llm-is-less-powerful-than-you-think/)** - Can only track 5-10 variables before degrading to random guessing
- **[Understanding LLM performance degradation](https://demiliani.com/2025/11/02/understanding-llm-performance-degradation-a-deep-dive-into-context-window-limits/)** - Why models "forget" what was said at the beginning of long conversations
- **[LLM Chat History Summarization Guide](https://mem0.ai/blog/llm-chat-history-summarization-guide-2025)** - Mem0 - Practical solutions to memory limitations

## 9. Prompt Engineering (Why "Think Step by Step" Works)
- **[Understanding Reasoning LLMs](https://magazine.sebastianraschka.com/p/understanding-reasoning-llms)** - Sebastian Raschka, PhD - Chain-of-thought unlocks latent capabilities
- **[The Ultimate Guide to LLM Reasoning](https://kili-technology.com/large-language-models-llms/llm-reasoning-guide)** - CoT more than doubles performance on math problems
- **[Chain-of-Thought Prompting](https://www.promptingguide.ai/techniques/cot)** - Only works with ~100B+ parameter models; smaller models produce worse results

## 10. Energy/Environmental Costs
- **[Generative AI's Environmental Impact](https://news.mit.edu/2025/explained-generative-ai-environmental-impact-0117)** - MIT - AI data centers projected to rank 5th globally in energy (between Japan and Russia)
- **[We did the math on AI's energy footprint](https://www.technologyreview.com/2025/05/20/1116327/ai-energy-usage-climate-footprint-big-tech/)** - MIT Tech Review - 60% from fossil fuels; shocking water usage stats
- **[AI Environment Statistics 2026](https://www.allaboutai.com/resources/ai-statistics/ai-environment/)** - AllAboutAI - AI draining 731-1,125M cubic meters of water annually

## 11. Agents vs. Chatbots (The 2026 Shift)
- **[2025 Was Chatbots. 2026 Is Agents.](https://dev.to/inboryn_99399f96579fcd705/2025-was-about-chatbots-2026-is-about-agents-heres-the-difference-426f)** - "Chatbots talk to you, agents do work for you"
- **[AI Agents vs Chatbots: The 2026 Guide](https://technosysblogs.com/ai-agents-vs-chatbots/)** - Generative AI is "read-only", agentic AI is "read-write"
- **[Agentic AI Explained](https://www.synergylabs.co/blog/agentic-ai-explained-from-chatbots-to-autonomous-ai-agents-in-2026)** - Agent market at 45% CAGR vs 23% for chatbots

## 12. Multimodal AI
- **[Visual cognition in multimodal LLMs](https://www.nature.com/articles/s42256-024-00963-y)** - Nature - Scaling improves perception but not reasoning; even advanced models fail at simple counting
- **[Will multimodal LLMs achieve deep understanding?](https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2025.1683133/full)** - Frontiers - Remain detached from interactive learning
- **[Compare Multimodal AI Models on Visual Reasoning](https://research.aimultiple.com/visual-reasoning/)** - AIMultiple 2026 - Fall short on causal reasoning and intuitive psychology

## 13. Training vs. Learning
- **[5 huge AI misconceptions to drop in 2026](https://www.tomsguide.com/ai/5-huge-ai-misconceptions-to-drop-now-heres-what-you-need-to-know-in-2026)** - Tom's Guide - Bias, accuracy, data privacy myths
- **[AI models collapse when trained on AI-generated data](https://www.nature.com/articles/s41586-024-07566-y)** - Nature - "Model collapse" where rare patterns disappear
- **[The State of LLMs 2025](https://magazine.sebastianraschka.com/p/state-of-llms-2025)** - Sebastian Raschka - "LLMs stopped getting smarter by training and started getting smarter by thinking"

## 14. How Researchers Study LLMs
- **[Treating LLMs like an alien autopsy](https://www.technologyreview.com/2026/01/12/1129782/ai-large-language-models-biology-alien-autopsy/)** - MIT Tech Review (Jan 2026) - "So vast and complicated that nobody quite understands what they are"
- **[Mechanistic Interpretability: Breakthrough Tech 2026](https://www.technologyreview.com/2026/01/12/1130003/mechanistic-interpretability-ai-research-models-2026-breakthrough-technologies/)** - Anthropic's work opening the black box
- **[2025: The year in LLMs](https://simonwillison.net/2025/Dec/31/the-year-in-llms/)** - Simon Willison - "Trained to produce statistically likely answers, not to assess their own confidence"

## 15. Podcast Resources
- **[Latent Space Podcast](https://podcasts.apple.com/us/podcast/large-language-model-llm-talk/id1790576136)** - Swyx & Alessio Fanelli - Deep technical coverage
- **[Practical AI](https://podcasts.apple.com/us/podcast/practical-ai-machine-learning-data-science-llm/id1406537385)** - Accessible to general audiences; good "What mattered in 2025" episode
- **[TWIML AI Podcast](https://podcasts.apple.com/us/podcast/the-twiml-ai-podcast-formerly-this-week-in-machine/id1116303051)** - Researcher interviews since 2016

---

## Top Radio Hooks (Best Audience Engagement)

1. **Taco Bell AI ordering 18,000 cups of water** - Funny, relatable failure
2. **Lawyers citing 21 fake court cases** - Serious real-world consequences
3. **34% more confident language when wrong** - Counterintuitive and alarming
4. **AI data centers rank 5th globally in energy** (between Japan and Russia) - Shocking scale
5. **2/3 of Americans think ChatGPT might be conscious** - Audience self-reflection moment
6. **"Strawberry" has how many R's?** - Interactive audience participation
7. **Million-token context but only tracks 5-10 variables** - "Bigger isn't always better" angle