Synced files: - ai-misconceptions-reading-list.md (radio show research) - ai-misconceptions-radio-segments.md (distilled radio segments) - extract_license_plate.py - review_best_plates.py Machine: ACG-M-L5090 Timestamp: 2026-02-09 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
95 lines
10 KiB
Markdown
95 lines
10 KiB
Markdown
# AI/LLM Misconceptions Reading List
|
|
## For Radio Show: "Emergent AI Technologies"
|
|
**Created:** 2026-02-09
|
|
|
|
---
|
|
|
|
## 1. Tokenization (The "Strawberry" Problem)
|
|
- **[Why LLMs Can't Count the R's in 'Strawberry'](https://arbisoft.com/blogs/why-ll-ms-can-t-count-the-r-s-in-strawberry-and-what-it-teaches-us)** - Arbisoft - Clear explainer on how tokenization breaks words into chunks like "st", "raw", "berry"
|
|
- **[Can modern LLMs count the b's in "blueberry"?](https://minimaxir.com/2025/08/llm-blueberry/)** - Max Woolf - Shows 2025-2026 models are overcoming this limitation
|
|
- **[Signs of Tokenization Awareness in LLMs](https://medium.com/@solidgoldmagikarp/a-breakthrough-feature-signs-of-tokenization-awareness-in-llms-058fe880ef9f)** - Ekaterina Kornilitsina, Medium (Jan 2026) - Modern LLMs developing tokenization awareness
|
|
|
|
## 2. Math/Computation Limitations
|
|
- **[Why LLMs Are Bad at Math](https://www.reachcapital.com/resources/thought-leadership/why-llms-are-bad-at-math-and-how-they-can-be-better/)** - Reach Capital - LLMs predict plausible text, not compute answers; lack working memory for multi-step calculations
|
|
- **[Why AI Struggles with Basic Math](https://www.aei.org/technology-and-innovation/why-ai-struggles-with-basic-math-and-how-thats-changing/)** - AEI - How "87439" gets tokenized inconsistently, breaking positional value
|
|
- **[Why LLMs Fail at Math & The Neuro-Symbolic AI Solution](https://www.arsturn.com/blog/why-your-llm-is-bad-at-math-and-how-to-fix-it-with-a-clip-on-symbolic-brain)** - Arsturn - Proposes integrating symbolic computing systems
|
|
|
|
## 3. Hallucination (Confidently Wrong)
|
|
- **[Why language models hallucinate](https://openai.com/index/why-language-models-hallucinate/)** - OpenAI - Trained to guess, penalized for saying "I don't know"
|
|
- **[AI hallucinates because it's trained to fake answers](https://www.science.org/content/article/ai-hallucinates-because-it-s-trained-fake-answers-it-doesn-t-know)** - Science (AAAS) - Models use 34% more confident language when WRONG
|
|
- **[It's 2026. Why Are LLMs Still Hallucinating?](https://blogs.library.duke.edu/blog/2026/01/05/its-2026-why-are-llms-still-hallucinating/)** - Duke University - "Sounding good far more important than being correct"
|
|
- **[AI Hallucination Report 2026](https://www.allaboutai.com/resources/ai-statistics/ai-hallucinations/)** - AllAboutAI - Comprehensive stats on hallucination rates across models
|
|
|
|
## 4. Real-World Failures (Great Radio Stories)
|
|
- **[California fines lawyer over ChatGPT fabrications](https://calmatters.org/economy/technology/2025/09/chatgpt-lawyer-fine-ai-regulation/)** - $10K fine; 21 of 23 cited cases were fake; 486 documented cases worldwide
|
|
- **[As more lawyers fall for AI hallucinations](https://cronkitenews.azpbs.org/2025/10/28/lawyers-ai-hallucinations-chatgpt/)** - Cronkite/PBS - Judges issued hundreds of decisions addressing AI hallucinations in 2025
|
|
- **[The Biggest AI Fails of 2025](https://www.ninetwothree.co/blog/ai-fails)** - Taco Bell AI ordering 18,000 cups of water, Tesla FSD crashes, $440K Australian report with hallucinated sources
|
|
- **[26 Biggest AI Controversies](https://www.crescendo.ai/blog/ai-controversies)** - xAI exposing 300K private Grok conversations, McDonald's McHire with password "123456"
|
|
|
|
## 5. Anthropomorphism ("AI is Thinking")
|
|
- **[Anthropomorphic conversational agents](https://www.pnas.org/doi/10.1073/pnas.2415898122)** - PNAS - 2/3 of Americans think ChatGPT might be conscious; anthropomorphic attributions up 34% in 2025
|
|
- **[Thinking beyond the anthropomorphic paradigm](https://arxiv.org/html/2502.09192v1)** - ArXiv (Feb 2026) - Anthropomorphism hinders accurate understanding
|
|
- **[Stop Talking about AI Like It Is Human](https://epic.org/a-new-years-resolution-for-everyone-stop-talking-about-generative-ai-like-it-is-human/)** - EPIC - Why anthropomorphic language is misleading and dangerous
|
|
|
|
## 6. The Stochastic Parrot Debate
|
|
- **[From Stochastic Parrots to Digital Intelligence](https://wires.onlinelibrary.wiley.com/doi/10.1002/wics.70035)** - Wiley - Evolution of how we view LLMs, recognizing emergent capabilities
|
|
- **[LLMs still lag ~40% behind humans on physical concepts](https://arxiv.org/abs/2502.08946)** - ArXiv (Feb 2026) - Supporting the "just pattern matching" view
|
|
- **[LLMs are Not Stochastic Parrots](https://medium.com/@freddyayala/llms-are-not-stochastic-parrots-how-large-language-models-actually-work-16c000588b70)** - Counter-argument: GPT-4 scoring 90th percentile on Bar Exam, 93% on MATH Olympiad
|
|
|
|
## 7. Emergent Abilities
|
|
- **[Emergent Abilities in LLMs: A Survey](https://arxiv.org/abs/2503.05788)** - ArXiv (Mar 2026) - Capabilities arising suddenly and unpredictably at scale
|
|
- **[Breaking Myths in LLM scaling](https://www.sciencedirect.com/science/article/pii/S092523122503214X)** - ScienceDirect - Some "emergent" behaviors may be measurement artifacts
|
|
- **[Examining Emergent Abilities](https://hai.stanford.edu/news/examining-emergent-abilities-large-language-models)** - Stanford HAI - Smoother metrics show gradual improvements, not sudden leaps
|
|
|
|
## 8. Context Windows & Memory
|
|
- **[Your 1M+ Context Window LLM Is Less Powerful Than You Think](https://towardsdatascience.com/your-1m-context-window-llm-is-less-powerful-than-you-think/)** - Can only track 5-10 variables before degrading to random guessing
|
|
- **[Understanding LLM performance degradation](https://demiliani.com/2025/11/02/understanding-llm-performance-degradation-a-deep-dive-into-context-window-limits/)** - Why models "forget" what was said at the beginning of long conversations
|
|
- **[LLM Chat History Summarization Guide](https://mem0.ai/blog/llm-chat-history-summarization-guide-2025)** - Mem0 - Practical solutions to memory limitations
|
|
|
|
## 9. Prompt Engineering (Why "Think Step by Step" Works)
|
|
- **[Understanding Reasoning LLMs](https://magazine.sebastianraschka.com/p/understanding-reasoning-llms)** - Sebastian Raschka, PhD - Chain-of-thought unlocks latent capabilities
|
|
- **[The Ultimate Guide to LLM Reasoning](https://kili-technology.com/large-language-models-llms/llm-reasoning-guide)** - CoT more than doubles performance on math problems
|
|
- **[Chain-of-Thought Prompting](https://www.promptingguide.ai/techniques/cot)** - Only works with ~100B+ parameter models; smaller models produce worse results
|
|
|
|
## 10. Energy/Environmental Costs
|
|
- **[Generative AI's Environmental Impact](https://news.mit.edu/2025/explained-generative-ai-environmental-impact-0117)** - MIT - AI data centers projected to rank 5th globally in energy (between Japan and Russia)
|
|
- **[We did the math on AI's energy footprint](https://www.technologyreview.com/2025/05/20/1116327/ai-energy-usage-climate-footprint-big-tech/)** - MIT Tech Review - 60% from fossil fuels; shocking water usage stats
|
|
- **[AI Environment Statistics 2026](https://www.allaboutai.com/resources/ai-statistics/ai-environment/)** - AllAboutAI - AI draining 731-1,125M cubic meters of water annually
|
|
|
|
## 11. Agents vs. Chatbots (The 2026 Shift)
|
|
- **[2025 Was Chatbots. 2026 Is Agents.](https://dev.to/inboryn_99399f96579fcd705/2025-was-about-chatbots-2026-is-about-agents-heres-the-difference-426f)** - "Chatbots talk to you, agents do work for you"
|
|
- **[AI Agents vs Chatbots: The 2026 Guide](https://technosysblogs.com/ai-agents-vs-chatbots/)** - Generative AI is "read-only", agentic AI is "read-write"
|
|
- **[Agentic AI Explained](https://www.synergylabs.co/blog/agentic-ai-explained-from-chatbots-to-autonomous-ai-agents-in-2026)** - Agent market at 45% CAGR vs 23% for chatbots
|
|
|
|
## 12. Multimodal AI
|
|
- **[Visual cognition in multimodal LLMs](https://www.nature.com/articles/s42256-024-00963-y)** - Nature - Scaling improves perception but not reasoning; even advanced models fail at simple counting
|
|
- **[Will multimodal LLMs achieve deep understanding?](https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2025.1683133/full)** - Frontiers - Remain detached from interactive learning
|
|
- **[Compare Multimodal AI Models on Visual Reasoning](https://research.aimultiple.com/visual-reasoning/)** - AIMultiple 2026 - Fall short on causal reasoning and intuitive psychology
|
|
|
|
## 13. Training vs. Learning
|
|
- **[5 huge AI misconceptions to drop in 2026](https://www.tomsguide.com/ai/5-huge-ai-misconceptions-to-drop-now-heres-what-you-need-to-know-in-2026)** - Tom's Guide - Bias, accuracy, data privacy myths
|
|
- **[AI models collapse when trained on AI-generated data](https://www.nature.com/articles/s41586-024-07566-y)** - Nature - "Model collapse" where rare patterns disappear
|
|
- **[The State of LLMs 2025](https://magazine.sebastianraschka.com/p/state-of-llms-2025)** - Sebastian Raschka - "LLMs stopped getting smarter by training and started getting smarter by thinking"
|
|
|
|
## 14. How Researchers Study LLMs
|
|
- **[Treating LLMs like an alien autopsy](https://www.technologyreview.com/2026/01/12/1129782/ai-large-language-models-biology-alien-autopsy/)** - MIT Tech Review (Jan 2026) - "So vast and complicated that nobody quite understands what they are"
|
|
- **[Mechanistic Interpretability: Breakthrough Tech 2026](https://www.technologyreview.com/2026/01/12/1130003/mechanistic-interpretability-ai-research-models-2026-breakthrough-technologies/)** - Anthropic's work opening the black box
|
|
- **[2025: The year in LLMs](https://simonwillison.net/2025/Dec/31/the-year-in-llms/)** - Simon Willison - "Trained to produce statistically likely answers, not to assess their own confidence"
|
|
|
|
## 15. Podcast Resources
|
|
- **[Latent Space Podcast](https://podcasts.apple.com/us/podcast/large-language-model-llm-talk/id1790576136)** - Swyx & Alessio Fanelli - Deep technical coverage
|
|
- **[Practical AI](https://podcasts.apple.com/us/podcast/practical-ai-machine-learning-data-science-llm/id1406537385)** - Accessible to general audiences; good "What mattered in 2025" episode
|
|
- **[TWIML AI Podcast](https://podcasts.apple.com/us/podcast/the-twiml-ai-podcast-formerly-this-week-in-machine/id1116303051)** - Researcher interviews since 2016
|
|
|
|
---
|
|
|
|
## Top Radio Hooks (Best Audience Engagement)
|
|
|
|
1. **Taco Bell AI ordering 18,000 cups of water** - Funny, relatable failure
|
|
2. **Lawyers citing 21 fake court cases** - Serious real-world consequences
|
|
3. **34% more confident language when wrong** - Counterintuitive and alarming
|
|
4. **AI data centers rank 5th globally in energy** (between Japan and Russia) - Shocking scale
|
|
5. **2/3 of Americans think ChatGPT might be conscious** - Audience self-reflection moment
|
|
6. **"Strawberry" has how many R's?** - Interactive audience participation
|
|
7. **Million-token context but only tracks 5-10 variables** - "Bigger isn't always better" angle
|