diff --git a/ai-misconceptions-radio-segments.md b/ai-misconceptions-radio-segments.md new file mode 100644 index 0000000..59928ed --- /dev/null +++ b/ai-misconceptions-radio-segments.md @@ -0,0 +1,201 @@ +# AI Misconceptions - Radio Segment Scripts +## "Emergent AI Technologies" Episode +**Created:** 2026-02-09 +**Format:** Each segment is 3-5 minutes at conversational pace (~150 words/minute) + +--- + +## Segment 1: "Strawberry Has How Many R's?" (~4 min) +**Theme:** Tokenization - AI doesn't see words the way you do + +Here's a fun one to start with. Ask ChatGPT -- or any AI chatbot -- "How many R's are in the word strawberry?" Until very recently, most of them would confidently tell you: two. The answer is three. So why does a system trained on essentially the entire internet get this wrong? + +It comes down to something called tokenization. When you type a word into an AI, it doesn't see individual letters the way you do. It breaks text into chunks called "tokens" -- pieces it learned to recognize during training. The word "strawberry" might get split into "st," "raw," and "berry." The AI never sees the full word laid out letter by letter. It's like trying to count the number of times a letter appears in a sentence, but someone cut the sentence into random pieces first and shuffled them. + +This isn't a bug -- it's how the system was built. AI processes language as patterns of chunks, not as strings of characters. It's optimized for meaning and flow, not spelling. Think of it like someone who's amazing at understanding conversations in a foreign language but couldn't tell you how to spell half the words they're using. + +The good news: newer models released in 2025 and 2026 are starting to overcome this. Researchers are finding signs of "tokenization awareness" -- models learning to work around their own blind spots. But it's a great reminder that AI doesn't process information the way a human brain does, even when the output looks human. + +**Key takeaway for listeners:** AI doesn't read letters. It reads chunks. That's why it can write you a poem but can't count letters in a word. + +--- + +## Segment 2: "Your Calculator is Smarter Than ChatGPT" (~4 min) +**Theme:** AI doesn't actually do math -- it guesses what math looks like + +Here's something that surprises people: AI chatbots don't actually calculate anything. When you ask ChatGPT "What's 4,738 times 291?" it's not doing multiplication. It's predicting what a correct-looking answer would be, based on patterns it learned from training data. Sometimes it gets it right. Sometimes it's wildly off. Your five-dollar pocket calculator will beat it every time on raw arithmetic. + +Why? Because of that same tokenization problem. The number 87,439 might get broken up as "874" and "39" in one context, or "87" and "439" in another. The AI has no consistent concept of place value -- ones, tens, hundreds. It's like trying to do long division after someone randomly rearranged the digits on your paper. + +The deeper issue is that AI is a language system, not a logic system. It's trained to produce text that sounds right, not to follow mathematical rules. It doesn't have working memory the way you do when you carry the one in long addition. Each step of a calculation is essentially a fresh guess at what the next plausible piece of text should be. + +This is why researchers are now building hybrid systems -- AI for the language part, with traditional computing bolted on for the math. When your phone's AI assistant does a calculation correctly, there's often a real calculator running behind the scenes. The AI figures out what you're asking, hands the numbers to a proper math engine, then presents the answer in natural language. + +**Key takeaway for listeners:** AI predicts what a math answer looks like. It doesn't compute. If accuracy matters, verify the numbers yourself. + +--- + +## Segment 3: "Confidently Wrong" (~5 min) +**Theme:** Hallucination -- why AI makes things up and sounds sure about it + +This one has real consequences. AI systems regularly state completely false information with total confidence. Researchers call this "hallucination," and it's not a glitch -- it's baked into how these systems are built. + +Here's why: during training, AI is essentially taking a never-ending multiple choice test. It learns to always pick an answer. There's no "I don't know" option. Saying something plausible is always rewarded over staying silent. So the system becomes an expert at producing confident-sounding text, whether or not that text is true. + +A study published in Science found something remarkable: AI models actually use 34% more confident language -- words like "definitely" and "certainly" -- when they're generating incorrect information compared to when they're right. The less the system actually "knows" about something, the harder it tries to sound convincing. Think about that for a second. The AI is at its most persuasive when it's at its most wrong. + +This has hit the legal profession hard. A California attorney was fined $10,000 after filing a court appeal where 21 out of 23 cited legal cases were completely fabricated by ChatGPT. They looked real -- proper case names, citations, even plausible legal reasoning. But the cases never existed. And this isn't an isolated incident. Researchers have documented 486 cases worldwide of lawyers submitting AI-hallucinated citations. In 2025 alone, judges issued hundreds of rulings specifically addressing this problem. + +Then there's the Australian government, which spent $440,000 on a report that turned out to contain hallucinated sources. And a Taco Bell drive-through AI that processed an order for 18,000 cups of water because it couldn't distinguish a joke from a real order. + +OpenAI themselves admit the problem: their training process rewards guessing over acknowledging uncertainty. Duke University researchers put it bluntly -- for these systems, "sounding good is far more important than being correct." + +**Key takeaway for listeners:** AI doesn't know what it doesn't know. It will never say "I'm not sure." Treat every factual claim from AI the way you'd treat a tip from a confident stranger -- verify before you trust. + +--- + +## Segment 4: "Does AI Actually Think?" (~4 min) +**Theme:** We talk about AI like it's alive -- and that's a problem + +Two-thirds of American adults believe ChatGPT is possibly conscious. Let that sink in. A peer-reviewed study published in the Proceedings of the National Academy of Sciences found that people increasingly attribute human qualities to AI -- and that trend grew by 34% in 2025 alone. + +We say AI "thinks," "understands," "learns," and "knows." Even the companies building these systems use that language. But here's what's actually happening under the hood: the system is calculating which word is most statistically likely to come next, given everything that came before it. That's it. There's no understanding. There's no inner experience. It's a very sophisticated autocomplete. + +Researchers call this the "stochastic parrot" debate. One camp says these systems are just parroting patterns from their training data at an incredible scale -- like a parrot that's memorized every book ever written. The other camp points out that GPT-4 scored in the 90th percentile on the Bar Exam and solves 93% of Math Olympiad problems -- can something that performs that well really be "just" pattern matching? + +The honest answer is: we don't fully know. MIT Technology Review ran a fascinating piece in January 2026 about researchers who now treat AI models like alien organisms -- performing what they call "digital autopsies" to understand what's happening inside. The systems have become so complex that even their creators can't fully explain how they arrive at their answers. + +But here's why the language matters: when we say AI "thinks," we lower our guard. We trust it more. We assume it has judgment, common sense, and intention. It doesn't. And that mismatch between perception and reality is where people get hurt -- trusting AI with legal filings, medical questions, or financial decisions without verification. + +**Key takeaway for listeners:** AI doesn't think. It predicts. The words we use to describe it shape how much we trust it -- and right now, we're over-trusting. + +--- + +## Segment 5: "The World's Most Forgetful Genius" (~3 min) +**Theme:** AI has no memory and shorter attention than you think + +Companies love to advertise massive "context windows" -- the amount of text an AI can consider at once. Some models now claim they can handle a million tokens, equivalent to several novels. Sounds impressive. But research shows these systems can only reliably track about 5 to 10 pieces of information before performance degrades to essentially random guessing. + +Think about that. A system that can "read" an entire book can't reliably keep track of more than a handful of facts from it. It's like hiring someone with photographic memory who can only remember 5 things at a time. The information goes in, but the system loses the thread. + +And here's something most people don't realize: AI has zero memory between conversations. When you close a chat window and open a new one, the AI has absolutely no recollection of your previous conversation. It doesn't know who you are, what you discussed, or what you decided. Every conversation starts completely fresh. Some products build memory features on top -- saving notes about you that get fed back in -- but the underlying AI itself remembers nothing. + +Even within a single long conversation, models "forget" what was said at the beginning. If you've ever noticed an AI contradicting something it said twenty messages ago, this is why. The earlier parts of the conversation fade as new text pushes in. + +**Key takeaway for listeners:** AI isn't building a relationship with you. Every conversation is day one. And even within a conversation, its attention span is shorter than you'd think. + +--- + +## Segment 6: "Just Say 'Think Step by Step'" (~3 min) +**Theme:** The weird magic of prompt engineering + +Here's one of the strangest discoveries in AI: if you add the words "think step by step" to your question, the AI performs dramatically better. On math problems, this simple phrase more than doubles accuracy. It sounds like a magic spell, and honestly, it kind of is. + +It works because of how these systems generate text. Normally, an AI tries to jump straight to an answer -- predicting the most likely response in one shot. But when you tell it to think step by step, it generates intermediate reasoning first. Each step becomes context for the next step. It's like the difference between trying to do complex multiplication in your head versus writing out the long-form work on paper. + +Researchers call this "chain-of-thought prompting," and it reveals something fascinating about AI: the knowledge is often already in there, locked up. The right prompt is the key that unlocks it. The system was trained on millions of examples of step-by-step reasoning, so when you explicitly ask for that format, it activates those patterns. + +But there's a catch -- this only works on large models, roughly 100 billion parameters or more. On smaller models, asking for step-by-step reasoning actually makes performance worse. The smaller system generates plausible-looking steps that are logically nonsensical, then confidently arrives at a wrong answer. It's like asking someone to show their work when they don't actually understand the subject -- you just get confident-looking nonsense. + +**Key takeaway for listeners:** The way you phrase your question to AI matters enormously. "Think step by step" is the single most useful trick you can learn. But remember -- it's not actually thinking. It's generating text that looks like thinking. + +--- + +## Segment 7: "AI is Thirsty" (~4 min) +**Theme:** The environmental cost nobody talks about + +Here's a number that stops people in their tracks: if AI data centers were a country, they'd rank fifth in the world for energy consumption -- right between Japan and Russia. By the end of 2026, they're projected to consume over 1,000 terawatt-hours of electricity. That's more than most nations on Earth. + +Every time you ask ChatGPT a question, a server somewhere draws power. Not a lot for one question -- but multiply that by hundreds of millions of users, billions of queries per day, and it adds up fast. And it's not just electricity. AI is incredibly thirsty. Training and running these models requires massive amounts of water for cooling the data centers. We're talking 731 million to over a billion cubic meters of water annually -- equivalent to the household water usage of 6 to 10 million Americans. + +Here's the part that really stings: MIT Technology Review found that 60% of the increased electricity demand from AI data centers is being met by fossil fuels. So despite all the talk about clean energy, the AI boom is adding an estimated 220 million tons of carbon emissions. The irony of using AI to help solve climate change while simultaneously accelerating it isn't lost on researchers. + +A single query to a large language model uses roughly 10 times the energy of a standard Google search. Training a single large model from scratch can consume as much energy as five cars over their entire lifetimes, including manufacturing. + +None of this means we should stop using AI. But most people have no idea that there's a physical cost to every conversation, every generated image, every AI-powered feature. The cloud isn't actually a cloud -- it's warehouses full of GPUs running 24/7, drinking water and burning fuel. + +**Key takeaway for listeners:** AI has a physical footprint. Every question you ask has an energy cost. It's worth knowing that "free" AI tools aren't free -- someone's paying the electric bill, and the planet's paying too. + +--- + +## Segment 8: "Chatbots Are Old News" (~3 min) +**Theme:** The shift from chatbots to AI agents + +If 2025 was the year of the chatbot, 2026 is the year of the agent. And the difference matters. + +A chatbot talks to you. You ask a question, it gives an answer. It's reactive -- like a really smart FAQ page. An AI agent does work for you. You give it a goal, and it figures out the steps, uses tools, and executes. It can browse the web, write and run code, send emails, manage files, and chain together multiple actions to accomplish something complex. + +Here's the simplest way to think about it: a chatbot is read-only. It can create text, suggest ideas, answer questions. An agent is read-write. It doesn't just suggest you should send a follow-up email -- it writes the email, sends it, tracks whether you got a response, and follows up if you didn't. + +The market reflects this shift. The AI agent market is growing at 45% per year, nearly double the 23% growth rate for chatbots. Companies are building agents that can handle entire workflows autonomously -- scheduling meetings, managing customer service tickets, writing and deploying code, analyzing data and producing reports. + +This is where AI gets both more useful and more risky. A chatbot that hallucinates gives you bad information. An agent that hallucinates takes bad action. When an AI can actually do things in the real world -- send messages, modify files, make purchases -- the stakes of getting it wrong go way up. + +**Key takeaway for listeners:** The next wave of AI doesn't just talk -- it acts. That's powerful, but it also means the consequences of AI mistakes move from "bad advice" to "bad actions." + +--- + +## Segment 9: "AI Eats Itself" (~3 min) +**Theme:** Model collapse -- what happens when AI trains on AI + +Here's a problem nobody saw coming. As the internet fills up with AI-generated content -- articles, images, code, social media posts -- the next generation of AI models inevitably trains on that AI-generated material. And when AI trains on AI output, something strange happens: it gets worse. Researchers call it "model collapse." + +A study published in Nature showed that when models train on recursively generated data -- AI output fed back into AI training -- rare and unusual patterns gradually disappear. The output drifts toward bland, generic averages. Think of it like making a photocopy of a photocopy of a photocopy. Each generation loses detail and nuance until you're left with a blurry, indistinct mess. + +This matters because AI models need diverse, high-quality data to perform well. The best AI systems were trained on the raw, messy, varied output of billions of real humans -- with all our creativity, weirdness, and unpredictability. If future models train primarily on the sanitized, pattern-averaged output of current AI, they'll lose the very diversity that made them capable in the first place. + +Some researchers describe it as an "AI inbreeding" problem. There's now a premium on verified human-generated content for training purposes. The irony is real: the more successful AI becomes at generating content, the harder it becomes to train the next generation of AI. + +**Key takeaway for listeners:** AI needs human creativity to function. If we flood the internet with AI-generated content, we risk making future AI systems blander and less capable. Human originality isn't just nice to have -- it's the raw material AI depends on. + +--- + +## Segment 10: "Nobody Knows How It Works" (~4 min) +**Theme:** Even the people who build AI don't fully understand it + +Here's maybe the most unsettling fact about modern AI: the people who build these systems don't fully understand how they work. That's not an exaggeration -- it's the honest assessment from the researchers themselves. + +MIT Technology Review published a piece in January 2026 about a new field of AI research that treats language models like alien organisms. Scientists are essentially performing digital autopsies -- probing, dissecting, and mapping the internal pathways of these systems to figure out what they're actually doing. The article describes them as "machines so vast and complicated that nobody quite understands what they are or how they work." + +A company called Anthropic -- the makers of the Claude AI -- has made breakthroughs in what's called "mechanistic interpretability." They've developed tools that can identify specific features and pathways inside a model, mapping the route from a question to an answer. MIT Technology Review named it one of the top 10 breakthrough technologies of 2026. But even with these tools, we're still in the early stages of understanding. + +Here's the thing that's hard to wrap your head around: nobody programmed these systems to do what they do. Engineers designed the architecture and the training process, but the actual capabilities -- writing poetry, solving math, generating code, having conversations -- emerged on their own as the models grew larger. Some abilities appeared suddenly and unexpectedly at certain scales, which researchers call "emergent abilities." Though even that's debated -- Stanford researchers found that some of these supposed sudden leaps might just be artifacts of how we measure performance. + +Simon Willison, a prominent AI researcher, summarized the state of things at the end of 2025: these systems are "trained to produce the most statistically likely answer, not to assess their own confidence." They don't know what they know. They can't tell you when they're guessing. And we can't always tell from the outside either. + +**Key takeaway for listeners:** AI isn't like traditional software where engineers write rules and the computer follows them. Modern AI is more like a system that organized itself, and we're still figuring out what it built. That should make us both fascinated and cautious. + +--- + +## Segment 11: "AI Can See But Can't Understand" (~3 min) +**Theme:** Multimodal AI -- vision isn't the same as comprehension + +The latest AI models don't just read text -- they can look at images, listen to audio, and watch video. These are called multimodal models, and they seem almost magical when you first use them. Upload a photo and the AI describes it. Show it a chart and it explains the data. Point a camera at a math problem and it solves it. + +But research from Meta, published in Nature, tested 60 of these vision-language models and found a crucial gap: scaling up these models improves their ability to perceive -- to identify objects, read text, recognize faces -- but it doesn't improve their ability to reason about what they see. Even the most advanced models fail at tasks that are trivial for humans, like counting objects in an image or understanding basic physical relationships. + +Show one of these models a photo of a ball on a table near the edge and ask "will the ball fall?" and it struggles. Not because it can't see the ball or the table, but because it doesn't understand gravity, momentum, or cause and effect. It can describe what's in the picture. It can't tell you what's going to happen next. + +Researchers describe this as the "symbol grounding problem" -- the AI can match images to words, but those words aren't grounded in real-world experience. A child who's dropped a ball understands what happens when a ball is near an edge. The AI has only seen pictures of balls and read descriptions of falling. + +**Key takeaway for listeners:** AI can see what's in a photo, but it doesn't understand the world the photo represents. Perception and comprehension are very different things. + +--- + +## Suggested Episode Flow + +For a cohesive episode, consider this order: + +1. **Segment 1** (Strawberry) - Fun, accessible opener that hooks the audience +2. **Segment 2** (Math) - Builds on tokenization, deepens understanding +3. **Segment 3** (Hallucination) - The big one; real-world stakes with great stories +4. **Segment 4** (Does AI Think?) - Philosophical turn, audience reflection +5. **Segment 6** (Think Step by Step) - Practical, empowering -- gives listeners something actionable +6. **Segment 5** (Memory) - Quick, surprising facts +7. **Segment 11** (Vision) - Brief palate cleanser +8. **Segment 9** (AI Eats Itself) - Unexpected twist the audience won't see coming +9. **Segment 8** (Agents) - Forward-looking, what's next +10. **Segment 7** (Energy) - The uncomfortable truth to close on +11. **Segment 10** (Nobody Knows) - Perfect closer; leaves audience thinking + +**Estimated total runtime:** 40-45 minutes of content (before intros, outros, and transitions) diff --git a/ai-misconceptions-reading-list.md b/ai-misconceptions-reading-list.md new file mode 100644 index 0000000..b9c74c6 --- /dev/null +++ b/ai-misconceptions-reading-list.md @@ -0,0 +1,94 @@ +# AI/LLM Misconceptions Reading List +## For Radio Show: "Emergent AI Technologies" +**Created:** 2026-02-09 + +--- + +## 1. Tokenization (The "Strawberry" Problem) +- **[Why LLMs Can't Count the R's in 'Strawberry'](https://arbisoft.com/blogs/why-ll-ms-can-t-count-the-r-s-in-strawberry-and-what-it-teaches-us)** - Arbisoft - Clear explainer on how tokenization breaks words into chunks like "st", "raw", "berry" +- **[Can modern LLMs count the b's in "blueberry"?](https://minimaxir.com/2025/08/llm-blueberry/)** - Max Woolf - Shows 2025-2026 models are overcoming this limitation +- **[Signs of Tokenization Awareness in LLMs](https://medium.com/@solidgoldmagikarp/a-breakthrough-feature-signs-of-tokenization-awareness-in-llms-058fe880ef9f)** - Ekaterina Kornilitsina, Medium (Jan 2026) - Modern LLMs developing tokenization awareness + +## 2. Math/Computation Limitations +- **[Why LLMs Are Bad at Math](https://www.reachcapital.com/resources/thought-leadership/why-llms-are-bad-at-math-and-how-they-can-be-better/)** - Reach Capital - LLMs predict plausible text, not compute answers; lack working memory for multi-step calculations +- **[Why AI Struggles with Basic Math](https://www.aei.org/technology-and-innovation/why-ai-struggles-with-basic-math-and-how-thats-changing/)** - AEI - How "87439" gets tokenized inconsistently, breaking positional value +- **[Why LLMs Fail at Math & The Neuro-Symbolic AI Solution](https://www.arsturn.com/blog/why-your-llm-is-bad-at-math-and-how-to-fix-it-with-a-clip-on-symbolic-brain)** - Arsturn - Proposes integrating symbolic computing systems + +## 3. Hallucination (Confidently Wrong) +- **[Why language models hallucinate](https://openai.com/index/why-language-models-hallucinate/)** - OpenAI - Trained to guess, penalized for saying "I don't know" +- **[AI hallucinates because it's trained to fake answers](https://www.science.org/content/article/ai-hallucinates-because-it-s-trained-fake-answers-it-doesn-t-know)** - Science (AAAS) - Models use 34% more confident language when WRONG +- **[It's 2026. Why Are LLMs Still Hallucinating?](https://blogs.library.duke.edu/blog/2026/01/05/its-2026-why-are-llms-still-hallucinating/)** - Duke University - "Sounding good far more important than being correct" +- **[AI Hallucination Report 2026](https://www.allaboutai.com/resources/ai-statistics/ai-hallucinations/)** - AllAboutAI - Comprehensive stats on hallucination rates across models + +## 4. Real-World Failures (Great Radio Stories) +- **[California fines lawyer over ChatGPT fabrications](https://calmatters.org/economy/technology/2025/09/chatgpt-lawyer-fine-ai-regulation/)** - $10K fine; 21 of 23 cited cases were fake; 486 documented cases worldwide +- **[As more lawyers fall for AI hallucinations](https://cronkitenews.azpbs.org/2025/10/28/lawyers-ai-hallucinations-chatgpt/)** - Cronkite/PBS - Judges issued hundreds of decisions addressing AI hallucinations in 2025 +- **[The Biggest AI Fails of 2025](https://www.ninetwothree.co/blog/ai-fails)** - Taco Bell AI ordering 18,000 cups of water, Tesla FSD crashes, $440K Australian report with hallucinated sources +- **[26 Biggest AI Controversies](https://www.crescendo.ai/blog/ai-controversies)** - xAI exposing 300K private Grok conversations, McDonald's McHire with password "123456" + +## 5. Anthropomorphism ("AI is Thinking") +- **[Anthropomorphic conversational agents](https://www.pnas.org/doi/10.1073/pnas.2415898122)** - PNAS - 2/3 of Americans think ChatGPT might be conscious; anthropomorphic attributions up 34% in 2025 +- **[Thinking beyond the anthropomorphic paradigm](https://arxiv.org/html/2502.09192v1)** - ArXiv (Feb 2026) - Anthropomorphism hinders accurate understanding +- **[Stop Talking about AI Like It Is Human](https://epic.org/a-new-years-resolution-for-everyone-stop-talking-about-generative-ai-like-it-is-human/)** - EPIC - Why anthropomorphic language is misleading and dangerous + +## 6. The Stochastic Parrot Debate +- **[From Stochastic Parrots to Digital Intelligence](https://wires.onlinelibrary.wiley.com/doi/10.1002/wics.70035)** - Wiley - Evolution of how we view LLMs, recognizing emergent capabilities +- **[LLMs still lag ~40% behind humans on physical concepts](https://arxiv.org/abs/2502.08946)** - ArXiv (Feb 2026) - Supporting the "just pattern matching" view +- **[LLMs are Not Stochastic Parrots](https://medium.com/@freddyayala/llms-are-not-stochastic-parrots-how-large-language-models-actually-work-16c000588b70)** - Counter-argument: GPT-4 scoring 90th percentile on Bar Exam, 93% on MATH Olympiad + +## 7. Emergent Abilities +- **[Emergent Abilities in LLMs: A Survey](https://arxiv.org/abs/2503.05788)** - ArXiv (Mar 2026) - Capabilities arising suddenly and unpredictably at scale +- **[Breaking Myths in LLM scaling](https://www.sciencedirect.com/science/article/pii/S092523122503214X)** - ScienceDirect - Some "emergent" behaviors may be measurement artifacts +- **[Examining Emergent Abilities](https://hai.stanford.edu/news/examining-emergent-abilities-large-language-models)** - Stanford HAI - Smoother metrics show gradual improvements, not sudden leaps + +## 8. Context Windows & Memory +- **[Your 1M+ Context Window LLM Is Less Powerful Than You Think](https://towardsdatascience.com/your-1m-context-window-llm-is-less-powerful-than-you-think/)** - Can only track 5-10 variables before degrading to random guessing +- **[Understanding LLM performance degradation](https://demiliani.com/2025/11/02/understanding-llm-performance-degradation-a-deep-dive-into-context-window-limits/)** - Why models "forget" what was said at the beginning of long conversations +- **[LLM Chat History Summarization Guide](https://mem0.ai/blog/llm-chat-history-summarization-guide-2025)** - Mem0 - Practical solutions to memory limitations + +## 9. Prompt Engineering (Why "Think Step by Step" Works) +- **[Understanding Reasoning LLMs](https://magazine.sebastianraschka.com/p/understanding-reasoning-llms)** - Sebastian Raschka, PhD - Chain-of-thought unlocks latent capabilities +- **[The Ultimate Guide to LLM Reasoning](https://kili-technology.com/large-language-models-llms/llm-reasoning-guide)** - CoT more than doubles performance on math problems +- **[Chain-of-Thought Prompting](https://www.promptingguide.ai/techniques/cot)** - Only works with ~100B+ parameter models; smaller models produce worse results + +## 10. Energy/Environmental Costs +- **[Generative AI's Environmental Impact](https://news.mit.edu/2025/explained-generative-ai-environmental-impact-0117)** - MIT - AI data centers projected to rank 5th globally in energy (between Japan and Russia) +- **[We did the math on AI's energy footprint](https://www.technologyreview.com/2025/05/20/1116327/ai-energy-usage-climate-footprint-big-tech/)** - MIT Tech Review - 60% from fossil fuels; shocking water usage stats +- **[AI Environment Statistics 2026](https://www.allaboutai.com/resources/ai-statistics/ai-environment/)** - AllAboutAI - AI draining 731-1,125M cubic meters of water annually + +## 11. Agents vs. Chatbots (The 2026 Shift) +- **[2025 Was Chatbots. 2026 Is Agents.](https://dev.to/inboryn_99399f96579fcd705/2025-was-about-chatbots-2026-is-about-agents-heres-the-difference-426f)** - "Chatbots talk to you, agents do work for you" +- **[AI Agents vs Chatbots: The 2026 Guide](https://technosysblogs.com/ai-agents-vs-chatbots/)** - Generative AI is "read-only", agentic AI is "read-write" +- **[Agentic AI Explained](https://www.synergylabs.co/blog/agentic-ai-explained-from-chatbots-to-autonomous-ai-agents-in-2026)** - Agent market at 45% CAGR vs 23% for chatbots + +## 12. Multimodal AI +- **[Visual cognition in multimodal LLMs](https://www.nature.com/articles/s42256-024-00963-y)** - Nature - Scaling improves perception but not reasoning; even advanced models fail at simple counting +- **[Will multimodal LLMs achieve deep understanding?](https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2025.1683133/full)** - Frontiers - Remain detached from interactive learning +- **[Compare Multimodal AI Models on Visual Reasoning](https://research.aimultiple.com/visual-reasoning/)** - AIMultiple 2026 - Fall short on causal reasoning and intuitive psychology + +## 13. Training vs. Learning +- **[5 huge AI misconceptions to drop in 2026](https://www.tomsguide.com/ai/5-huge-ai-misconceptions-to-drop-now-heres-what-you-need-to-know-in-2026)** - Tom's Guide - Bias, accuracy, data privacy myths +- **[AI models collapse when trained on AI-generated data](https://www.nature.com/articles/s41586-024-07566-y)** - Nature - "Model collapse" where rare patterns disappear +- **[The State of LLMs 2025](https://magazine.sebastianraschka.com/p/state-of-llms-2025)** - Sebastian Raschka - "LLMs stopped getting smarter by training and started getting smarter by thinking" + +## 14. How Researchers Study LLMs +- **[Treating LLMs like an alien autopsy](https://www.technologyreview.com/2026/01/12/1129782/ai-large-language-models-biology-alien-autopsy/)** - MIT Tech Review (Jan 2026) - "So vast and complicated that nobody quite understands what they are" +- **[Mechanistic Interpretability: Breakthrough Tech 2026](https://www.technologyreview.com/2026/01/12/1130003/mechanistic-interpretability-ai-research-models-2026-breakthrough-technologies/)** - Anthropic's work opening the black box +- **[2025: The year in LLMs](https://simonwillison.net/2025/Dec/31/the-year-in-llms/)** - Simon Willison - "Trained to produce statistically likely answers, not to assess their own confidence" + +## 15. Podcast Resources +- **[Latent Space Podcast](https://podcasts.apple.com/us/podcast/large-language-model-llm-talk/id1790576136)** - Swyx & Alessio Fanelli - Deep technical coverage +- **[Practical AI](https://podcasts.apple.com/us/podcast/practical-ai-machine-learning-data-science-llm/id1406537385)** - Accessible to general audiences; good "What mattered in 2025" episode +- **[TWIML AI Podcast](https://podcasts.apple.com/us/podcast/the-twiml-ai-podcast-formerly-this-week-in-machine/id1116303051)** - Researcher interviews since 2016 + +--- + +## Top Radio Hooks (Best Audience Engagement) + +1. **Taco Bell AI ordering 18,000 cups of water** - Funny, relatable failure +2. **Lawyers citing 21 fake court cases** - Serious real-world consequences +3. **34% more confident language when wrong** - Counterintuitive and alarming +4. **AI data centers rank 5th globally in energy** (between Japan and Russia) - Shocking scale +5. **2/3 of Americans think ChatGPT might be conscious** - Audience self-reflection moment +6. **"Strawberry" has how many R's?** - Interactive audience participation +7. **Million-token context but only tracks 5-10 variables** - "Bigger isn't always better" angle diff --git a/extract_license_plate.py b/extract_license_plate.py new file mode 100644 index 0000000..281794b --- /dev/null +++ b/extract_license_plate.py @@ -0,0 +1,237 @@ +""" +Extract and enhance license plate from Tesla dash cam video +Target: Pickup truck at 25-30 seconds +""" + +import cv2 +import numpy as np +from pathlib import Path +from PIL import Image, ImageEnhance, ImageFilter +import os + +def extract_frames_from_range(video_path, start_time, end_time, fps=10): + """Extract frames from specific time range at given fps""" + cap = cv2.VideoCapture(str(video_path)) + video_fps = cap.get(cv2.CAP_PROP_FPS) + + frames = [] + timestamps = [] + + # Calculate frame numbers for the time range + start_frame = int(start_time * video_fps) + end_frame = int(end_time * video_fps) + frame_interval = int(video_fps / fps) + + print(f"[INFO] Video FPS: {video_fps}") + print(f"[INFO] Extracting frames {start_frame} to {end_frame} every {frame_interval} frames") + + cap.set(cv2.CAP_PROP_POS_FRAMES, start_frame) + current_frame = start_frame + + while current_frame <= end_frame: + ret, frame = cap.read() + if not ret: + break + + if (current_frame - start_frame) % frame_interval == 0: + timestamp = current_frame / video_fps + frames.append(frame) + timestamps.append(timestamp) + print(f"[OK] Extracted frame at {timestamp:.2f}s (frame {current_frame})") + + current_frame += 1 + + cap.release() + return frames, timestamps + +def detect_license_plates(frame): + """Detect potential license plate regions using multiple methods""" + gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) + + # Method 1: Edge detection + contours + edges = cv2.Canny(gray, 50, 200) + contours, _ = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) + + plate_candidates = [] + + for contour in contours: + x, y, w, h = cv2.boundingRect(contour) + aspect_ratio = w / float(h) if h > 0 else 0 + area = w * h + + # License plate characteristics: aspect ratio ~2-5, reasonable size + if 1.5 < aspect_ratio < 6 and 1000 < area < 50000: + plate_candidates.append({ + 'bbox': (x, y, w, h), + 'aspect_ratio': aspect_ratio, + 'area': area, + 'score': area * aspect_ratio # Simple scoring + }) + + # Sort by score and return top candidates + plate_candidates.sort(key=lambda x: x['score'], reverse=True) + return plate_candidates[:10] # Return top 10 candidates + +def enhance_license_plate(plate_img, upscale_factor=6): + """Apply multiple enhancement techniques to license plate image""" + enhanced_versions = [] + + # Convert to PIL for some operations + plate_pil = Image.fromarray(cv2.cvtColor(plate_img, cv2.COLOR_BGR2RGB)) + + # 1. Upscale first + new_size = (plate_pil.width * upscale_factor, plate_pil.height * upscale_factor) + upscaled = plate_pil.resize(new_size, Image.Resampling.LANCZOS) + enhanced_versions.append(("upscaled", upscaled)) + + # 2. Sharpen heavily + sharpened = upscaled.filter(ImageFilter.SHARPEN) + sharpened = sharpened.filter(ImageFilter.SHARPEN) + enhanced_versions.append(("sharpened", sharpened)) + + # 3. High contrast + contrast = ImageEnhance.Contrast(sharpened) + high_contrast = contrast.enhance(2.5) + enhanced_versions.append(("high_contrast", high_contrast)) + + # 4. Brightness adjustment + brightness = ImageEnhance.Brightness(high_contrast) + bright = brightness.enhance(1.3) + enhanced_versions.append(("bright_contrast", bright)) + + # 5. Adaptive thresholding (OpenCV) + gray_cv = cv2.cvtColor(np.array(upscaled), cv2.COLOR_RGB2GRAY) + adaptive = cv2.adaptiveThreshold(gray_cv, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, + cv2.THRESH_BINARY, 11, 2) + enhanced_versions.append(("adaptive_thresh", Image.fromarray(adaptive))) + + # 6. Bilateral filter + sharpen + bilateral = cv2.bilateralFilter(np.array(upscaled), 9, 75, 75) + bilateral_pil = Image.fromarray(bilateral) + bilateral_sharp = bilateral_pil.filter(ImageFilter.SHARPEN) + enhanced_versions.append(("bilateral_sharp", bilateral_sharp)) + + # 7. Unsharp mask + unsharp = upscaled.filter(ImageFilter.UnsharpMask(radius=2, percent=200, threshold=3)) + enhanced_versions.append(("unsharp_mask", unsharp)) + + # 8. Extreme sharpening + extreme_sharp = sharpened.filter(ImageFilter.SHARPEN) + extreme_sharp = extreme_sharp.filter(ImageFilter.UnsharpMask(radius=3, percent=250, threshold=2)) + enhanced_versions.append(("extreme_sharp", extreme_sharp)) + + return enhanced_versions + +def main(): + video_path = Path("E:/TeslaCam/SavedClips/2026-02-03_19-48-23/2026-02-03_19-42-36-front.mp4") + output_dir = Path("D:/Scratchpad/pickup_truck_25-30s") + output_dir.mkdir(parents=True, exist_ok=True) + + print(f"[INFO] Processing video: {video_path}") + print(f"[INFO] Output directory: {output_dir}") + + # Extract frames from 25-30 second range at 10 fps + start_time = 25.0 + end_time = 30.0 + target_fps = 10 + + frames, timestamps = extract_frames_from_range(video_path, start_time, end_time, target_fps) + print(f"[OK] Extracted {len(frames)} frames") + + # Process each frame + all_plates = [] + + for idx, (frame, timestamp) in enumerate(zip(frames, timestamps)): + frame_name = f"frame_{timestamp:.2f}s" + + # Save original frame + frame_path = output_dir / f"{frame_name}_original.jpg" + cv2.imwrite(str(frame_path), frame) + + # Detect license plates + plate_candidates = detect_license_plates(frame) + print(f"[INFO] Frame {timestamp:.2f}s: Found {len(plate_candidates)} plate candidates") + + # Process each candidate + for plate_idx, candidate in enumerate(plate_candidates[:5]): # Top 5 candidates + x, y, w, h = candidate['bbox'] + + # Extract plate region with some padding + padding = 10 + x1 = max(0, x - padding) + y1 = max(0, y - padding) + x2 = min(frame.shape[1], x + w + padding) + y2 = min(frame.shape[0], y + h + padding) + + plate_crop = frame[y1:y2, x1:x2] + + if plate_crop.size == 0: + continue + + # Draw bounding box on original frame + frame_with_box = frame.copy() + cv2.rectangle(frame_with_box, (x, y), (x+w, y+h), (0, 255, 0), 2) + cv2.putText(frame_with_box, f"Candidate {plate_idx+1}", (x, y-10), + cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) + + # Save frame with detection box + detection_path = output_dir / f"{frame_name}_detection_{plate_idx+1}.jpg" + cv2.imwrite(str(detection_path), frame_with_box) + + # Save raw crop + crop_path = output_dir / f"{frame_name}_plate_{plate_idx+1}_raw.jpg" + cv2.imwrite(str(crop_path), plate_crop) + + # Enhance plate + enhanced_versions = enhance_license_plate(plate_crop, upscale_factor=6) + + for enhance_name, enhanced_img in enhanced_versions: + enhance_path = output_dir / f"{frame_name}_plate_{plate_idx+1}_{enhance_name}.jpg" + enhanced_img.save(str(enhance_path)) + + all_plates.append({ + 'timestamp': timestamp, + 'candidate_idx': plate_idx, + 'bbox': (x, y, w, h), + 'aspect_ratio': candidate['aspect_ratio'], + 'area': candidate['area'] + }) + + print(f"[OK] Saved candidate {plate_idx+1} from {timestamp:.2f}s (AR: {candidate['aspect_ratio']:.2f}, Area: {candidate['area']})") + + # Create summary + summary_path = output_dir / "summary.txt" + with open(summary_path, 'w') as f: + f.write("License Plate Extraction Summary\n") + f.write("=" * 60 + "\n\n") + f.write(f"Video: {video_path}\n") + f.write(f"Time Range: {start_time}-{end_time} seconds\n") + f.write(f"Frames Extracted: {len(frames)}\n") + f.write(f"Total Plate Candidates: {len(all_plates)}\n\n") + + f.write("Candidates by Frame:\n") + f.write("-" * 60 + "\n") + for plate in all_plates: + f.write(f"Time: {plate['timestamp']:.2f}s | ") + f.write(f"Candidate #{plate['candidate_idx']+1} | ") + f.write(f"Aspect Ratio: {plate['aspect_ratio']:.2f} | ") + f.write(f"Area: {plate['area']}\n") + + f.write("\n" + "=" * 60 + "\n") + f.write("Enhancement Techniques Applied:\n") + f.write("- Upscaled 6x (LANCZOS)\n") + f.write("- Heavy sharpening\n") + f.write("- High contrast boost\n") + f.write("- Brightness adjustment\n") + f.write("- Adaptive thresholding\n") + f.write("- Bilateral filtering\n") + f.write("- Unsharp masking\n") + f.write("- Extreme sharpening\n") + + print(f"\n[SUCCESS] Processing complete!") + print(f"[INFO] Output directory: {output_dir}") + print(f"[INFO] Total plate candidates processed: {len(all_plates)}") + print(f"[INFO] Summary saved to: {summary_path}") + +if __name__ == "__main__": + main() diff --git a/review_best_plates.py b/review_best_plates.py new file mode 100644 index 0000000..2913aff --- /dev/null +++ b/review_best_plates.py @@ -0,0 +1,145 @@ +""" +Identify the best license plate candidates from extraction results +Filter by ideal aspect ratio (2-5) and larger area +""" + +import re +from pathlib import Path + +def parse_summary(summary_path): + """Parse summary.txt to find best candidates""" + candidates = [] + + with open(summary_path, 'r') as f: + content = f.read() + + # Parse each candidate line + pattern = r'Time: ([\d.]+)s \| Candidate #(\d+) \| Aspect Ratio: ([\d.]+) \| Area: (\d+)' + + for match in re.finditer(pattern, content): + timestamp = float(match.group(1)) + candidate_num = int(match.group(2)) + aspect_ratio = float(match.group(3)) + area = int(match.group(4)) + + # Score candidates based on ideal license plate characteristics + # Ideal aspect ratio: 3-4.5 (most US license plates) + # Prefer larger areas (closer to camera) + ar_score = 0 + if 2.5 <= aspect_ratio <= 5.0: + # Best score for aspect ratio between 3-4.5 + if 3.0 <= aspect_ratio <= 4.5: + ar_score = 100 + else: + ar_score = 50 + + # Area score (normalize to 0-100) + area_score = min(area / 500, 100) # Scale area + + # Combined score + total_score = (ar_score * 0.6) + (area_score * 0.4) + + candidates.append({ + 'timestamp': timestamp, + 'candidate': candidate_num, + 'aspect_ratio': aspect_ratio, + 'area': area, + 'score': total_score + }) + + return candidates + +def main(): + summary_path = Path("D:/Scratchpad/pickup_truck_25-30s/summary.txt") + output_dir = Path("D:/Scratchpad/pickup_truck_25-30s") + + print("[INFO] Analyzing license plate candidates...") + candidates = parse_summary(summary_path) + + # Sort by score + candidates.sort(key=lambda x: x['score'], reverse=True) + + # Show top 20 candidates + print("\n" + "=" * 80) + print("TOP 20 LICENSE PLATE CANDIDATES") + print("=" * 80) + print(f"{'Rank':<6} {'Time':<10} {'Cand':<6} {'AR':<8} {'Area':<10} {'Score':<8} {'Files'}") + print("-" * 80) + + for idx, candidate in enumerate(candidates[:20], 1): + timestamp = candidate['timestamp'] + cand_num = candidate['candidate'] + ar = candidate['aspect_ratio'] + area = candidate['area'] + score = candidate['score'] + + # Check which files exist for this candidate + frame_name = f"frame_{timestamp:.2f}s" + base_pattern = f"{frame_name}_plate_{cand_num}_" + + # Count enhancement files + enhancement_files = list(output_dir.glob(f"{base_pattern}*.jpg")) + enhancement_count = len([f for f in enhancement_files if '_raw' not in f.name]) + + print(f"{idx:<6} {timestamp:<10.2f} {cand_num:<6} {ar:<8.2f} {area:<10} {score:<8.1f} {enhancement_count} enhanced") + + # Create recommendation file + recommendation_path = output_dir / "RECOMMENDATIONS.txt" + with open(recommendation_path, 'w') as f: + f.write("LICENSE PLATE EXTRACTION - TOP CANDIDATES\n") + f.write("=" * 80 + "\n\n") + f.write("These are the top 20 most likely license plate candidates based on:\n") + f.write("- Aspect ratio (ideal: 3.0-4.5 for US plates)\n") + f.write("- Area size (larger = closer to camera)\n\n") + f.write("REVIEW THESE FILES FIRST:\n") + f.write("-" * 80 + "\n\n") + + for idx, candidate in enumerate(candidates[:20], 1): + timestamp = candidate['timestamp'] + cand_num = candidate['candidate'] + ar = candidate['aspect_ratio'] + area = candidate['area'] + score = candidate['score'] + + f.write(f"RANK {idx}: Time {timestamp:.2f}s - Candidate #{cand_num}\n") + f.write(f" Aspect Ratio: {ar:.2f} | Area: {area} | Score: {score:.1f}\n") + f.write(f" Files to review:\n") + + frame_name = f"frame_{timestamp:.2f}s" + + # List specific enhancement files to check + enhancements = [ + f"{frame_name}_detection_{cand_num}.jpg (shows detection box on frame)", + f"{frame_name}_plate_{cand_num}_high_contrast.jpg (best for dark plates)", + f"{frame_name}_plate_{cand_num}_extreme_sharp.jpg (best for clarity)", + f"{frame_name}_plate_{cand_num}_adaptive_thresh.jpg (best for OCR)", + f"{frame_name}_plate_{cand_num}_bilateral_sharp.jpg (balanced enhancement)", + ] + + for enhancement in enhancements: + f.write(f" - {enhancement}\n") + + f.write("\n") + + f.write("\n" + "=" * 80 + "\n") + f.write("ENHANCEMENT TYPES EXPLAINED:\n") + f.write("-" * 80 + "\n") + f.write("- detection_X.jpg: Shows where the plate was detected on the frame\n") + f.write("- high_contrast.jpg: Best for dark/low-contrast plates\n") + f.write("- extreme_sharp.jpg: Best for overall clarity and readability\n") + f.write("- adaptive_thresh.jpg: Black/white threshold - best for OCR\n") + f.write("- bilateral_sharp.jpg: Noise reduction + sharpening\n") + f.write("- unsharp_mask.jpg: Professional-grade sharpening\n") + f.write("- bright_contrast.jpg: Brightness + contrast boost\n") + + print("\n[SUCCESS] Analysis complete!") + print(f"[INFO] Recommendations saved to: {recommendation_path}") + print("\n[NEXT STEPS]") + print("1. Open the output directory in File Explorer:") + print(f" {output_dir}") + print("2. Read RECOMMENDATIONS.txt for the best candidates") + print("3. Start with Rank 1, review the enhancement files listed") + print("4. The 'extreme_sharp' and 'adaptive_thresh' versions usually work best") + +if __name__ == "__main__": + main()