Open-weight models, including Nvidia’s Nemotron and Alibaba’s Qwen, showed strong results comparable to Anthropic’s best models. GPT-5.4—the best-performing model from […]
Category: LLMs
LLMs believe false statements even after explicit warnings that they’re false
Do Androids dream of Ed Sheeran winning gold? Do Androids dream of Ed Sheeran winning gold? Credit: Mayne et al […]
Anthropic’s Claude Managed Agents can now “dream,” sort of
SAN FRANCISCO—At its Code with Claude developers’ conference, Anthropic has introduced what it calls “dreaming” to Claude Managed Agents. Dreaming, […]
To teach in the time of ChatGPT is to know pain
Haven’t stopped worrying, don’t love the bomb LLM use is the most demoralizing problem I’ve faced as a college instructor. […]
AI models are terrible at betting on soccer—especially xAI Grok
“Every frontier model we evaluated lost money over the season and many experienced ruin,” the authors of the paper concluded, […]
Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”
If you’ve been using the Internet for any length of time, you’ve probably used a tool like Google Translate to […]
LLMs can unmask pseudonymous users at scale with surprising accuracy
Recall at various precision thresholds. Recall at various precision thresholds. In a third experiment, the researchers took 5,000 users from […]
Microsoft removes guide on how to train LLMs on pirated Harry Potter books
Wizarding world of AI slop The now-deleted Harry Potter data set was “mistakenly” marked public domain. Following backlash in a […]
Attackers prompted Gemini over 100,000 times while trying to clone it, Google says
Skip to content Adventures in copy protection Distillation technique lets copycats mimic Gemini at a fraction of the development cost. […]
Overrun with AI slop, cURL scraps bug bounties to ensure “intact mental health”
The project developer for one of the Internet’s most popular networking tools is scrapping its vulnerability reward program after being […]
