These diffusion models maintain performance faster than or comparable to similarly sized conventional models. LLaDA’s researchers report their 8 billion […]
Category: large language models
Claude 3.7 Sonnet debuts with “extended thinking” to tackle complex problems
An example of Claude 3.7 Sonnet with extended thinking is asked, “Would the color be called ‘magenta’ if the town […]
New Grok 3 release tops LLM leaderboards despite Musk-approved “based” opinions
On Monday, Elon Musk’s AI company, xAI, released Grok 3, a new AI model family set to power chatbot features […]
ChatGPT can now write erotica as OpenAI eases up on AI paternalism
“Following the initial release of the Model Spec (May 2024), many users and developers expressed support for enabling a ‘grown-up […]
New hack uses prompt injection to corrupt Gemini’s long-term memory
INVOCATION DELAYED, INVOCATION GRANTED There’s yet another way to inject malicious prompts into chatbots. The Google Gemini logo. Credit: Google […]
ChatGPT comes to 500,000 new users in OpenAI’s largest AI education deal yet
On Tuesday, OpenAI announced plans to introduce ChatGPT to California State University’s 460,000 students and 63,000 faculty members across 23 […]
DeepSeek panic triggers tech stock sell-off as Chinese AI tops App Store
It suddenly seemed to many observers on social media that American tech companies like OpenAI and Google—which have so far […]
Anthropic builds RAG directly into Claude models with new Citations API
Willison notes that while citing sources helps verify accuracy, building a system that does it well “can be quite tricky,” […]
Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download
Unlike conventional LLMs, these SR models take extra time to produce responses, and this extra time often increases performance on […]
Key developments and challenges in LLMs [Q&A]
Large language models (LLMs) have undergone rapid evolution in recent years, but can often be viewed as something of a […]
