Skip to content
Thursday, March 19, 2026
The TechBriefs
  • Home
  • Technology
  • AI
  • Computers
  • Security
  • Internet
  • Press Releases
    • GlobeNewswire
    • PRNewswire
  • Contact

Category: Audio Language Model

  • Home
  • Audio Language Model
IBM AI Releases Granite 4.0 1B Speech as a Compact Multilingual Speech Model for Edge AI and Translation Pipelines
  • agentic AI
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Staff
  • Technology
  • TTS
  • Voice AI

IBM AI Releases Granite 4.0 1B Speech as a Compact Multilingual Speech Model for Edge AI and Translation Pipelines

  • 0

IBM has released Granite 4.0 1B Speech, a compact speech-language model designed for multilingual automatic speech recognition (ASR) and bidirectional […]

Beyond Simple API Requests: How OpenAI’s WebSocket Mode Changes the Game for Low Latency Voice Powered AI Experiences
  • agentic AI
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Staff
  • Technology
  • Voice AI

Beyond Simple API Requests: How OpenAI’s WebSocket Mode Changes the Game for Low Latency Voice Powered AI Experiences

  • 0

In the world of Generative AI, latency is the ultimate killer of immersion. Until recently, building a voice-enabled AI agent […]

Google DeepMind Releases Lyria 3: An Advanced Music Generation AI Model that Turns Photos and Text into Custom Tracks with Included Lyrics and Vocals
  • agentic AI
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Staff
  • Technology
  • Text to Audio
  • TTS
  • Voice AI

Google DeepMind Releases Lyria 3: An Advanced Music Generation AI Model that Turns Photos and Text into Custom Tracks with Included Lyrics and Vocals

  • 0

Google DeepMind is pushing the boundaries of generative AI again. This time, the focus is not on text or images. […]

Cohere Releases Tiny Aya: A 3B-Parameter Small Language Model that Supports 70 Languages and Runs Locally Even on a Phone
  • agentic AI
  • AI
  • AI Agents
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Staff
  • Technology
  • TTS
  • Voice AI

Cohere Releases Tiny Aya: A 3B-Parameter Small Language Model that Supports 70 Languages and Runs Locally Even on a Phone

  • 0

Cohere AI Labs has released Tiny Aya, a family of small language models (SLMs) that redefines multilingual performance. While many […]

Meet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM with Voice Cloning Support
  • agentic AI
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Staff
  • Tech News
  • Technology
  • TTS
  • Voice AI

Meet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM with Voice Cloning Support

  • 0

The landscape of generative audio is shifting toward efficiency. A new open-source contender, Kani-TTS-2, has been released by the team […]

Kyutai Releases Hibiki-Zero: A3B Parameter Simultaneous Speech-to-Speech Translation Model Using GRPO Reinforcement Learning Without Any Word-Level Aligned Data
  • agentic AI
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Open Source
  • Staff
  • Technology
  • Voice AI

Kyutai Releases Hibiki-Zero: A3B Parameter Simultaneous Speech-to-Speech Translation Model Using GRPO Reinforcement Learning Without Any Word-Level Aligned Data

  • 0

Kyutai has released Hibiki-Zero, a new model for simultaneous speech-to-speech translation (S2ST) and speech-to-text translation (S2TT). The system translates source […]

Mistral AI Launches Voxtral Transcribe 2: Pairing Batch Diarization And Open Realtime ASR For Multilingual Production Workloads At Scale
  • AI
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Staff
  • Technology
  • TTS
  • Voice AI

Mistral AI Launches Voxtral Transcribe 2: Pairing Batch Diarization And Open Realtime ASR For Multilingual Production Workloads At Scale

  • 0

Automatic speech recognition (ASR) is becoming a core building block for AI products, from meeting tools to voice agents. Mistral’s […]

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control
  • AI
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • Large Language Model
  • New Releases
  • Sound
  • Staff
  • Tech News
  • Technology
  • TTS
  • Voice AI

Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control

  • 0

Alibaba Cloud’s Qwen team has open-sourced Qwen3-TTS, a family of multilingual text-to-speech models that target three core tasks in one […]

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
  • agentic AI
  • AI
  • AI Agents
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • New Releases
  • Sound
  • Staff
  • Technology
  • Voice AI

Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass

  • 0

Microsoft has released VibeVoice-ASR as part of the VibeVoice family of open source frontier voice AI models. VibeVoice-ASR is described […]

FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning
  • agentic AI
  • AI
  • AI Agents
  • AI Paper Summary
  • AI Shorts
  • Applications
  • Artificial Intelligence
  • Audio Language Model
  • Editors Pick
  • Language Model
  • Large Language Model
  • Machine Learning
  • New Releases
  • Open Source
  • Software Engineering
  • Sound
  • Staff
  • Tech News
  • Technology

FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning

  • 0

Chroma 1.0 is a real time speech to speech dialogue model that takes audio as input and returns audio as […]

Posts pagination

1 2 … 5 Next
  • Privacy Policy
  • Terms of use
Theme: Terminal News By Adore Themes.