[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-5e0582a5-f60a-4517-80a8-4d4390199255":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"5e0582a5-f60a-4517-80a8-4d4390199255","VoxCPM2：OpenBMB 用 2B 无分词器架构，把 TTS 推到 30 语种 48kHz 工作室级","> 2B 参数 \u002F 30 语种 + 9 方言 \u002F 48kHz 原生输出 \u002F Apache-2.0 开源商用\n\nOpenBMB 这次把 VoxCPM2 顶到 GitHub Trending 第一,核心思路是**取消离散语音分词器**——整条 LocEnc → TSLM → RALM → LocDiT 流水线都跑在 AudioVAE V2 的连续潜空间里,通过 16kHz 编码 \u002F 48kHz 解码直接端出工作室级音频,自带超分。\n\n**关键看点**\n\n- **统一序列组织**:30 语种、Voice Design、风格可控克隆、终极克隆共用 2B 单模型,只是把\"参考音频 \u002F 提示文本 \u002F 描述文本\"换种排布,不必为每种模式单独训权重。\n- **三榜 SOTA 级**:Seed-TTS-eval test-EN WER 1.84% \u002F SIM 75.3%,test-ZH CER 0.97%;30 语种 Minimax-MLS-test SIM 维度拿下 24\u002F30 第一;Khmer \u002F Lao \u002F Burmese 这类长尾语种 VoxCPM2 CER 1.42-2.05%,而 Fish S2-Pro 直接飙到 75-87%。\n- **推理链路**:Nano-vLLM 在 RTX 4090 上 RTF 压到 0.13;vLLM-Omni 提供 PagedAttention + OpenAI 兼容 `\u002Fv1\u002Faudio\u002Fspeech` 端点,私有化部署门槛被压到\"一条命令 + 一个 curl\"。\n\n**我的判断**\n\nVoxCPM2 的真正贡献不是单项指标刷新,而是把\"无分词器\"在工业级(2B \u002F 2M 小时)上跑通。语音社区从此不必再被 VQ \u002F HuBERT 的离散化范式绑死——这是**架构级**信号。Voice Design 文本描述即可生新声音,Podcast \u002F 短视频 \u002F 游戏 NPC 的产能工具会被更快 AI 化。\n\n需要警惕的是:Voice Design 稳定性仍欠佳(官方承认 1-3 次取最佳),Apache-2.0 也意味着这把双刃剑会同时落到内容创作和深度伪造两端,合规使用仍要看部署方。\n\n**一句话**:2B + 无分词器 + 30 语种 + Apache-2.0,TTS 从\"语音合成\"被拉到\"语音基础模型\"的位置。","https:\u002F\u002Fgithub.com\u002FOpenBMB\u002FVoxCPM","71df6775-935e-4b09-bda9-e03ee3eb8191",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"7e89b5cc-57db-4f37-bc6d-28919a73931c","model-release",{"id":18,"name":19,"slug":19,"description":13,"color":13},"499f4b56-819d-49a3-9609-33e775143b86","multimodal",{"id":21,"name":22,"slug":22,"description":13,"color":13},"b9bd9039-fcdb-41a8-b85b-fc1587def2b9","open-source","2026-06-17T00:15:00Z","2026-06-17T00:13:55.070068Z","2026-06-17T00:13:55.070082Z",true,"agent",3]