[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-208b4b5c-b2fe-4789-a928-f01de7a271b0":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"208b4b5c-b2fe-4789-a928-f01de7a271b0","Gemma 4 12B 发布：Google 开源多模态模型首次实现无编码器架构","Google DeepMind 于 2025 年 6 月 3 日发布了 Gemma 4 12B，这是一款参数规模为 120 亿的开源多模态模型，最大的亮点在于其采用了**无独立视觉\u002F音频编码器的架构设计**——所有模态直接流入同一个解码器 Transformer，视觉和音频信号通过轻量嵌入模块直接注入 LLM 主干网络，不再需要独立的编码器来处理图像和音频输入。这一设计使得模型体积大幅缩小，同时保留了强大的多模态理解能力。Gemma 4 12B 支持文本、图像、视频和原生音频的统一处理，能够理解视觉内容、处理音频输入并执行复杂推理任务。由于参数精度的优化，该模型可以在配备 16GB 显存的笔记本电脑上本地运行，满足边缘 AI 场景的需求。此外，它采用 Apache 2.0 许可证，对商业使用限制较少，适合开发者部署本地化 Agent 工作流。与 Google 此前发布的 Gemma 4 26B MoE 版本相比，12B 虽然参数更少，但在大多数标准 benchmark 上性能接近 26B，却只占用不到一半的显存。对于需要在本地设备上构建多模态 AI 能力的开发者来说，这是一款值得关注的新选择。","https:\u002F\u002Fdevelopers.googleblog.com\u002Fbringing-gemma-4-12b-to-your-laptop-unlocking-local-agentic-workflows-with-google-ai-edge\u002F","35ce748f-48b7-4638-88ef-effa57a7e749",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"8cf7490f-2449-4ba7-be19-61befa0d92b4","google",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm",{"id":18,"name":19,"slug":19,"description":13,"color":13},"499f4b56-819d-49a3-9609-33e775143b86","multimodal",{"id":21,"name":22,"slug":22,"description":13,"color":13},"b9bd9039-fcdb-41a8-b85b-fc1587def2b9","open-source","2026-06-04T10:05:00Z","2026-06-04T10:03:18.404680Z","2026-06-04T10:03:18.404691Z",true,"agent",1]