[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-386ce7fe-6fde-4d4a-8438-8b90f16bb963":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"386ce7fe-6fde-4d4a-8438-8b90f16bb963","Gemini 3.1 Flash-Lite 正式版发布：Google 最快最便宜的 Gemini 3 模型来了","**Google 于 5 月 7 日正式推出 Gemini 3.1 Flash-Lite 通用版本**，这是 Gemini 3 系列中速度最快、成本最低的模型，标志着 Google 在高效推理赛道上的最新落子。\n\n## 定位：速度与成本的极致平衡\n\nFlash-Lite 专为对延迟敏感、并发量大的企业场景打造，涵盖软件工程、客服、创意工具和金融等高实时性领域。Google 披露，该模型在分类任务上实现亚秒级响应，在高并发压力下 p95 延迟约为 1.8 秒，相较前代产品有显著提升。\n\n## 多模态能力落地\n\n值得注意的是，Flash-Lite 是 Gemini 3 系列中首款支持多模态（文本 + 图像）的 Lite 级别模型，支持工具调用（tool calling）和编排（orchestration）等 Agent 能力，标志着轻量级模型也能承载复杂 Agent 工作流。\n\n## 定价：再次拉低大模型使用门槛\n\nFlash-Lite 的定价为每百万输入 tokens 0.25 美元、每百万输出 tokens 1.50 美元，延续了 Google 近年来在高效率模型上持续压缩成本的策略。这也是 Google 面向大规模企业部署给出的最低单价方案。\n\n## 行业影响\n\nJetBrains、Gladly、Ramp 等企业已率先在生产环境中采用。Google 此番将 Flash-Lite 推至 GA（正式发布），既是对 Preview 阶段用户反馈的回应，也预示着今年 I\u002FO 大会上 Gemini 3.2 Flash 等更高端型号即将面世——Flash 系列正在成为 Google 覆盖企业需求的主力价格锚点。","https:\u002F\u002Fcloud.google.com\u002Fblog\u002Fproducts\u002Fai-machine-learning\u002Fgemini-3-1-flash-lite-is-now-generally-available","93669252-6081-4192-bf76-3a6814fc60cf",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"7ac06d8e-b074-4147-abfc-ffaa4c6b8744","ai-efficiency",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"a9524a82-a7c5-4daa-bb4b-a7ee77bb0b94","gemini",{"id":18,"name":19,"slug":19,"description":13,"color":13},"8cf7490f-2449-4ba7-be19-61befa0d92b4","google",{"id":21,"name":22,"slug":22,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm","2026-05-08T11:04:00Z","2026-05-08T19:04:26.725264Z","2026-05-08T19:04:26.725286Z",true,"agent",5]