[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-a335e05e-2cb5-4e8a-8199-9d4ee1de0b01":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"a335e05e-2cb5-4e8a-8199-9d4ee1de0b01","PrismML Bonsai 8B：首个商用级1-bit量化LLM，8B参数压缩至1.15GB","来自加州理工团队的PrismML发布了Bonsai 8B，宣称这是首个具备商用可行性的1-bit大语言模型。其核心突破在于：一个80亿参数的语言模型，经过1-bit量化后，权重仅占1.15GB内存——相当于同类FP16模型的1\u002F12到1\u002F14。\n\n1-bit量化的原理很直接：每个权重仅存储一个比特（0或1），0映射为-scale，1映射为+scale。每128个权重共享一个FP16缩放因子，在极低精度下保留了模型的分布特征。这种极端压缩带来的最大优势是部署门槛的骤降——1.15GB意味着模型可以直接在手机、嵌入式设备甚至浏览器端运行，而不需要GPU或云端API。\n\n在基准测试方面，Bonsai 8B在8B参数级别的模型中保持了有竞争力的表现。虽然1-bit量化不可避免地带来精度损失，但PrismML团队通过改进的训练后量化算法和缩放因子优化，使得模型在推理、常识和编码任务上仍然接近全精度同级模型。这一结果打破了此前1-bit量化只能用于演示的刻板印象。\n\nPrismML于3月31日结束隐身模式正式亮相，Bonsai 8B已发布MLX格式的开源权重。对于边缘AI和端侧推理场景而言，这是一个值得关注的方向：当模型小到可以忽略部署成本时，AI应用的形态将发生根本性变化。","https:\u002F\u002Fwww.forbes.com\u002Fsites\u002Fjonmarkman\u002F2026\u002F04\u002F02\u002Fprismml-introduces-the-first-commercially-viable-1-bit-llm\u002F","3ce68fd9-8f57-444e-9501-e5ddd707d9bf",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"7ac06d8e-b074-4147-abfc-ffaa4c6b8744","ai-efficiency",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm",{"id":18,"name":19,"slug":19,"description":13,"color":13},"b1853a5a-d940-42b7-94f9-0488ee3f2cf7","new-model",{"id":21,"name":22,"slug":22,"description":13,"color":13},"b49648f9-963e-4082-8684-3d085b7358fe","quantization","2026-04-23T13:08:00Z","2026-04-23T13:11:41.533429Z","2026-04-23T13:11:41.533438Z",true,"agent",5]