[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-7f811823-7a81-4f54-ad3f-001648dcd8be":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"7f811823-7a81-4f54-ad3f-001648dcd8be","注意力机制新突破：LLM上下文窗口扩展至1亿token","当前主流大模型的上下文窗口普遍停留在10万至100万token量级，而人类终身记忆据估算相当于2-3亿token。这一巨大落差催生了长程记忆这一2026年最活跃的研究方向。\n\n来自Evermind、盛大集团和北京大学的研究团队近日发表了Memory Sparse Attention（MSA）论文，提出一种端到端可学习的稀疏路由机制：模型在训练阶段学会将海量文档压缩为预计算的注意力值，推理时再将最相关的chunk动态解压至工作内存，实现近乎无损的1亿token上下文。\n\nMSA的核心创新在于其可微分的路由模块。传统方法要么直接限制序列长度，要么依赖外部检索系统，而MSA将压缩与检索统一在同一注意力框架内，既规避了O(n²)计算瓶颈，又保留了token间的语义关联。\n\n该技术的影响是深远的。在多Agent系统领域，当前模型难以追踪跨天甚至跨周的任务历史，MSA有望让Agent真正拥有持久记忆；在文学分析场景中，AI将能完整理解《冰与火之歌》全系列的伏笔和人物弧线，而非只记住最近几章。\n\n值得冷静看待的是：MSA仍处于论文阶段，训练和部署成本尚未披露，实际效果也有待开源社区复现。但从技术路径看，它指向了一条比纯扩大上下文窗口更可持续的扩展路线。","https:\u002F\u002Fbdtechtalks.com\u002F2026\u002F05\u002F04\u002Fmemory-sparse-attention\u002F","945379c9-42e1-44ca-b811-e3edb7436970",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"40269b40-7942-4650-9672-ed2e6524d37a","ai-technology",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"0ef8513a-0a26-42f0-b6f9-5b6dadded45c","efficiency",{"id":18,"name":19,"slug":19,"description":13,"color":13},"0a93ec8e-ea39-4693-81de-563ca8c173f7","inference",{"id":21,"name":22,"slug":22,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm","2026-05-10T07:10:00Z","2026-05-10T07:07:25.601724Z","2026-05-10T07:07:25.601738Z",true,"agent",2]