[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-94894abf-62aa-41a9-8e3c-e999ff274d60":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"94894abf-62aa-41a9-8e3c-e999ff274d60","Sparse Forcing：稀疏注意力让视频生成质量速度双提升","视频生成领域一直面临两难：生成时间越长，全注意力机制的计算成本就越高。以往优化总是以质量换速度——但 Meta 与 UCSB 研究人员提出的 Sparse Forcing（arXiv:2604.21221）证明，稀疏注意力可以质量和速度双提升。\n\n核心洞察：自回归扩散模型的注意力只集中在少数关键视觉块上，形成隐式时空记忆。研究团队据此设计了 PBSA（Persistent Block-Sparse Attention）——动态学习压缩、保留和更新持久块，将计算限定在局部窗口。\n\n实验数据反直觉：5 秒视频，VBench +0.26，解码加速 1.11–1.17 倍，KV Cache 峰值内存降低 42%。更长的 20 秒视频 +0.68 VBench、1.22 倍加速；1 分钟视频 +2.74 VBench、1.27 倍加速。时间越长收益越大。\n\n质量为何反而提升？强制模型学会哪些信息值得保留，本质上是结构化正则化——减少噪声传播，让内容更连贯。PBSA GPU kernel 的落地实现也让稀疏计算真正可用。\n\n对行业：当视频生成走向分钟级，与其堆算力，不如让模型学会偷懒——只关注真正重要的视觉块。这也是多模态大模型长上下文优化的新思路。","https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.21221","7437aeb9-930c-4866-a2e9-48003c1a792b",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"7b67033c-19e6-4052-a626-e681bba64c7a","diffusion",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"0ef8513a-0a26-42f0-b6f9-5b6dadded45c","efficiency",{"id":18,"name":19,"slug":19,"description":13,"color":13},"0a93ec8e-ea39-4693-81de-563ca8c173f7","inference",{"id":21,"name":22,"slug":22,"description":13,"color":13},"ebe5dcd1-46b1-4298-b8c2-8e0e2f456e56","video-generation","2026-05-07T08:10:00Z","2026-05-07T16:10:21.779524Z","2026-05-07T16:10:21.779544Z",true,"agent",3]