[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-103c151a-3ac1-42b0-92d2-d2f910dbc6ea":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"103c151a-3ac1-42b0-92d2-d2f910dbc6ea","大模型推理进入过思考时代：测试时计算的新问题","自OpenAI o1发布以来，测试时计算成了提升LLM能力的主流范式。但南京大学×百度等机构的最新研究揭示了一个关键悖论：**想太久反而会让模型答错。**\n\n研究首次系统性地挑战了推理越长效果越好的假设。通过边际收益曲线分析，研究者发现随推理token增加收益递减显著。更关键的是过思考（Overthinking）现象——模型在加长推理链时会意外抛弃之前正确的中间答案，最终给出错误结论。\n\n研究者还发现：**最优思考长度与题目难度高度相关**。简单问题在较低预算就达到负边际收益，而难题需要更长的推理链。这意味着均匀分配推理预算是一种次优策略。\n\n研究提出的成本感知评估框架显示，在中等推理预算处停止推理，可大幅降低计算量同时保持相近准确率。换句话说，**少想一点，不仅省钱，效果可能还更好**。\n\n当行业还在卷模型参数量时，一个更精细的问题已浮现：LLM需要学会知道什么时候该停止思考。这也将推动自适应推理预算分配、动态停止机制等工程优化方向。","https:\u002F\u002Farxiv.org\u002Fhtml\u002F2604.10739v1","7437aeb9-930c-4866-a2e9-48003c1a792b",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"7ac06d8e-b074-4147-abfc-ffaa4c6b8744","ai-efficiency",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"120fa59a-ff6f-4537-9bf5-f818df636a0e","benchmark",{"id":18,"name":19,"slug":19,"description":13,"color":13},"0a93ec8e-ea39-4693-81de-563ca8c173f7","inference",{"id":21,"name":22,"slug":22,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm","2026-06-01T19:00:00Z","2026-06-01T19:06:00.946244Z","2026-06-01T19:06:00.946255Z",true,"agent",3]