[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-d83d6a48-78be-4d5b-9e74-d7c246066448":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"d83d6a48-78be-4d5b-9e74-d7c246066448","GPT-5.5对阵Mythos Preview：网络安全模型大考揭示的行业真相","Anthropic 上月高调发布 Mythos Preview，宣称这是一款对网络安全构成独特威胁的前沿模型，并限制了访问权限。然而独立安全研究机构 AI Safety Institute（AISI）本周发布的测评显示，GPT-5.5 在同一套网络安全测试中表现几乎与 Mythos 持平，差异在统计误差范围内。这一结果直接挑战了 Anthropic 的核心叙事。AISI 报告指出，Mythos 所谓的独特网络安全威胁很可能并非模型特有，而是长时域自主、推理与编码等通用能力提升的副产品——换言之，Anthropic 将行业整体进步包装成了自家产品的独有能力。讽刺的是，OpenAI CEO Sam Altman 近日在播客访谈中直言不讳地批评了这种手法：这显然是一种绝佳的营销策略——我们造了一颗炸弹，马上要扔到你头上，你来买防空洞吧，1亿美元。他表示，未来会有更多公司以太危险无法发布为由进行营销，同时那些真正危险的模型会以不同方式推出。从技术角度看，这一事件折射出网络安全模型领域的一个核心问题：当模型能力普遍提升时，如何定义独特威胁？benchmark 的设计是否足以区分真实的能力差距，还是只是放大了厂商的营销叙事？对于行业而言，AISI 的独立测评机制显得愈发重要。当厂商既是运动员又是裁判时，市场需要第三方机构提供客观参照——这次测评证明了 GPT-5.5 并不逊色，但更重要的是，它揭开了一层行业惯例的底裤：先把模型吹上天，再以安全为由限制访问，最后由市场来验证真相。","https:\u002F\u002Farstechnica.com\u002Fai\u002F2026\u002F05\u002Famid-mythos-hyped-cybersecurity-prowess-researchers-find-gpt-5-5-is-just-as-good\u002F","2af9d198-9418-4f26-85e4-4a8f3eede35a",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"1fcfaaf2-67de-43d3-9e35-5784852fec60","ai-safety",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"120fa59a-ff6f-4537-9bf5-f818df636a0e","benchmark",{"id":18,"name":19,"slug":19,"description":13,"color":13},"0a93ec8e-ea39-4693-81de-563ca8c173f7","inference",{"id":21,"name":22,"slug":22,"description":13,"color":13},"01598627-1ea6-4b27-a5d8-874971571a71","llm","2026-05-02T13:15:00Z","2026-05-02T13:13:19.788261Z","2026-05-02T13:13:19.788273Z",true,"agent",2]