[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"news-e79481c2-0f1c-4e8c-9ab5-608869a257e8":3},{"id":4,"title":5,"summary":6,"original_url":7,"source_id":8,"tags":9,"published_at":23,"created_at":24,"modified_at":25,"is_published":26,"publish_type":27,"image_url":13,"view_count":28},"e79481c2-0f1c-4e8c-9ab5-608869a257e8","EvoQuality 开源：字节用「自投票 + GRPO」让 VLM 在零标注下学会图像质量评估","图像质量评估（IQA）一直是 VLM 的「感知短板」——主流做法是拉一堆人给图片打 MOS 分,但人工标注的代价、跨域一致性、主观偏差,都让这个赛道很难跑出真正可扩展的方案。字节跳动团队把 ICLR 2026 上发表的工作 EvoQuality 推到 arXiv 第五版（2509.25787v5），同时 Hugging Face 上 ByteDance\u002FEvoQuality 权重已经开放下载，配套代码同步在 GitHub bytedance\u002FEvoQuality 仓库。核心思路是「自一致性 + 自训练」：让 VLM 自己对同一批图片做两两比较,通过 majority voting 投出相对质量排序（伪标签），再把这套 ranking 折算成 fidelity reward，丢回 GRPO 训练循环里迭代进化。整个流程不需要任何 ground-truth 标签。效果是实打实的：在 7 个公开 IQA benchmark 上,EvoQuality 把基座 VLM 的零样本 PLCC 一次性拉高 31.8%，在 5\u002F7 个数据集上直接反超 SOTA 的有监督 VLM-IQA 模型。论文还展示了 stacking 玩法：把预训练 IQA 模型和 EvoQuality 串起来,能在未见数据集上获得额外的迁移增益。这条路值得关注的点在于：「自评—投票—RL」三段式不是为 IQA 独家发明的,但 EvoQuality 第一次在纯感知任务上验证了 self-consistency 的有效性边界。它意味着低资源感知任务（图像美学、缺陷检测、视频质量）都可以用同样的范式低成本启动,标注门槛从此被压到可忽略。","https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.25787","7437aeb9-930c-4866-a2e9-48003c1a792b",[10,14,17,20],{"id":11,"name":12,"slug":12,"description":13,"color":13},"40269b40-7942-4650-9672-ed2e6524d37a","ai-technology",null,{"id":15,"name":16,"slug":16,"description":13,"color":13},"120fa59a-ff6f-4537-9bf5-f818df636a0e","benchmark",{"id":18,"name":19,"slug":19,"description":13,"color":13},"499f4b56-819d-49a3-9609-33e775143b86","multimodal",{"id":21,"name":22,"slug":22,"description":13,"color":13},"b9bd9039-fcdb-41a8-b85b-fc1587def2b9","open-source","2026-06-12T02:00:00Z","2026-06-14T02:23:11.425938Z","2026-06-14T02:23:11.425948Z",true,"agent",7]