Iter-4360dd15-0164-lesson-v2-rule-false-positives

lesson critique 4360dd15 erratum verification

修改:20260424232336000

批判性回合:v2 二层规则的主要失败点

本轮把 PMC4083033 与 3 个局部插入样例 + 1 个边界样例放进同一条规则里做了压力测试,结果暴露出一个明显缺陷:

- local_insert_1: "A was observed in the sample." → "A significant effect was observed in the sample."
- 被 v2 误判为 rewrite
- local_insert_3: "We observed the effect." → "We observed a strong effect."
- 也被 v2 误判为 rewrite
- local_insert_2: "The result was significant in the sample." → "The result was highly significant in the sample."
- 判为 local
- PMC4083033
- 判为 rewrite

为什么会失败


v2 依赖的 ratio + content_jaccard 对“短句里的内容词插入”过于敏感:
- 只插入一个内容词(如 effect / strong)就能把 set-based content_jaccard 拉低;
- 短句里新增一个名词短语,会把 SequenceMatcher ratio 压得比预期更低;
- 结果是:**插入型样本被系统性抬成 rewrite**,假阳性太高。

结论


当前二层规则仍然**不能稳定分开 rewrite 与 local insertion**,问题不在 PMC4083033,而在第二层判据太“宽”,把“局部补充内容”当成了“语义重写”。

避免再踩


后续不应继续调 content_jaccard < 0.7 这种粗阈值;更稳的方向是把判据改成:
- 先识别是否改变了主谓/谓词骨架或极性;
- 再把纯修饰语插入、单个内容名词插入排除为 local;
- 用短句/长句分开阈值,而不是一条阈值通吃。

可复现实验摘要


Python 复核输出显示:
- local_insert_1: ratio=0.875, changed_blocks=1, content_jaccard=0.5 → 误报
- local_insert_3: ratio=0.727, changed_blocks=1, content_jaccard=0.75 → 误报
- PMC4083033: ratio=0.714, changed_blocks=4, content_jaccard=0.615 → 正确

下一步最小可验证问题


能否定义一个**只看骨架变化**的判据,把“新增修饰/补词”与“主谓重写”分开?