Skip to content

Commit 97ceaa9

Browse files
committed
Add sample-driven low AIGC workflow
1 parent d958420 commit 97ceaa9

17 files changed

Lines changed: 849 additions & 62 deletions

README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Use $software-thesis-docx to turn my project repo into a structured thesis manif
3333
Use $software-thesis-docx to read my school Word template, extract a custom style preset, and build the thesis in that format.
3434
Use $software-thesis-docx to generate Mermaid architecture and sequence diagrams for my thesis from the repository structure.
3535
Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only rewrite the flagged single-run paragraphs if I approve it.
36+
Use $software-thesis-docx to lower AIGC for my thesis and switch to explicit_low_aigc mode only for the paragraphs I authorize.
3637
```
3738

3839
## What It Adds
@@ -42,7 +43,7 @@ Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only r
4243
- Manifest-driven DOCX build with optional `formatting` config
4344
- Mermaid request contracts for architecture, sequence, ER, state, gantt, class, and mind-map diagrams
4445
- Optional rigorous-writing subagent mode for complex thesis tasks, default off
45-
- Local AIGC risk checking and conservative paragraph-level reduction workflow, default off
46+
- Sample-driven AIGC risk checking and dual-profile paragraph-level reduction workflow, default off and defaulting to `academic_safe`
4647

4748
## Compatibility
4849

@@ -54,14 +55,16 @@ Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only r
5455

5556
- Root-level distribution files: `README`, release notes, and one-click installers
5657
- The actual reusable skill at `skills/software-thesis-docx/`
57-
- Five reusable DOCX-oriented scripts:
58+
- Six user-facing scripts and one shared helper:
5859
- `build_docx_from_manifest.py`
5960
- `extract_docx_style_preset.py`
6061
- `replace_images_by_caption.py`
6162
- `rewrite_paragraphs.py`
6263
- `check_aigc_risk.py`
64+
- `rewrite_low_aigc_docx.py`
65+
- `aigc_utils.py`
6366
- Public examples for manifests, workflow options, Mermaid requests, image maps, and paragraph rewrites
64-
- Reference docs for formatting presets, Mermaid planning, option intake, AIGC review, and repo-to-thesis workflow
67+
- Reference docs for formatting presets, Mermaid planning, option intake, AIGC review, a low-AIGC playbook, and repo-to-thesis workflow
6568

6669
## Repository Layout
6770

@@ -95,6 +98,7 @@ Inside that folder you will find:
9598
## Documentation
9699

97100
- [Codex quick start](docs/codex-quickstart.md)
101+
- [v0.4.0 release notes](docs/releases/v0.4.0.md)
98102
- [v0.3.0 release notes](docs/releases/v0.3.0.md)
99103
- [v0.2.0 release notes](docs/releases/v0.2.0.md)
100104
- [v0.1.0 release notes](docs/releases/v0.1.0.md)
@@ -107,13 +111,14 @@ Inside that folder you will find:
107111
- Prefer caption-based figure replacement over position-based image replacement
108112
- Prefer exact paragraph rewrites when Word layout fidelity matters
109113
- Keep Mermaid generation and AIGC review opt-in, not forced defaults
114+
- Keep `explicit_low_aigc` behind an explicit user request; otherwise stay in `academic_safe`
110115

111116
## Limits
112117

113118
- Mermaid support currently generates code and file outputs, not rendered images or automatic DOCX insertion
114119
- The paragraph rewrite tool only handles exact full-paragraph matches
115120
- Mixed-format, multi-run paragraphs should be inspected before automated edits
116-
- The AIGC checker is a local heuristic pass, not a guarantee of any third-party score
121+
- The AIGC checker is a sample-driven local heuristic pass, not a guarantee of any third-party score
117122
- Institution-specific formatting rules may still require manifest or preset extensions
118123

119124
## Roadmap

README.zh-CN.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Use $software-thesis-docx to turn my project repo into a structured thesis manif
3333
Use $software-thesis-docx to read my school Word template, extract a custom style preset, and build the thesis in that format.
3434
Use $software-thesis-docx to generate Mermaid architecture and sequence diagrams for my thesis from the repository structure.
3535
Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only rewrite the flagged single-run paragraphs if I approve it.
36+
Use $software-thesis-docx to lower AIGC for my thesis and switch to explicit_low_aigc mode only for the authorized paragraphs.
3637
```
3738

3839
## 这次新增了什么
@@ -42,7 +43,7 @@ Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only r
4243
- 支持 `formatting` 配置的 manifest 驱动 DOCX 构建
4344
- 面向论文插图的 Mermaid 请求契约,支持架构图、时序图、ER 图、状态图、甘特图、类图和脑图
4445
- 可选的严谨写作 subagent 模式,默认关闭
45-
- 本地 AIGC 风险检查与保守的定向降重流程,默认关闭
46+
- 基于真实样本抽象出的 AIGC 风险检查与双模式降重流程,默认关闭且默认走 `academic_safe`
4647

4748
## 兼容性
4849

@@ -54,14 +55,16 @@ Use $software-thesis-docx to run an AIGC risk check on my thesis DOCX and only r
5455

5556
- 根目录分发文件:`README`、版本说明、一键安装脚本
5657
- 真实可复用的 skill:`skills/software-thesis-docx/`
57-
- 5 个可复用脚本
58+
- 6 个用户可直接运行的脚本与 1 个共享 helper
5859
- `build_docx_from_manifest.py`
5960
- `extract_docx_style_preset.py`
6061
- `replace_images_by_caption.py`
6162
- `rewrite_paragraphs.py`
6263
- `check_aigc_risk.py`
64+
- `rewrite_low_aigc_docx.py`
65+
- `aigc_utils.py`
6366
- 公开示例:manifest、workflow options、Mermaid 请求、图片映射、段落改写
64-
- 方法论文档:格式 preset、Mermaid、交互 intake、AIGC 评估、repo-to-thesis 工作流
67+
- 方法论文档:格式 preset、Mermaid、交互 intake、AIGC 评估、低 AIGC playbook、repo-to-thesis 工作流
6568

6669
## 仓库结构
6770

@@ -95,6 +98,7 @@ https://github.com/Jonnys-Li/software-thesis-docx-skill/tree/main/skills/softwar
9598
## 文档
9699

97100
- [Codex 快速上手](docs/codex-quickstart.zh-CN.md)
101+
- [v0.4.0 版本说明](docs/releases/v0.4.0.md)
98102
- [v0.3.0 版本说明](docs/releases/v0.3.0.md)
99103
- [v0.2.0 版本说明](docs/releases/v0.2.0.md)
100104
- [v0.1.0 版本说明](docs/releases/v0.1.0.md)
@@ -107,13 +111,14 @@ https://github.com/Jonnys-Li/software-thesis-docx-skill/tree/main/skills/softwar
107111
- 图片替换优先按图注定位,不按“第几张图”这种脆弱逻辑
108112
- 文本改写优先整段精确匹配,减少对 Word 格式的破坏
109113
- Mermaid 与 AIGC 评估都是显式开关,不默认强制开启
114+
- 默认只做保守 AIGC 检查,只有用户明确提出“降低 AIGC”时才启用 `explicit_low_aigc`
110115

111116
## 使用边界
112117

113118
- Mermaid 当前只负责生成代码或 `.mmd` 文件,不负责自动渲染图片或自动插入 DOCX
114119
- 段落改写脚本只适合“整段全文精确命中”的场景
115120
-`run`、混合格式、修订痕迹较多的段落需要先人工检查
116-
- AIGC 检测是本地启发式评估,不承诺对齐第三方平台分值
121+
- AIGC 检测是样本驱动的本地启发式评估,不承诺对齐第三方平台分值
117122
- 学校格式要求如果更特殊,通常需要扩展 manifest 或 preset 逻辑
118123

119124
## 下一步

docs/codex-quickstart.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ The important part is that `SKILL.md` stays at the root of the installed skill.
4747
- custom style preset extraction from `.docx` templates
4848
- Mermaid planning contracts
4949
- optional subagent rigorous-writing mode
50-
- optional AIGC risk checking and conservative reduction workflow
50+
- optional AIGC risk checking and dual-profile reduction workflow
5151

5252
## 4. Use The Skill In Codex
5353

@@ -63,6 +63,7 @@ Typical prompts:
6363
- `Use $software-thesis-docx to read my Word template, extract a style preset, and build the thesis in that format.`
6464
- `Use $software-thesis-docx to generate Mermaid flowchart and sequenceDiagram code for my thesis based on the repo architecture.`
6565
- `Use $software-thesis-docx to run an AIGC risk review on my thesis DOCX and only rewrite the flagged single-run paragraphs after showing me the report.`
66+
- `Use $software-thesis-docx to lower AIGC for my thesis, keep academic_safe by default, and only switch to explicit_low_aigc if I explicitly ask for it.`
6667

6768
## 5. Optional Dependency Step
6869

@@ -109,6 +110,18 @@ python3 "$HOME/.codex/skills/software-thesis-docx/scripts/check_aigc_risk.py" \
109110
--output /tmp/aigc-risk-report.json
110111
```
111112

113+
Rewrite authorized paragraphs for lower AIGC risk:
114+
115+
```bash
116+
python3 "$HOME/.codex/skills/software-thesis-docx/scripts/rewrite_low_aigc_docx.py" \
117+
--input thesis.docx \
118+
--report /tmp/aigc-risk-report.json \
119+
--output /tmp/thesis-low-aigc.docx \
120+
--pending-output /tmp/aigc-pending-review.json \
121+
--profile academic_safe \
122+
--normalize-typography
123+
```
124+
112125
Replace images by caption:
113126

114127
```bash

docs/codex-quickstart.zh-CN.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ irm https://raw.githubusercontent.com/Jonnys-Li/software-thesis-docx-skill/main/
4747
-`.docx` 模板抽取自定义 style preset
4848
- Mermaid 规划契约
4949
- 可选的严谨写作 subagent 模式
50-
- 可选的 AIGC 风险检查与保守降重流程
50+
- 可选的 AIGC 风险检查与双模式降重流程
5151

5252
## 4. 在 Codex 中使用
5353

@@ -63,6 +63,7 @@ Use $software-thesis-docx to build my thesis DOCX workflow from my software repo
6363
- `Use $software-thesis-docx to read my Word template, extract a style preset, and build the thesis in that format.`
6464
- `Use $software-thesis-docx to generate Mermaid flowchart and sequenceDiagram code for my thesis based on the repo architecture.`
6565
- `Use $software-thesis-docx to run an AIGC risk review on my thesis DOCX and only rewrite the flagged single-run paragraphs after showing me the report.`
66+
- `Use $software-thesis-docx to lower AIGC for my thesis, keep academic_safe by default, and only switch to explicit_low_aigc if I explicitly ask for it.`
6667

6768
## 5. 可选依赖安装
6869

@@ -109,6 +110,18 @@ python3 "$HOME/.codex/skills/software-thesis-docx/scripts/check_aigc_risk.py" \
109110
--output /tmp/aigc-risk-report.json
110111
```
111112

113+
按风险报告回写低 AIGC 版本:
114+
115+
```bash
116+
python3 "$HOME/.codex/skills/software-thesis-docx/scripts/rewrite_low_aigc_docx.py" \
117+
--input thesis.docx \
118+
--report /tmp/aigc-risk-report.json \
119+
--output /tmp/thesis-low-aigc.docx \
120+
--pending-output /tmp/aigc-pending-review.json \
121+
--profile academic_safe \
122+
--normalize-typography
123+
```
124+
112125
按图注替换图片:
113126

114127
```bash

docs/releases/v0.4.0.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# v0.4.0
2+
3+
## Summary
4+
5+
`v0.4.0` upgrades the AIGC capability from a basic heuristic checker into a sample-driven low-AIGC workflow.
6+
7+
This release adds a reusable low-AIGC playbook, a stronger checker aligned to real before/after thesis samples, and a new rewrite script that can auto-apply authorized paragraph rewrites for single-run paragraphs.
8+
9+
## Highlights
10+
11+
- Sample-driven AIGC methodology in `references/low-aigc-playbook.md`
12+
- Stronger `check_aigc_risk.py` output:
13+
- `rewrite_recipe`
14+
- `typography_flags`
15+
- denser academic-template detectors
16+
- New rewrite script:
17+
- `rewrite_low_aigc_docx.py`
18+
- Shared AIGC utilities:
19+
- `aigc_utils.py`
20+
- Dual rewrite profiles:
21+
- `academic_safe`
22+
- `explicit_low_aigc`
23+
- Updated workflow options:
24+
- `aigc.rewrite_profile`
25+
- `aigc.normalize_typography`
26+
27+
## Scope Notes
28+
29+
- `academic_safe` remains the default.
30+
- `explicit_low_aigc` is only intended for cases where the user explicitly asks to lower AIGC.
31+
- The checker remains local and heuristic. It is not presented as a guarantee of any third-party detector percentage.

skills/software-thesis-docx/SKILL.md

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ This skill is especially useful when the user needs one or more of the following
2121
- Mermaid code for architecture, flow, sequence, ER, state, schedule, or mind-map diagrams
2222
- caption-based figure replacement without breaking Word layout
2323
- citation cleanup or terminology cleanup in an existing `docx`
24-
- optional AIGC risk review before final delivery
24+
- optional AIGC risk review or low-AIGC rewrite before final delivery
2525

2626
## 1. Ground The Repository Or The DOCX
2727

@@ -66,6 +66,11 @@ Default values:
6666
- `subagents.enabled = false`
6767
- `aigc.enabled = false`
6868

69+
For AIGC:
70+
71+
- keep `rewrite_profile = academic_safe` by default
72+
- only switch to `explicit_low_aigc` when the user explicitly asks to lower AIGC
73+
6974
If the runtime supports structured input collection such as `request_user_input`, use it.
7075
If not, ask the same questions in plain conversation.
7176

@@ -105,6 +110,7 @@ Supporting references:
105110
- [references/workflow.md](references/workflow.md)
106111
- [references/source-conventions.md](references/source-conventions.md)
107112
- [references/migration-notes.md](references/migration-notes.md)
113+
- [references/low-aigc-playbook.md](references/low-aigc-playbook.md)
108114

109115
## 6. Run The Right Script
110116

@@ -159,6 +165,18 @@ python3 scripts/check_aigc_risk.py \
159165
--output /tmp/aigc-risk-report.json
160166
```
161167

168+
Rewrite authorized paragraphs for lower AIGC risk:
169+
170+
```bash
171+
python3 scripts/rewrite_low_aigc_docx.py \
172+
--input thesis.docx \
173+
--report /tmp/aigc-risk-report.json \
174+
--output thesis-low-aigc.docx \
175+
--pending-output /tmp/aigc-pending-review.json \
176+
--profile academic_safe \
177+
--normalize-typography
178+
```
179+
162180
## 7. Mermaid Rules
163181

164182
Mermaid generation is an orchestration capability, not a required rendering pipeline.
@@ -204,11 +222,13 @@ Safe behavior:
204222

205223
- scan first
206224
- rewrite only authorized paragraphs
225+
- keep `academic_safe` as the default rewrite profile
226+
- use `explicit_low_aigc` only when the user explicitly asks to lower AIGC
207227
- prefer more concrete, evidence-linked language
208-
- reuse `rewrite_paragraphs.py` only for single-run exact-match paragraphs
209-
- stop for manual confirmation on multi-run or mixed-format paragraphs
228+
- use `rewrite_low_aigc_docx.py` for authorized single-run paragraphs
229+
- stop for manual confirmation on multi-run or mixed-format paragraphs and inspect the pending JSON output
210230

211-
Read [references/aigc.md](references/aigc.md) when the user asks for detection or auto-reduction.
231+
Read [references/aigc.md](references/aigc.md) and [references/low-aigc-playbook.md](references/low-aigc-playbook.md) when the user asks for detection or low-AIGC rewriting.
212232

213233
## 10. Quality Gates
214234

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
interface:
22
display_name: "Software Thesis DOCX"
3-
short_description: "把软件仓库整理成毕业论文 DOCX,并支持格式 preset、Mermaid、AIGC 与严谨写作编排"
4-
default_prompt: "Use $software-thesis-docx to build or refine a software thesis DOCX workflow from my project repository, with optional style preset extraction, Mermaid diagrams, subagent writing review, and AIGC risk checks."
3+
short_description: "把软件仓库整理成毕业论文 DOCX,并支持格式 preset、Mermaid、样本驱动 AIGC 检查与降重编排"
4+
default_prompt: "Use $software-thesis-docx to build or refine a software thesis DOCX workflow from my project repository, with optional style preset extraction, Mermaid diagrams, subagent writing review, and sample-driven AIGC risk checks or low-AIGC rewriting."

skills/software-thesis-docx/assets/examples/thesis_workflow_options.example.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@
3232
"check_only": true,
3333
"auto_reduce": false,
3434
"threshold": 0.58,
35-
"target_scope": "generated_content"
35+
"target_scope": "generated_content",
36+
"rewrite_profile": "academic_safe",
37+
"normalize_typography": true
3638
}
3739
}

skills/software-thesis-docx/references/aigc.md

Lines changed: 34 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,11 @@ Use AIGC risk checks when the user wants a local, heuristic pass over thesis pro
66

77
This is a local risk score, not a claim that it matches any institution's external detector.
88

9+
The skill now supports two rewrite profiles:
10+
11+
- `academic_safe`: default; keep academic tone and only absorb low-risk anti-template rewrites
12+
- `explicit_low_aigc`: only for users who explicitly ask to lower AIGC; allows stronger sentence expansion and naturalization
13+
914
## Checker Script
1015

1116
```bash
@@ -20,34 +25,57 @@ The report includes:
2025
- excerpt
2126
- risk score
2227
- triggered signals
28+
- rewrite recipe
29+
- typography flags
2330
- whether rewrite is recommended
2431

32+
## Rewrite Script
33+
34+
```bash
35+
python3 scripts/rewrite_low_aigc_docx.py \
36+
--input thesis.docx \
37+
--report report.json \
38+
--output thesis-low-aigc.docx \
39+
--pending-output aigc-pending-review.json \
40+
--profile academic_safe \
41+
--normalize-typography
42+
```
43+
2544
## Current Signals
2645

2746
- frequent transition cliches
2847
- low-information summary phrases
29-
- overly uniform sentence-length patterns
30-
- repeated tri-grams
48+
- compressed academic enumerations
49+
- colon/semicolon-driven template paragraphs
50+
- dense operation chains
51+
- term stacks without enough explanation
3152
- claims without concrete evidence markers
32-
- low lexical diversity
53+
- typography issues in Chinese thesis paragraphs
3354

3455
## Recommended Workflow
3556

3657
1. Run the checker and inspect high-risk paragraphs first.
3758
2. Only rewrite paragraphs the user has authorized.
38-
3. Prefer concrete, evidence-linked rewrites over cosmetic synonym swaps.
39-
4. If a paragraph is a single run, reuse `rewrite_paragraphs.py` for deterministic updates.
40-
5. If a paragraph has mixed formatting or multiple runs, stop and ask for manual confirmation before rewriting.
59+
3. Use `academic_safe` unless the user explicitly asked to lower AIGC.
60+
4. Prefer concrete, evidence-linked rewrites over cosmetic synonym swaps.
61+
5. Normalize punctuation and spacing while rewriting.
62+
6. If a paragraph is a single run, `rewrite_low_aigc_docx.py` can auto-apply the rewrite.
63+
7. If a paragraph has mixed formatting or multiple runs, stop and review the pending JSON output before manual confirmation.
64+
65+
See [low-aigc-playbook.md](low-aigc-playbook.md) for the sample-driven methodology.
4166

4267
## Safe Defaults
4368

4469
- `enabled`: false
4570
- `check_only`: true
4671
- `auto_reduce`: false
4772
- `target_scope`: generated content or user-selected paragraphs only
73+
- `rewrite_profile`: `academic_safe`
74+
- `normalize_typography`: true
4875

4976
## What Not To Do
5077

5178
- Do not silently rewrite an entire thesis.
5279
- Do not imply guaranteed compliance with any third-party AIGC score.
5380
- Do not treat the risk report as a substitute for human academic review.
81+
- Do not reduce citation density, weaken facts, or replace technical terms inaccurately just to chase a lower score.

0 commit comments

Comments
 (0)