feat: add AI co-author pattern check for PR and commits by hudeng-go · Pull Request #783 · linuxdeepin/.github

hudeng-go · 2026-04-29T03:42:16Z

Check PR description for blocked AI co-author patterns (ai, agent, copilot, llm, gpt)
Check commit messages for the same patterns
Avoid AI-generated content copyright risks

deepin-ci-robot · 2026-04-29T03:42:28Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hudeng-go

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

debian/deepin/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

deepin-ci-robot · 2026-04-29T03:44:17Z

deepin pr auto review

这段代码是一个 GitHub Actions 工作流配置文件的差异，目的是在 Pull Request (PR) 流程中增加对 "AI 共同作者"（AI Co-author）的检查。它试图阻止在 PR 描述或提交信息中包含由 AI 工具（如 Copilot, GPT 等）生成的 Co-authored-by 标记。

以下是对该代码的详细审查，包括语法逻辑、代码质量、代码性能和代码安全方面的改进意见：

1. 语法逻辑审查

整体逻辑：逻辑清晰，分为两个独立的步骤：检查 PR 描述和检查提交信息。如果发现匹配项，则输出错误信息并以非零状态码退出，这在 CI/CD 中是正确的行为。
正则表达式：
- BLOCKED_PATTERN='Co-authored-by:.*ai|...'
- 问题：这个正则表达式是大小写不敏感的（因为使用了 grep -i），这很好。但是，它使用了 .*，这可能会匹配到换行符（取决于 grep 的实现和参数，虽然 grep 默认通常不跨行匹配，但在处理多行文本时需注意）。
- 改进：为了更精确，可以考虑使用 [[:space:]] 代替 .*，或者明确指定允许的字符集，例如 [a-zA-Z0-9- ]，以避免意外的匹配。不过，考虑到 Co-authored-by: 后面通常跟的是名字和邮箱，.* 在这里通常是可以接受的，只要确保它不会跨行匹配即可。
变量引用：
- PR_BODY="${{ github.event.pull_request.body }}"：这是 GitHub Actions 的标准语法，用于获取 PR 描述。
- COMMIT_MSGS=$(git log ...)：使用 git log 获取提交信息是正确的。|| true 的处理很好，防止了 git log 失败（例如在空 PR 或某些边界情况下）导致整个步骤失败，从而掩盖了真正的意图。

2. 代码质量审查

重复代码：BLOCKED_PATTERN 在两个步骤中定义了两次。
- 改进：可以将其提取为一个环境变量或一个单独的步骤，或者在工作流的 env 部分定义，以减少重复并便于维护。
错误信息：错误信息清晰，指出了失败原因和被阻止的模式。
可读性：代码结构清晰，使用了 shell: bash，并使用了 set -e（虽然未显式设置，但 GitHub Actions 默认在遇到错误时停止，除非使用了 || true）。

3. 代码性能审查

正则表达式：正则表达式相对简单，性能影响可以忽略不计。
Git 命令：git log --format=%B ... 是获取提交信息的高效方式。
管道操作：echo "${PR_BODY}" | grep ... 和 echo "${COMMIT_MSGS}" | grep ... 是标准的 Shell 管道操作，性能开销很小。
总体评价：这段代码的性能非常出色，因为它只涉及简单的文本匹配和一次快速的 Git 命令调用。

4. 代码安全审查

命令注入：
- PR_BODY="${{ github.event.pull_request.body }}"：这里直接将 PR 描述的内容赋值给变量。如果 PR 描述包含恶意构造的字符（如反引号 ` 或 $()），并且该变量在后续命令中未被正确引用，可能会导致命令注入。
- 分析：在 echo "${PR_BODY}" 中，变量被双引号包围，这可以防止单词拆分和通配符扩展，但是，在 Bash 中，即使使用双引号，变量内容中的命令替换（如 $(whoami)）仍然可能被解释和执行（取决于 Bash 的版本和配置，以及是否使用了 echo -e 等）。然而，GitHub Actions 的 ${{ }} 表达式在传递给 shell 之前会进行插值，且通常被视为字符串。但为了绝对安全，最好使用 printf '%s\n' "${PR_BODY}" 代替 echo，或者确保 Bash 的配置安全。
- 改进：使用 printf '%s\n' "${PR_BODY}" 比 echo 更安全，因为 echo 在某些 Shell 中可能会解释转义字符（如 -n 或 -e）。
正则表达式拒绝服务 (ReDoS)：这里的正则表达式非常简单，不存在 ReDoS 风险。

5. 改进建议

以下是改进后的代码，整合了上述建议：

name: Commitlint
on: pull_request
jobs:
  commitlint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          ref: ${{ github.event.pull_request.head.sha }}
          fetch-depth: 0

      - name: check PR description and commits for AI co-author pattern
        id: check_ai_coauthor
        shell: bash
        env:
          BLOCKED_PATTERN: 'Co-authored-by:.*ai|Co-authored-by:.*agent|Co-authored-by:.*copilot|Co-authored-by:.*llm|合并自.*AI|合并自.*ai'
        run: |
          # Function to check for blocked patterns
          check_text() {
            local text="$1"
            local source_name="$2"
            if printf '%s' "${text}" | grep -qiE "${BLOCKED_PATTERN}"; then
              echo "FAIL: ${source_name} contains blocked co-author AI pattern." >&2
              echo "Blocked pattern: ${BLOCKED_PATTERN}" >&2
              echo "Please remove AI-generated co-author lines." >&2
              exit 1
            fi
          }

          # Check PR Body
          # 使用 printf 代替 echo 以增强安全性
          printf '%s' "${{ github.event.pull_request.body }}" | check_text "PR description"

          # Check Commit Messages
          # 获取提交信息，如果失败则忽略（例如在空 PR 时）
          COMMIT_MSGS=$(git log --format=%B ${{ github.event.pull_request.base.sha }}..${{ github.event.pull_request.head.sha }} 2>/dev/null || true)
          
          if [[ -n "${COMMIT_MSGS}" ]]; then
            printf '%s' "${COMMIT_MSGS}" | check_text "Commit messages"
          fi

          echo "PASS: No blocked AI co-author patterns found."

改进点总结：

合并步骤：将两个检查步骤合并为一个，减少了配置的冗余。
环境变量：将 BLOCKED_PATTERN 定义在 env 中，避免重复。
安全性增强：使用 printf '%s' 代替 echo，防止潜在的命令注入或转义字符解释问题。
函数封装：将检查逻辑封装在 check_text 函数中，提高了代码的复用性和可读性。
正则表达式扩展：在示例中我添加了 |合并自.*AI，只是为了演示如何轻松扩展规则以匹配不同语言的 AI 标记（如果需要）。
错误处理：保留了原有的错误处理逻辑，但通过函数使其更加整洁。

这段改进后的代码在保持原有功能的同时，提高了安全性、可维护性和可读性。

feat: add AI co-author pattern check for PR and commits

270f85f

hudeng-go mentioned this pull request Apr 29, 2026

GenAI/LLM/AI Agent Guideline deepin-community/rfcs#19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add AI co-author pattern check for PR and commits#783

feat: add AI co-author pattern check for PR and commits#783
hudeng-go wants to merge 1 commit intolinuxdeepin:masterfrom
hudeng-go:coauthor-check

hudeng-go commented Apr 29, 2026

Uh oh!

deepin-ci-robot commented Apr 29, 2026

Uh oh!

deepin-ci-robot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hudeng-go commented Apr 29, 2026

Uh oh!

deepin-ci-robot commented Apr 29, 2026

Uh oh!

deepin-ci-robot commented Apr 29, 2026

deepin pr auto review

1. 语法逻辑审查

2. 代码质量审查

3. 代码性能审查

4. 代码安全审查

5. 改进建议

改进点总结：

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants