Write Your Own Checker
An automated checker generation framework for mainstream static analysis tools
AutoChecker is a tool that automatically generates code checkers for mainstream static analysis tools based on user-defined rule requirements.
In day-to-day development, we often need to check whether code follows specific rules, such as whether there are null pointer risks, whether resources are released correctly, or whether naming conventions are followed. Although static analysis tools provide some built-in checkers, those built-in rules often do not fully cover real-world, customized requirements. Writing a new checker manually is both time-consuming and error-prone.
AutoChecker offers a new solution for this scenario: users only need to provide a rule description and examples, and AutoChecker can automatically generate a usable checker for the target static analysis tool. For the same rule, AutoChecker can also generate checkers for multiple analysis tools, allowing users to choose the one that best fits their workflow.
- Automatically generate checkers from rule descriptions and test cases: users provide a rule description together with positive and negative test cases, and the tool outputs checker code that can be used directly in the target static analysis tool.
- Write once, reuse across multiple tools: the same rule specification and test cases can be reused for different analysis tools, reducing duplicated effort and improving portability.
- Extensible to multiple static analysis tools: the current documentation covers
PMD,clang-tidy, andCodeQL, with support for more tools planned in the future.
You can now try AutoChecker directly in the browser:
AutoChecker currently provides two installation methods:
- Manual installation: install dependencies, configure environment variables, and build manually. This mode can be used to generate checkers for
PMD,clang-tidy,CodeQL, and more. - Docker deployment: use Docker to complete environment setup and toolchain preparation in one step. This mode currently supports generating
clang-tidyandCodeQLcheckers.
| Item | Requirement |
|---|---|
| Disk Space | At least 64 GB |
| Memory | At least 16 GB |
| CPU | At least 4 cores |
| Operating System | Ubuntu 22.04 (recommended) |
| LLM API Key | An API key for a large language model is required, such as DeepSeek or OpenAI |
Clone the repository:
git clone https://github.com/SQUARE-RG/AutoChecker.gitCreate a virtual environment and install dependencies:
# Create a virtual environment
conda create -n autochecker python=3.10
conda activate autochecker
# Enter the project root directory
cd AutoChecker
pip install -r requirements.txtInstall the required static analysis engines according to your needs:
Create a .env file in the project root directory and fill in your LLM API information:
API_KEY=your_api_key
MODEL=model_name (e.g. deepseek, gpt-4)
BASE_URL=api_endpoint (e.g. https://api.deepseek.com)After the configuration is complete, you can move on to preparing rules and test cases.
Create a rule.json file in the project root directory:
{
"data": {
"ucassaat": [
{
"main_title": "use-uncheck-pointer-after-malloc",
"description": "The rule requires that any pointer obtained through dynamic memory allocation functions (such as malloc, calloc, or realloc) must be checked for non-null before its first use. This check must occur before the pointer is used; performing the check after use is considered a violation. Acceptable check methods include explicit or implicit null pointer comparisons like if (ptr != NULL), if (ptr), or if (!ptr). If a dynamically allocated pointer is never used, it does not violate this rule. If a pointer is reallocated, it must be checked again before any subsequent use. This rule applies equally to global and local variables. Only one warning should be reported per violating pointer variable.",
"rule_test_path": "/root/code_check/experiment/gjb8114/codeql_test_case/use_uncheck_pointer_after_malloc"
}
]
}
}Notes:
rule_test_pathmust be an absolute path pointing to the directory of the test suite.- For violating test cases, use
CHECK-MESSAGEScomments in the code to mark the expected results.
python src/main.py --rule_file rule.json --language cpp --analyzer clang-tidyThe generated results are saved to the result-generation directory by default.
| Dependency | Description |
|---|---|
| Docker | Version 28.1.1 or later is recommended |
| Operating System | Ubuntu 22.04 is recommended, though other Linux distributions should also work |
| LLM API Key | An API key for a large language model is required, such as DeepSeek or OpenAI |
git clone https://github.com/SQUARE-RG/AutoChecker.git
cd AutoChecker
git clone https://github.com/llvm/llvm-project.git --branch release/17.x --depth 1The build process automatically installs the Python runtime, configures the conda virtual environment, downloads embedding models, and builds the related static analysis toolchains. The entire process typically takes around 10 minutes.
docker build -t autochecker:1.0 .The build process includes dependency installation and compilation, so please wait patiently. When you see
Successfully tagged autochecker:1.0, the build has completed successfully.
docker run -it --name autochecker-container autochecker:1.0 /bin/bashAfter execution, you will enter the container's interactive shell, with the default working directory set to the AutoChecker root directory.
Inside the container, create a .env file in the project root directory and fill in your LLM API information:
API_KEY=your_api_key
MODEL=model_name (e.g. deepseek, gpt-4)
BASE_URL=api_endpoint (e.g. https://api.deepseek.com)After the configuration is complete, you can move on to preparing rules and test cases.
Create a rule.json file in the project root directory and fill in your rule definition and test case path:
{
"main_title": "your_rule_name",
"title": "short_rule_summary (optional)",
"description": "Describe in detail what this rule is intended to detect and in what scenarios it applies.",
"rule_test_path": "/absolute/path/to/test/case/directory/",
"category": "rule_category (optional)"
}Test case requirements:
- Use the file extension corresponding to the target language, such as
.cpp,.c, or.java. - Each test file should be independently compilable.
python src/main.py --rule_file rule.json --language cpp --analyzer clang-tidyThe program prints progress information during execution. After generation finishes, the results are saved to the result-generation/ directory by default, including:
final_checker/: the final generated checker code, such as header and implementation files.checker_generation_result.json: the performance report of the checker on the test suite, including metrics such as accuracy, time cost, and usage cost.
The generated checker code can be placed directly into the checker directory of the target static analysis tool and used after recompilation.
| Tool | Supported Languages |
|---|---|
| PMD | Java |
| Clang-tidy | C/C++ |
| CodeQL | Multiple languages |
Planned support:
- Semgrep
- Clang Static Analyzer
The config.json file in the project root directory can be adjusted as needed:
| Parameter | Description | Default Value |
|---|---|---|
max_round |
Maximum number of iteration rounds for each test case | 2 |
max_compiler_trys |
Maximum number of attempts to fix compilation failures | 5 |
top_key |
Number of most relevant code snippets retrieved | 2 |
result_dir |
Output directory for generated results | result-generation/ |
If you use our work in your research or project, please consider citing:
- Jun Liu, Yuanyuan Xie, Jiwei Yan, Jinhao Huang, Jun Yan, Jian Zhang. Write Your Own CodeChecker: An Automated Test-Driven Checker Development Approach with LLMs. ICSE 2026. paper
@inproceedings{AutoChecker,
title={Write Your Own CodeChecker: An Automated Test-Driven Checker Development Approach with LLMs},
author={Jun Liu and Yuanyuan Xie and Jiwei Yan and Jinhao Huang and Jun Yan and Jian Zhang},
booktitle={Proceedings of the International Conference on Software Engineering (ICSE)},
year={2026}
}AutoChecker is actively developed and maintained by members of SQUARE Research Group:
- Jun Liu(@nonsense-j)
- Yuanyuan Xie(@xyyusr)
- Liqiang Ji (@carlson-jlq)
- Jinhao Huang (@jinhao-huang)
- Yuyang Xie (@sisifuCha)
- Xianglong Qi (@Meiosis-Poor)
- Jiwei Yan(@hanada31)
AutoChecker is an open source project, and contributions from the community are welcome. For more details, please refer to the Developer Guide.
