Paper Knowledge Workflow

paper-knowledge-workflow is a public Codex skill for building a traceable research-paper workspace from sources, notes, and drafts.

It is designed for one paper project at a time. It creates a lightweight knowledge base, keeps a source manifest, maps paragraph-level claims to evidence, and runs structure and evidence gates before manuscript revision or export.

Status

This repository is suitable for public beta use. The deterministic gates and project initializer are tested, but the workflow is intentionally not a one-click paper generator.

The scripts automate:

project initialization
PDF registration and processed text extraction
source wiki note generation and matrix row creation
structure validation
manifest field validation
source wiki coverage and local wikilink validation
paragraph-level claim evidence validation
maintenance and backlog reporting

The agent or human researcher still owns:

source ingestion decisions
literature synthesis
research-question refinement
wiki note quality and human-checked summaries
manuscript argument quality
scholarly review and final judgment

Key Features

First-use guided intake: starts with four plain questions, then infers whether the project is in preflight, materials, or draft state.
Project-level schema: every paper workspace gets its own SCHEMA.md so the file structure and evidence rules are explicit.
Direct source tracking: separates durable raw materials from processed reading aids.
PDF processing adapter: registers PDFs, extracts processed text with pdftotext or optional pypdf, and records parser status in the manifest.
Manifest-first evidence registry: uses index/source_manifest.json as the source of truth for access, reading status, evidence level, and citation metadata.
Source wiki generation: creates 10_Wiki/Sources/<source_id>.md notes from manifest sources and processed text excerpts.
Literature matrix: creates 30_Matrix/literature_matrix.md for comparison and synthesis instead of leaving notes scattered.
Paragraph-level claim map: maps manuscript claims to source IDs in 40_Paper/claim_evidence_map.md.
Hard gates: ships standard-library Python scripts for structure validation, wiki/link validation, evidence validation, and maintenance reporting.
Soft academic review: keeps scholarly judgment as a review report, not a fake automatic pass.
Optional ARS routing: can use academic-research-suite as a specialist paper-writing companion when it is installed, without vendoring it.
Public-safe examples: includes a small synthetic example project and purity tests to avoid private paths or project-specific material.

What It Is For

Starting a paper project from a topic, existing materials, or an existing draft.
Turning PDFs, notes, web sources, and bibliographic metadata into a traceable writing workspace.
Maintaining source_manifest.json, wiki pages, literature matrices, paper drafts, and claim-evidence maps.
Checking whether wiki notes cover manifest sources and local wikilinks resolve.
Checking whether manuscript claims are backed by usable, read sources.
Routing paper-specific thinking to academic-research-suite when that skill is available.

What It Is Not For

Grant proposal writing.
Cross-project knowledge-base governance.
One-off copyediting without source tracking.
A replacement for human scholarly judgment.
A full document-ingestion engine for every source format. PDF support is a practical adapter, not a universal parser for scanned or protected documents.

Install

Clone the repository, then copy the skill contents into your Codex skills directory.

git clone https://github.com/hideaway007/paper-knowledge-workflow.git
cd paper-knowledge-workflow
mkdir -p ~/.codex/skills/paper-knowledge-workflow
rsync -a skill/ ~/.codex/skills/paper-knowledge-workflow/

Restart or refresh Codex so it can discover the new skill.

If you already cloned the repository and only want to install the skill:

mkdir -p ~/.codex/skills/paper-knowledge-workflow
rsync -a skill/ ~/.codex/skills/paper-knowledge-workflow/

First Use

Ask Codex:

Use $paper-knowledge-workflow to start a paper project.

The skill should ask four short questions:

What is the paper topic or working title?
What discipline or research type is this?
What materials do you already have?
What do you want from this session?

The skill then infers the entry state:

preflight: topic or direction only.
materials: sources are available.
draft: a manuscript draft already exists.

The skill should not draft a full paper directly from a vague topic. It first creates or checks the research question checkpoint.

CLI Fallback

The initialization script can create the project structure directly:

python3 skill/scripts/init_paper_project.py \
  --root ./my-paper-project \
  --title "Working Paper Title" \
  --discipline "urban studies" \
  --materials "PDFs and notes" \
  --goal "literature review"

Then run the hard gates:

python3 skill/scripts/process_pdfs.py --root ./my-paper-project --source-dir ./my-paper-project/00_Inbox
python3 skill/scripts/generate_wiki.py --root ./my-paper-project
python3 skill/scripts/validate_structure.py --root ./my-paper-project
python3 skill/scripts/validate_wiki.py --root ./my-paper-project
python3 skill/scripts/validate_evidence.py --root ./my-paper-project
python3 skill/scripts/generate_maintenance_report.py --root ./my-paper-project

Hard Gates

validate_structure.py checks that the paper workspace matches the schema contract and that manifest entries are well-formed.

process_pdfs.py registers PDF files, copies them into 20_Sources/raw/, extracts text into 20_Sources/processed/, and records parser metadata under each source's custom field. It prefers system pdftotext and can fall back to optional pypdf.

generate_wiki.py creates source wiki notes in 10_Wiki/Sources/ and appends missing source rows to 30_Matrix/literature_matrix.md.

validate_wiki.py checks that manifest sources have source wiki notes, generated notes link raw and processed artifacts, source wiki IDs still exist in the manifest, and local Obsidian-style wikilinks resolve.

validate_evidence.py checks that paragraph-level claims cite manifest sources and that supported claims do not rely on unread, rejected, pending, or metadata-only sources.

generate_maintenance_report.py summarizes source access, reading status, claim status, and backlog items.

Template examples inside fenced Markdown code blocks are ignored by the claim parser. A freshly initialized project can pass the evidence gate because it contains no real claims yet; it still needs the research question checkpoint and real claim mapping before drafting or export. Parsed PDFs and generated wiki notes are reading aids; they do not make a source read or verified.

Project Layout

00_Inbox/
20_Sources/
  raw/
  processed/
10_Wiki/
  Sources/
30_Matrix/
40_Paper/
50_Review/
index/
SCHEMA.md

Relationship to Academic Research Suite

This skill is the project owner: it controls the schema, source registry, wiki/matrix artifacts, claim-evidence map, validation gates, and maintenance reports.

academic-research-suite is treated as an optional specialist companion for research-question refinement, literature synthesis, outlines, drafting, citation checks, revision coaching, and reviewer simulation.

This repository does not vendor or copy academic-research-suite. If it is not installed, the workflow still works with the local scripts and references.

Test

The repository uses Python standard-library tests:

python3 -m unittest discover -s tests
python3 tests/smoke_public_workflow.py

The smoke workflow initializes a temporary project, processes a synthetic PDF, generates source wiki notes, validates wiki links, validates a real synthetic source/claim pair, and checks that an unread source fails the evidence gate.

Public-Use Boundary

This repository intentionally avoids private research cases, local machine paths, grant-writing assumptions, and project-specific terminology. Keep examples small, synthetic, and neutral.

References and Credits

See NOTICE.md for the reference model, inspirations, and attribution boundaries, including academic-research-skills, awesome-ai-research-writing, Karpathy's llm-wiki.md, Agent Skills, and Obsidian.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs/superpowers/plans		docs/superpowers/plans
examples/minimal-paper-project		examples/minimal-paper-project
skill		skill
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE.md		NOTICE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Knowledge Workflow

Status

Key Features

What It Is For

What It Is Not For

Install

First Use

CLI Fallback

Hard Gates

Project Layout

Relationship to Academic Research Suite

Test

Public-Use Boundary

References and Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Paper Knowledge Workflow

Status

Key Features

What It Is For

What It Is Not For

Install

First Use

CLI Fallback

Hard Gates

Project Layout

Relationship to Academic Research Suite

Test

Public-Use Boundary

References and Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages