Skip to content

Commit f94a0f8

Browse files
committed
feat: add dependency topology scanner
Static analysis tool for Lua codebase architectural layering. - scan_topology.py: CLI entry (scan / diff subcommands) - scan_analysis.py: core analysis, group matching, policy violation detection - graph_utils.py: pure graph algorithms (Tarjan SCC, back-edges, degree) - html_renderer.py: interactive dagre-d3 HTML visualization with cluster expand/collapse, violation highlighting, SCC marking - topology.jsonc: 5-layer group definitions with English comments explaining each module placement and REVIEW notes for debatable calls scan --json produces agent-friendly output: health summary (cycles/violations/ungrouped with verdicts) cycles with severity, members_by_layer, example_cycle path, back_edges violations grouped by rule with full edge lists group_coverage confirming 0 ungrouped modules
1 parent 8edc19a commit f94a0f8

8 files changed

Lines changed: 1676 additions & 0 deletions

File tree

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,7 @@ deps/
2222

2323
# Local Claude settings (keep out of repo)
2424
.claude/
25+
26+
# Dependency topology tool artifacts
27+
scripts/dependency-topology/__pycache__/
28+
*.html

AGENTS.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,13 @@
1111
- **Comments:** Avoid obvious comments that merely restate what the code does. Only add comments when necessary to explain _why_ something is done, not _what_ is being done. Prefer self-explanatory code.
1212
- **Config:** Centralize in `config.lua`. Use deep merge for user overrides.
1313
- **Types:** Use Lua annotations (`---@class`, `---@field`, etc.) for public APIs/config.
14+
15+
## Dependency Topology Tool
16+
17+
Use `scripts/dependency-topology/scan_topology.py` to inspect and track architectural layering.
18+
19+
- Use `python3 scripts/dependency-topology/scan_topology.py scan` to inspect current-state vs target-policy gap
20+
- Use `diff` to inspect change direction (improved/regressed/neutral) between snapshots
21+
- Pass `--snapshot <git-ref>` for historical snapshots
22+
- Pass `--json` when feeding outputs into scripts or agents
23+
- Keep architecture cleanup discussions anchored on scanner output instead of ad-hoc grep chains
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Dependency Topology Scanner
2+
3+
Static analysis tool for Lua codebase dependency architecture.
4+
5+
## File Structure
6+
7+
```
8+
scripts/dependency-topology/
9+
├── scan_topology.py # CLI entry: scan / diff subcommands
10+
├── scan_analysis.py # Core analysis: groups, edge rules, payload builders
11+
├── graph_utils.py # Pure graph algorithms (Tarjan SCC, back edges, degree)
12+
├── html_renderer.py # Interactive dagre-d3 + d3v5 HTML visualization
13+
└── topology.jsonc # Group definitions + review comments (strategy file)
14+
```
15+
16+
## Quick Start
17+
18+
```bash
19+
# Scan current HEAD → generate interactive HTML
20+
python3 scripts/dependency-topology/scan_topology.py scan
21+
22+
# Output to specific path
23+
python3 scripts/dependency-topology/scan_topology.py scan -o /tmp/deps.html
24+
25+
# JSON output (for scripts/agents)
26+
python3 scripts/dependency-topology/scan_topology.py scan --json
27+
28+
# Compare HEAD vs working tree (default)
29+
python3 scripts/dependency-topology/scan_topology.py diff
30+
31+
# Compare specific refs
32+
python3 scripts/dependency-topology/scan_topology.py diff --from main --to HEAD
33+
```
34+
35+
## Snapshot References
36+
37+
- `worktree` — current working tree (uncommitted changes)
38+
- `HEAD` — latest commit
39+
- Any git ref — branch name, tag, commit SHA
40+
41+
**diff defaults:** `--from HEAD --to worktree`
42+
43+
## Output
44+
45+
**scan:** One-line summary + HTML file path
46+
```
47+
4 cycles, 20 violations, violations=20 → /path/to/dependency-graph.html
48+
```
49+
50+
**diff:** Change direction summary
51+
```
52+
HEAD → worktree: +2/-1 edges, improved=1, regressed=0
53+
```
54+
55+
## JSON Output Signals
56+
57+
When using `--json`:
58+
59+
- `health` — one-glance status for cycles / violations / ungrouped coverage
60+
- `cycles` — SCC details with severity, members_by_layer, example_cycle, back_edges_in_scc
61+
- `violations` — policy violations grouped by rule with full edge lists
62+
- `group_coverage` — module counts per layer (including ungrouped)
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
#!/usr/bin/env python3
2+
"""Repository-local static Lua dependency graph helpers.
3+
4+
Mechanism only:
5+
- Parse `require('opencode.*')` edges from `lua/opencode/**/*.lua`
6+
- Build snapshot graph from worktree or git ref
7+
- Provide SCC / back-edge utilities
8+
"""
9+
10+
from __future__ import annotations
11+
12+
from collections import Counter, defaultdict
13+
from dataclasses import dataclass
14+
from pathlib import Path
15+
import re
16+
import subprocess
17+
from typing import Dict, Iterable, List, Optional, Sequence, Set, Tuple
18+
19+
20+
REQUIRE_PATTERNS = [
21+
re.compile(r"require\s*\(\s*['\"](opencode(?:\.[^'\"]+)?)['\"]\s*\)"),
22+
re.compile(r"require\s+['\"](opencode(?:\.[^'\"]+)?)['\"]"),
23+
]
24+
25+
26+
@dataclass
27+
class SnapshotGraph:
28+
snapshot: str
29+
files: int
30+
nodes: Dict[str, str] # module -> relative file path
31+
edges: Set[Tuple[str, str]]
32+
33+
34+
def module_from_relpath(relpath: str) -> Optional[str]:
35+
if not relpath.startswith("lua/opencode/") or not relpath.endswith(".lua"):
36+
return None
37+
mod = relpath[len("lua/") : -len(".lua")]
38+
if mod.endswith("/init"):
39+
mod = mod[: -len("/init")]
40+
return mod.replace("/", ".")
41+
42+
43+
def _worktree_files(repo: Path) -> List[Tuple[str, str]]:
44+
out: List[Tuple[str, str]] = []
45+
base = repo / "lua" / "opencode"
46+
for fp in base.rglob("*.lua"):
47+
rel = fp.relative_to(repo).as_posix()
48+
text = fp.read_text(encoding="utf-8", errors="ignore")
49+
out.append((rel, text))
50+
return out
51+
52+
53+
def _git_files(repo: Path, ref: str) -> List[Tuple[str, str]]:
54+
cmd = ["git", "ls-tree", "-r", "--name-only", ref, "lua/opencode"]
55+
ls = subprocess.check_output(cmd, cwd=repo, text=True)
56+
57+
out: List[Tuple[str, str]] = []
58+
for rel in ls.splitlines():
59+
if not rel.endswith(".lua"):
60+
continue
61+
show_cmd = ["git", "show", f"{ref}:{rel}"]
62+
try:
63+
text = subprocess.check_output(show_cmd, cwd=repo, text=True, stderr=subprocess.DEVNULL)
64+
except subprocess.CalledProcessError:
65+
continue
66+
out.append((rel, text))
67+
return out
68+
69+
70+
def load_snapshot_graph(repo: Path, snapshot: str) -> SnapshotGraph:
71+
files = _worktree_files(repo) if snapshot == "worktree" else _git_files(repo, snapshot)
72+
73+
nodes: Dict[str, str] = {}
74+
for rel, _ in files:
75+
module = module_from_relpath(rel)
76+
if module:
77+
nodes[module] = rel
78+
79+
edges: Set[Tuple[str, str]] = set()
80+
for rel, content in files:
81+
src = module_from_relpath(rel)
82+
if not src:
83+
continue
84+
85+
deps: Set[str] = set()
86+
for pat in REQUIRE_PATTERNS:
87+
deps.update(m.group(1) for m in pat.finditer(content))
88+
89+
for dep in deps:
90+
if dep in nodes:
91+
edges.add((src, dep))
92+
93+
return SnapshotGraph(snapshot=snapshot, files=len(files), nodes=nodes, edges=edges)
94+
95+
96+
def tarjan_scc(nodes: Iterable[str], edges: Iterable[Tuple[str, str]]) -> List[List[str]]:
97+
graph: Dict[str, List[str]] = defaultdict(list)
98+
for a, b in edges:
99+
graph[a].append(b)
100+
101+
index = 0
102+
stack: List[str] = []
103+
on_stack: Set[str] = set()
104+
indices: Dict[str, int] = {}
105+
lowlink: Dict[str, int] = {}
106+
result: List[List[str]] = []
107+
108+
def strongconnect(v: str) -> None:
109+
nonlocal index
110+
indices[v] = index
111+
lowlink[v] = index
112+
index += 1
113+
stack.append(v)
114+
on_stack.add(v)
115+
116+
for w in graph[v]:
117+
if w not in indices:
118+
strongconnect(w)
119+
lowlink[v] = min(lowlink[v], lowlink[w])
120+
elif w in on_stack:
121+
lowlink[v] = min(lowlink[v], indices[w])
122+
123+
if lowlink[v] == indices[v]:
124+
comp: List[str] = []
125+
while True:
126+
w = stack.pop()
127+
on_stack.remove(w)
128+
comp.append(w)
129+
if w == v:
130+
break
131+
result.append(comp)
132+
133+
for n in sorted(set(nodes)):
134+
if n not in indices:
135+
strongconnect(n)
136+
137+
return result
138+
139+
140+
def back_edges(nodes: Iterable[str], edges: Iterable[Tuple[str, str]]) -> Set[Tuple[str, str]]:
141+
graph: Dict[str, List[str]] = defaultdict(list)
142+
for a, b in edges:
143+
graph[a].append(b)
144+
for n in graph:
145+
graph[n] = sorted(set(graph[n]))
146+
147+
white, gray, black = 0, 1, 2
148+
color: Dict[str, int] = {n: white for n in set(nodes)}
149+
backs: Set[Tuple[str, str]] = set()
150+
151+
def dfs(v: str) -> None:
152+
color[v] = gray
153+
for w in graph[v]:
154+
c = color.get(w, white)
155+
if c == white:
156+
dfs(w)
157+
elif c == gray:
158+
backs.add((v, w))
159+
color[v] = black
160+
161+
for n in sorted(color.keys()):
162+
if color[n] == white:
163+
dfs(n)
164+
165+
return backs
166+
167+
168+
def degree(edges: Iterable[Tuple[str, str]]) -> Tuple[Counter, Counter]:
169+
indeg: Counter = Counter()
170+
outdeg: Counter = Counter()
171+
for src, dst in edges:
172+
outdeg[src] += 1
173+
indeg[dst] += 1
174+
return indeg, outdeg
175+
176+
177+
def find_cycle_in_scc(members: List[str], edges: Iterable[Tuple[str, str]]) -> List[str]:
178+
"""Return one concrete cycle path within an SCC, e.g. [a, b, c, a].
179+
180+
Uses DFS from the first member; backtracks until a back-edge is found.
181+
Returns [] if no cycle is found (shouldn't happen for a real SCC > 1).
182+
"""
183+
member_set = set(members)
184+
graph: Dict[str, List[str]] = defaultdict(list)
185+
for a, b in edges:
186+
if a in member_set and b in member_set:
187+
graph[a].append(b)
188+
for n in graph:
189+
graph[n] = sorted(set(graph[n]))
190+
191+
path: List[str] = []
192+
on_path: Dict[str, int] = {} # node -> index in path
193+
visited: Set[str] = set()
194+
195+
def dfs(v: str) -> List[str]:
196+
path.append(v)
197+
on_path[v] = len(path) - 1
198+
for w in graph[v]:
199+
if w in on_path:
200+
# Found cycle: extract from w's position to end, close it
201+
return path[on_path[w]:] + [w]
202+
if w not in visited:
203+
visited.add(w)
204+
result = dfs(w)
205+
if result:
206+
return result
207+
path.pop()
208+
del on_path[v]
209+
return []
210+
211+
start = sorted(members)[0]
212+
visited.add(start)
213+
return dfs(start)
214+
215+
216+
def largest_scc_size(comps: Sequence[Sequence[str]]) -> int:
217+
nontrivial = [c for c in comps if len(c) > 1]
218+
return max((len(c) for c in nontrivial), default=0)

0 commit comments

Comments
 (0)