codedump (the Code Dumper plugin) lifts a function and everything it touches out of IDA Pro
and into a single, self-contained artifact: decompiled code, call graphs, annotated assembly, or types.
It does not just paste pseudocode. It walks the call graph up and down from your cursor, decompiles the
whole neighbourhood, resolves the types those functions reference, traces where every argument comes from,
and writes it all to one file (or your clipboard). It can also carry your names, prototypes, comments,
local variables, and types from one build of a binary to the next, verified function-by-function.
Open a function in the Hex-Rays pseudocode view, right-click, and pick something from Dump code/:
- Dump code (.c) - the function plus its callers and callees, all decompiled, with the structs/unions/enums they use pasted in at the top, cross-references annotated, and dataflow traced between them.
- Dump call graph (.dot) - a Graphviz call graph of the same neighbourhood, edges colour-coded by how the calls are made.
- Dump assembly (.asm) - the same neighbourhood as annotated disassembly.
- Dump PTN (.ptn) - just the provenance/dataflow facts, in a compact text notation.
- Export metadata / Apply metadata - save all your reverse-engineering work for a function tree, then re-apply it onto the same code in a different IDB.
Right-click a type in the Local Types view instead, and you get Dump type/ - render a type and its dependencies to a header file, to the clipboard, or as a dependency graph.
Everything writes to a file by default, or to the clipboard if you tick the box.
Sharing a piece of a binary is annoying. You select the pseudocode, copy it, and immediately lose half the context: the structs are gone, the helper functions it calls are gone, you can't tell where a parameter actually came from, and the reader has no idea which other functions reach this one. So you copy those too, and the structs, and now you're hand-assembling a context dump in a scratch file.
codedump does that assembly for you. Point it at one function and it produces the whole readable unit (code, types, cross-references, and dataflow) in a form you can drop into a ticket, a chat, a diff, or an LLM prompt without further editing.
And there's a second problem it solves. When you've spent a day naming variables and writing prototypes in one build, and then a new build lands with the same code at different addresses, you don't want to redo the work. codedump's metadata transfer fingerprints each function structurally (ignoring absolute addresses) and replays your names, comments, prototypes, locals, and types onto the matching functions in the new database, skipping anything that doesn't actually match.
A .c dump of a single function with default settings reads like this (abridged):
// Decompiled code dump generated by CodeDumper
// --------
#PTN v1
// @PTN LEGEND
// Nodes: L(name)=local; P(name)=param; G(name|addr)=global; C(val)=constant; R(func)=return val.
// Slices: @[off:len] or .field_name; '&' = address-of; '*' = deref.
// A: alias/assignment => A: dst := src
// I: inbound (caller->param) => I: origin -> P(name)
// E: outbound (arg->callee) => E: origin -> A(callee, arg_idx)
// R: return flow => R: callee() -> L(var)
// G: global access => G: F(func) -> G(addr) (write) or G(addr) -> F(func) (read)
// --------
// Start Function: 0x401000 (parse_request)
// Caller Depth: 2
// Callee/Ref Depth: 2
// Total Functions Found: 7
// Included Functions (7):
// - parse_request (0x401000)
// - validate (0x401100)
// - handle_conn (0x401200)
// ...
// Removed Functions: None
// ------------------------------------------------------------
// ============================================================
// Referenced types (3 ordinals, recursively resolved)
// ============================================================
struct Request {
int id;
char verb[8];
char *body;
};
// ... every type these functions touch, in dependency order ...
// ============================================================
// Incoming xrefs for parse_request (0x401000): handle_conn (0x401200) [direct_call]
// Outgoing xrefs for parse_request (0x401000): validate (0x401100) [direct_call], memcpy (0x4010F0) [direct_call]
// @PTN D:F3=0x401000,parse_request
// @PTN I:P(buf){conf=high,cs=0x401210,caller=F5} -> P(buf)
// @PTN E:L(len)@[0x0:?] -> A(memcpy,2) {conf=med,cs=0x401055}
// --- Function: parse_request (0x401000) ---
// incoming: rdi, rsi
// outgoing: rax, rcx, rdx
// offset: RVA 0x1000 EA 0x401000
__int64 __fastcall parse_request(char *buf, int len)
{
// ... Hex-Rays pseudocode ...
}
// --- End Function: parse_request (0x401000) ---Every block above is something codedump computed so you didn't have to. The header lists the whole function set and the depths used. The type section is the recursively-resolved set of structs the code references. The per-function preamble gives you incoming/outgoing cross-references, the provenance (@PTN) lines, an optional register liveness summary, and both the RVA and absolute address so a named function still carries its location when you copy just the body out.
codedump runs a fixed five-pass pipeline against the function under your cursor. A cancellable wait box reports live progress through each pass.
| Pass | What happens |
|---|---|
| 1 - Callers | Breadth-first walk up the call graph, up to Caller Depth layers. |
| 2 - Callees | Breadth-first walk down the call graph (and references), up to Callee Depth layers. |
| 3 - Bodies | Decompile each discovered function's ctree and extract a provenance summary (arguments, aliases, global accesses). Optionally compute a register-liveness summary. |
| 4 - Decompile | Produce the pseudocode for each function and collect every type it references. Optionally capture disassembly. |
| 5 - Output | Build PTN annotations, resolve the type tree, render the chosen format, and write it to a file or the clipboard. |
The graph walk is what makes the output useful rather than a single-function paste. It doesn't just follow direct calls; it understands seven distinct kinds of edge, each of which you can switch on or off:
| Edge | How codedump finds it |
|---|---|
| direct_call | A call/jmp to a known function start. |
| indirect_call | A call reg/call [mem] whose target codedump can recover from a nearby register load or displacement. |
| data_ref | A data reference into a function (e.g. a function pointer stored in a table). |
| immediate_ref | A function address used as an immediate operand. |
| tail_call_push_ret | A tail jmp, or the push <addr>; ret trampoline pattern. |
| virtual_call | An indirect call resolved through a scanned vtable, matched by slot offset. |
| jump_table | Targets recovered from a switch/jump table. |
The graph nodes and these edges are exactly what the .dot output draws, so a call-graph dump and a code dump always describe the same neighbourhood.
codedump is an IDA plugin built on idax. idax pulls in the IDA SDK and ida-cmake for you, so the only hard requirement on your side is a C++23 compiler and an IDA install with Hex-Rays.
git clone [email protected]:19h/ida-codedump.git
cd ida-codedump
# Build (Ninja if available, otherwise Make)
make
# Install into ~/.idapro/plugins (codesigned on macOS, which IDA requires)
make installThat drops codedump.so / codedump.dylib / codedump.dll into ~/.idapro/plugins/. Restart IDA, open a binary, and the plugin announces itself in the Output window:
[codedump] Plugin initialized (right-click in pseudocode or Local Types view)
The plugin is intentionally hidden: it has no menu entry and no default hotkey. You drive it entirely from the right-click popups in the pseudocode and Local Types views. (Running it from the plugins list directly just opens the code-dump dialog.)
Good to know codedump initialises the Hex-Rays decompiler at load time; if Hex-Rays is unavailable it prints a message and refuses to load, because every one of its features depends on the decompiler.
- CMake 3.27+
- A C++23 compiler (Clang 17+, GCC 13+, MSVC 2022) - it uses
std::formatandstd::expected - IDA Pro with the Hex-Rays decompiler
IDASDKset, or let idax fetch the SDK itself
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j
# then copy build/codedump.{so,dylib,dll} into your plugins dir;
# build/cdump or build/cdump.exe is the headless CLICI builds the plugin and cdump on macOS (x86_64 + arm64), Linux x86_64, and Windows x86_64 on every push; tagged v* pushes attach both binaries to a GitHub release.
For bulk dumps or CI jobs, cdump is the headless idalib command-line tool. It opens an IDB or binary and renders the same code, assembly, DOT, and PTN outputs as the plugin.
It is part of the default build:
make # builds the plugin and cdump
make cdump # explicitly build just the CLI targetThe equivalent CMake build is:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target cdumpBy default the executable links through the SDK's idasdk::idalib import target, so CI can build release artifacts without a full IDA install. At runtime, cdump still needs the real IDA libraries from your local IDA installation (libidalib and libida). Put the executable in the IDA library directory, or add that directory to your platform's loader path (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, or PATH).
For local development, you can instead link the executable directly against an installed IDA runtime:
IDADIR=/path/to/ida/bin cmake -S . -B build -DCODEDUMP_CLI_USE_IDADIR_RUNTIME=ON
cmake --build build --target cdumpTagged releases upload the CLI as cdump_linux-x86_64, cdump_macos-x86_64, cdump_macos-arm64, or cdump_windows-x86_64.exe.
Usage:
cdump [options] <input.idb|binary>
-f, --functions <spec> Start functions by name or 0xADDR. Separate with comma or |
or repeat -f. Omit for all functions.
-o, --out <file> Output path. Use "-" for stdout. By default cdump writes
<input-stem>.<ext> next to the command-line input.
--format code|asm|dot|ptn Output kind. Default: code.
--caller-depth N Walk callers from the resolved start functions. Default: 0.
--callee-depth N Walk callees/refs from the resolved start functions. Default: 0.
--ptn, --no-ptn Enable or omit inline PTN in code/asm output. Default: off.
--regs, --register-in-out Add incoming/outgoing register summaries.
--offsets, --size-comments
Annotate types with member offsets, sizes, and sizeof.
--trim, --referenced-only Trim structs/unions to referenced fields.
--no-direct-calls ... Disable graph-walk xref kinds; one flag per xref kind.
--max-chars N For code output, drop smallest non-seed function blocks
until the rendered text fits. 0 means unlimited.
-q, --quiet Suppress progress output.
-v, --verbose Show extra progress detail.
Examples:
cdump mybin.i64 # every function; writes ./mybin.c by default
cdump -f 0x140001000,main -o out.c mybin
cdump -f parse --callee-depth 3 --ptn --regs --offsets --trim-types mybin
cdump -f main --callee-depth 2 --format dot -o graph.dot mybin.i64Key properties:
- All-functions mode iterates
ida::function::by_indexdirectly and skips caller/callee graph discovery. Depths and xref filters apply only when at least one-fseed resolves. - If every supplied
-fspec fails to resolve, the current CLI falls back to all-functions mode. - Type declarations are collected only from decompiled functions in the dump; unused Local Types entries are not emitted.
- Code and assembly outputs include resolved type declarations. DOT output uses discovered graph edges. Standalone PTN output is produced with
--format ptn. - The CLI runs in its own idalib process and does not require launching the IDA GUI.
This is the headline use case: one .c file with the function, its neighbourhood, its types, its cross-references, and its dataflow.
- Open the function in the pseudocode view (press F5 on it).
- Right-click anywhere in the pseudocode → Dump code/ → Dump code (.c)….
- In the dialog, set Caller Depth and Callee Depth (how far up and down the call graph to walk), choose an Output File, and toggle any options you want.
- Click OK. A progress box shows the five passes; you can cancel at any time and it stops cleanly.
Good to know The output header lists every function included and the depths used, so nothing is ever silently dropped. If you set Max Characters, codedump removes the smallest non-root functions first and always keeps the function you started from.
Tip Tick Copy to clipboard to skip the file and push the whole dump straight to your clipboard, handy for pasting into a chat, ticket, or LLM prompt. codedump reports which clipboard backend it used and falls back to a select-all dialog if none is available.
Tip For a big struct you only touch a couple of fields of, tick Trim types to referenced fields only to collapse it to the accessed members (the rest is padded), keeping the dump readable.
- With the function open in pseudocode, right-click → Dump code/ → Dump call graph (.dot)….
- Use the same dialog; the depths and Xref Types checkboxes decide which functions and edges appear.
- Save the
.dotand render it, e.g.dot -Tsvg out.dot -o out.svg.
Good to know Your start function is filled light blue. Edge colour encodes the call kind (direct = black, indirect = blue, data = green, immediate = orange, tail = red, virtual = purple, jump table = brown); indirect and virtual calls are dashed; the label lists every reference type joining two functions.
- Right-click → Dump code/ → Dump PTN (.ptn)… for the dataflow facts on their own, or just read the inline
// @PTNcomments inside a.cdump. - Use the legend at the top of any dump (and the PTN reference below) to decode the notation.
Good to know PTN forwards arguments across call boundaries up to Callee Depth levels, so a parameter passed through several wrapper functions is traced end to end. The confidence tag (
high/med/low) drops honestly through lossy operations like a non-constant index or a conditional.
Tip If you only want clean code with no dataflow noise, tick Omit PTN annotations.
- Right-click → Dump code/ → Dump assembly (.asm)….
- Same dialog and same neighbourhood as the code dump.
Good to know PTN hints are attached to the exact instruction that sets up each argument, and the referenced type declarations are included as comments, so the single
.asmfile stands on its own:401055: call memcpy ; @PTN E:L(len) -> A(memcpy,2)
- Switch to the Local Types view.
- Right-click a type → Dump type/ and choose:
- Dump type… / Dump type recursively… to write a
.hfile, - Copy type / Copy type recursively… to send it to the clipboard,
- Dump type graph (.dot)… / Copy type graph (.dot)… for a dependency graph.
- Dump type… / Dump type recursively… to write a
- For the recursive variants, set the Recursion Depth:
0= just this type,-1= unlimited, N = follow dependencies N levels.
Good to know The type-graph dialog offers two layouts: Simple (one node per type) and Table (a record-style node where each edge leaves the specific field that references another type). You can include or exclude enums and typedefs.
Tip Tick size comments to annotate members with
// off=N size=Mand structs with// sizeof=0xN.
Use this when a new build of the same binary lands at different addresses and you don't want to redo your naming and typing work.
In the build where you did the work:
- Open the top-level function of the tree you want to save, in pseudocode.
- Right-click → Dump code/ → Export metadata (functions+types)….
- Choose how deep to walk callees and an output path, then OK. codedump writes a
.cdumpmetafile containing names, prototypes, comments, locals, referenced types, and a structural fingerprint for every function in the tree.
In the new build:
- Open the function that corresponds to the old root.
- Right-click → Dump code/ → Apply metadata… and point it at the
.cdumpmetafile. - Review the summary of matched / mismatched / applied counts when it finishes.
Good to know Every function is fingerprinted structurally (byte size, instruction count, basic-block count, and a hash that ignores absolute addresses), and the fingerprint is checked before anything is written. A function that actually changed won't match, so stale names never land on the wrong code.
Tip Leave Continue past fingerprint mismatches on to skip only the offending subtree instead of aborting the whole apply. Turn off Overwrite existing names if you'd rather keep names you've already set in the new database.
The two depths decide how much shows up. Bigger isn't better; past a point you get a wall of unrelated functions and a slow run.
| Goal | Caller / Callee | What you get |
|---|---|---|
| Read one function with its helpers | 0 / 2 | The function and two layers of what it calls - no callers. |
| Understand a function in full context | 1 / 2 | Adds one layer of callers, so you see how it's reached. |
| Map a small subsystem (the default) | 2 / 2 | A balanced neighbourhood in both directions. |
| Just this function, nothing else | 0 / 0 | The single function on its own. |
| Tight context for an LLM prompt | 1 / 1 + Max Characters | A small, capped slice. |
Good to know The callee walk starts from everything found so far (the start function and its callers), so adding caller depth also pulls in the callees of those callers. One extra caller layer can widen the dump more than you expect.
Good to know Each extra callee layer can multiply the function count, and every discovered function gets decompiled. Deep walks on a densely-connected binary are slow; the wait box shows live progress and cancels cleanly at any point. Use Max Characters to bound the output, and untick Xref Types you don't care about (data and immediate refs especially) to keep the graph focused.
Common goals as concrete option sets:
- Drop a function into a chat or LLM - Dump code (.c), Caller
1/ Callee1, Max Characters ≈ 40000, Copy to clipboard. Add Trim types to referenced fields only if the structs are large. - File a readable bug repro - Dump code (.c), default
2/2, Include size comments, written to a file next to the IDB. - Sketch a subsystem - Dump call graph (.dot), Caller
1/ Callee3, untick everything in Xref Types except the call edges, thendot -Tsvg. - Audit dataflow into a sink - Dump PTN (.ptn) on the function that reaches the sink, Callee depth high enough to span the wrappers; read the
E:/I:chains. - Grab a struct and its dependencies - Local Types → Dump type recursively…, depth
-1, size comments on. - Move a day's work to a new build - Export metadata from the root at a generous callee depth, then Apply metadata onto the matching function in the new IDB.
| Menu item | Where | Output | Goes to |
|---|---|---|---|
| Dump code (.c) | Pseudocode | Decompiled neighbourhood + types + xrefs + PTN | File or clipboard |
| Dump call graph (.dot) | Pseudocode | Graphviz call graph | File or clipboard |
| Dump PTN (.ptn) | Pseudocode | Standalone provenance facts | File or clipboard |
| Dump assembly (.asm) | Pseudocode | Annotated disassembly + PTN hints | File or clipboard |
| Export metadata… | Pseudocode | .cdumpmeta (names, prototypes, comments, lvars, types, fingerprints) |
File |
| Apply metadata… | Pseudocode | Writes saved metadata into the current database | In place |
| Dump type… / recursively… | Local Types | Type declaration(s) | .h file |
| Copy type / recursively… | Local Types | Type declaration(s) | Clipboard |
| Dump / Copy type graph (.dot) | Local Types | Graphviz dependency graph | File or clipboard |
Every output that can go to the clipboard falls back to a select-all text dialog when no clipboard backend is available.
| Field | Default | Meaning |
|---|---|---|
| Caller Depth | 2 | How many layers of callers to walk up. |
| Callee Depth | 2 | How many layers of callees/references to walk down. Also bounds PTN chain forwarding. |
| Max Characters | 0 | Output size cap; 0 = unlimited. Over the cap, smallest non-root functions are dropped first. |
| Output File | next to the input/IDB | Destination path. A directory is accepted; the default filename is appended. |
| Xref Types | all on | Which of the seven edge kinds the graph walk follows (direct, indirect, data, immediate, tail, virtual, jump table). |
| Omit PTN annotations | off | Skip all provenance output. |
| Include size comments | off | Annotate type members with offset/size and structs with sizeof. |
| Copy to clipboard | off | Skip the file; push the rendered output to the clipboard. |
| Include register summary | off | Add per-function incoming/outgoing register lines. |
| Trim types to referenced fields only | off | Reduce structs/unions to accessed members, padding the rest. |
PTN ("Provenance Tracking Notation") is the compact dataflow notation codedump emits inline in .c dumps, as a standalone .ptn file, and internally as JSON. codedump builds it by decompiling each function, walking its ctree, and normalising every expression to an origin (propagating taint through casts, pointer arithmetic, struct member access, array indexing, and shifts), then stitching the results across call boundaries with a breadth-first chain walk.
| Token | Meaning |
|---|---|
L(name) |
local variable |
P(name) |
parameter |
G(name|addr) |
global |
C(val) |
constant |
R(func) |
a function's return value |
A(callee,i) |
argument slot i of a call to callee |
F(func) |
a function as a whole |
@[off:len] / .field |
a slice or struct member of the node |
& / * |
address-of / dereference applied to the node |
:(type) |
a cast that was applied |
{conf=…,cs=…,caller=…} |
metadata: confidence, call-site address, calling function id |
| Edge | Reads as |
|---|---|
A: dst := src |
an assignment/alias inside a function |
I: origin -> P(name) |
inbound: a caller fed origin into this parameter |
E: origin -> A(callee,i) |
outbound: this function passed origin as argument i of callee |
R: callee() -> L(var) |
a return value flowed into a local |
G: F(func) -> G(addr) / G: G(addr) -> F(func) |
a global write / read |
Three lines from a real dump:
A:L(v3):=P(buf)@[0x8:?] {conf=med}
E:L(v3) -> A(memcpy,2) {conf=med,cs=0x401055}
I:L(len){conf=high,cs=0x401210,caller=F5} -> P(len)
Read top to bottom:
A:- inside this function, localv3was assigned from parameterbufat byte offset0x8(the@[0x8:?]slice; length unknown).E:-v3is then passed as argument index2(the third argument) tomemcpy, at call site0x401055. Chained with the line above, the third argument tomemcpyultimately comes frombuf + 8.I:- a caller (functionF5) supplied its own locallenas this function'slenparameter, at call site0x401210, with high confidence.
The function id F5 resolves through the D: dictionary lines at the top of the dump; cs= is the exact call instruction; conf= tells you how much to trust the inference.
When Include register summary is on, the // incoming: / // outgoing: lines come from a proper iterative liveness analysis over the function's flow chart, not a guess:
- incoming = registers read along some path before being written, i.e. the registers a caller must set up (the function's real inputs).
- outgoing = every register the function writes anywhere.
x86 sub-registers are normalised to their 64-bit parents (eax/ax/al → rax, r8d → r8, xmm/ymm/zmm collapse to xmm), and always-live/always-clobbered registers (rsp, rip, flags) are filtered out as noise.
A line-based text protocol with C-escaped quoted strings. One directive per line; functions are bracketed by func/end_func; types are one line each.
#cdump-meta v1
image_base 0x140000000
root_ea 0x140012340
arch "x86_64"
created "2026-05-31T12:00:00Z"
type 1 "MyStruct" "struct MyStruct {\n int x;\n};"
func 1
name "handle_req"
ea 0x140012340
rva 0x12340
fp_size 234
fp_insns 47
fp_bbs 5
fp_hash 0xABCDEF
proto "__int64 __fastcall handle_req(int)"
func_cmt "..."
func_cmt_rpt "..."
cmt 0x12 0 "..."
lvar reg 12 0x14 "argp" "char *"
lvar stk 16 0x20 "n" "int"
callee 0x18 2
end_func
cmt <rva> <repeatable> "text", lvar <reg|stk|none> <loc> <defea_rva> "name" "type", and callee <call_rva> <callee_id> are all relative to the function start so they survive an address shift between builds.
| Symptom | Cause | What to do |
|---|---|---|
| Plugin won't load; log says Hex-Rays decompiler not available | No decompiler for this binary's architecture | codedump requires Hex-Rays; there's nothing it can do without it. |
| A Dump action is greyed out | Wrong view | Code and metadata actions enable only in the pseudocode view; type actions only in Local Types, on a named type. |
| No function at cursor | The cursor isn't inside a recognised function | Put the cursor inside a function (define one with P if IDA hasn't). |
Some bodies show // (decompilation failed) |
Hex-Rays couldn't decompile that function | Expected for a few functions; the rest still dump. Pass 4's progress shows the ok/fail split. |
| The Referenced types section is missing | The walked functions reference no user-defined types | Nothing to emit; not an error. |
| Clipboard unavailable, a text dialog pops up | No clipboard backend in this environment (e.g. headless) | Select-all + copy from the dialog, or untick Copy to clipboard to write a file instead. |
| Output is huge, or the run drags | Depths too high on a densely-connected binary | Lower the depths, set Max Characters, narrow Xref Types, or cancel the wait box. |
| Apply metadata: lots of mismatched functions | The target genuinely differs, or you applied onto the wrong root | Confirm the cursor is on the function matching the dump's root; leave Continue past fingerprint mismatches on to apply whatever does match. |
| Apply metadata: failed to parse … at line N | The .cdumpmeta was hand-edited or truncated |
Re-export it; the format is line- and quote-sensitive. |
codedump is a hybrid: all of the plugin scaffolding, user interaction, analysis, and output runs through idax (a modern C++23 wrapper over the IDA SDK), with raw SDK types confined to the plugin ABI boundary. There is no loose ea_t/BADADDR/raw-tinfo_t vocabulary in the analysis code: the decompiler, type rendering, forms, wait box, clipboard, names, comments, instruction decoding, flow graphs, and xref traversal are all idax calls. (The parity audit lives in docs/IDAX_GAPS.md.)
CodeDumpPlugin (an ida::plugin::Plugin subclass) takes a scoped Hex-Rays session at init, registers the twelve actions, and subscribes to two popup events with RAII guards: the Hex-Rays populating popup event attaches the Dump code/ actions in the pseudocode view, and the UI popup ready event attaches the Dump type/ actions in the Local Types view. Everything is torn down cleanly on term(). Each action is gated by context: code actions only enable in pseudocode, type actions only when there's a named type under the cursor.
The repository is ida-codedump ([email protected]:19h/ida-codedump.git), the built binary and the source namespace are codedump, and the plugin shows up in IDA as Code Dumper. Saved metadata files carry the .cdumpmeta extension. They're all the same project.
MIT
