Skip to content

fix: pass function call result to execute_op query_id during rollback#87

Open
thehecktour wants to merge 1 commit into
PostHog:mainfrom
thehecktour:fix/pass-uuid4-correct
Open

fix: pass function call result to execute_op query_id during rollback#87
thehecktour wants to merge 1 commit into
PostHog:mainfrom
thehecktour:fix/pass-uuid4-correct

Conversation

@thehecktour

Copy link
Copy Markdown

What this fixes

In attempt_migration_rollback, every rollback operation was being tagged with the same static string as its query_id — the string representation of the uuid4 function object itself, not a generated UUID.

# Before — str() of the function reference, not a call
execute_op(op, query_id=str(uuid4))
# → query_id = "<built-in method uuid4 of type object at 0x10a3b2c40>"

# After — str() of the UUID generated by calling the function
execute_op(op, query_id=str(uuid4()))
# → query_id = "550e8400-e29b-41d4-a716-446655440000"

In Python, functions are first-class objects. Writing uuid4 without () is a valid expression — it evaluates to the function itself. str() on a function produces its memory address representation, which is constant for the lifetime of the process.


Why this matters

execute_op passes query_id as log_comment in the ClickHouse query settings:

# housewatch/async_migrations/async_migration_utils.py
def execute_op(sql: str, args=None, *, query_id: str, timeout_seconds: int = 600, settings=None):
    settings = settings if settings else {"max_execution_time": timeout_seconds, "log_comment": query_id}
    run_query(sql, args, settings=settings, use_cache=False)

log_comment is how ClickHouse tags a query in system.query_log and how you identify it for KILL QUERY. Without unique IDs per operation, the consequences during a rollback are:

  • Every rollback op shares the same identifier in system.query_log — impossible to distinguish which step ran, failed, or how long each took
  • Individual ops cannot be killed via KILL QUERY WHERE log_comment = '...' — they all have the same tag
  • Debugging a stuck rollback becomes guesswork, since the log entries are indistinguishable

Proof it's a typo, not intentional

The forward migration path in the same file already does this correctly:

# runner.py — forward operation (line 87) ✅
current_query_id = str(uuid4())
execute_op(op, query_id=current_query_id)

# runner.py — rollback operation (line 160) ❌ before this fix
execute_op(op, query_id=str(uuid4))

Same file — one path calls the function, the other doesn't. The inconsistency is the tell.

The bug was likely introduced when execute_op was refactored from its original PostHog signature (which passed a UUID string externally) to the current one (which generates it inline). The rollback call was updated to pass the value inline, but the () was dropped in the process, as evidenced by the commented-out original:

# async_migration_utils.py (commented-out original)
# def execute_op(op: AsyncMigrationOperation, uuid: str, rollback: bool = False):
#     op.rollback_fn(uuid) if rollback else op.fn(uuid)

Change

File: housewatch/async_migrations/runner.py

- execute_op(op, query_id=str(uuid4))
+ execute_op(op, query_id=str(uuid4()))

One character. Every rollback operation now gets a proper unique UUID as its query_id, consistent with how forward migration operations are already handled. 🔧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant