Skip to content

feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260

Open
zyratlo wants to merge 2 commits into
apache:mainfrom
zyratlo:migration-tool-llm-client
Open

feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion#5260
zyratlo wants to merge 2 commits into
apache:mainfrom
zyratlo:migration-tool-llm-client

Conversation

@zyratlo
Copy link
Copy Markdown
Contributor

@zyratlo zyratlo commented May 28, 2026

What changes were proposed in this PR?

Introduces the frontend LLM session class that converts a Jupyter notebook into a Texera workflow JSON plus a bidirectional cell to operator mapping, along with the prompt library it uses. Two files under frontend/src/app/workspace/service/notebook-migration/, totalling ~700 lines (~410 of which is prompt text).

migration-llm.ts — defines NotebookMigrationLLM, an @Injectable class wrapping a Vercel AI SDK chat session against the LiteLLM proxy already exposed on main at /api/chat/completion.

  • initialize(modelType, apiKey) — builds an OpenAI-compatible chat client via createOpenAI({ baseURL: AppSettings.getApiEndpoint() }), seeds the message history with Texera documentation as system messages.
  • verifyConnection() — does a 10-token ping call to validate that the API key works against the configured model.
  • convertNotebookToWorkflow(notebook) — extracts code cells (each tagged with a UUID in metadata.uuid), sends WORKFLOW_PROMPT + the notebook to get a JSON of UDF operators / edges, then sends MAPPING_PROMPT to get the cell↔operator mapping. Assembles a complete Texera workflow JSON (PythonUDFV2 operators with stub input/output ports, links derived from the LLM's edge list, default settings) plus a bidirectional operator_to_cell / cell_to_operator mapping. Returns both as a JSON string.
  • close() — clears the message history and the model reference.

migration-prompts.ts — string constants used by migration-llm.ts: TEXERA_OVERVIEW, TUPLE_DOCUMENTATION, TABLE_DOCUMENTATION, OPERATOR_DOCUMENTATION, UDF_INPUT_PORT_DOCUMENTATION, EXAMPLE_OF_GOOD_CONVERSION, VISUALIZER_DOCUMENTATION, EXAMPLE_OF_MULTIPLE_UDF_CONVERSION, WORKFLOW_PROMPT, MAPPING_PROMPT.

Any related issues, documentation, discussions?

Closes #5259
Parent issue #4301

How was this PR tested?

No unit tests were included for these reasons:

  • A large portion of the changes are prompt text, which are not testable, only readable. However the prompt text can be changed to improve the performance of the LLM.
  • Testing would require mocking a significant amount of logic that will be introduced in later PRs, since the logic in migration-llm.ts is parsing a response.

However I am open to writing tests based on review feedback.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.7)

@github-actions github-actions Bot added the frontend Changes related to the frontend GUI label May 28, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.95%. Comparing base (d8c254c) to head (78b9ef3).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5260      +/-   ##
============================================
- Coverage     48.95%   48.95%   -0.01%     
  Complexity     2377     2377              
============================================
  Files          1048     1048              
  Lines         40270    40270              
  Branches       4272     4272              
============================================
- Hits          19714    19713       -1     
  Misses        19402    19402              
- Partials       1154     1155       +1     
Flag Coverage Δ *Carryforward flag
access-control-service 39.53% <ø> (ø) Carriedforward from 95ceb37
agent-service 33.76% <ø> (ø) Carriedforward from 95ceb37
amber 51.57% <ø> (ø) Carriedforward from 95ceb37
computing-unit-managing-service 0.00% <ø> (ø) Carriedforward from 95ceb37
config-service 0.00% <ø> (ø) Carriedforward from 95ceb37
file-service 37.99% <ø> (ø) Carriedforward from 95ceb37
frontend 40.64% <ø> (ø)
python 90.79% <ø> (ø) Carriedforward from 95ceb37
workflow-compiling-service 56.81% <ø> (ø) Carriedforward from 95ceb37

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Yicong-Huang Yicong-Huang changed the title feat(python-notebook-migration, frontend): add LLM client for notebook-to-workflow conversion feat(python-notebook-migration): add LLM client for notebook-to-workflow conversion May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend Changes related to the frontend GUI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Notebook Migration] Add LLM client for notebook-to-workflow conversion

2 participants