You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HuggingFace inference operator (#5041) is being landed as a sequence of focused task-family PRs. The dispatcher + per-task codegen architecture was introduced in #5277 with text-generation as the first task family.
This issue covers adding the audio and media-generation task families to that architecture. The new tasks plug into the existing dispatcher by adding dedicated TaskCodegen implementations for audio and media generation, then registering their task strings in HuggingFaceInferenceOpDesc.
Concretely, landing this would enable:
Audio inference tasks:
automatic-speech-recognition
audio-classification
text-to-speech
Media-generation tasks:
text-to-image
text-to-video
A cleaner codegen structure where audio and media-generation Python payload / parse logic lives in separate files instead of expanding the operator descriptor.
Task Summary
Feature Summary
The HuggingFace inference operator (#5041) is being landed as a sequence of focused task-family PRs. The dispatcher + per-task codegen architecture was introduced in #5277 with text-generation as the first task family.
This issue covers adding the audio and media-generation task families to that architecture. The new tasks plug into the existing dispatcher by adding dedicated
TaskCodegenimplementations for audio and media generation, then registering their task strings inHuggingFaceInferenceOpDesc.Concretely, landing this would enable:
automatic-speech-recognitionaudio-classificationtext-to-speechtext-to-imagetext-to-videoProposed Solution or Design
Add new files under:
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/codegen/AudioTaskCodegen.scalaMediaGenCodegen.scalaModify:
HuggingFaceInferenceOpDesc.scalaTaskCodegen.scalaCodegenContextwith audio input fieldsPythonCodegenBase.scalaHuggingFaceInferenceOpDescSpec.scalaDesign constraints:
TaskCodegenfiles.EncodableString+pyb"..."safety for user-provided string fields.generatePythonCodetotal so arbitrary@JsonPropertyvalues do not throw during code generation.References:
Impact / Priority
(P2) Medium — required for broader HuggingFace operator task coverage. Does not affect existing operators.
Affected Area
Workflow Engine (Amber) — HuggingFace operator descriptor and Python codegen.
Task Type
Testing / QA
Other
Task Type