Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation)

### Feature Summary

The HuggingFace inference operator (#5041) needs to cover ~20 HF pipeline tasks (text-generation, image-classification, ASR, text-to-image, …). To land it cleanly and let the per-task work proceed in parallel, the operator is introduced via a dispatcher + per-task codegen architecture: a thin `HuggingFaceInferenceOpDesc` selects a `TaskCodegen` based on the configured task, and the selected codegen contributes the per-task Python payload + parse snippets. Shared infrastructure (provider fallback, HTTP loop, response-parsing framework) lives in `PythonCodegenBase`.

This issue covers shipping the dispatcher pattern + the first task family (text-generation) end-to-end. Subsequent child issues add the image, audio / media-generation, and QA / ranking task families by introducing new `*Codegen` objects and registering them in the dispatcher map. The architecture lets each task-family PR stay focused: a new task family means one new file plus one entry in the dispatcher map — no surgery on the shared infrastructure or other codegens.

Concretely, landing this would enable:

- A working HuggingFace operator on the workspace for text-generation tasks against HF Hub and any OpenAI-compatible third-party provider (Cerebras, Groq, Sambanova, Together, …).
- A clean extension point for the image / audio / QA task families to plug into via subsequent PRs without modifying the operator class or the shared Python infrastructure.

### Proposed Solution or Design

1. New files under `common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/huggingFace/`:
   - `HuggingFaceInferenceOpDesc.scala` — thin (~180-line) dispatcher holding the `@JsonProperty` fields and the `registeredCodegens` map.
   - `codegen/TaskCodegen.scala` — trait + `CodegenContext` case class; default `tasks: Set[String] = Set(task)` for single-task codegens, overridable by multi-task codegens.
   - `codegen/PythonCodegenBase.scala` — shared provider-fallback (HF router + OpenAI-compatible third-party providers), `process_table` loop, `_parse_response` framework, with two holes for the per-task payload + parse snippets.
   - `codegen/TextGenCodegen.scala` — text-generation's chat-completions payload and `body["choices"][0]["message"]["content"]` parse.
2. Register `HuggingFaceInferenceOpDesc` in `LogicalOp.scala`'s `@JsonSubTypes`.
3. Design constraints baked into the codegen:
   - **Safe codegen via `EncodableString` + `pyb"..."`:** user-input string fields are typed as `EncodableString` (`String @EncodableStringAnnotation`); the `pyb` macro emits them as `self.decode_python_template('<base64>')` runtime expressions instead of raw Python literals, so they never appear in the generated source as-is. This is what satisfies `PythonCodeRawInvalidTextSpec`'s leakage check.
   - **Constants in `open(self)`:** per-instance attributes (`self.MODEL_ID`, `self.PROMPT_COLUMN`, …) are assigned in the lifecycle method so `self` is in scope for the decode call.
   - **Codegen totality:** `generatePythonCode` never throws on arbitrary `@JsonProperty` values — unknown task strings fall back to `TextGenCodegen`, and the generated Python's `else` branch produces a generic `{"inputs": prompt_value}` payload, matching the original monolithic operator's behavior. Required by the regression test contract.
   - **Defensive `MODEL_ID` validation at runtime:** generated Python rejects malformed model IDs (path-traversal segments, query strings, fragments, control characters) with a clear `ValueError` before any HF URL is composed.

References:
- Parent issue: #5041
- Stacked on: #5124 (REST resource — issue #5134)
- HF Inference Providers API: https://huggingface.co/docs/inference-providers

### Impact / Priority

(P2) Medium — required for the HuggingFace inference operator (#5041) to function. Does not affect existing functionality.

### Affected Area

Workflow Engine (Amber) — operator descriptor + Python codegen.

### Task Type

- [ ] Refactor / Cleanup
- [ ] DevOps / Deployment / CI
- [ ] Testing / QA
- [ ] Documentation
- [ ] Performance
- [x] Other

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation) #5277

Feature Summary

Proposed Solution or Design

Impact / Priority

Affected Area

Task Type

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add HuggingFaceInferenceOpDesc with dispatcher + per-task codegen architecture (text-generation) #5277

Description

Feature Summary

Proposed Solution or Design

Impact / Priority

Affected Area

Task Type

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions