Skip to content

refactor(config): migrate from .env to YAML with OmegaConf#350

Draft
looksaw2 wants to merge 3 commits into
apache:mainfrom
looksaw2:yaml-config-migration
Draft

refactor(config): migrate from .env to YAML with OmegaConf#350
looksaw2 wants to merge 3 commits into
apache:mainfrom
looksaw2:yaml-config-migration

Conversation

@looksaw2

Copy link
Copy Markdown

Migrated config storage from .env to config.yaml using OmegaConf.

  • YAML replaces .env — all settings live in one config.yaml under llm / hugegraph
    / admin / index sections with nested structure.
  • Hot reload — background file watcher picks up changes to config.yaml without
    restarting. Polling interval is configurable.
  • Auto migration — existing .env is automatically converted to config.yaml on
    first run, no manual setup needed.
  • Env vars take priority — OPENAI_API_KEY and similar secrets still come from
    environment variables, overriding YAML values (12-factor friendly).
  • No breaking changes — llm_settings.language and all other attribute access works
    exactly the same, 46 consumer files untouched.

looksaw2 and others added 3 commits May 30, 2026 22:57
Replace pydantic-settings + python-dotenv with OmegaConf-based
config.yaml. Supports nested YAML structure, environment variable
override (os.environ > config.yaml > defaults), auto-migration
from existing .env, and hot-reload via file watcher.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Update design.md to reflect the nested YAML structure (via flat↔nested
mapping), update tasks.md to mark all 10 task groups completed, and
document implementation differences from the original plan.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Translate requirements.md, design.md, and tasks.md from Chinese
to English to meet project contribution guidelines.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels May 30, 2026
@github-actions github-actions Bot added the llm label May 30, 2026

@imbajin imbajin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto reviewed the config migration and found one blocking regression in environment-variable precedence.

# Load .env into os.environ for backward compatibility and priority override
if os.path.exists(self._env_path):
for k, v in dotenv_values(self._env_path).items():
os.environ[k] = v

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Preserve real environment-variable precedence

This loop copies every key from .env into os.environ before YAML/env override resolution. That reverses the intended deployment precedence when the process already has real environment variables: I reproduced this with .env containing OPENAI_API_KEY=from_dotenv while launching with OPENAI_API_KEY=from_real_env, and llm_settings.openai_chat_api_key became from_dotenv. Existing pydantic settings gave the real environment precedence over the dotenv file, so a stale local .env can override Docker/Kubernetes secrets after this migration. Please only fill missing keys from .env (for example os.environ.setdefault(...), skipping empty values) and add a regression test for env > .env/config.yaml precedence.

Suggested change
os.environ[k] = v
for k, v in dotenv_values(self._env_path).items():
if v:
os.environ.setdefault(k, v)

@VGalaxies VGalaxies left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review summary

  • Blocking: yes
  • Summary: The YAML migration introduces config persistence and compatibility regressions that can break existing deployments.
  • Evidence:
    • static review of git diff origin/main...HEAD
    • git diff --check origin/main...HEAD clean
    • no changed files under hugegraph-llm/src/tests/

current,
)
setattr(self, key, value)
cfg_mgr.update_section(self._config_section, self)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: Hot reload rewrites the watched file repeatedly

hugegraph-llm/src/hugegraph_llm/config/models/base_config.py:340

Evidence

  • The watcher records _last_mtime before reload() at lines 260-264, then reload() calls check_config(), which always saves config.yaml again at lines 340-341.

Impact

  • After startup or any external edit, the internal save changes mtime again, so the watcher can reload and rewrite the file every interval indefinitely, causing repeated disk writes and log noise.

Requested fix

  • Do not save from the hot-reload sync path, or update _last_mtime after internal saves and suppress self-triggered reloads.

}

_env_var_map: ClassVar[dict] = {
"openai_chat_api_key": "OPENAI_API_KEY",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Legacy provider-specific OpenAI environment variables are ignored

hugegraph-llm/src/hugegraph_llm/config/llm_config.py:75

Evidence

  • _env_var_map maps openai_chat_api_key, openai_extract_api_key, and openai_text2gql_api_key only to OPENAI_API_KEY; the override loop checks only that mapped name instead of also checking the field’s own uppercase name.

Impact

  • Existing deployments using OPENAI_CHAT_API_KEY, OPENAI_EXTRACT_API_KEY, OPENAI_TEXT2GQL_API_KEY, or matching per-provider base URL variables stop overriding config once config.yaml exists.

Requested fix

  • Preserve field-specific env names and use generic OPENAI_API_KEY / OPENAI_BASE_URL only as fallback, with tests covering both paths.


dir_name = os.path.dirname
env_path = os.path.join(os.getcwd(), ".env") # Load .env from the current working directory
YAML_PATH = os.path.join(os.getcwd(), "config.yaml")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Config files are resolved from the process CWD

hugegraph-llm/src/hugegraph_llm/config/models/base_config.py:32

Evidence

  • YAML_PATH and ENV_PATH are hardcoded to os.getcwd(), while the source setup in hugegraph-llm/README.md runs from the repository root and existing docs/Docker examples refer to hugegraph-llm/.env.

Impact

  • Launching from the repository root skips migration of an existing hugegraph-llm/.env and creates/uses a different root-level config.yaml, so users can silently run with defaults instead of their configured credentials and graph settings.

Requested fix

  • Resolve the default config path from the hugegraph-llm project root, or add an explicit config path override and update all launch/migration paths consistently.

@looksaw2 looksaw2 marked this pull request as draft June 8, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request llm size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants