fix(runtime): harden Flask validation Docker readiness and CI by donny-devops · Pull Request #35 · donny-devops/docker-flask-postgres-api

donny-devops · 2026-05-17T17:29:13Z

Summary

fix Marshmallow validator compatibility after dependency upgrades
add /ready endpoint with database connectivity check
add structured JSON error handlers for HTTP and unexpected errors
add database wait loop before running migrations in Docker entrypoint
update Compose healthcheck to use readiness, not only liveness
bind Postgres and pgAdmin to localhost by default
move pgAdmin behind an optional admin profile
make CI lint check-only with least-privilege permissions
add route tests for readiness, JSON 404 responses, blank update validation, and duplicate update conflicts
fill README architecture, project structure, and operational notes

Root causes / risks fixed

Marshmallow validator callbacks can receive metadata kwargs in newer versions; current validators accepted only value.
/health checked only app liveness, while Docker readiness needed database verification.
Entrypoint relied on Compose health order only; direct container starts could race migrations against Postgres readiness.
CI lint job had contents: write and auto-pushed formatting commits, which is noisy and over-privileged.
Compose exposed database/admin services more broadly than necessary for local development.

Validation expected

ruff check .
ruff format --check .
pytest --cov=app --cov-report=xml --cov-fail-under=85 -v
Docker image build via existing CI path

qodo-code-review · 2026-05-17T17:29:17Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

ecc-tools · 2026-05-17T17:29:22Z

Analyzing 200 commits...

ecc-tools · 2026-05-17T17:30:02Z

Analysis Complete

Generated ECC bundle from 8 commits | Confidence: 50%

View Pull Request #36

Repository Profile

Attribute	Value
Language	Python
Framework	Not detected
Commit Convention	conventional
Test Directory	`separate`

Changed Files (8)

Metric	Value
Files changed	8
Additions	250
Deletions	51

Top hotspots

Path	Status	+/-
`README.md`	modified	+72 / -12
`app/__init__.py`	modified	+61 / -3
`entrypoint.sh`	modified	+42 / -4
`.github/workflows/ci.yml`	modified	+13 / -20
`tests/test_routes.py`	modified	+30 / -1

Top directories

Directory	Files	Total changes
`.`	4	160
`app`	2	77
`.github/workflows`	1	33
`tests`	1	31

Analysis Depth Readiness (evidence-backed, 29%)

ECC Tools uses this to decide whether recommendations should stay at commit-history/setup guidance or expand into CI, security, harness, reference-set, AI-routing, and team backlog work.

Area	Status	Evidence / Next Step
Commit history	Ready	`8 commits sampled`
CI/CD signals	Ready	`.github/workflows/ci.yml`
Security evidence	Missing	Add AgentShield, audit, SARIF, SBOM, or security review evidence so recommendations can cover security posture.
Harness configuration	Missing	Add Claude, Codex, OpenCode, Zed, dmux, MCP, plugin, or cross-harness config evidence for harness-agnostic recommendations.
Reference/eval evidence	Missing	Add fixtures, golden traces, reference sets, or evaluator benchmarks so deeper recommendations have regression evidence.
AI routing and cost controls	Missing	Add model-routing, budget, usage, or cost-control files before relying on AI-heavy automation recommendations.
Team handoff and project tracking	Missing	Add roadmap, runbook, project, Linear, or follow-up tracking docs so generated work can land in a team queue.

Reference Set Readiness (0/7, 0%)

Area	Status	Evidence / Next Step
Deep analyzer corpus	Missing	Add analyzer fixture, golden, benchmark, or reference-set files that can catch analyzer regressions.
RAG/evaluator comparison	Missing	Add retrieval or evaluator reference-set comparison fixtures with expected ranking behavior.
PR salvage/review corpus	Missing	Add stale-PR, review-thread, reopen-flow, or salvage reference cases for queue cleanup automation.
Discussion triage corpus	Missing	Add public discussion triage fixtures, golden cases, or reference sets for informational, answered, and no-response classifications.
Harness compatibility	Missing	Add cross-harness, adapter-compliance, or harness-audit evidence for Claude, Codex, OpenCode, Zed, dmux, and agent surfaces.
Security evidence	Missing	Attach security evidence such as SBOMs, SARIF, audit reports, or AgentShield evidence packs.
CI failure-mode evidence	Missing	Add captured CI failure logs, dry-run fixtures, or troubleshooting docs for common workflow failure modes.

Likely Future Issues (2)

Severity	Signal	Why it may show up
MEDIUM	CI workflow changes may ship without failure-mode evidence	1 CI/test-runner paths changed; 0 CI failure-mode evidence artifacts changed
MEDIUM	Dependency or CI drift could surface after merge	CI/workflow files changed; no lockfile changes detected

CI workflow changes may ship without failure-mode evidence: The PR changes CI workflows or test-runner entrypoints without touching CI failure fixtures, captured logs, troubleshooting notes, or regression evidence.
Dependency or CI drift could surface after merge: Package or workflow changes landed without an accompanying lockfile update, which often turns into CI or release noise later.

Suggested Follow-up Work (2)

Type	Suggested title	Targets
PR	`ci: add failure-mode evidence for .github/workflows/ci.yml`	`.github/workflows/ci.yml`
PR	`chore: refresh lockfile and validate CI after dependency updates`	`.github/workflows/ci.yml`

ci: add failure-mode evidence for .github/workflows/ci.yml: Backfill CI failure-mode evidence before another workflow or test-runner change lands on the touched surface.
chore: refresh lockfile and validate CI after dependency updates: Package or workflow changes without a lockfile refresh tend to turn into noisy follow-up fixes after merge.

Copy-ready bodies

ci: add failure-mode evidence for .github/workflows/ci.yml

## Summary
- Add CI failure-mode evidence for the recently changed workflow or test-runner surface.

## Why
- Backfill CI failure-mode evidence before another workflow or test-runner change lands on the touched surface.

## Touched paths
- `.github/workflows/ci.yml`

## Validation
- Add or update a CI failure fixture, captured failing log, troubleshooting note, workflow dry-run evidence, or regression test for the changed CI/test-runner behavior.
- Run the affected workflow or test-runner entrypoint locally or in CI and record pass/fail evidence.

chore: refresh lockfile and validate CI after dependency updates

## Summary
- Refresh the lockfile and rerun CI after the dependency or workflow changes in this PR.

## Why
- Package or workflow changes without a lockfile refresh tend to turn into noisy follow-up fixes after merge.

## Touched paths
- `.github/workflows/ci.yml`

## Validation
- Refresh the lockfile in the same package manager used by the repo.
- Run the repo typecheck / test / CI entrypoints that depend on the updated package graph.

Detected Workflows (2)

Workflow	Description
feature-development-api-endpoint	Implements a new API endpoint, including code, tests, and documentation.
documentation-update	Updates documentation files to reflect new features, environment variables, or architectural changes.

Generated Instincts (22)

Domain	Count
git	5
code-style	9
testing	4
workflow	4

After merging, import with:

/instinct-import .claude/homunculus/instincts/inherited/docker-flask-postgres-api-instincts.yaml

Files

.claude/ecc-tools.json
.claude/skills/docker-flask-postgres-api/SKILL.md
.agents/skills/docker-flask-postgres-api/SKILL.md
.agents/skills/docker-flask-postgres-api/agents/openai.yaml
.claude/identity.json
.codex/config.toml
.codex/AGENTS.md
.codex/agents/explorer.toml
.codex/agents/reviewer.toml
.codex/agents/docs-researcher.toml
.claude/homunculus/instincts/inherited/docker-flask-postgres-api-instincts.yaml
.claude/commands/feature-development-api-endpoint.md
.claude/commands/documentation-update.md

_{ECC Tools | Everything Claude Code}

gemini-code-assist

Code Review

This pull request enhances the application's operational readiness by introducing a database-aware readiness check, centralized JSON error handling, and a more robust container entrypoint that waits for database connectivity. The documentation and environment templates were also updated to reflect new configuration options for Gunicorn and database timeouts. Feedback focuses on improving the robustness of the HTTP error handler where exc.code might be None, optimizing the database wait loop in the entrypoint by moving engine creation outside the retry loop, and improving type safety by replacing noqa suppressions with explicit return type annotations.

gemini-code-assist · 2026-05-17T17:31:10Z

+    def handle_http_exception(exc: HTTPException):  # noqa: ANN202
+        """Return consistent JSON for Flask/Werkzeug HTTP errors."""
+        return (
+            jsonify(
+                {
+                    "error": exc.name,
+                    "message": exc.description,
+                    "status_code": exc.code,
+                }
+            ),
+            exc.code,
+        )


The exc.code attribute in HTTPException can be None for certain exception types. If it is None, Flask will fail to generate a response from the returned tuple. Providing a default status code (e.g., 500) ensures the error handler is robust. Additionally, providing the return type annotation allows for the removal of the noqa suppression.

Suggested change

def handle_http_exception(exc: HTTPException): # noqa: ANN202

"""Return consistent JSON for Flask/Werkzeug HTTP errors."""

return (

jsonify(

{

"error": exc.name,

"message": exc.description,

"status_code": exc.code,

}

),

exc.code,

)

@app.errorhandler(HTTPException)

def handle_http_exception(exc: HTTPException) -> tuple[Response, int]:

"""Return consistent JSON for Flask/Werkzeug HTTP errors."""

code = exc.code or int(HTTPStatus.INTERNAL_SERVER_ERROR)

return (

jsonify(

{

"error": exc.name,

"message": exc.description,

"status_code": code,

}

),

code,

)

gemini-code-assist · 2026-05-17T17:31:10Z

+from flask import Flask, jsonify
+from sqlalchemy import text


Import Response from Flask and alias SQLAlchemy's exc module. This allows for proper type hinting and more specific error handling in the routes and error handlers below.

Suggested change

from flask import Flask, jsonify

from sqlalchemy import text

from flask import Flask, Response, jsonify

from sqlalchemy import exc as sa_exc, text

gemini-code-assist · 2026-05-17T17:31:10Z

        return {"status": "healthy", "service": "docker-flask-postgres-api"}

+    @app.get("/ready")
+    def ready():  # noqa: ANN202


Instead of suppressing the linting error with noqa, provide the explicit return type annotation. This improves code clarity and maintainability.

Suggested change

def ready(): # noqa: ANN202

def ready() -> Response | tuple[Response, int]:

gemini-code-assist · 2026-05-17T17:31:10Z

+        """Readiness check that verifies database connectivity."""
+        try:
+            db.session.execute(text("SELECT 1"))
+        except Exception as exc:  # noqa: BLE001


Catching a broad Exception is discouraged. It is better to catch sa_exc.SQLAlchemyError to specifically handle database connectivity issues, allowing other unexpected application errors to be caught by the global error handler.

Suggested change

except Exception as exc: # noqa: BLE001

except sa_exc.SQLAlchemyError as exc:

gemini-code-assist · 2026-05-17T17:31:10Z

+        )
+
+    @app.errorhandler(Exception)
+    def handle_unexpected_exception(exc: Exception):  # noqa: ANN202


Provide the explicit return type annotation for the error handler to improve maintainability and remove the need for noqa.

Suggested change

def handle_unexpected_exception(exc: Exception): # noqa: ANN202

def handle_unexpected_exception(exc: Exception) -> tuple[Response, int]:

gemini-code-assist · 2026-05-17T17:31:10Z

+: "${DATABASE_URL:?DATABASE_URL must be set}"
+: "${GUNICORN_WORKERS:=2}"
+: "${GUNICORN_THREADS:=4}"
+: "${DB_WAIT_TIMEOUT:=60}"


For consistency with other Gunicorn-related environment variables, GUNICORN_TIMEOUT should be initialized with a default value in the script's preamble.

Suggested change

: "${DATABASE_URL:?DATABASE_URL must be set}"

: "${GUNICORN_WORKERS:=2}"

: "${GUNICORN_THREADS:=4}"

: "${DB_WAIT_TIMEOUT:=60}"

: "${DATABASE_URL:?DATABASE_URL must be set}"

: "${GUNICORN_WORKERS:=2}"

: "${GUNICORN_THREADS:=4}"

: "${GUNICORN_TIMEOUT:=60}"

: "${DB_WAIT_TIMEOUT:=60}"

gemini-code-assist · 2026-05-17T17:31:10Z

+while time.monotonic() < deadline:
+    try:
+        engine = create_engine(url, pool_pre_ping=True)
+        with engine.connect() as conn:
+            conn.execute(text("SELECT 1"))
+        print("[entrypoint] Database is reachable.")
+        sys.exit(0)
+    except SQLAlchemyError as exc:
+        last_error = str(exc)
+        time.sleep(2)


Creating a new SQLAlchemy engine inside the loop is inefficient as it sets up a new connection pool on every iteration. The engine should be created once outside the loop. Additionally, pool_pre_ping is unnecessary here because a fresh connection is requested immediately via engine.connect().

Suggested change

while time.monotonic() < deadline:

try:

engine = create_engine(url, pool_pre_ping=True)

with engine.connect() as conn:

conn.execute(text("SELECT 1"))

print("[entrypoint] Database is reachable.")

sys.exit(0)

except SQLAlchemyError as exc:

last_error = str(exc)

time.sleep(2)

engine = create_engine(url)

while time.monotonic() < deadline:

try:

with engine.connect() as conn:

conn.execute(text("SELECT 1"))

print("[entrypoint] Database is reachable.")

sys.exit(0)

except SQLAlchemyError as exc:

last_error = str(exc)

time.sleep(2)

gemini-code-assist · 2026-05-17T17:31:10Z

-    --timeout 60 \
+    --workers "${GUNICORN_WORKERS}" \
+    --threads "${GUNICORN_THREADS}" \
+    --timeout "${GUNICORN_TIMEOUT:-60}" \


Since GUNICORN_TIMEOUT is now defaulted in the preamble, the inline default here can be simplified.

Suggested change

--timeout "${GUNICORN_TIMEOUT:-60}" \

--timeout "${GUNICORN_TIMEOUT}" \

amazon-q-developer

Summary

This PR successfully addresses the stated goals of hardening Flask validation, Docker readiness, and CI configuration. The changes include important reliability improvements: Marshmallow validator compatibility fixes, database connectivity verification, entrypoint wait logic, and security-focused port bindings.

Key Improvements

✅ Fixed Marshmallow validator compatibility with newer versions by accepting metadata kwargs
✅ Added /ready endpoint with database connectivity verification
✅ Implemented database wait loop in entrypoint to prevent migration race conditions
✅ Improved CI least-privilege model by removing auto-format push permissions
✅ Enhanced security by binding Postgres and pgAdmin to localhost
✅ Added comprehensive test coverage for new functionality

Critical Feedback

The main concern is the database session handling in the /ready endpoint (see comment on app/__init__.py). Using db.engine.connect() instead of db.session.execute() prevents potential connection pool issues under load and follows best practices for health check endpoints that don't need ORM session features.

The validation tests cover all new features thoroughly, and the CI pipeline properly enforces code quality without over-privileged auto-formatting.

Recommendation

Address the session cleanup issue in the readiness endpoint, then this PR will be ready to merge. The other changes are well-implemented and align with production best practices.

You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.

amazon-q-developer · 2026-05-17T17:31:11Z

        condition: service_healthy
    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
+      test: ["CMD", "curl", "-f", "http://localhost:5000/ready"]


🛑 Logic Error: The healthcheck creates a circular dependency. The API container depends on db being healthy (line 40-41), but this healthcheck tests /ready which verifies database connectivity. If the database becomes temporarily unavailable after startup, this healthcheck will fail and mark the API as unhealthy, even though the API service itself is running fine.

Change the healthcheck back to /health for liveness checks, or accept that the API healthcheck accurately reflects database dependency status. If keeping /ready, document that the API healthcheck intentionally fails when the database is unavailable.

amazon-q-developer · 2026-05-17T17:31:11Z

+    def ready():  # noqa: ANN202
+        """Readiness check that verifies database connectivity."""
+        try:
+            db.session.execute(text("SELECT 1"))


🛑 Crash Risk: The database session is not explicitly closed or rolled back after the readiness check query. While Flask-SQLAlchemy typically handles session cleanup, executing raw SQL with db.session.execute() in a health check endpoint without explicit cleanup can lead to connection pool exhaustion under high request load or when the database is slow to respond.

Add db.session.remove() or use a separate engine connection to ensure the session is properly cleaned up after each readiness check.

Suggested change

db.session.execute(text("SELECT 1"))

try:

with db.engine.connect() as conn:

conn.execute(text("SELECT 1"))

except Exception as exc: # noqa: BLE001

amazon-q-developer · 2026-05-17T17:31:11Z

-      - "5432:5432"
+      - "127.0.0.1:5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-appuser} -d ${POSTGRES_DB:-appdb}"]


🛑 Logic Error: The pg_isready healthcheck includes specific user and database parameters but will fail if the environment variables are not properly interpolated. The ${POSTGRES_USER:-appuser} syntax works in the docker-compose context but the actual command string passed to the container may not expand these variables correctly depending on how PostgreSQL processes the healthcheck command.

Test this healthcheck thoroughly to ensure it properly validates with the interpolated username and database, or simplify to pg_isready without parameters if the connection variables are already configured via environment.

secure-code-warrior-for-github · 2026-05-17T17:31:16Z

Micro-Learning Topic: Race condition (Detected by phrase)

Matched on "race condition"

What is this? (2min video)

A race condition is a flaw that produces an unexpected result when the timing of actions impact other actions.

Try a challenge in Secure Code Warrior

codacy-production · 2026-05-17T17:32:13Z

Not up to standards ⛔

🔴 Issues 5 high

Alerts:
⚠ 5 issues (≤ 0 issues of at least minor severity)

Results:
5 new issues

Category Results

Security 5 high

View in Codacy

🟢 Metrics 8 complexity · 0 duplication

Metric Results

Complexity 8

Duplication 0

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

donny-devops added 8 commits May 17, 2026 13:25

fix(api): make Marshmallow validators compatible with newer callbacks

fbcef6d

feat(api): add readiness endpoint and structured error responses

da30d68

fix(docker): add database wait loop before migrations

94ddfd6

fix(docker): harden compose healthchecks and defaults

6a43d12

fix(ci): make lint check-only and least-privilege

9dd56c7

test(api): cover readiness validation and JSON error behavior

e860efd

docs(env): document runtime tuning variables

04e13f7

docs(readme): fill architecture and operational runbook gaps

08cbfb1

gemini-code-assist Bot reviewed May 17, 2026

View reviewed changes

amazon-q-developer Bot reviewed May 17, 2026

View reviewed changes

	def ready(): # noqa: ANN202
	def ready() -> Response \| tuple[Response, int]:

	except Exception as exc: # noqa: BLE001
	except sa_exc.SQLAlchemyError as exc:

	def handle_unexpected_exception(exc: Exception): # noqa: ANN202
	def handle_unexpected_exception(exc: Exception) -> tuple[Response, int]:

	--timeout "${GUNICORN_TIMEOUT:-60}" \
	--timeout "${GUNICORN_TIMEOUT}" \

Conversation

donny-devops commented May 17, 2026

Summary

Root causes / risks fixed

Validation expected

Uh oh!

qodo-code-review Bot commented May 17, 2026

Qodo reviews are paused for this user.

Uh oh!

ecc-tools Bot commented May 17, 2026

Uh oh!

ecc-tools Bot commented May 17, 2026

Analysis Complete

View Pull Request #36

Files

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot left a comment

Choose a reason for hiding this comment

Summary

Key Improvements

Critical Feedback

Recommendation

Uh oh!

amazon-q-developer Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

secure-code-warrior-for-github Bot commented May 17, 2026

Micro-Learning Topic: Race condition (Detected by phrase)

Matched on "race condition"

Try a challenge in Secure Code Warrior

Uh oh!

codacy-production Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codacy-production Bot commented May 17, 2026 •

edited

Loading