Skip to content

Improve Dockerfile security, layering, and dev/prod parity#6882

Open
compwron wants to merge 2 commits intomainfrom
dockerfile-fixes-2026-04-23
Open

Improve Dockerfile security, layering, and dev/prod parity#6882
compwron wants to merge 2 commits intomainfrom
dockerfile-fixes-2026-04-23

Conversation

@compwron
Copy link
Copy Markdown
Collaborator

Summary

  • Non-root user: app now runs as a dedicated app user instead of root
  • Remove vim from production image (attack surface reduction)
  • Fix ARG ROOT: remove the no-op global declaration; each stage now carries its own default value
  • Layer cache improvement: package*.json is copied and npm ci runs before COPY . ., so source changes don't bust the npm install layer
  • npm symlink fix: replaced fragile ln -s into node_modules internals with COPY --from=node-source /usr/local/bin/npm
  • apk cache cleanup added to the build stage (was missing, present in final stage)
  • Dev/prod Ruby parity: devcontainer updated from ruby:dev-3.3-bookwormruby:dev-4.0-bookworm to match production

Test plan

  • docker build . completes without errors
  • docker run starts Rails server and responds on port 3000
  • Verify whoami inside container returns app (not root)
  • Devcontainer opens successfully in VS Code with Ruby 4.0

🤖 Generated with Claude Code

- Run app as non-root user (addgroup/adduser + USER app)
- Remove vim from production image
- Fix ARG ROOT: remove no-op global declaration, add default per-stage
- Copy package*.json before COPY . . to improve npm layer caching
- Replace fragile npm symlink with direct COPY from node-source
- Add apk cache cleanup to build stage
- Align devcontainer Ruby version with production (3.3 → 4.0)

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@github-actions github-actions Bot added the dependencies Touches dependency files label Apr 24, 2026
COPY resolves the file to /usr/local/bin/npm, so Node sets __dirname
to /usr/local/bin/ and require('../lib/cli.js') in npm-cli.js points
to /usr/local/lib/cli.js (missing). A symlink keeps the real file at
/usr/local/lib/node_modules/npm/bin/npm-cli.js so __dirname resolves
correctly and the relative require works.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@compwron
Copy link
Copy Markdown
Collaborator Author

compwron commented May 3, 2026

bots says:

The failing job 72828420002 is being killed with exit code 137, which almost always means the process was terminated by the OS due to out-of-memory (OOM) (or, less commonly, a forced kill). In your workflow, the “Test” step runs the entire RSpec suite inside the web container, after db:setup and assets:precompile, which can push the GitHub runner over its memory limit.

From the logs, the failure happens at the end of the job and doesn’t point to a specific spec failure; it’s consistent with the container being killed rather than tests failing.

Fix: reduce memory pressure in CI (recommended)

1) Don’t run db:setup in CI; use db:prepare instead

db:setup loads seeds and does extra work that’s not needed for CI and increases time/memory. Switch to db:prepare.

.github/workflows/docker.yml (ref d469d6d12d7b00a3b2a1cb7f1b7bbb2d46690bed)

      - name: db:setup
-       run: docker compose exec -T web rails db:setup
+       run: docker compose exec -T web rails db:prepare

2) Limit parallelism / memory usage for Ruby & RSpec inside the container

Constrain Ruby’s heap growth and reduce concurrency to avoid spikes.

 jobs:
   docker:
     runs-on: ubuntu-latest
     timeout-minutes: 20
     env:
       RAILS_ENV: test
       TEST_MAX_DURATION: 60
+      # reduce memory pressure on GitHub-hosted runners
+      RUBYOPT: "--disable-gems"
+      MALLOC_ARENA_MAX: "2"
+      RUBY_YJIT_ENABLE: "0"

And adjust the test command to run in a more memory-stable way (single process, fewer formatter extras):

      - name: Test
-       run: docker compose exec -T web bundle exec rspec spec --fail-fast
+       run: docker compose exec -T web bundle exec rspec spec --fail-fast --format progress

(If you’re using parallel test execution elsewhere, ensure it’s not enabled in CI for this job.)

3) Skip assets precompile for this job (if tests don’t require it)

If your specs don’t depend on compiled assets, removing this step is the biggest easy win.

-     - name: compile assets
-       run: docker compose exec -T web bundle exec rails assets:precompile

If you do need assets for system specs, consider precompiling only when system specs are present, or moving system specs to a separate job.

Why this solves it

  • Exit code 137 indicates the runner is killing the containerized workload.
  • The workflow runs multiple heavy steps back-to-back (db:setup, assets:precompile, full RSpec), so the web container likely exceeds available memory.
  • Replacing db:setup, reducing memory growth, and/or skipping asset compilation keeps the job under runner limits and prevents the kill.

If you want, paste the docker-compose service definitions (especially the web service) and I can suggest a targeted memory cap/optimization there too (e.g., disabling unnecessary services during CI).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Touches dependency files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant