Skip to content

Automation/blog pipeline#718

Open
nora-weisser wants to merge 9 commits into
Women-Coding-Community:mainfrom
nora-weisser:automation/blog-pipeline
Open

Automation/blog pipeline#718
nora-weisser wants to merge 9 commits into
Women-Coding-Community:mainfrom
nora-weisser:automation/blog-pipeline

Conversation

@nora-weisser

@nora-weisser nora-weisser commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

A daily (and manually-triggerable) GitHub Action that turns reviewed blog submissions into draft PRs.
The initial PR was opened before by Silke, I improved it.

What it does

  • Reads the Form Responses sheet and publishes rows marked isReviewedandApproved but not yet isPublished
  • Exports each Google Doc → sanitized HTML post, downloads the cover image (falls back to default), then opens a PR for human review
  • Writes isPublished=TRUE back to the sheet so nothing is published twice — the sheet is the single source of truth

Key changes

  • publish_reviewed_blogs.py — orchestrator; blog_exporter.py — Doc→post + image; blog_info_from_spreadsheet.py — sheet read/write
  • HTML sanitized against an allowlist (bleach); filename slugs and YAML front matter hardened
  • Robust to missing images / inaccessible Docs (skips cleanly)
  • Workflow runs only on the org repo; service-account key injected from a secret and never committed

It was tested locally, it is not possible to test when it is in PR only when merged.

Pull request checklist

Please check if your PR fulfills the following requirements:

  • I checked and followed the contributor guide
  • I have tested my changes locally.
  • I have added a screenshot from the website after I tested it locally

@nora-weisser nora-weisser marked this pull request as ready for review June 25, 2026 12:34
@nora-weisser nora-weisser requested a review from a team as a code owner June 25, 2026 12:34
@dricazenck dricazenck requested a review from silkenodwell June 28, 2026 20:07
with:
token: ${{ secrets.GHA_ACTIONS_ALLOW_TOKEN }}
commit-message: "Automated import of reviewed blog posts"
branch: "automation/import-blog"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The branch name is fixed as automation/import-blog, which means all batch imports accumulate into the same PR. If a previous batch PR is still open when a new daily run triggers, reviewers might approve a larger bundle than expected.\n\nWould it be worth using a date-stamped branch name (e.g. automation/import-blog-2026-06-28) so each run creates its own isolated PR, giving reviewers clearer control over what they're approving?

- new posts under `_posts/`
- cover images under `assets/images/blog/`

The spreadsheet's `isPublished` column has already been set to TRUE for

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR body already mentions that isPublished is set to TRUE before merging, which is good. One concern: if this PR is rejected or closed without merging, those rows stay marked as published in the sheet and won't be re-exported on the next run — they'd be silently lost.\n\nCould we add a note here reminding reviewers that closing this PR without merging requires manually resetting isPublished to FALSE in the spreadsheet for the affected rows?

team-reviewers: "Women-Coding-Community/leaders"
title: "Automated import of reviewed blog posts"
body: |
This PR was created automatically by a GitHub Action.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR checklist includes "I have added a screenshot from the website after I tested it locally" but no screenshot appears in the PR body. Since this automation writes files and opens PRs, a sample of the generated post output (e.g. the front matter + first few lines of an exported .html file) would help reviewers validate the format before merging.

Use this ID in your scripts when exporting the document.

## Run Automation
## Export a single blog manually (for testing)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README now correctly references blog_exporter.py as the entry point for manual testing, but doc_to_html_conversion.py is still present in the repo (visible in the file tree). Is this file still needed, or should it be removed/deprecated to avoid confusion about which script to use?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants