Skip to content
View samwu4166's full-sized avatar
🇹🇼
Focusing on how to become a good data engineer
🇹🇼
Focusing on how to become a good data engineer
  • TAIPEI
  • 13:31 (UTC +08:00)

Block or report samwu4166

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
samwu4166/README.md

Hi 👋, I'm Sam

Senior Data Engineer · building agent-native data systems


🇹🇼 Based in Taipei, building production data platforms and increasingly agent-native engineering systems. 7+ years across ad-tech, fintech, and consumer-product data stacks.

🔭 Currently building

  • 🤖 Multi-agent infrastructure — a 4-agent Claude Code system with stigmergic file-based coordination and autonomous monitoring agents on an enterprise MCP platform
  • 📦 Agent-native dbt PR review harness — multi-phase impact pipeline + multi-model oracle review (118 PRs shipped in 22 days; composite review quality 3.9 → 7.6)
  • 🛰 Production BigQuery → Spark → ClickHouse ad-metrics pipeline — query latency ~10s → ~100ms (~100×), 4× throughput, atomic partition swap + circuit-breaker hardening

🛠 Stack

Data / Backend — Apache Airflow · dbt · ClickHouse · BigQuery · Spark · PostgreSQL · Python · FastAPI Infra — Kubernetes (GKE) · Pulumi (TypeScript IaC) · Docker · GitHub Actions Agentic AI — Claude Code · MCP · multi-agent orchestration patterns

✍️ Writing & speaking

Coming soon: technical posts on agent-native data engineering, multi-agent harness patterns, and dbt at scale. Open to conference / meetup invites (JCConf, AI Engineer Summit, COSCUP).

📫 Connect


stats langs

Pinned Loading

  1. pagination-prediction pagination-prediction Public

    A Repo that contains ML-based Pagination Prediction and served with Fastapi

    Jupyter Notebook 1 1

  2. house_pricing_collector house_pricing_collector Public

    A Repo that contains multi-processed pyppeteer for 591 house records and served with fastapi, also provide auto-deploy to AWS EC2.

    Python

  3. ElectricMonitor ElectricMonitor Public

    A realtime electric monitor, build with Vue.js and Express.js

    JavaScript 1

  4. claude-swarm-kit claude-swarm-kit Public

    Run a fleet of Claude Code agents on one machine, each with an isolated Telegram channel + shared audit/memory/spawn/report infra

    Python