Junior AI / Data Engineer · Python · RAG & Multi-Agent Systems · Data Pipelines (GCP)
I'm a Software Engineering who already ships production-grade AI and data systems: agentic workflows, RAG assistants, and Medallion data platforms. I work at the intersection of data engineering and applied AI, building reliable pipelines, integrating LLMs with real data and tools, and turning messy processes into observable, automated systems.
- Currently building multi-agent LLM systems and LLMOps evaluation pipelines in production.
- Strong on data engineering (Airflow, dbt, BigQuery, Medallion architecture) and applied AI (RAG, agents, function calling).
- Open to Junior roles in AI Engineering, Data Engineering and Data Analytics.
- Based in Rio Grande do Norte, Brazil, working remotely with teams in São Paulo.
| Project | What it is |
|---|---|
| insights-llm (Architecture Case Study) | A production agentic WhatsApp analytics assistant I built (private codebase); this repo is a public demonstration of its architecture. Event-driven (FastAPI + LangChain + Gemini), read-only JWT integration to a .NET PDV, three-level authorization, outbox pattern, failure-to-defense map. |
| LIZ (Agentic RAG Assistant) | Production-style RAG assistant with strict document grounding, async analytics and an IAM-aware architecture. |
| Thor (LLMOps) | Autonomous LLM evaluation system. Compares prompts and models with reproducible methodology and calibrated rubrics, no external frameworks. |
| AI Document Extraction Pipeline | End-to-end pipeline: web crawling, PDF extraction, LLM structuring, validated JSON output. |
| Project | What it is |
|---|---|
| Olist Data Platform | Enterprise-grade platform on the Olist dataset: Airflow, dbt, Docker, Great Expectations, ML drift monitoring and AI-powered diagnostics. |
| Data Engineering Foundations | A curated foundations trail: Python/pandas, real-API ingestion, SQL analytics (joins, customer segmentation) and production-style ETL challenges in JS. Built as a deliberate skills progression. |
AI Workflow Engineer, Estúdio Oggi (2026 to present) Built LLM-as-Judge evaluation infrastructure (calibrated rubrics, versioned golden sets), multi-agent squads for workflow automation, and A/B testing across Anthropic, Google and OpenAI quantifying cost, latency and quality trade-offs.
GenAI & Data Automation Intern, Maximiza IA (2026 to present) LLM-powered crawler with semantic structuring, database schema and business-rule validation, Django back-end with APIs, CRUDs and authentication.
Founder & AI Product Developer, Caatinga IA (2025 to present) Architected and deployed a municipal RAG assistant over WhatsApp; GCP data pipelines and Looker dashboards; product definition and MVP validation with stakeholders.
AI & Agents: LLMs · RAG · Multi-agent orchestration · LangChain · Function calling · LLMOps (eval, monitoring, prompt/version control) Data: Python · SQL · Pandas · dbt · Airflow · Great Expectations Cloud (GCP): BigQuery · Vertex AI · Cloud Run · Pub/Sub · Looker Engineering: FastAPI · Docker · REST APIs · Event-driven architecture · MongoDB · PostgreSQL
- LinkedIn: hyego-maia
- Email: [email protected]