🧹 Automated Filtering of Undesirable Web Data to Update LLM Knowledge
-
Updated
Sep 18, 2025 - Jupyter Notebook
🧹 Automated Filtering of Undesirable Web Data to Update LLM Knowledge
Open-Source Machine Learning Platform
Data version control with Makefile and DVC for a regression task to estimate insurance costs for certain individuals.
mini project
🏷️ An AI-driven approach to Label LLM Training Data
The runtime environment for the KOI-System. Train models, run instances, and collect samples.
The backend of the KOI-system.
🌱 Manifeste de la Clairveillance : pour des institutions qui mesurent, doutent et apprennent
End-to-end Financial MLOps pipeline for automated stock forecasting featuring PyTorch LSTM, Apache Airflow orchestration, MLflow tracking, MongoDB Atlas, and FastAPI containerized with Docker and GHCR.
This project integrates Airflow, EC2, MLFlow, and MLOps principles to deploy a robust pipeline for classifying medical MNIST images. Streamlit enables user-friendly image uploads, MLFlow handles model registry and inference, while Airflow automates data updates and model retraining on AWS EC2.
A concurrent training and generation pipeline leveraging active learning to drive synthetic data rendering. By generating customized datasets simultaneously alongside model training, it creates a real-time feedback loop to dynamically refine object detection models.
🌟 Explore how entropy and negentropy shape our world and learn to build a clearer future with insights from machine intelligence.
Add a description, image, and links to the continuous-training topic page so that developers can more easily learn about it.
To associate your repository with the continuous-training topic, visit your repo's landing page and select "manage topics."