Welcome to the LLMOps Workshop, this repository contains the necessary material for the workshop.
The slides that were presented are in the moodle 🥸
- Understand the foundations of Large Language Models and their deployment challenges
- Master some practical techniques for building LLM applications
- Learn a touch about optimization, testing, evaluation, and DevOps
In order to successfully complete this workshop, it will be nice to have :
- Python programming experience (at least OOP concepts)
- Working knowledge of Git and GitHub
- Familiarity with API architectures
- Basic knowledge of Docker
This workshop consists of two practical exercises (TPs) and one lab project that will help you apply the concepts learned during lectures.
The sequence of assignments is designed to progressively advance you through different development environments and paradigms, from interactive jupyter notebooks to scripts to a full FastAPI implementation 😎
This notebook exercise will help you understand the frameworks chromadb and langchain, and their basic concepts like :
- Connect to a Chroma database running in Docker
- Create collections and add documents
- Query vector collections
- Work with LangChain components for LLM applications
- Implement parsing for structured LLM outputs
- Interact with ChromaDB and Langchain
- Upload, process, and query PDF documents in a vector database
You just need to RTFM and play with the notebook 🤗
In this exercise, you will build a PDF Retrieval Augmented Generation (RAG) system using local models Ollama that can answer questions about PDF documents without sending the PDFs to external servers.
This meets the requirements of organizations dealing with sensitive information like legal firms 🕵️
Key components you'll implement:
- Document loading and processing
- Text splitting and chunking
- Vector embedding generation
- Vector database creation and querying
- Retrieval QA chain setup
- Interactive question answering
You'll use a simple shell-based interface to query the system about uploaded PDFs, with an option for structured output responses.
See more in the dedicated README in the TP2 folder.
In this lab project, you will build a unified API gateway that can handle the communication with various LLM providers.
The gateway will serve as a middleware layer that standardizes requests and responses across multiple LLM services including OpenAI, Anthropic, Groq, and locally hosted models via Ollama.
🛠️ Some key features of your mission 🛠️
- FastAPI app connecting multiple providers with extension capabilities
- Robust error handling and logging
- Rate limiting and request validation
- Caching mechanisms for optimization
- Fallback strategies
- Testing driven strategy
- Docker local deployment
All code for the workshop (tps and lab) should be coded in your personal GitHub account in a PUBLIC repository. Please make sure to follow the 4 steps below :
- Create a separate repository for this workshop (or one for each activities but explain this when you are at the step 4)
- Organize your code according to the provided guidelines as much as possible
- Write detailed documentation and README files for the hands on section
- Share your repository link at my cs email address
Your repository should be clean and well organized, with documentation of each part of your implementation 😇
If you encounter any issues or have coding questions during the workshop, please follow this procedure :
- Open a git issue in your repository with a clear description of your problem
- Include screenshots if applicable to help illustrate the issue
- Tag me in the issue and/or send an email notification
- Use descriptive titles for your issues (like "Error connecting to Chroma database in TP2" rather than "Not working... Help plz🥲")
Please check the existing issues before creating yours (you are not the only one with your problem) 🥹
Your work will be evaluated based on the following criteras :
- Functionality and completeness of implementations (see more in the dedicated README in the LAB folder)
- Problem-solving approaches
- Testing and error handling
- Documentation quality
Create a conda env and install the required dependencies with the commands below :
conda create -n llmops python=3.12 -y
conda activate llmops
then
pip install -r requirements.txt
and access the created env inside your jupyter server :
python -m ipykernel install --user --name=llmops
Happy coding 🤗