An intelligent data analytics system that allows users to query a MySQL database using natural language. The system converts user queries into SQL using a local LLM (Qwen via Ollama) and returns structured results.
Non-technical users struggle to write SQL queries.
👉 This system solves it by enabling:
Natural Language → SQL → Insights
- Frontend: Streamlit
- Backend: Python
- Database: MySQL
- LLM: Qwen2.5:3B (Ollama)
User → Streamlit → LLM (Qwen) → SQL → MySQL → Result
ai-nl-to-sql-analytics-platform/
│
├── README.md
├── requirements.txt
├── .gitignore
│
├── src/
│ ├── app.py # Streamlit UI
| ├── backend.py # Database logic, LLM integration, Helper functions
│ └── prompt.py # Prompt engineering
│
├── db/
│ ├── schema.sql # CREATE TABLE statements
│ └── data/
│ ├── customers.csv
│ ├── orders.csv
│ ├── order_items.csv
│ ├── products.csv
│ ├── category.csv
│ └── departments.csv
|
└── images/
git clone <repo_url>
cd project
pip install -r requirements.txt
ollama serve
ollama run qwen2.5:3b
streamlit run app.py
- Natural Language → SQL conversion
- Multi-table JOIN support
- Aggregation (SUM, COUNT, etc.)
- Prompt-engineered accuracy
- MySQL integration
| Natural Language | Generated SQL |
|---|---|
| Show all customers | SELECT * FROM customers; |
| Total sales | SELECT SUM(total_amount) FROM orders; |
| Top 5 customers | GROUP BY + ORDER BY + LIMIT |
| Sales by department | Multi-table JOIN |
-
Blocks:
- DROP
- DELETE
- UPDATE
- TRUNCATE
-
Ensures:
- Read-only queries
- Controlled execution
- LLM may generate incorrect queries in edge cases
- Depends on prompt quality
- Future RAG implimentation to reduce hallucination
- No dynamic schema support yet
- Limited scalability (local setup)
- Dynamic schema detection
- RAG-based table retrieval
- Query validation & auto-correction
- Dashboard & charts
- Cloud deployment
- Prompt engineering for structured tasks
- LLM integration with backend systems
- SQL query generation & optimization
- Building end-to-end AI systems
This project demonstrates how AI can bridge the gap between users and data systems by enabling natural language querying over structured databases.
Mazhar Kakar

