A lightweight, interpreted query language for querying multiple data sources using a consistent syntax.
- Zero-Dependency Core: Written entirely in modern C++17 without heavy external data-processing libraries.
- Interactive REPL: Explore your data interactively via the built-in shell.
- Multiple Data Formats: Supports CSV, TSV, JSON, Parquet, SQLite, YAML, and XML out of the box with zero-copy stream parsing.
- Native Parquet Support: Fully native, zero-dependency C++ Parquet reader (via
tinyparquet), completely replacing previous Python/Pandas dependencies!
- Native Parquet Support: Fully native, zero-dependency C++ Parquet reader (via
- Map-Reduce Parallel Execution: Threaded
GROUP BYlogic and CSV adapter chunk splitting viastd::thread::hardware_concurrency()for massive multicore throughput. - Incremental Aggregation:
sum,count,min,maxprocess inline during streaming, completely eliminating row-caching memory footprints (10M row RAM usage is just 4.5 MB). - Projection Pushdown: Columns not needed by the AST are skipped entirely at the parser level via read-masks, improving parse speeds dramatically (10M row queries in 3.7 seconds).
- Modular Engine Architecture: The monolithic
runtime.cppwas cleanly decoupled into 5 focused component files (evaluator.cpp,executor_streaming.cpp, etc.). - Lexer & Parser: Custom written top-down recursive descent parser and custom tokenization engine.
- Advanced SQL Syntax: Full support for
JOIN,GROUP BY,ORDER BY,HAVING,LIKE, subqueries, nested mathematical/string inline functions, andWITH RECURSIVECommon Table Expressions powered by nativeUNION ALLresolution loops. LIKEFiltering: Supports powerful wildcard regex pattern matching out of the box.- Execution Limits: Strict memory-capping and row-processing timeouts protect the engine against recursive loops and adversarial file structures.
- Usage Guide: Learn the QLE syntax, inline functions, and CLI limits.
- Architecture: Explore the engine's modular adapters and AST.
- Performance Guide: Deep dive into the zero-copy engine, memory streaming, and big-data optimization strategies.
- Comprehensive Benchmarks: Hardware metrics and execution times across all 16 features.
- Examples Library: Explore ready-to-run
.qlefiles demonstrating joins, maths, and aggregations.
Requirements: CMake 3.14+, C++17 compiler.
mkdir build
cd build
cmake ..
makeRun an inline query:
./qle "from users.csv where age > 18 select name, email"Run queries from files:
./qle run query1.qle query2.qle- src/lexer: Tokenization and position tracking.
- src/parser: Syntax validation and AST generation.
- src/ast: Immutable Abstract Syntax Tree nodes.
- src/runtime: Evaluates the AST against data sources.
- src/adapters: Extensible interfaces for CSV, JSON, etc.
- If you encounter any bugs related to the core engine, SQL syntax parsing, execution performance, or native adapters (CSV, JSON, SQLite, XML, etc.), please open an issue here in the QLE repository.
- IMPORTANT: QLE's native Parquet reader is powered by an embedded version of the
tinyparquetlibrary. Any issues related directly to Parquet reading, missing encodings, unsupported compression codecs, or decoding failures must be reported directly to thetinyparquetrepository.