Skip to content

niklassandhu/ids-evaluation-framework

Repository files navigation

Python

IDS Evaluation Framework

A comprehensive, modular, and configurable framework for evaluating Machine Learning-based Intrusion Detection Systems (IDS).

evaluation_pipeline

Features

  • Modular Plugin Architecture: Easily extend the framework with custom IDS models, metrics, and adversarial attacks
  • Flexible Data Pipeline: Load, preprocess, and split datasets with configurable preprocessing steps and feature selection
  • Multiple Evaluation Modes: Support for intra-dataset, cross-dataset, and k-fold cross-validation evaluation
  • Comprehensive Metrics: Built-in static metrics (accuracy, F1, precision, recall, ROC-AUC, etc.) and runtime metrics (CPU, RAM, training time)
  • Adversarial Robustness Testing: Evaluate model robustness against adversarial attacks (FGSM, noise perturbation, junk data injection)
  • Reproducible Results: Hash-based output organization ensures consistent experiment tracking
  • Flexible Deployment: Run natively with Python or via Docker

Installation

Prerequisites

  • Python 3.13+
  • uv (recommended) or pip

Python Installation

pip3 install ids-evaluation-framework

Native Installation

# Install dependencies (uv should be in your $PATH)
uv sync

# Verify installation
uv run ids-eval version

Docker Installation

# Configure environment variables
cp .env.example .env
# Edit .env to set your data paths

# Run via Docker Compose
docker compose run --rm ids-eval version

A pre-built Docker image is available on Docker Hub: niklassandhu/ids-eval-framework:latest

Quick Start

1. Create a Configuration File

Copy the example configuration and adjust it to your needs:

cp examples/run_config/example.config.yml examples/run_config/my_config.yml

2. Prepare Your Data

Run the data preparation pipeline:

uv run ids-eval dataset <run_config>

3. Run Evaluation

Execute the evaluation pipeline:

uv run ids-eval evaluate <run_config>

Usage

CLI Commands

The framework provides two main commands:

Command Description
ids-eval dataset <config.yml> Run dataset pipeline
ids-eval evaluate <config.yml> Run evaluation pipeline

Evaluation Flags

Flag Description
--train-only Only train models, skip testing phase
--force-train Force retraining, ignore saved models
--force-model Load saved models without config hash validation
--clear-checkpoints Clear evaluation checkpoints before running

Makefile Targets

make dataset CONFIG=<config.yml>          # Run dataset pipeline
make evaluate CONFIG=<config.yml>         # Run evaluation pipeline
make docker-dataset CONFIG=<config.yml>   # Run dataset pipeline via Docker
make docker-evaluate CONFIG=<config.yml>  # Run evaluation via Docker
make help                                 # Show all available targets

Configuration

The framework uses YAML configuration files. See run_config/example.config.yml for a fully documented example.

Key Configuration Sections

  • general: Run name, paths, random seed
  • data_manager: Dataset loading, preprocessing, feature selection, train/test split
  • evaluation: IDS models, metrics, adversarial attacks

Output Structure

All outputs are organized in hash-based directories for reproducibility:

out/
├── processed_datasets/<hash>/    # Preprocessed datasets
├── saved_models/<hash>/          # Trained models
└── reports/<hash>/               # Evaluation reports
    ├── config.yaml               # Configuration used
    ├── dataset_report.yaml       # Dataset statistics
    ├── ids_report.yaml           # Detailed evaluation results
    └── evaluation_summary.yaml   # Aggregated summary

The configuration hash is displayed at startup:

Your config hash is: a1b2c3d4

Plugin Development

The framework supports four types of plugins:

Plugin Type Directory Base Class
IDS Models plugin_ids/ AbstractIDSConnector
Static Metrics plugin_static_metric/ AbstractStaticMetric
Runtime Metrics plugin_runtime_metric/ AbstractRuntimeMetric
Adversarial Attacks plugin_adversarial/ AbstractAdversarialAttack

See the existing plugins in each directory for implementation examples.

Development

make setup      # Install dependencies
make test       # Run tests
make lint       # Check code style
make format     # Format code

Additional Information

BibTeX entry

Please cite this project using the following bibtex entry:
Generic badge

@inproceedings{}

Creating Issues

If you find any bugs, bad patterns, performance issues, etc. do not hesitate to open an issue.
Any new features which should be part of the evaluation has to be underlined by peer-reviewed publications. This counts for new examples as well. All examples are reproduced publications except baseline models.

License

See LICENSE for details.

About

Standardized and Modular Evaluation Framework for ML-based IDS

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors