Skip to content

sciguy-code/Hi-CliTr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🩺 Hi-CliTr: Cognitive Radiology Report Generation

Python 3.8+ PyTorch 2.0+ License: MIT Maintenance

A state-of-the-art Deep Learning framework for automated chest X-ray report generation implementing Cognitive Simulation inspired by the Hi-CliTr framework.


πŸ“– Overview

Hi-CliTr (Hierarchical Cross-modal Cognitive Transformer) is designed to address "reader fatigue" in radiology by acting as an intelligent "Second Reader". Unlike standard image captioning models, Hi-CliTr simulates the cognitive workflow of a radiologist:

  1. Perceives anatomical structures at multiple scales (Organ β†’ Region β†’ Pixel).
  2. Reasons about potential pathologies using a knowledge graph.
  3. Verifies its findings against the image before generating the final report.

This project implements the core components: PRO-FA (Progressive Feature Alignment), MIX-MLP (Knowledge-Enhanced Classification), and RCTA (Triangular Cognitive Attention).


🌟 Key Features

Implements Hierarchical Visual Perception via a Swin-Transformer backbone. It aligns multi-scale features with the RadLex medical ontology:

  • Organ-level (4Γ—4): Global anatomical awareness.
  • Region-level (7Γ—7): Lobe/region specific features.
  • Pixel-level (7Γ—7): Fine-grained lesion details.

A dual-path, knowledge-enhanced architecture for disease classification:

  • Residual Path: Efficient feature flow for common cases.
  • Expansion Path: Captures complex disease patterns and co-occurrences.
  • CheXpert: High-precision classification for 14 common pathologies.

A 3-stage closed-loop verification system that mimics clinical reasoning:

  1. Image β†’ Text: Creates context from visual features and clinical indication.
  2. Context β†’ Labels: Formulates a diagnostic hypothesis.
  3. Labels β†’ Image: Verifies the hypothesis against visual evidence.

Utilizes a GPT-2 Medium backbone (355M params) to generate structured, clinically accurate reports (Findings & Impression), conditioned on the cognitive states from RCTA.


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    COGNITIVE RADIOLOGY MODEL                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚   CXR       β”‚    β”‚  Clinical   β”‚    β”‚                         β”‚  β”‚
β”‚  β”‚  Images     β”‚    β”‚ Indication  β”‚    β”‚     Generated Report    β”‚  β”‚
β”‚  β”‚  (PA/LAT)   β”‚    β”‚   Text      β”‚    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β”‚  β”‚ FINDINGS:        β”‚  β”‚  β”‚
β”‚         β”‚                  β”‚           β”‚  β”‚ Heart size normalβ”‚  β”‚  β”‚
β”‚         β–Ό                  β”‚           β”‚  β”‚ Lungs are clear  β”‚  β”‚  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚           β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”‚  β”‚
β”‚  β”‚      PRO-FA         β”‚   β”‚           β”‚  β”‚ IMPRESSION:      β”‚  β”‚  β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β” β”‚   β”‚           β”‚  β”‚ No acute         β”‚  β”‚  β”‚
β”‚  β”‚ β”‚Organβ”‚Regioβ”‚Pixelβ”‚ β”‚   β”‚           β”‚  β”‚ abnormality      β”‚  β”‚  β”‚
β”‚  β”‚ β”‚ 4x4 β”‚ 7x7 β”‚ 7x7 β”‚ β”‚   β”‚           β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚ β””β”€β”€β”¬β”€β”€β”΄β”€β”€β”¬β”€β”€β”΄β”€β”€β”¬β”€β”€β”˜ β”‚   β”‚           β”‚                         β”‚  β”‚
β”‚  β”‚    β””β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”˜    β”‚   β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”‚    RadLex Align     β”‚   β”‚                      β–²                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚                      β”‚                 β”‚
β”‚             β”‚              β”‚           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚             β–Ό              β”‚           β”‚   Report Generator  β”‚      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚           β”‚      (GPT-2)        β”‚      β”‚
β”‚  β”‚      MIX-MLP        β”‚   β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚   β”‚                      β–²                 β”‚
β”‚  β”‚ β”‚  Residual Path  β”‚ β”‚   β”‚                      β”‚                 β”‚
β”‚  β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚   β”‚           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ β”‚ Expansion Path  β”‚ β”‚   β”‚           β”‚       RCTA          β”‚      β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚   β”‚           β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚      β”‚
│  │          ▼          │   │           │  │ Image→Text   │   │      │
β”‚  β”‚   14 CheXpert Labelsβ”‚   └──────────►│  β”‚ Textβ†’Labels  β”‚   β”‚      β”‚
β”‚  β”‚   (Multi-label F1)  │───────────────│  β”‚ Labelsβ†’Image β”‚   β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚      β”‚
β”‚                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Directory Structure

BrainDead-Solution/
β”œβ”€β”€ data/                       # Data management
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ download_iu_xray.py     # Scripts for downloading & preprocessing IU-Xray
β”‚   β”œβ”€β”€ dataset.py              # PyTorch Dataset definitions (MIMIC-CXR, IU-Xray)
β”‚   └── sanity_check.py         # Data integrity verification script
β”œβ”€β”€ models/                     # Core model components
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ encoder.py              # PRO-FA: Multi-scale ViT + RadLex Alignment
β”‚   β”œβ”€β”€ classifier.py           # MIX-MLP: Knowledge-enhanced classifier
β”‚   β”œβ”€β”€ decoder.py              # RCTA + GPT-2 Decoder
β”‚   └── model.py                # Unified Hi-CliTr Model assembly
β”œβ”€β”€ training/                   # training logic
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── trainer.py              # Training loop, validation, and saving
β”œβ”€β”€ evaluation/                 # Metrics and evaluation
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── metrics.py              # CheXpert F1, BLEU, CIDEr, RadGraph F1
β”œβ”€β”€ notebooks/
β”‚   └── inference_demo.ipynb    # Interactive Jupyter notebook for demo
β”œβ”€β”€ static/                     # Web app static assets (CSS/JS)
β”œβ”€β”€ templates/                  # Web app HTML templates
β”œβ”€β”€ app.py                      # Flask Web Application entry point
β”œβ”€β”€ config.py                   # Centralized configuration file
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ problem_statement.md        # Original hackathon problem statement
└── README.md                   # Project documentation

πŸš€ Quick Start

1. Installation

Clone the repository and install dependencies:

# Clone repository
git clone https://github.com/your-username/braindead-solution.git
cd braindead-solution

# Create virtual environment (Recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Dataset Setup

This project uses the IU-Xray dataset for benchmarking and public usage. A helper script is provided to download and prepare it.

# Verify integrity of existing data or download
python data/download_iu_xray.py --verify

# Preprocess dataset (create splits and metadata)
python data/download_iu_xray.py --preprocess

# Run a sanity check to ensure everything is loadable
python data/sanity_check.py

Note: For MIMIC-CXR, you must have credentialed access via PhysioNet. Place the dataset in data/mimic_cxr if available.

3. Training

Train the model from scratch using the trainer.py script. You can configure hyperparameters in config.py or pass them as arguments.

# Standard training run
python training/trainer.py \
    --max_epochs 30 \
    --batch_size 8 \
    --learning_rate 1e-4

# Fast dev run (sanity check training loop)
python training/trainer.py --fast_dev_run

4. Web Application (Demo)

Launch the interactive web interface to generate reports for uploaded X-rays.

python app.py

Open http://localhost:5000 in your browser.

  • Upload a Chest X-ray image.
  • (Optional) Enter clinical indication (e.g., "Fever and cough").
  • View the generated Findings and Impression.

5. Inference (CLI / Notebook)

You can also run inference programmatically:

from models.model import create_model
import torch

# Load Model
model = create_model(pretrained=True, device="cuda")
model.load_state_dict(torch.load("checkpoints/best.pt")["model_state_dict"])
model.eval()

# Generate Report
result = model.generate_report(
    images="path/to/xray.png", 
    indication="Patient with shortness of breath"
)
print(result['reports'][0])

See notebooks/inference_demo.ipynb for a complete walkthrough.


βš™οΈ Configuration

The config.py file controls all aspects of the model and training. Key sections:

  • DataConfig: Paths, image size (224x224), sequence lengths.
  • EncoderConfig: Swin Transformer settings, RadLex concept count.
  • ClassifierConfig: CheXpert labels, loss weights.
  • DecoderConfig: GPT-2 settings, beam search parameters (k=4).
  • TrainingConfig: Learning rate, batch size, mixed precision (AMP) settings.

πŸ“Š Performance Targets

Metric IU-Xray Test Target Description
CheXpert Micro F1 TBD > 0.500 Clinical accuracy of disease detection
RadGraph F1 TBD > 0.500 Semantic relation accuracy
CIDEr TBD > 0.400 Text generation consensus metric
BLEU-4 TBD > 0.100 N-gram overlap precision

🀝 Contributing

Contributions are welcome!

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

πŸ“ Citation

If you use this code for your research, please cite:

@inproceedings{braindead2026,
  title={Hi-CliTr: Cognitive Radiology Report Generation},
  author={Team BrainDead},
  booktitle={ML Hackathon 2026},
  year={2026}
}

πŸ“„ License

Distributed under the MIT License. See LICENSE for more information.


Made with 🧠 and ❀️ by Team BrainDead for ML Hackathon 2026
"Pushing the boundaries of Cognitive Simulation in Radiology"

About

Hi-CliTr Cognitive Radiology Model

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors