Skip to content

LivLilli/lupus-alberto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🐺 🇮🇹 🏥 Lupus Alberto

BERT-based fine-tuning for SLE information extraction from Italian clinical reports.

This repository contains the code used for the study Lupus Alberto: A Transformer-Based Approach for SLE Information Extraction from Italian Clinical Reports (Lilli et al., 2024). The fine-tuning experiments build on AlBERTo, the Italian BERT model introduced by Polignano et al. (2019).

How to use

  1. Install requirements
pip install -r requirements.txt
  1. Configure the experiments

Edit the JSON files in the configs folder before running the scripts:

  • configs/experiment_config.json controls the fine-tuning and evaluation loop: domain, target categories, base models, number of epochs, and checkpoint number used for evaluation.
  • configs/benchmark_config.json controls the benchmark aggregation: domain, target categories, model names, and metrics to extract from the result files.
  1. Fine Tuning and Evaluation
python experiments/run_experiments.py

The run_experiments.py file reads its settings from configs/experiment_config.json and runs the following sequence of scripts: finetune.py, evaluation.py, and evaluation_results.py.

  1. Benchmark Analysis
python benchmark_analysis/compare_results.py

The compare_results.py file reads its settings from configs/benchmark_config.json.

Citation

If you use this repository, please cite:

@inproceedings{lilli-etal-2024-lupus,
    title = "Lupus Alberto: A Transformer-Based Approach for {SLE} Information Extraction from {I}talian Clinical Reports",
    author = "Lilli, Livia and Antenucci, Laura and Ortolan, Augusta and Bosello, Silvia Laura and D'agostino, Maria Antonietta and Patarnello, Stefano and Masciocchi, Carlotta and Lenkowicz, Jacopo",
    booktitle = "Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)",
    year = "2024",
    address = "Pisa, Italy",
    publisher = "CEUR Workshop Proceedings",
    pages = "510--516",
    url = "https://aclanthology.org/2024.clicit-1.60/"
}

Please also cite the original AlBERTo model:

@inproceedings{polignano-etal-2019-alberto,
    title = "{A}l{BER}To: {I}talian {BERT} Language Understanding Model for {NLP} Challenging Tasks Based on Tweets",
    author = "Polignano, Marco and Basile, Pierpaolo and de Gemmis, Marco and Semeraro, Giovanni and Basile, Valerio",
    booktitle = "Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)",
    year = "2019",
    address = "Bari, Italy",
    publisher = "CEUR Workshop Proceedings",
    pages = "312--317",
    url = "https://aclanthology.org/2019.clicit-1.47/"
}

Releases

No releases published

Packages

 
 
 

Contributors

Languages