HealthNLP .org is home to natural language processing (NLP) projects related to health such as mining the clinical narrative of the electronic medical record.
At this time HealthNLP has over 20 public repositories for several projects, including:
HNLP-TimeNorm : Provides models for finding natural language expressions of dates and times and converting them to a normalized form.
LGT-SACT : Extracts and normalizes temporal information from clinical notes using fine-tuned LLMs. Specifically, Systemic Anti-Cancer Therapy (SACT) Timelines.
chemoTimelines Docker : Dockerizable source code for the baseline system for the Chemotherapy Treatment Timelines Extraction from the Clinical Narrative shared task.
chemoTimelines Eval : Evaluation code for ChemoTimelines 2025.
rt-ctae-eval : Evaluation and annotation adjudication tool for the ACS-CTAE Label Studio project, using lseval as a backend.
lseval : Basic version of core functionality we use with anaforatools but for Label Studio annotations.
radiotherapy_end2end : An end-to-end natural language processing system for automatically extracting radiotherapy events from clinical texts.
ctae_pre_annotation : cTAKES module for generating LabelStudio pre-annotation JSON from clinical text for the CTAE project.
acs-lung-cns-eda : EDA/Computing note counts for ACS project lung cancer patients with at least one CNS adverse event.
acs-lung-cardiac-eda : EDA/Computing note counts for ACS project lung cancer patients with at least one cardiac event.
bwrobitterman_label_studio_setup : Repository for managing the Label Studio startup scripts etc.
rt-signature-docker : DeepPhe RT Signature Docker with fixes for running and Label Studio output.
Some of our projects are extensions to other public projects such as Apache cTAKES and DeepPhe.