Clinical Bioinformatician
I build production pipelines for clinical informatics — variant calling, NGS QC, GWASes, and the infrastructure that ties them to electronic health records. Most of my work lives in private organizational repos; the summary below reflects aggregated activity across all of them.
Covers the rolling 12 months across public repos and private organizational repos. Private figures are aggregated counts only — no repository names, code, or commit messages are exposed. Stats are broken down by organization below the summary table.
Last updated: 2026-05-11 · Rolling 12-month window
| Public repos | Private org repos | |
|---|---|---|
| Repos / repos touched | 47 | 28 |
| Commits | — | 658 |
| Lines added | — | 4,444,084 |
| Lines deleted | — | 644,479 |
| Net lines | — | 3,799,605 |
| Top language | Python | Python |
| Language | Share | |
|---|---|---|
| Python | 59.5% | ██████████████████░░░░░░░░░░░░ |
| Nextflow | 36.8% | ███████████░░░░░░░░░░░░░░░░░░░ |
| R | 2.8% | █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ |
| Dockerfile | 0.7% | ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ |
| JavaScript | 0.2% | ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ |
| Awk | 0.0% | ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ |
| Organization | Repos touched | Commits | Lines added | Lines deleted | Net lines |
|---|---|---|---|---|---|
| ALL | 28 | 658 | 4,444,084 | 644,479 | 3,799,605 |
| Setia-Verma-Lab | 1 | 6 | 111,309 | 115 | 111,194 |
| Verma-Lab | 4 | 66 | 1,364,019 | 40,773 | 1,323,246 |
| drivas-lab | 5 | 46 | 1,023,353 | 425,862 | 597,491 |
| PMBB-Informatics-and-Genomics | 18 | 540 | 1,945,403 | 177,729 | 1,767,674 |
Showing non-archived repositories with activity in the last 3 years, grouped by owner.
repo for my github profile Python · ★ 0 · ⑂ 0 · last push 2026-05-04
Run PrediXcan on PMBB V3 Unknown · ★ 0 · ⑂ 0 · last push 2025-07-16
testing for dnanexus nextflow pipelines Python · ★ 0 · ⑂ 0 · last push 2024-10-20
phenotyping pipeline Python · ★ 0 · ⑂ 0 · last push 2026-05-04
Association Test Manhattan Plotting Package Python · ★ 2 · ⑂ 1 · last push 2026-04-10
saige family toolkit for nextflow pipelines Nextflow · ★ 3 · ⑂ 4 · last push 2026-03-24
No description. Python · ★ 0 · ⑂ 0 · last push 2026-03-24
This repository serves as a landing page of our repositories for each of the modules we have developed within the PMBB-Informatics working group. Contents include: Links to repos, Example config files, and Set-up instructions Nextflow · ★ 5 · ⑂ 0 · last push 2026-03-24
No description. Nextflow · ★ 0 · ⑂ 0 · last push 2026-01-21
No description. Nextflow · ★ 1 · ⑂ 0 · last push 2026-01-21
No description. Nextflow · ★ 1 · ⑂ 1 · last push 2026-01-21
Plink Clump Nextflow · ★ 0 · ⑂ 0 · last push 2025-12-17
repository for exwas meta analysis Python · ★ 0 · ⑂ 0 · last push 2025-12-17
Run VEP using Nextflow Unknown · ★ 0 · ⑂ 0 · last push 2025-08-21
Workshop Materials for our PSB 2025 Workshop: Command Line to PipeLine Jupyter Notebook · ★ 0 · ⑂ 0 · last push 2025-01-05
No description. Python · ★ 0 · ⑂ 0 · last push 2024-07-19
No description. Jupyter Notebook · ★ 0 · ⑂ 0 · last push 2026-04-14
No description. Unknown · ★ 0 · ⑂ 0 · last push 2026-03-03
No description. Jupyter Notebook · ★ 0 · ⑂ 0 · last push 2025-11-19
This repository includes code used to generate data and figures in this publication R · ★ 0 · ⑂ 0 · last push 2025-11-14
Source code accompanying the PSB Submission titled "DRIVE-KG: Enhancing variant-phenotype association discovery in understudied complex diseases using heterogeneous knowledge graphs" Jupyter Notebook · ★ 0 · ⑂ 0 · last push 2025-10-01
Code accompanying the manuscript: "Enhancing genetic association power in endometriosis through unsupervised clustering of clinical subtypes identified from electronic health records".
biobanks genetic-association-study genomics precision-medicine unsupervised-clustering
Jupyter Notebook · ★ 1 · ⑂ 0 · last push 2025-09-05
No description. Unknown · ★ 0 · ⑂ 0 · last push 2025-08-01
No description. Unknown · ★ 0 · ⑂ 0 · last push 2025-06-02
No description. Unknown · ★ 0 · ⑂ 0 · last push 2025-04-15
Source code for Endo_RuleBased_Phenotyping Python · ★ 0 · ⑂ 0 · last push 2024-12-11
This project aims to investigate genetic associations with hepatocellular carcinoma (HCC) using whole-exome sequencing (WES) and whole-genome sequencing (WGS) data from multiple cohorts. The study has two phases: I: Six-gene burden association analysis and II: Exome-wide association analysis Python · ★ 0 · ⑂ 0 · last push 2026-04-23
No description. Jupyter Notebook · ★ 0 · ⑂ 0 · last push 2026-03-11
Investigating sex-specific genetic effects on hepatic fat accumulation and liver health across menopausal status using CT-derived imaging phenotypes from Penn Medicine Biobank Python · ★ 0 · ⑂ 0 · last push 2026-03-03
No description. Python · ★ 0 · ⑂ 0 · last push 2025-08-29
Platlas Frontend Repo JavaScript · ★ 0 · ⑂ 0 · last push 2025-06-16
No description. JavaScript · ★ 2 · ⑂ 0 · last push 2025-05-30
platlas backend repo JavaScript · ★ 0 · ⑂ 0 · last push 2025-05-08
Penn Med LPC Scientific Brain Trust Workshop Materials Shell · ★ 0 · ⑂ 0 · last push 2025-03-19
No description. Python · ★ 0 · ⑂ 0 · last push 2025-02-14
No description. Rust · ★ 0 · ⑂ 1 · last push 2026-04-12
CiliaTGD is an R package wrapper for CiliaQ that automates high-throughput image segmentation, interactive editing, and downstream analysis of cilia using FIJI’s CiliaQ suite. R · ★ 0 · ⑂ 0 · last push 2025-12-16
No description. Unknown · ★ 0 · ⑂ 0 · last push 2025-07-09
The stats above are generated by a GitHub Action that runs every Monday.
It queries the GitHub API with a scoped PAT, aggregates commit and LOC counts, and commits a stats.json file.
A second step in the same job rewrites this README between the comment markers.