Skip to content

Zafar-Lab/PHALCON

Repository files navigation

PHALCON

Description

PHALCON is a scalable single-cell variant caller designed for high-throughput sequencing data. It is robust to common single-cell sequencing (SCS) errors and enables accurate mutation detection across large numbers of cells within practical runtimes.

PHALCON Workflow

Usage

PHALCON takes as input a read count matrix $(sites \times cells)$ and (optionally) a genotype quality matrix. If you have a loom file instead, a script named loomToReadcount.py is present in the supplementary folder in the main directory, output files of which can be fed as input to PHALCON.

Installation

Clone the repo and navigate to the main directory:

git clone https://github.com/Zafar-Lab/PHALCON
cd PHALCON

Create and activate the PHALCON conda environment:

conda env create -f environment.yml
conda activate phalcon

Verify that PHALCON and all dependencies have been installed successfully, run:

python src/phalcon.py --help

This should display the list of available command-line arguments.

Arguments

-i: Input read count file

-o: Output prefix

-r: Minimum read depth threshold (Default : 5)

-a: Alternate frequency threshold (Default : 0.2)

-v: Threshold for proportion of cells with insufficient read count information (Default : 0.5)

-m: Threshold for proportion of sites harboring a mutation (Default : 0.004)

-c: Clustering algorithm to use (Default : "spectral", Options: "spectral" or "leiden")

-s: Seed

Optional Arguments

-gq: Enable genotype quality filter (Default : 0)

-q: Genotype quality threshold (Default : 30)

Run PHALCON

The example below runs PHALCON using both a read count matrix (sample_read_count_file.tsv) and a genotype quality matrix (sample_geno_qual_file) while keeping all other parameters at their default values.

python src/phalcon.py -i ../sample_read_count_file.tsv -g ../sample_geno_qual_file.tsv -gq 1

PHALCON executable (optional)

You may also create a system-wide executable for PHALCON. To create the executable, download the src folder, unzip it on your system, and follow the steps below:

Step 1- Change directory to src folder and run the following command:

chmod +x phalcon.py

Step 2- Convert phalcon into an executable file using the following commands (second command is optional):

sudo mv phalcon.py /usr/local/bin/phalcon
sudo ln -s $(pwd)/phalcon.py /usr/local/bin/phalcon

On the command line, give the input arguments (use help for the list of arguments) and run phalcon.

Below is an example where "sample_read_count_file.tsv" and "sample_geno_qual_file.tsv" files are provided as input with all other variables being kept at the default values.

phalcon -i ../sample_read_count_file.tsv -g ../sample_geno_qual_file.tsv -gq 1

Use -gq 0 to disable the genotype quality filter. For a sample run, you can find the input files here:

Output

PHALCON mainly outputs the variant calls on each cell (.vcf format) and the reconstructed phylogeny (.gv format). Other auxiliary files, such as umap, cluster labels, etc, are also outputted.

Sample read count and genotype quality files

Tutorials

You can find the readthedocs for PHALCON here - PHALCON-read-the-docs

Tutorial on simulated datasets - PHALCON-on-simulated-data

AML tutorial - AML-67-001

TNBC tutorial - TN4

Help

Run phalcon -help for the description of parameters, along with their default values.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages