CellPathway: Cell type-specific enhancer informs pathway-based approaches for association analysis in whole-genome sequencing studies

CellPathway is a framework that directly tests associations between noncoding variants and cell type-specific pathways defined by enhancer activity.

Installation

Step 1: Clone the repository

git clone https://github.com/WGLab/CellPathway.git
cd CellPathway

Step 2: Create a conda environment

conda create -n cellpathway python=3.10 -y
conda activate cellpathway

Step 3: Install bedtools and pyBigWig via conda

conda install -c bioconda bedtools pybigwig -y

Step 4: Install remaining Python dependencies

pip install -r requirements.txt

Quick start

Step 5: Run enrichment analysis

python cellpathway_enrich.py \
    --enhancer-dir data/Atlas \
    --dnm-file example/autism_dnm.txt \
    --output-dir example \
    --cadd-threshold 10

Step 6: TAD annotation

After enrichment, run TAD annotation for a specific cell type using the overlap BED file generated in the previous step:

python cellpathway_tad.py \
    --overlap-bed example/dnm_enhc_overlap_cadd_10/Fetal_brain_dnm.bed \
    --tad-file data/tad_w_boundary_08.bed \
    --elements-bb data/genes_w_noncoding.bb \
    --gene-list example/SFARI_Gene.csv \
    --output example/tad_Fetal_brain_autism.csv

The --overlap-bed path comes from the enrichment output. Replace Fetal_brain with any cell type from your results.

Input data

File	Description
`example/autism_dnm.txt`	De novo mutations with CADD scores (tab-delimited: Chr, Start, End, CADD_PHRED)
`data/Atlas/`	Cell type-specific enhancer BED files (51 tissues, `*.hg38.bed`)
`data/Single_cell/`	Single-cell enhancer BED files (169 cell types, `*.hg38.bed`)
`data/tad_w_boundary_08.bed`	TAD regions and boundaries (hg38)
`data/genes_w_noncoding.bb`	Gene annotations in BigWig format
`example/SFARI_Gene.csv`	SFARI autism-associated gene list

Output

Enrichment table (enrichment_FC_<cadd>.csv): cell type, enhancer bp, DNM overlap count, fold enrichment, p-value, FDR-adjusted p-value
TAD annotation (tad_<cell>_autism.csv): enhancer-to-TAD mapping with gene names and known disease gene overlaps

Preprocessing

Before running CellPathway, noncoding de novo mutations must be prepared for all samples. We recommend using ANNOVAR to identify and filter rare noncoding variants and to annotate variant pathogenicity using Combined Annotation Dependent Depletion (CADD) scores.

ANNOVAR
CADD

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
example		example
.gitignore		.gitignore
CellPathway.ipynb		CellPathway.ipynb
README.md		README.md
cellpathway_enrich.py		cellpathway_enrich.py
cellpathway_tad.py		cellpathway_tad.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CellPathway: Cell type-specific enhancer informs pathway-based approaches for association analysis in whole-genome sequencing studies

Installation

Step 1: Clone the repository

Step 2: Create a conda environment

Step 3: Install bedtools and pyBigWig via conda

Step 4: Install remaining Python dependencies

Quick start

Step 5: Run enrichment analysis

Step 6: TAD annotation

Input data

Output

Preprocessing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CellPathway: Cell type-specific enhancer informs pathway-based approaches for association analysis in whole-genome sequencing studies

Installation

Step 1: Clone the repository

Step 2: Create a conda environment

Step 3: Install bedtools and pyBigWig via conda

Step 4: Install remaining Python dependencies

Quick start

Step 5: Run enrichment analysis

Step 6: TAD annotation

Input data

Output

Preprocessing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages