-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Ilya Popov edited this page Jun 2, 2026
·
4 revisions
conda create -n krakenparser pip -y
conda activate krakenparser
pip install krakenparser! KrakenParser -h
Usage: KrakenParser [OPTIONS] COMMAND [ARGS]...
KrakenParser: Convert Kraken2 Reports to CSV and analyze microbial diversity.
To execute the full pipeline automatically, just use the global options.
Alternatively, you can run specific parts of the pipeline manually in the
following order:
mpa โ combine โ split โ process โ csv โ relabund โ diversity
Each step behaves as an independent tool. Type 'krakenparser <command> --help'
to see options for a specific step.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --input -i PATH Directory containing Kraken2 report โ
โ files. โ
โ --output -o PATH Output directory. โ
โ --viruses -viruses Extract only VIRUSES domain taxa in โ
โ the pipeline. โ
โ --bacteria -bacteria Extract only BACTERIA domain taxa in โ
โ the pipeline. โ
โ --fungi -fungi Extract only FUNGI kingdom taxa in โ
โ the pipeline. โ
โ --archaea -archaea Extract only ARCHAEA domain taxa in โ
โ the pipeline. โ
โ --keep-human -keep-human Do not filter human-related taxa. โ
โ --version -V Show version and exit. โ
โ --depth -d INTEGER Rarefaction depth for ฮฒ-diversity. โ
โ [default: 1000] โ
โ --seed -s INTEGER Random seed for reproducible โ
โ rarefaction. โ
โ --overwrite -overwrite Overwrite the output directory if it โ
โ already exists. โ
โ --help -h Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Advanced (Step-by-step pipeline control) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ mpa Convert a Kraken2 report to MetaPhlAn (MPA) format. โ
โ combine Combine MPA files into a single tab-delimited table. โ
โ split Split a combined MPA table into per-rank TXT files. โ
โ process Reads a source file, processes its first line, modifies taxa โ
โ names in a destination file, and updates it. โ
โ csv Reads a TXT file, reorganizes the data, and converts it into a โ
โ CSV file. โ
โ relabund Calculates taxa relative abundance and saves it to a CSV file. โ
โ diversity Calculate ฮฑ & ฮฒ-diversities for microbial communities. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
First, download the demo data
Input
! wget https://github.com/PopovIILab/KrakenParser/raw/refs/heads/dev/demo_data.zip && \
unzip demo_data.zip && rm -rf demo_data.zip
Then run KrakenParser
Input
! KrakenParser -i demo_data/kreports/ -o results
The resulted output files:
results/
โโ counts/
โ โโ counts_species.csv
โ โโ counts_genus.csv
โ โโ ...
โ โโ counts_phylum.csv
โโ rel_abund/
โ โโ ra_species.csv
โ โโ ra_genus.csv
โ โโ ...
โ โโ ra_phylum.csv
โโ diversity/
โ โโ alpha_div.csv
โ โโ beta_div_bray.csv
โ โโ beta_div_jaccard.csv
โโ intermediate/
โ โโ mpa/
โ โ โโ {sample}.txt
โ โ โโ ...
โ โโ COMBINED.txt
โ โโ txt/
โ โโ counts_species.txt
โ โโ counts_genus.txt
โ โโ ...
โ โโ counts_phylum.txt
โโ krakenparser.log
Then group low abundant (<4.0%) taxa on species level
! KrakenParser relabund -i results/counts/counts_species.csv -o results/rel_abund/ra_species_4.csv -O 4
The resulted results/rel_abund/ra_species_4.csv file:
Sample_id,taxon,rel_abund_perc
Sample_id,taxon,rel_abund_perc
X1,Other (<4.0%),42.686249843331495
X1,Escherichia coli,12.75490546506176
X1,Haemophilus ducreyi,10.075525825933164
X1,Salmonella enterica,9.632211973651838
X1,Staphylococcus aureus,8.720517307808358
X1,Klebsiella pneumoniae,7.072132502100518
X1,Bacteroides fragilis,4.549653472470442
X1,Morganella morganii,4.508803609642424
X2,Other (<4.0%),46.04232622442552
X2,Morganella morganii,16.5117268346954
X2,Escherichia coli,15.349350872456869
X2,Salmonella enterica,10.345628717042564
X2,Klebsiella pneumoniae,7.063587503374482
X2,Haemophilus ducreyi,4.68737984800517
...
X8,Bartonella krasnovii,25.39596964453076
X8,Pediococcus pentosaceus,9.254716232772273
X8,Latilactobacillus sakei,8.688709539996534
X8,Staphylococcus aureus,7.8557850790470285
X8,Bacteroides fragilis,5.997643719867084
X8,Escherichia coli,5.454406567029118
X8,Haemophilus ducreyi,5.004744878322517
X9,Other (<4.0%),46.9787738378817
X9,Escherichia coli,18.22909641989718
X9,Salmonella enterica,10.944105961952845
X9,Klebsiella pneumoniae,9.848017408277853
X9,Staphylococcus aureus,7.364958937831675
X9,Haemophilus ducreyi,6.635047434158754
This file will be used as the input in all the visualization APIs documentation later on