Skip to content
Ilya Popov edited this page Jun 2, 2026 · 4 revisions

KrakenParser Usage Guide

INSTALLATION

conda create -n krakenparser pip -y
conda activate krakenparser
pip install krakenparser

HELP

! KrakenParser -h
Usage: KrakenParser [OPTIONS] COMMAND [ARGS]...                                
                                                                                
 KrakenParser: Convert Kraken2 Reports to CSV and analyze microbial diversity.  
                                                                                
 To execute the full pipeline automatically, just use the global options.       
                                                                                
 Alternatively, you can run specific parts of the pipeline manually in the      
 following order:                                                               
                                                                                
 mpa โž” combine โž” split โž” process โž” csv โž” relabund โž” diversity                   
                                                                                
 Each step behaves as an independent tool. Type 'krakenparser <command> --help' 
 to see options for a specific step.                                            
                                                                                
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --input       -i               PATH     Directory containing Kraken2 report  โ”‚
โ”‚                                         files.                               โ”‚
โ”‚ --output      -o               PATH     Output directory.                    โ”‚
โ”‚ --viruses     -viruses                  Extract only VIRUSES domain taxa in  โ”‚
โ”‚                                         the pipeline.                        โ”‚
โ”‚ --bacteria    -bacteria                 Extract only BACTERIA domain taxa in โ”‚
โ”‚                                         the pipeline.                        โ”‚
โ”‚ --fungi       -fungi                    Extract only FUNGI kingdom taxa in   โ”‚
โ”‚                                         the pipeline.                        โ”‚
โ”‚ --archaea     -archaea                  Extract only ARCHAEA domain taxa in  โ”‚
โ”‚                                         the pipeline.                        โ”‚
โ”‚ --keep-human  -keep-human               Do not filter human-related taxa.    โ”‚
โ”‚ --version     -V                        Show version and exit.               โ”‚
โ”‚ --depth       -d               INTEGER  Rarefaction depth for ฮฒ-diversity.   โ”‚
โ”‚                                         [default: 1000]                      โ”‚
โ”‚ --seed        -s               INTEGER  Random seed for reproducible         โ”‚
โ”‚                                         rarefaction.                         โ”‚
โ”‚ --overwrite   -overwrite                Overwrite the output directory if it โ”‚
โ”‚                                         already exists.                      โ”‚
โ”‚ --help        -h                        Show this message and exit.          โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Advanced (Step-by-step pipeline control) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ mpa        Convert a Kraken2 report to MetaPhlAn (MPA) format.               โ”‚
โ”‚ combine    Combine MPA files into a single tab-delimited table.              โ”‚
โ”‚ split      Split a combined MPA table into per-rank TXT files.               โ”‚
โ”‚ process    Reads a source file, processes its first line, modifies taxa      โ”‚
โ”‚            names in a destination file, and updates it.                      โ”‚
โ”‚ csv        Reads a TXT file, reorganizes the data, and converts it into a    โ”‚
โ”‚            CSV file.                                                         โ”‚
โ”‚ relabund   Calculates taxa relative abundance and saves it to a CSV file.    โ”‚
โ”‚ diversity  Calculate ฮฑ & ฮฒ-diversities for microbial communities.            โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

COMPLETE PIPELINE USAGE EXAMPLE

First, download the demo data

Input

! wget https://github.com/PopovIILab/KrakenParser/raw/refs/heads/dev/demo_data.zip && \
    unzip demo_data.zip && rm -rf demo_data.zip

Then run KrakenParser

Input

! KrakenParser -i demo_data/kreports/ -o results

The resulted output files:

results/
โ”œโ”€ counts/
โ”‚  โ”œโ”€ counts_species.csv
โ”‚  โ”œโ”€ counts_genus.csv
โ”‚  โ”œโ”€ ...
โ”‚  โ””โ”€ counts_phylum.csv
โ”œโ”€ rel_abund/
โ”‚  โ”œโ”€ ra_species.csv
โ”‚  โ”œโ”€ ra_genus.csv
โ”‚  โ”œโ”€ ...
โ”‚  โ””โ”€ ra_phylum.csv
โ”œโ”€ diversity/
โ”‚  โ”œโ”€ alpha_div.csv
โ”‚  โ”œโ”€ beta_div_bray.csv
โ”‚  โ””โ”€ beta_div_jaccard.csv
โ”œโ”€ intermediate/
โ”‚  โ”œโ”€ mpa/
โ”‚  โ”‚  โ”œโ”€ {sample}.txt
โ”‚  โ”‚  โ”œโ”€ ...
โ”‚  โ”œโ”€ COMBINED.txt
โ”‚  โ””โ”€ txt/
โ”‚     โ”œโ”€ counts_species.txt
โ”‚     โ”œโ”€ counts_genus.txt
โ”‚     โ”œโ”€ ...
โ”‚     โ””โ”€ counts_phylum.txt
โ””โ”€ krakenparser.log

Then group low abundant (<4.0%) taxa on species level

! KrakenParser relabund -i results/counts/counts_species.csv -o results/rel_abund/ra_species_4.csv -O 4

The resulted results/rel_abund/ra_species_4.csv file:

Sample_id,taxon,rel_abund_perc
Sample_id,taxon,rel_abund_perc
X1,Other (<4.0%),42.686249843331495
X1,Escherichia coli,12.75490546506176
X1,Haemophilus ducreyi,10.075525825933164
X1,Salmonella enterica,9.632211973651838
X1,Staphylococcus aureus,8.720517307808358
X1,Klebsiella pneumoniae,7.072132502100518
X1,Bacteroides fragilis,4.549653472470442
X1,Morganella morganii,4.508803609642424
X2,Other (<4.0%),46.04232622442552
X2,Morganella morganii,16.5117268346954
X2,Escherichia coli,15.349350872456869
X2,Salmonella enterica,10.345628717042564
X2,Klebsiella pneumoniae,7.063587503374482
X2,Haemophilus ducreyi,4.68737984800517
...
X8,Bartonella krasnovii,25.39596964453076
X8,Pediococcus pentosaceus,9.254716232772273
X8,Latilactobacillus sakei,8.688709539996534
X8,Staphylococcus aureus,7.8557850790470285
X8,Bacteroides fragilis,5.997643719867084
X8,Escherichia coli,5.454406567029118
X8,Haemophilus ducreyi,5.004744878322517
X9,Other (<4.0%),46.9787738378817
X9,Escherichia coli,18.22909641989718
X9,Salmonella enterica,10.944105961952845
X9,Klebsiella pneumoniae,9.848017408277853
X9,Staphylococcus aureus,7.364958937831675
X9,Haemophilus ducreyi,6.635047434158754

This file will be used as the input in all the visualization APIs documentation later on

Clone this wiki locally