Bactopia Tool - `merlin`¶

MinmER assisted species-specific bactopia tool seLectIoN, or Merlin, uses distances based on the RefSeq sketch downloaded by bactopia datasets to automatically run species-specific tools.

Currently Merlin knows 16 spells for which cover the following:

Genus/Species	Tools
Escherichia / Shigella	ECTyper, ShigaTyper, ShigEiFinder
Haemophilus	hicap, HpsuisSero
Klebsiella	Kleborate
Legionella	legsta
Listeria	LisSero
Mycobacterium	TBProfiler
Neisseria	meningotype, ngmaster
Pseudomonas	pasty
Salmonella	SeqSero2, SISTR
Staphylococcus	AgrVATE, spaTyper, staphopia-sccmec
Streptococcus	emmtyper, pbptyper, SsuisSero

Merlin is avialable as an independent Bactopia Tool, or in the Bactopia with the --ask_merlin parameter. Even better, if you want to force Merlin to execute all species-specific tools (no matter the distance), you can use --full_merlin. Then all the spells will be unleashed!

Example Usage¶

bactopia --wf merlin \
  --bactopia /path/to/your/bactopia/results

Output Overview¶

Below is the default output structure for the merlin tool. Where possible the file descriptions below were modified from a tools description.

<BACTOPIA_DIR>
├── <SAMPLE_NAME>
│   └── tools
│       ├── agrvate
│       │   ├── <SAMPLE_NAME>-agr_gp.tab
│       │   ├── <SAMPLE_NAME>-blastn_log.txt
│       │   ├── <SAMPLE_NAME>-hmm-log.txt
│       │   ├── <SAMPLE_NAME>-hmm.tab
│       │   ├── <SAMPLE_NAME>-summary.tab
│       │   └── logs
│       │       ├── nf-agrvate.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── ectyper
│       │   ├── <SAMPLE_NAME>.tsv
│       │   ├── blast_output_alleles.txt
│       │   └── logs
│       │       ├── ectyper.log
│       │       ├── nf-ectyper.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── emmtyper
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-emmtyper.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── genotyphi
│       │   ├── <SAMPLE_NAME>.csv
│       │   ├── <SAMPLE_NAME>.json
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── genotyphi
│       │       │   ├── nf-genotyphi.{begin,err,log,out,run,sh,trace}
│       │       │   └── versions.yml
│       │       └── mykrobe
│       │           ├── nf-genotyphi.{begin,err,log,out,run,sh,trace}
│       │           └── versions.yml
│       ├── hicap
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-hicap.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── hpsuissero
│       │   ├── <SAMPLE_NAME>_serotyping_res.tsv
│       │   └── logs
│       │       ├── nf-hpsuissero.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── kleborate
│       │   ├── <SAMPLE_NAME>.results.txt
│       │   └── logs
│       │       ├── nf-kleborate.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── legsta
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-legsta.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── mashdist
│       │   └── merlin
│       │       ├── <SAMPLE_NAME>-dist.txt
│       │       └── logs
│       │           ├── nf-mashdist.{begin,err,log,out,run,sh,trace}
│       │           └── versions.yml
│       ├── meningotype
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-meningotype.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── ngmaster
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-ngmaster.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── pasty
│       │   ├── <SAMPLE_NAME>.blastn.tsv
│       │   ├── <SAMPLE_NAME>.details.tsv
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-pasty.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── pbptyper
│       │   ├── <SAMPLE_NAME>-1A.tblastn.tsv
│       │   ├── <SAMPLE_NAME>-2B.tblastn.tsv
│       │   ├── <SAMPLE_NAME>-2X.tblastn.tsv
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-pbptyper.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── seqsero2
│       │   ├── <SAMPLE_NAME>_log.txt
│       │   ├── <SAMPLE_NAME>_result.tsv
│       │   ├── <SAMPLE_NAME>_result.txt
│       │   └── logs
│       │       ├── nf-seqsero2.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── seroba
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-seroba.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── shigatyper
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-shigatyper.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── shigeifinder
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-shigeifinder.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── sistr
│       │   ├── <SAMPLE_NAME>-allele.fasta.gz
│       │   ├── <SAMPLE_NAME>-allele.json.gz
│       │   ├── <SAMPLE_NAME>-cgmlst.csv
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-sistr.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── spatyper
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-spatyper.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── ssuissero
│       │   ├── <SAMPLE_NAME>_serotyping_res.tsv
│       │   └── logs
│       │       ├── nf-ssuissero.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── staphopiasccmec
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-staphopiasccmec.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       ├── stecfinder
│       │   ├── <SAMPLE_NAME>.tsv
│       │   └── logs
│       │       ├── nf-stecfinder.{begin,err,log,out,run,sh,trace}
│       │       └── versions.yml
│       └── tbprofiler
│           ├── <SAMPLE_NAME>.results.csv
│           ├── <SAMPLE_NAME>.results.json
│           ├── <SAMPLE_NAME>.results.txt
│           ├── bam
│           │   └── <SAMPLE_NAME>.bam
│           ├── logs
│           │   ├── nf-tbprofiler.{begin,err,log,out,run,sh,trace}
│           │   └── versions.yml
│           └── vcf
│               └── <SAMPLE_NAME>.targets.csq.vcf.gz
└── bactopia-runs
    └── merlin-<TIMESTAMP>
        ├── merged-results
        │   ├── agrvate.tsv
        │   ├── ectyper.tsv
        │   ├── emmtyper.tsv
        │   ├── genotyphi.tsv
        │   ├── hicap.tsv
        │   ├── hpsuissero.tsv
        │   ├── kleborate.tsv
        │   ├── legsta.tsv
        │   ├── logs
        │   │   └── <BACTOPIA_TOOL>-concat
        │   │       ├── nf-merged-results.{begin,err,log,out,run,sh,trace}
        │   │       └── versions.yml
        │   ├── meningotype.tsv
        │   ├── ngmaster.tsv
        │   ├── pasty.tsv
        │   ├── pbptyper.tsv
        │   ├── seqsero2.tsv
        │   ├── seroba.tsv
        │   ├── shigatyper.tsv
        │   ├── shigeifinder.tsv
        │   ├── sistr.tsv
        │   ├── spatyper.tsv
        │   ├── ssuissero.tsv
        │   ├── staphopiasccmec.tsv
        │   └── stecfinder.tsv
        └── nf-reports
            ├── merlin-dag.dot
            ├── merlin-report.html
            ├── merlin-timeline.html
            └── merlin-trace.txt

Directory structure might be different

merlin is available as a standalone Bactopia Tool, as well as from the main Bactopia workflow (e.g. through Staphopia or Merlin). If executed from Bactopia, the merlin directory structure might be different, but the output descriptions below still apply.

Results¶

Merged Results¶

Below are results that are concatenated into a single file.

Filename	Description
agrvate.tsv	A merged TSV file with `AgrVATE` results from all samples
clermontyping.csv	A merged TSV file with `ClermonTyping` results from all samples
ectyper.tsv	A merged TSV file with `ECTyper` results from all samples
emmtyper.tsv	A merged TSV file with `emmtyper` results from all samples
genotyphi.tsv	A merged TSV file with `genotyphi` results from all samples
hicap.tsv	A merged TSV file with `hicap` results from all samples
hpsuissero.tsv	A merged TSV file with `HpsuisSero` results from all samples
kleborate.tsv	A merged TSV file with `Kleborate` results from all samples
legsta.tsv	A merged TSV file with `legsta` results from all samples
lissero.tsv	A merged TSV file with `LisSero` results from all samples
meningotype.tsv	A merged TSV file with `meningotype` results from all samples
ngmaster.tsv	A merged TSV file with `ngmaster` results from all samples
pasty.tsv	A merged TSV file with `pasty` results from all samples
pbptyper.tsv	A merged TSV file with `pbptyper` results from all samples
seqsero2.tsv	A merged TSV file with `seqsero2` results from all samples
seroba.tsv	A merged TSV file with `seroba` results from all samples
shigapass.csv	A merged CSV file with `ShigaPass` results from all samples
shigatyper.tsv	A merged TSV file with `ShigaTyper` results from all samples
shigeifinder.tsv	A merged TSV file with `ShigEiFinder` results from all samples
sistr.tsv	A merged TSV file with `SISTR` results from all samples
spatyper.tsv	A merged TSV file with `spaTyper` results from all samples
ssuissero.tsv	A merged TSV file with `SsuisSero` results from all samples
staphopiasccmec.tsv	A merged TSV file with `staphopia-sccmec` results from all samples
stecfinder.tsv	A merged TSV file with `stecfinder` results from all samples

AgrVATE¶

Below is a description of the per-sample results from AgrVATE.

Extension	Description
-agr_gp.tab	A detailed report for agr kmer matches
-blastn_log.txt	Log files from programs called by `AgrVATE`
-summary.tab	A final summary report for agr typing

ClermonTyping¶

Below is a description of the per-sample results from ClermonTyping.

Extension	Description
<SAMPLE_NAME>.blast.xml	A BLAST XML file with the results of the ClermonTyping analysis
<SAMPLE_NAME>.html	A HTML file with the results of the ClermonTyping analysis
<SAMPLE_NAME>.mash.tsv	A TSV file with the Mash distances
<SAMPLE_NAME>.phylogroups.txt	A TSV file with the final phylogroup assignments

ECTyper¶

Below is a description of the per-sample results from ECTyper.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `ECTyper` result, see ECTyper - Report format for details
blast_output_alleles.txt	Allele report generated from BLAST results

emmtyper¶

Below is a description of the per-sample results from emmtyper.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `emmtyper` result, see emmtyper - Result format for details

hicap¶

Below is a description of the per-sample results from hicap.

Filename	Description
<SAMPLE_NAME>.gbk	GenBank file and cap locus annotations
<SAMPLE_NAME>.svg	Visualization of annotated cap locus
<SAMPLE_NAME>.tsv	A tab-delimited file with `hicap` results

HpsuisSero¶

Below is a description of the per-sample results from HpsuisSero.

Filename	Description
<SAMPLE_NAME>_serotyping_res.tsv	A tab-delimited file with `HpsuisSero` result

GenoTyphi¶

Below is a description of the per-sample results from GenoTyphi. A full description of the GenoTyphi output is available at GenoTyphi - Output

Filename	Description
<SAMPLE_NAME>_predictResults.tsv	A tab-delimited file with `GenoTyphi` results
<SAMPLE_NAME>.csv	The output of `mykrobe predict` in comma-separated format
<SAMPLE_NAME>.json	The output of `mykrobe predict` in JSON format

Kleborate¶

Below is a description of the per-sample results from Kleborate.

Filename	Description
<SAMPLE_NAME>.results.txt	A tab-delimited file with `Kleborate` result, see Kleborate - Example output for more details.

legsta¶

Below is a description of the per-sample results from legsta.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `legsta` result, see legsta - Output for more details

LisSero¶

Below is a description of the per-sample results from LisSero.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `LisSero` results

Mash¶

Below is a description of the per-sample results from Mash.

Filename	Description
<SAMPLE_NAME>-dist.txt	A tab-delimited file with `mash dist` results

meningotype¶

Below is a description of the per-sample results from meningotype .

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `meningotype` result

ngmaster¶

Below is a description of the per-sample results from ngmaster.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `ngmaster` results

pasty¶

Below is a description of the per-sample results from pasty.

Extension	Description
.blastn.tsv	A tab-delimited file of all blast hits
.details.tsv	A tab-delimited file with details for each serogroup
.tsv	A tab-delimited file with the predicted serogroup

pbptyper¶

Below is a description of the per-sample results from pbptyper.

Extension	Description
.tblastn.tsv	A tab-delimited file of all blast hits
.tsv	A tab-delimited file with the predicted PBP type

SeqSero2¶

Below is a description of the per-sample results from SeqSero2.

Filename	Description
<SAMPLE_NAME>_result.tsv	A tab-delimited file with `SeqSero2` results
<SAMPLE_NAME>_result.txt	A text file with key-value pairs of `SeqSero2` results

Seroba¶

Below is a description of the per-sample results from Seroba. More details about the outputs are available from Seroba - Output.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with the predicted serotype
detailed_serogroup_info.txt	Detailed information about the predicted results

ShigaPass¶

Below is a description of the per-sample results from ShigaPass.

Extension	Description
<SAMPLE_NAME>.csv	A CSV file with the predicted Shigella or EIEC serotype

ShigaTyper¶

Below is a description of the per-sample results from ShigaTyyper.

Filename	Description
<SAMPLE_NAME>-hits.tsv	Detailed statistics about each individual gene hit
<SAMPLE_NAME>.tsv	The final predicted serotype by `ShigaTyper`

ShigEiFinder¶

Below is a description of the per-sample results from ShigEiFinder.

Extension	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with the predicted Shigella or EIEC serotype

SISTR¶

Below is a description of the per-sample results from SISTR.

Filename	Description
<SAMPLE_NAME>-allele.fasta.gz	A FASTA file of the cgMLST allele search results
<SAMPLE_NAME>-allele.json.gz	JSON formated cgMLST allele search results, see SISTR - cgMLST search results for more details
<SAMPLE_NAME>-cgmlst.csv	A comma-delimited summary of the cgMLST allele search results
<SAMPLE_NAME>.tsv	A tab-delimited file with `SISTR` results, see SISTR - Primary results for more details

spaTyper¶

Below is a description of the per-sample results from spaTyper.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `spaTyper` result

SsuisSero¶

Below is a description of the per-sample results from SsuisSero.

Filename	Description
<SAMPLE_NAME>_serotyping_res.tsv	A tab-delimited file with `SsuisSero` results

staphopia-sccmec¶

Below is a description of the per-sample results from staphopia-sccmec.

Filename	Description
<SAMPLE_NAME>.tsv	A tab-delimited file with `staphopia-sccmec` results

TBProfiler¶

Below is a description of the per-sample results from TBProfiler.

Filename	Description
<SAMPLE_NAME>.results.csv	A CSV formated `TBProfiler` result file of resistance and strain type
<SAMPLE_NAME>.results.json	A JSON formated `TBProfiler` result file of resistance and strain type
<SAMPLE_NAME>.results.txt	A text file with `TBProfiler` results
<SAMPLE_NAME>.bam	BAM file with alignment details
<SAMPLE_NAME>.targets.csq.vcf.gz	VCF with variant info again reference genomes

Audit Trail¶

Below are files that can assist you in understanding which parameters and program versions were used.

Logs¶

Each process that is executed will have a folder named logs. In this folder are helpful files for you to review if the need ever arises.

Extension	Description
.begin	An empty file used to designate the process started
.err	Contains STDERR outputs from the process
.log	Contains both STDERR and STDOUT outputs from the process
.out	Contains STDOUT outputs from the process
.run	The script Nextflow uses to stage/unstage files and queue processes based on given profile
.sh	The script executed by bash for the process
.trace	The Nextflow Trace report for the process
versions.yml	A YAML formatted file with program versions

Nextflow Reports¶

These Nextflow reports provide great a great summary of your run. These can be used to optimize resource usage and estimate expected costs if using cloud platforms.

Filename	Description
merlin-dag.dot	The Nextflow DAG visualisation
merlin-report.html	The Nextflow Execution Report
merlin-timeline.html	The Nextflow Timeline Report
merlin-trace.txt	The Nextflow Trace report

Program Versions¶

At the end of each run, each of the versions.yml files are merged into the files below.

Filename	Description
software_versions.yml	A complete list of programs and versions used by each process
software_versions_mqc.yml	A complete list of programs and versions formatted for MultiQC

Parameters¶

Required Parameters¶

Define where the pipeline should find input data and save output data.

Parameter	Description
`--bactopia`	The path to bactopia results to use as inputs Type: `string`

Filtering Parameters¶

Use these parameters to specify which samples to include or exclude.

Parameter	Description
`--include`	A text file containing sample names (one per line) to include from the analysis Type: `string`
`--exclude`	A text file containing sample names (one per line) to exclude from the analysis Type: `string`

mashdist Parameters¶

Parameter	Description
`--mash_sketch`	The reference sequence as a Mash Sketch (.msh file) Type: `string`
`--mash_seed`	Seed to provide to the hash function Type: `integer`, Default: `42`
`--mash_table`	Table output (fields will be blank if they do not meet the p-value threshold) Type: `boolean`
`--mash_m`	Minimum copies of each k-mer required to pass noise filter for reads Type: `integer`, Default: `1`
`--mash_w`	Probability threshold for warning about low k-mer size. Type: `number`, Default: `0.01`
`--max_p`	Maximum p-value to report. Type: `number`, Default: `1.0`
`--max_dist`	Maximum distance to report. Type: `number`, Default: `1.0`
`--merlin_dist`	Maximum distance to report when using Merlin . Type: `number`, Default: `0.1`
`--full_merlin`	Go full Merlin and run all species-specific tools, no matter the Mash distance Type: `boolean`
`--use_fastqs`	Query with FASTQs instead of the assemblies Type: `boolean`

AgrVATE Parameters¶

Parameter	Description
`--typing_only`	agr typing only. Skips agr operon extraction and frameshift detection Type: `boolean`

ClermonTyping Parameters¶

Parameter	Description
`--clermon_threshold`	Do not use contigs under this size Type: `number`

ECTyper Parameters¶

Parameter	Description
`--opid`	Percent identity required for an O antigen allele match Type: `integer`, Default: `90`
`--opcov`	Minumum percent coverage required for an O antigen allele match Type: `integer`, Default: `90`
`--hpid`	Percent identity required for an H antigen allele match Type: `integer`, Default: `95`
`--hpcov`	Minumum percent coverage required for an H antigen allele match Type: `integer`, Default: `50`
`--verify`	Enable E. coli species verification Type: `boolean`
`--print_alleles`	Prints the allele sequences if enabled as the final column Type: `boolean`

emmtyper Parameters¶

Parameter	Description
`--emmtyper_wf`	Workflow for emmtyper to use. Type: `string`, Default: `blast`
`--cluster_distance`	Distance between cluster of matches to consider as different clusters Type: `integer`, Default: `500`
`--percid`	Minimal percent identity of sequence Type: `integer`, Default: `95`
`--culling_limit`	Total hits to return in a position Type: `integer`, Default: `5`
`--mismatch`	Threshold for number of mismatch to allow in BLAST hit Type: `integer`, Default: `5`
`--align_diff`	Threshold for difference between alignment length and subject length in BLAST Type: `integer`, Default: `5`
`--gap`	Threshold gap to allow in BLAST hit Type: `integer`, Default: `2`
`--min_perfect`	Minimum size of perfect match at 3 primer end Type: `integer`, Default: `15`
`--min_good`	Minimum size where there must be 2 matches for each mismatch Type: `integer`, Default: `15`
`--max_size`	Maximum size of PCR product Type: `integer`, Default: `2000`

hicap Parameters¶

Parameter	Description
`--database_dir`	Directory containing locus database Type: `string`
`--model_fp`	Path to prodigal model Type: `string`
`--full_sequence`	Write the full input sequence out to the genbank file rather than just the region surrounding and including the locus Type: `boolean`
`--hicap_debug`	hicap will print debug messages Type: `boolean`
`--gene_coverage`	Minimum percentage coverage to consider a single gene complete Type: `number`, Default: `0.8`
`--gene_identity`	Minimum percentage identity to consider a single gene complete Type: `number`, Default: `0.7`
`--broken_gene_length`	Minimum length to consider a broken gene Type: `integer`, Default: `60`
`--broken_gene_identity`	Minimum percentage identity to consider a broken gene Type: `number`, Default: `0.8`

GenoTyphi Parameters¶

Parameter	Description
`--kmer`	K-mer length Type: `integer`, Default: `21`
`--min_depth`	Minimum depth Type: `integer`, Default: `1`
`--model`	Genotype model used. Type: `string`, Default: `kmer_count`
`--report_all_calls`	Report all calls Type: `boolean`
`--mykrobe_opts`	Extra Mykrobe options in quotes Type: `string`

Kleborate Parameters¶

Parameter	Description
`--kleborate_preset`	Preset module to use for Kleborate Type: `string`, Default: `kpsc`
`--kleborate_opts`	Extra options in quotes for Kleborate Type: `string`

legsta Parameters¶

Parameter	Description
`--noheader`	Don't print header row Type: `boolean`

LisSero Parameters¶

Parameter	Description
`--min_id`	Minimum percent identity to accept a match Type: `number`, Default: `95.0`
`--min_cov`	Minimum coverage of the gene to accept a match Type: `number`, Default: `95.0`

meningotype Parameters¶

You can use these parameters to fine-tune your meningotype analysis

Parameter	Description
`--finetype`	perform porA and fetA fine typing Type: `boolean`
`--porB`	perform porB sequence typing (NEIS2020) Type: `boolean`
`--bast`	perform Bexsero antigen sequence typing (BAST) Type: `boolean`
`--mlst`	perform MLST Type: `boolean`
`--all`	perform MLST, porA, fetA, porB, BAST typing Type: `boolean`

ngmaster Parameters¶

Parameter	Description
`--csv`	output comma-separated format (CSV) rather than tab-separated Type: `boolean`

pasty Parameters¶

Parameter	Description
`--pasty_min_pident`	Minimum percent identity to count a hit Type: `integer`, Default: `95`
`--pasty_min_coverage`	Minimum percent coverage to count a hit Type: `integer`, Default: `95`

pbptyper Parameters¶

Parameter	Description
`--pbptyper_min_pident`	Minimum percent identity to count a hit Type: `integer`, Default: `95`
`--pbptyper_min_coverage`	Minimum percent coverage to count a hit Type: `integer`, Default: `95`

SeqSero2 Parameters¶

Parameter	Description
`--run_mode`	Workflow to run. 'a' allele mode, or 'k' k-mer mode Type: `string`, Default: `k`
`--input_type`	Input format to analyze. 'assembly' or 'fastq' Type: `string`, Default: `assembly`
`--bwa_mode`	Algorithms for bwa mapping for allele mode Type: `string`, Default: `mem`

SISTR Parameters¶

Parameter	Description
`--full_cgmlst`	Use the full set of cgMLST alleles which can include highly similar alleles Type: `boolean`

spaTyper Parameters¶

Parameter	Description
`--repeats`	List of spa repeats Type: `string`
`--repeat_order`	List spa types and order of repeats Type: `string`
`--do_enrich`	Do PCR product enrichment Type: `boolean`

staphopia-sccmec Parameters¶

Parameter	Description
`--hamming`	Report the results as hamming distances Type: `boolean`

TBProfiler Profile Parameters¶

Parameter	Description
`--call_whole_genome`	Call whole genome Type: `boolean`
`--mapper`	Mapping tool to use. If you are using nanopore data it will default to minimap2 Type: `string`, Default: `bwa`
`--caller`	Variant calling tool to use Type: `string`, Default: `freebayes`
`--calling_params`	Extra variant caller options in quotes Type: `string`
`--suspect`	Use the suspect suite of tools to add ML predictions Type: `boolean`
`--no_flagstat`	Don't collect flagstats Type: `boolean`
`--no_delly`	Don't run delly Type: `boolean`
`--tbprofiler_opts`	Extra options in quotes for TBProfiler Type: `string`

Optional Parameters¶

These optional parameters can be useful in certain settings.

Parameter	Description
`--outdir`	Base directory to write results to Type: `string`, Default: `bactopia`
`--skip_compression`	Ouput files will not be compressed Type: `boolean`
`--datasets`	The path to cache datasets to Type: `string`
`--keep_all_files`	Keeps all analysis files created Type: `boolean`

Max Job Request Parameters¶

Set the top limit for requested resources for any single job.

Parameter	Description
`--max_retry`	Maximum times to retry a process before allowing it to fail. Type: `integer`, Default: `3`
`--max_cpus`	Maximum number of CPUs that can be requested for any single job. Type: `integer`, Default: `4`
`--max_memory`	Maximum amount of memory that can be requested for any single job. Type: `string`, Default: `128.GB`
`--max_time`	Maximum amount of time that can be requested for any single job. Type: `string`, Default: `240.h`
`--max_downloads`	Maximum number of samples to download at a time Type: `integer`, Default: `3`

Nextflow Configuration Parameters¶

Parameters to fine-tune your Nextflow setup.

Parameter	Description
`--nfconfig`	A Nextflow compatible config file for custom profiles, loaded last and will overwrite existing variables if set. Type: `string`
`--publish_dir_mode`	Method used to save pipeline results to output directory. Type: `string`, Default: `copy`
`--infodir`	Directory to keep pipeline Nextflow logs and reports. Type: `string`, Default: `${params.outdir}/pipeline_info`
`--force`	Nextflow will overwrite existing output files. Type: `boolean`
`--cleanup_workdir`	After Bactopia is successfully executed, the `work` directory will be deleted. Type: `boolean`

Institutional config options¶

Parameters used to describe centralized config profiles. These should not be edited.

Parameter	Description
`--custom_config_version`	Git commit id for Institutional configs. Type: `string`, Default: `master`
`--custom_config_base`	Base directory for Institutional configs. Type: `string`, Default: `https://raw.githubusercontent.com/nf-core/configs/master`
`--config_profile_name`	Institutional config name. Type: `string`
`--config_profile_description`	Institutional config description. Type: `string`
`--config_profile_contact`	Institutional config contact information. Type: `string`
`--config_profile_url`	Institutional config URL link. Type: `string`

Nextflow Profile Parameters¶