Bactopia Tool - mobsuite¶
The mobsuite module uses MOB-suite to reconstruct and annotate plasmids in draft assemblies.
Example Usage¶
bactopia --wf mobsuite \
  --bactopia /path/to/your/bactopia/results \ 
  --include includes.txt  
Output Overview¶
Below is the default output structure for the mobsuite tool. Where possible the 
file descriptions below were modified from a tools description.
<BACTOPIA_DIR>
├── <SAMPLE_NAME>
│   └── tools
│       └── mobsuite
│           ├── <SAMPLE_NAME>-mobtyper.txt
│           ├── chromosome.fasta
│           ├── contig_report.txt
│           ├── logs
│           │   ├── nf-mobsuite.{begin,err,log,out,run,sh,trace}
│           │   └── versions.yml
│           └── plasmid_<PLASMID_NAME>.fasta
└── bactopia-runs
    └── mobsuite-<TIMESTAMP>
        ├── merged-results
        │   ├── logs
        │   │   └── mobsuite-concat
        │   │       ├── nf-merged-results.{begin,err,log,out,run,sh,trace}
        │   │       └── versions.yml
        │   └── mobsuite.tsv
        └── nf-reports
            ├── mobsuite-dag.dot
            ├── mobsuite-report.html
            ├── mobsuite-timeline.html
            └── mobsuite-trace.txt
Results¶
Merged Results¶
Below are results that are concatenated into a single file.
| Filename | Description | 
|---|---|
| mobsuite.tsv | A merged TSV file with mobsuite results from all samples | 
MOB-suite¶
Below is a description of the per-sample results from MOB-suite.
| Filename | Description | 
|---|---|
| <SAMPLE_NAME>-mobtyper.txt | Aggregate MOB-typer report files for all identified plasmid, see MOB-typer - report file for more details | 
| chromosome.fasta | FASTA file of all contigs found to belong to the chromosome | 
| contig_report.txt | Assignment of the contig to chromosome or a particular plasmid grouping, see MOB-recon - contig report for more details | 
| plasmid_<PLASMID_NAME>.fasta | Each plasmid group is written to an individual FASTA | 
Audit Trail¶
Below are files that can assist you in understanding which parameters and program versions were used.
Logs¶
Each process that is executed will have a folder named logs. In this folder are helpful
files for you to review if the need ever arises.
| Extension | Description | 
|---|---|
| .begin | An empty file used to designate the process started | 
| .err | Contains STDERR outputs from the process | 
| .log | Contains both STDERR and STDOUT outputs from the process | 
| .out | Contains STDOUT outputs from the process | 
| .run | The script Nextflow uses to stage/unstage files and queue processes based on given profile | 
| .sh | The script executed by bash for the process | 
| .trace | The Nextflow Trace report for the process | 
| versions.yml | A YAML formatted file with program versions | 
Nextflow Reports¶
These Nextflow reports provide great a great summary of your run. These can be used to optimize resource usage and estimate expected costs if using cloud platforms.
| Filename | Description | 
|---|---|
| mobsuite-dag.dot | The Nextflow DAG visualisation | 
| mobsuite-report.html | The Nextflow Execution Report | 
| mobsuite-timeline.html | The Nextflow Timeline Report | 
| mobsuite-trace.txt | The Nextflow Trace report | 
Program Versions¶
At the end of each run, each of the versions.yml files are merged into the files below.
| Filename | Description | 
|---|---|
| software_versions.yml | A complete list of programs and versions used by each process | 
| software_versions_mqc.yml | A complete list of programs and versions formatted for MultiQC | 
Parameters¶
Required Parameters¶
Define where the pipeline should find input data and save output data.
| Parameter | Description | 
|---|---|
--bactopia | 
The path to bactopia results to use as inputs  Type: string | 
Filtering Parameters¶
Use these parameters to specify which samples to include or exclude.
| Parameter | Description | 
|---|---|
--include | 
A text file containing sample names (one per line) to include from the analysis  Type: string | 
--exclude | 
A text file containing sample names (one per line) to exclude from the analysis  Type: string | 
MOB-suite Recon Parameters¶
| Parameter | Description | 
|---|---|
--mb_max_contig_size | 
Maximum size of a contig to be considered a plasmid  Type: integer, Default: 310000 | 
--mb_min_contig_size | 
Minimum length of contigs to classify  Type: integer, Default: 1000 | 
--mb_max_plasmid_size | 
Maximum size of a reconstructed plasmid  Type: integer, Default: 350000 | 
--mobsuite_opts | 
Extra MOB-suite options in quotes. Example: '--min_mob_evalue 0.001'  Type: string | 
Optional Parameters¶
These optional parameters can be useful in certain settings.
| Parameter | Description | 
|---|---|
--outdir | 
Base directory to write results to  Type: string, Default: ./ | 
--run_name | 
Name of the directory to hold results  Type: string, Default: bactopia | 
--skip_compression | 
Ouput files will not be compressed  Type: boolean | 
--datasets | 
The path to cache datasets to  Type: string | 
--keep_all_files | 
Keeps all analysis files created  Type: boolean | 
Max Job Request Parameters¶
Set the top limit for requested resources for any single job.
| Parameter | Description | 
|---|---|
--max_retry | 
Maximum times to retry a process before allowing it to fail.  Type: integer, Default: 3 | 
--max_cpus | 
Maximum number of CPUs that can be requested for any single job.  Type: integer, Default: 4 | 
--max_memory | 
Maximum amount of memory (in GB) that can be requested for any single job.  Type: integer, Default: 32 | 
--max_time | 
Maximum amount of time (in minutes) that can be requested for any single job.  Type: integer, Default: 120 | 
--max_downloads | 
Maximum number of samples to download at a time  Type: integer, Default: 3 | 
Nextflow Configuration Parameters¶
Parameters to fine-tune your Nextflow setup.
| Parameter | Description | 
|---|---|
--nfconfig | 
A Nextflow compatible config file for custom profiles, loaded last and will overwrite existing variables if set.  Type: string | 
--publish_dir_mode | 
Method used to save pipeline results to output directory.  Type: string, Default: copy | 
--infodir | 
Directory to keep pipeline Nextflow logs and reports.  Type: string, Default: ${params.outdir}/pipeline_info | 
--force | 
Nextflow will overwrite existing output files.  Type: boolean | 
--cleanup_workdir | 
After Bactopia is successfully executed, the work directory will be deleted. Type: boolean | 
Nextflow Profile Parameters¶
Parameters to fine-tune your Nextflow setup.
| Parameter | Description | 
|---|---|
--condadir | 
Directory to Nextflow should use for Conda environments  Type: string | 
--registry | 
Docker registry to pull containers from.  Type: string, Default: dockerhub | 
--datasets_cache | 
Directory where downloaded datasets should be stored.  Type: string, Default: <BACTOPIA_DIR>/data/datasets | 
--singularity_cache | 
Directory where remote Singularity images are stored.  Type: string | 
--singularity_pull_docker_container | 
Instead of directly downloading Singularity images for use with Singularity, force the workflow to pull and convert Docker containers instead.  Type: boolean | 
--force_rebuild | 
Force overwrite of existing pre-built environments.  Type: boolean | 
--queue | 
Comma-separated name of the queue(s) to be used by a job scheduler (e.g. AWS Batch or SLURM)  Type: string, Default: general,high-memory | 
--cluster_opts | 
Additional options to pass to the executor. (e.g. SLURM: '--account=my_acct_name'  Type: string | 
--disable_scratch | 
All intermediate files created on worker nodes of will be transferred to the head node.  Type: boolean | 
Helpful Parameters¶
Uncommonly used parameters that might be useful.
| Parameter | Description | 
|---|---|
--monochrome_logs | 
Do not use coloured log outputs.  Type: boolean | 
--nfdir | 
Print directory Nextflow has pulled Bactopia to  Type: boolean | 
--sleep_time | 
The amount of time (seconds) Nextflow will wait after setting up datasets before execution.  Type: integer, Default: 5 | 
--validate_params | 
Boolean whether to validate parameters against the schema at runtime  Type: boolean, Default: True | 
--help | 
Display help text.  Type: boolean | 
--wf | 
Specify which workflow or Bactopia Tool to execute  Type: string, Default: bactopia | 
--list_wfs | 
List the available workflows and Bactopia Tools to use with '--wf'  Type: boolean | 
--show_hidden_params | 
Show all params when using --help Type: boolean | 
--help_all | 
An alias for --help --show_hidden_params  Type: boolean | 
--version | 
Display version text.  Type: boolean | 
Citations¶
If you use Bactopia and mobsuite in your analysis, please cite the following.
- 
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) - 
csvtk
Shen, W csvtk: A cross-platform, efficient and practical CSV/TSV toolkit in Golang. (GitHub) - 
MOB-suite
Robertson J, Nash JHE MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microbial Genomics 4(8). (2018) - 
MOB-suite Database
Robertson J, Bessonov K, Schonfeld J, Nash JHE. Universal whole-sequence-based plasmid typing and its utility to prediction of host range and epidemiological surveillance. Microbial Genomics, 6(10)(2020)