merlin
Tags: species-specific automated mash minmer typing bactopia-tool
MinMER-assisted species-specific tool selection and execution.
This Bactopia Tool, Merlin, uses MinMER distances based on the RefSeq sketch to automatically run species-specific analysis tools. Merlin identifies the closest reference genomes and executes appropriate typing and analysis tools for each detected species.
Usage
Bactopia CLI:
bactopia --wf merlin \
--bactopia /path/to/your/bactopia/results
Nextflow:
nextflow run bactopia/bactopia/workflows/bactopia-tools/merlin/main.nf \
--bactopia /path/to/your/bactopia/results
Outputs
Expected Output Files
<BACTOPIA_DIR>
├── <SAMPLE_NAME>
│ └── tools
│ ├── clermontyping
│ │ ├── <SAMPLE_NAME>.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ ├── <SAMPLE_NAME>.blast.xml
│ │ ├── <SAMPLE_NAME>.html
│ │ └── <SAMPLE_NAME>.mash.tsv
│ ├── ectyper
│ │ ├── <SAMPLE_NAME>.blast_alleles.txt
│ │ ├── <SAMPLE_NAME>.tsv
│ │ └── logs
│ │ ├── ectyper.log
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── kleborate
│ │ ├── <SAMPLE_NAME>.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── merlindist
│ │ └── merlin-<TIMESTAMP>
│ │ ├── <SAMPLE_NAME>-dist.txt
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigapass
│ │ ├── <SAMPLE_NAME>.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ └── ShigaPass_summary.csv
│ ├── shigatyper
│ │ ├── <SAMPLE_NAME>-hits.tsv
│ │ ├── <SAMPLE_NAME>.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigeifinder
│ │ ├── <SAMPLE_NAME>.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ └── stecfinder
│ ├── <SAMPLE_NAME>.tsv
│ └── logs
│ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ └── versions.yml
├── <SAMPLE_NAME>SE
│ └── tools
│ ├── clermontyping
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ ├── <SAMPLE_NAME>SE.blast.xml
│ │ ├── <SAMPLE_NAME>SE.html
│ │ └── <SAMPLE_NAME>SE.mash.tsv
│ ├── ectyper
│ │ ├── <SAMPLE_NAME>SE.blast_alleles.txt
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ └── logs
│ │ ├── ectyper.log
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── kleborate
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── merlindist
│ │ └── merlin-<TIMESTAMP>
│ │ ├── <SAMPLE_NAME>SE-dist.txt
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigapass
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ └── ShigaPass_summary.csv
│ ├── shigatyper
│ │ ├── <SAMPLE_NAME>SE-hits.tsv
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigeifinder
│ │ ├── <SAMPLE_NAME>SE.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ └── stecfinder
│ ├── <SAMPLE_NAME>SE.tsv
│ └── logs
│ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ └── versions.yml
├── SRR13039589
│ └── tools
│ ├── clermontyping
│ │ ├── SRR13039589.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ ├── SRR13039589.blast.xml
│ │ ├── SRR13039589.html
│ │ └── SRR13039589.mash.tsv
│ ├── ectyper
│ │ ├── SRR13039589.blast_alleles.txt
│ │ ├── SRR13039589.tsv
│ │ └── logs
│ │ ├── ectyper.log
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── kleborate
│ │ ├── SRR13039589.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── merlindist
│ │ └── merlin-<TIMESTAMP>
│ │ ├── SRR13039589-dist.txt
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigapass
│ │ ├── SRR13039589.tsv
│ │ ├── logs
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── supplemental
│ │ └── ShigaPass_summary.csv
│ ├── shigatyper
│ │ ├── SRR13039589-hits.tsv
│ │ ├── SRR13039589.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigeifinder
│ │ ├── SRR13039589.tsv
│ │ └── logs
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ └── stecfinder
│ ├── SRR13039589.tsv
│ └── logs
│ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ └── versions.yml
└── bactopia-runs
└── merlin-<TIMESTAMP>
├── merged-results
│ ├── clermontyping.tsv
│ ├── ectyper.tsv
│ ├── kleborate.tsv
│ ├── logs
│ │ ├── clermontyping-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ ├── ectyper-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ ├── kleborate-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ ├── shigapass-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ ├── shigatyper-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ ├── shigeifinder-concat
│ │ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ │ └── versions.yml
│ │ └── stecfinder-concat
│ │ ├── nf.command.{begin,err,log,out,run,sh,trace}
│ │ └── versions.yml
│ ├── shigapass.tsv
│ ├── shigatyper.tsv
│ ├── shigeifinder.tsv
│ └── stecfinder.tsv
└── nf-reports
├── merlin-dag.dot
├── merlin-report.html
└── merlin-timeline.html
Species-Specific Analysis
Tools executed depend on detected species
| File | Description |
|---|---|
Analysis | results from all executed species-specific tools |
Merged Results
| File | Description |
|---|---|
merlin.tsv | Merged summary of all species-specific analyses |
Audit Trail
Below are files that can assist you in understanding which parameters and program versions were used.
Logs
Each process that is executed will have a folder named logs. In this folder are helpful
files for you to review if the need ever arises.
| Extension | Description |
|---|---|
| .begin | An empty file used to designate the process started |
| .err | Contains STDERR outputs from the process |
| .log | Contains both STDERR and STDOUT outputs from the process |
| .out | Contains STDOUT outputs from the process |
| .run | The script Nextflow uses to stage/unstage files and queue processes based on given profile |
| .sh | The script executed by bash for the process |
| .trace | The Nextflow trace report for the process |
| versions.yml | A YAML formatted file with program versions |
Nextflow Reports
These Nextflow reports provide great a great summary of your run. These can be used to optimize resource usage and estimate expected costs if using cloud platforms.
| Filename | Description |
|---|---|
| merlin-dag.dot | The Nextflow DAG visualization |
| merlin-report.html | The Nextflow Execution Report |
| merlin-timeline.html | The Nextflow Timeline Report |
| merlin-trace.txt | The Nextflow Trace report |
Parameters
Required Parameters
Define where the pipeline should find input data and save output data.
| Parameter | Type | Default | Description |
|---|---|---|---|
--bactopia | string | The path to bactopia results to use as inputs |
mashdist Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--mash_sketch | string | The reference sequence as a Mash Sketch (.msh file) | |
--mash_seed | integer | 42 | Seed to provide to the hash function |
--mash_table | boolean | false | Table output (fields will be blank if they do not meet the p-value threshold) |
--mash_m | integer | 1 | Minimum copies of each k-mer required to pass noise filter for reads |
--mash_w | number | 0.01 | Probability threshold for warning about low k-mer size. |
--mash_max_p | number | 1.0 | Maximum p-value to report. |
--mash_max_dist | number | 1.0 | Maximum distance to report. |
--merlin_dist | number | 0.1 | Maximum distance to report when using Merlin . |
--full_merlin | boolean | false | Go full Merlin and run all species-specific tools, no matter the Mash distance |
--mash_use_fastqs | boolean | false | Query with FASTQs instead of the assemblies |
ClermonTyping Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--clermontyping_threshold | integer | 0 | Do not use contigs under this size |
csvtk concat Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--csvtk_concat_opts | string | Extra csvtk concat options in quotes |
ECTyper Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--ectyper_opid | integer | 90 | Percent identity required for an O antigen allele match |
--ectyper_opcov | integer | 90 | Minimum percent coverage required for an O antigen allele match |
--ectyper_hpid | integer | 95 | Percent identity required for an H antigen allele match |
--ectyper_hpcov | integer | 50 | Minimum percent coverage required for an H antigen allele match |
--ectyper_verify | boolean | false | Enable E. coli species verification |
--ectyper_print_alleles | boolean | false | Prints the allele sequences if enabled as the final column |
emmtyper Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--emmtyper_wf | string | blast | Workflow for emmtyper to use. (choices: blast, pcr) |
--emmtyper_blastdb | string | Path to custom EMM BLAST DB. | |
--emmtyper_cluster_distance | integer | 500 | Distance between cluster of matches to consider as different clusters |
--emmtyper_percid | integer | 95 | Minimal percent identity of sequence |
--emmtyper_culling_limit | integer | 5 | Total hits to return in a position |
--emmtyper_mismatch | integer | 5 | Threshold for number of mismatch to allow in BLAST hit |
--emmtyper_align_diff | integer | 5 | Threshold for difference between alignment length and subject length in BLAST |
--emmtyper_gap | integer | 2 | Threshold gap to allow in BLAST hit |
--emmtyper_min_perfect | integer | 15 | Minimum size of perfect match at 3 primer end |
--emmtyper_min_good | integer | 15 | Minimum size where there must be 2 matches for each mismatch |
--emmtyper_max_size | integer | 2000 | Maximum size of PCR product |
hicap Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--hicap_database_dir | string | Directory containing locus database | |
--hicap_model_fp | string | Path to prodigal model | |
--hicap_full_sequence | boolean | false | Write the full input sequence out to the genbank file rather than just the region surrounding and including the locus |
--hicap_debug | boolean | false | hicap will print debug messages |
--hicap_gene_coverage | number | 0.8 | Minimum percentage coverage to consider a single gene complete |
--hicap_gene_identity | number | 0.7 | Minimum percentage identity to consider a single gene complete |
--hicap_broken_gene_length | integer | 60 | Minimum length to consider a broken gene |
--hicap_broken_gene_identity | number | 0.8 | Minimum percentage identity to consider a broken gene |
Mykrobe Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--mykrobe_species | string | Species panel to use (choices: sonnei, staph, tb, typhi) | |
--mykrobe_kmer | integer | 21 | K-mer length |
--mykrobe_min_depth | integer | 1 | Minimum depth |
--mykrobe_model | string | kmer_count | Genotype model used. (choices: kmer_count, median_depth) |
--mykrobe_report_all_calls | boolean | false | Report all calls |
--mykrobe_opts | string | Extra Mykrobe options in quotes |
GenoTyphi Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--genotyphi_kmer | integer | 21 | K-mer length |
--genotyphi_min_depth | integer | 1 | Minimum depth |
--genotyphi_model | string | kmer_count | Genotype model used. (choices: kmer_count, median_depth) |
--genotyphi_report_all_calls | boolean | false | Report all calls |
--genotyphi_mykrobe_opts | string | Extra Mykrobe options in quotes |
Kleborate Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--kleborate_preset | string | kpsc | Preset module to use for Kleborate (choices: kpsc, kosc, escherichia) |
--kleborate_opts | string | Extra options in quotes for Kleborate |
legsta Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--legsta_noheader | boolean | false | Don't print header row |
LisSero Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--lissero_min_id | number | 95.0 | Minimum percent identity to accept a match |
--lissero_min_cov | number | 95.0 | Minimum coverage of the gene to accept a match |
ngmaster Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--ngmaster_csv | boolean | false | output comma-separated format (CSV) rather than tab-separated |
pasty Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--pasty_min_pident | integer | 95 | Minimum percent identity to count a hit |
--pasty_min_coverage | integer | 95 | Minimum percent coverage to count a hit |
pbptyper Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--pbptyper_min_pident | integer | 95 | Minimum percent identity to count a hit |
--pbptyper_min_coverage | integer | 95 | Minimum percent coverage to count a hit |
SeqSero2 Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--seqsero2_run_mode | string | k | Workflow to run. 'a' allele mode, or 'k' k-mer mode (choices: a, k) |
--seqsero2_input_type | string | assembly | Input format to analyze. 'assembly' or 'fastq' (choices: assembly, fastq) |
--seqsero2_bwa_mode | string | mem | Algorithms for bwa mapping for allele mode (choices: mem, sam) |
SeroBA Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--seroba_noclean | boolean | false | Do not clean up intermediate files |
--seroba_coverage | integer | 20 | Threshold for k-mer coverage of the reference sequence |
SISTR Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--sistr_full_cgmlst | boolean | false | Use the full set of cgMLST alleles which can include highly similar alleles |
AgrVATE Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--agrvate_typing_only | boolean | false | agr typing only. Skips agr operon extraction and frameshift detection |
spaTyper Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--spatyper_repeats | string | List of spa repeats | |
--spatyper_repeat_order | string | List spa types and order of repeats | |
--spatyper_do_enrich | boolean | false | Do PCR product enrichment |
sccmec Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--sccmec_min_targets_pident | integer | 90 | Minimum percent identity to count a target hit |
--sccmec_min_targets_coverage | integer | 80 | Minimum percent coverage to count a target hit |
--sccmec_min_regions_pident | integer | 85 | Minimum percent identity to count a region hit |
--sccmec_min_regions_coverage | integer | 93 | Minimum percent coverage to count a region hit |
StaphSCAN Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--staphscan_modules | string | Comma-separated list of modules to run | |
--staphscan_db_mlst | string | Path or tarball to custom MLST database |
STECFinder Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--stecfinder_use_reads | boolean | false | Paired-end Illumina reads will be used instead of assemblies |
--stecfinder_hits | boolean | false | Show detailed gene search results |
--stecfinder_cutoff | number | 10.0 | Minimum read coverage for gene to be called |
--stecfinder_length | number | 50.0 | Percentage of gene length needed for positive call |
--stecfinder_ipah_length | number | 10.0 | Percentage of ipaH gene length needed for positive gene call |
--stecfinder_ipah_depth | number | 1.0 | Minimum depth for positive ipaH gene call (requires --stecfinder_use_reads) |
--stecfinder_stx_length | number | 10.0 | Percentage of stx gene length needed for positive gene call |
--stecfinder_stx_depth | number | 1.0 | Minimum depth for positive stx gene call (requires --stecfinder_use_reads) |
--stecfinder_o_length | number | 60.0 | Percentage of wz_ gene length needed for positive call |
--stecfinder_o_depth | number | 1.0 | Minimum depth for positive qz_ gene call (requires --stecfinder_use_reads) |
--stecfinder_h_length | number | 60.0 | Percentage of fliC gene length needed for positive call |
--stecfinder_h_depth | number | 1.0 | Minimum depth for positive fliC gene call (requires --stecfinder_use_reads) |
TB-Profiler Profile Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--tbprofiler_call_whole_genome | boolean | false | Call whole genome |
--tbprofiler_mapper | string | bwa | Mapping tool to use. If you are using nanopore data it will default to minimap2 (choices: bwa, minimap2, bowtie2, bwa-mem2) |
--tbprofiler_caller | string | freebayes | Variant calling tool to use (choices: bcftools, gatk, freebayes) |
--tbprofiler_calling_params | string | Extra variant caller options in quotes | |
--tbprofiler_suspect | boolean | false | Use the suspect suite of tools to add ML predictions |
--tbprofiler_no_flagstat | boolean | false | Don't collect flagstats |
--tbprofiler_no_delly | boolean | false | Don't run delly |
--tbprofiler_opts | string | Extra options in quotes for TBProfiler |
TB-Profiler Collate Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
--tbprofiler_itol | boolean | false | Generate itol config files |
--tbprofiler_full | boolean | false | Output mutations in main result file |
--tbprofiler_all_variants | boolean | false | Output all variants in variant matrix |
--tbprofiler_mark_missing | boolean | false | An asterisk will be used to mark predictions which are affected by missing data at a drug resistance position |
Filtering Parameters
Use these parameters to specify which samples to include or exclude.
| Parameter | Type | Default | Description |
|---|---|---|---|
--include | string | A text file containing sample names (one per line) to include from the analysis | |
--exclude | string | A text file containing sample names (one per line) to exclude from the analysis |
Optional Parameters
These optional parameters can be useful in certain settings.
| Parameter | Type | Default | Description |
|---|---|---|---|
--outdir | string | bactopia | Base directory to write results to |
--skip_compression | boolean | false | Output files will not be compressed |
--datasets | string | The path to cache datasets to | |
--keep_all_files | boolean | false | Keeps all analysis files created |
Max Job Request Parameters
Set the top limit for requested resources for any single job.
| Parameter | Type | Default | Description |
|---|---|---|---|
--max_retry | integer | 3 | Maximum times to retry a process before allowing it to fail. |
--max_cpus | integer | 4 | Maximum number of CPUs that can be requested for any single job. |
--max_memory | string | 128.GB | Maximum amount of memory that can be requested for any single job. |
--max_time | string | 240.h | Maximum amount of time that can be requested for any single job. |
--max_downloads | integer | 3 | Maximum number of samples to download at a time |
Nextflow Configuration Parameters
Parameters to fine-tune your Nextflow setup.
| Parameter | Type | Default | Description |
|---|---|---|---|
--nfconfig | string | A Nextflow compatible config file for custom profiles, loaded last and will overwrite existing variables if set. | |
--publish_dir_mode | string | copy | Method used to save pipeline results to output directory. (choices: symlink, rellink, link, copy, copyNoFollow, move) |
--infodir | string | ${params.outdir}/pipeline_info | Directory to keep pipeline Nextflow logs and reports. |
--force | boolean | false | Nextflow will overwrite existing output files. |
--cleanup_workdir | boolean | false | After Bactopia is successfully executed, the work directory will be deleted. |
Institutional config options
Parameters used to describe centralized config profiles. These should not be edited.
| Parameter | Type | Default | Description |
|---|---|---|---|
--custom_config_version | string | master | Git commit id for Institutional configs. |
--custom_config_base | string | https://raw.githubusercontent.com/nf-core/configs/master | Base directory for Institutional configs. |
--config_profile_name | string | Institutional config name. | |
--config_profile_description | string | Institutional config description. | |
--config_profile_contact | string | Institutional config contact information. | |
--config_profile_url | string | Institutional config URL link. |
Nextflow Profile Parameters
Parameters to fine-tune your Nextflow setup.
| Parameter | Type | Default | Description |
|---|---|---|---|
--condadir | string | Directory to Nextflow should use for Conda environments | |
--registry | string | quay.io | Registry to pull Docker containers from. |
--datasets_cache | string | <HOME>/.bactopia/datasets | Directory where downloaded datasets should be stored. |
--singularity_cache | string | Directory where remote Singularity images are stored. | |
--singularity_pull_docker_container | boolean | Instead of directly downloading Singularity images for use with Singularity, force the workflow to pull and convert Docker containers instead. | |
--force_rebuild | boolean | false | Force overwrite of existing pre-built environments. |
--queue | string | general,high-memory | Comma-separated name of the queue(s) to be used by a job scheduler (e.g. AWS Batch or SLURM) |
--cluster_opts | string | Additional options to pass to the executor. (e.g. SLURM: '--account=my_acct_name' | |
--container_opts | string | Additional options to pass to Apptainer, Docker, or Singularity. (e.g. Singularity: '-D pwd' | |
--disable_scratch | boolean | false | All intermediate files created on worker nodes of will be transferred to the head node. |
Helpful Parameters
Uncommonly used parameters that might be useful.
| Parameter | Type | Default | Description |
|---|---|---|---|
--monochrome_logs | boolean | Do not use coloured log outputs. | |
--nfdir | boolean | Print directory Nextflow has pulled Bactopia to | |
--sleep_time | integer | 5 | The amount of time (seconds) Nextflow will wait after setting up datasets before execution. |
--validate_params | boolean | true | Boolean whether to validate parameters against the schema at runtime |
--help | boolean | Display help text. | |
--wf | string | bactopia | Specify which workflow or Bactopia Tool to execute |
--list_wfs | boolean | List the available workflows and Bactopia Tools to use with '--wf' | |
--show_hidden_params | boolean | Show all params when using --help | |
--help_all | boolean | An alias for --help --show_hidden_params | |
--version | boolean | Display version text. |
Composition
This workflow uses the following subworkflows:
- bactopia_datasets - Download and provide pre-compiled datasets required by Bactopia.
- merlin - MinER assisted species-specific bactopia tool seLectIoN.
Citations
If you use this in your analysis, please cite the following.
-
Bactopia
Petit III RA, Read TD Bactopia - a flexible pipeline for complete analysis of bacterial genomes. mSystems 5 (2020) -
Mash
Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy AM Mash: fast genome and metagenome distance estimation using MinHash. Genome Biol 17, 132 (2016)