Skip to main content

sylph_profile

Tags: metagenomics profiling taxonomy abundance ani sylph sample-scope

Profile metagenome samples against a database using Sylph.

Uses Sylph to profile metagenomic samples for taxonomic abundance and containment ANI against a provided database. It is designed to be extremely fast and memory-efficient.

Uses explicit positional record fields for reads:

  • Input: record(meta, r1, r2, se, lr) where each read slot is Path?

Inputs

record (
meta: Record,
r1: Path?,
r2: Path?,
se: Path?,
lr: Path?
)
FieldTypeDescription
metaRecordGroovy Record containing sample information
r1Path?Illumina R1 reads (paired-end)
r2Path?Illumina R2 reads (paired-end)
sePath?Single-end Illumina reads
lrPath?Long reads (ONT/PacBio)
db: Path
NameTypeDescription
dbPathPath to the Sylph database file (*.syldb)

Outputs

record (
meta: Record,
tsv: Path,
results: Set<Path>,
logs: Set<Path?>,
nf_logs: Set<Path>,
versions: Set<Path>
)
FieldTypeDescription
metaRecordSample information record
tsvPathTSV file with profiling results
resultsSet<Path>All output files to be published
logsSet<Path?>Optional program specific log files
nf_logsSet<Path>Nextflow-specific log files (e.g. .command.{begin
versionsSet<Path>A YAML formatted file with program versions

Parameters

Sylph Profile Parameters

ParameterTypeDefaultDescription
--sylph_dbstringThe path to a sylph formatted database
--sylph_kinteger31K-mer size for sketching
--sylph_min_spacinginteger30Minimum spacing between selected k-mers on the database genomes
--sylph_subsample_rateinteger200Subsample rate for sketching
--sylph_min_aniinteger95Minimum adjusted ANI to consider. Smaller than 95 for profile will give inaccurate results.
--sylph_min_kmersinteger50Exclude genomes with less than this number of sampled k-mers
--sylph_min_correctinteger3Minimum k-mer multiplicity needed for coverage correction. Higher values gives more precision but lower sensitivity
--sylph_estimate_unknownbooleanfalseEstimate true coverage and scale sequence abundance in profile by estimated unknown sequence percentage
--sylph_optsstringExtra options in quotes for Sylph

Used By

Subworkflows

  • sylph - Profile microbial composition using Sylph.

Workflows

  • sylph - Taxonomic profiling by abundance-corrected MinHash.

Citations

If you use this in your analysis, please cite the following.

Source

View source on GitHub

Version

SYLPH_PROFILE:
- sylph: 0.9.0