PaperSwipe

Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade

Published 3 years agoVersion 2arXiv:2202.03534

Authors

Lucas Czech, Alexandros Stamatakis, Micah Dunthorn, Pierre Barbera

Categories

q-bio.PEq-bio.GN

Abstract

Phylogenetic placement refers to a family of tools and methods to analyze, visualize, and interpret the tsunami of metagenomic sequencing data generated by high-throughput sequencing. Compared to alternative (e. g., similarity-based) methods, it puts metabarcoding sequences into a phylogenetic context using a set of known reference sequences and taking evolutionary history into account. Thereby, one can increase the accuracy of metagenomic surveys and eliminate the requirement for having exact or close matches with existing sequence databases. Phylogenetic placement constitutes a valuable analysis tool per se, but also entails a plethora of downstream tools to interpret its results. A common use case is to analyze species communities obtained from metagenomic sequencing, for example via taxonomic assignment, diversity quantification, sample comparison, and identification of correlations with environmental variables. In this review, we provide an overview over the methods developed during the first ten years. In particular, the goals of this review are (i) to motivate the usage of phylogenetic placement and illustrate some of its use cases, (ii) to outline the full workflow, from raw sequences to publishable figures, including best practices, (iii) to introduce the most common tools and methods and their capabilities, (iv) to point out common placement pitfalls and misconceptions,(v) to showcase typical placement-based analyses, and how they can help to analyze, visualize, and interpret phylogenetic placement data.

Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade

3 years ago
v2
4 authors

Categories

q-bio.PEq-bio.GN

Abstract

Phylogenetic placement refers to a family of tools and methods to analyze, visualize, and interpret the tsunami of metagenomic sequencing data generated by high-throughput sequencing. Compared to alternative (e. g., similarity-based) methods, it puts metabarcoding sequences into a phylogenetic context using a set of known reference sequences and taking evolutionary history into account. Thereby, one can increase the accuracy of metagenomic surveys and eliminate the requirement for having exact or close matches with existing sequence databases. Phylogenetic placement constitutes a valuable analysis tool per se, but also entails a plethora of downstream tools to interpret its results. A common use case is to analyze species communities obtained from metagenomic sequencing, for example via taxonomic assignment, diversity quantification, sample comparison, and identification of correlations with environmental variables. In this review, we provide an overview over the methods developed during the first ten years. In particular, the goals of this review are (i) to motivate the usage of phylogenetic placement and illustrate some of its use cases, (ii) to outline the full workflow, from raw sequences to publishable figures, including best practices, (iii) to introduce the most common tools and methods and their capabilities, (iv) to point out common placement pitfalls and misconceptions,(v) to showcase typical placement-based analyses, and how they can help to analyze, visualize, and interpret phylogenetic placement data.

Authors

Lucas Czech, Alexandros Stamatakis, Micah Dunthorn et al. (+1 more)

arXiv ID: 2202.03534
Published Feb 7, 2022

Click to preview the PDF directly in your browser