Benchmarking genome assembly methods on metagenomic sequencing data

Zhenmiao Zhang, Chao Yang, Werner Pieter Veldsman, Xiaodong Fang, Lu Zhang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

1 Citation (Scopus)

Abstract

Metagenome assembly is an efficient approach to reconstruct microbial genomes from metagenomic sequencing data. Although short-read sequencing has been widely used for metagenome assembly, linked- and long-read sequencing have shown their advancements in assembly by providing long-range DNA connectedness. Many metagenome assembly tools were developed to simplify the assembly graphs and resolve the repeats in microbial genomes. However, there remains no comprehensive evaluation of metagenomic sequencing technologies, and there is a lack of practical guidance on selecting the appropriate metagenome assembly tools. This paper presents a comprehensive benchmark of 19 commonly used assembly tools applied to metagenomic sequencing datasets obtained from simulation, mock communities or human gut microbiomes. These datasets were generated using mainstream sequencing platforms, such as Illumina and BGISEQ short-read sequencing, 10x Genomics linked-read sequencing, and PacBio and Oxford Nanopore long-read sequencing. The assembly tools were extensively evaluated against many criteria, which revealed that long-read assemblers generated high contig contiguity but failed to reveal some medium- and high-quality metagenome-assembled genomes (MAGs). Linked-read assemblers obtained the highest number of overall near-complete MAGs from the human gut microbiomes. Hybrid assemblers using both short- and long-read sequencing were promising methods to improve both total assembly length and the number of near-complete MAGs. This paper also discussed the running time and peak memory consumption of these assembly tools and provided practical guidance on selecting them.

Original languageEnglish
Article numberbbad087
Number of pages17
JournalBriefings in Bioinformatics
Volume24
Issue number2
DOIs
Publication statusPublished - Mar 2023

Scopus Subject Areas

  • Information Systems
  • Molecular Biology

User-Defined Keywords

  • genome assembly tools
  • linked-read sequencing
  • long-read sequencing
  • metagenome-assembled genome
  • metagenomic sequencing
  • short-read sequencing

Fingerprint

Dive into the research topics of 'Benchmarking genome assembly methods on metagenomic sequencing data'. Together they form a unique fingerprint.

Cite this