Evaluation of sample pooling for gene sequencing of SARS-CoV-2: a simulation study

Authors

  • Heng Chen Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Yue Cheng Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Xun He Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Yuzhen Zhou Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Wenjun Xie Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Danyun Shen Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Zhiqun He Chengdu Center for Disease Control and Prevention, Chengdu, 610047, China
  • Ruidan Li Chengdu Center for Disease Control and Prevention, Chengdu, 610047, China
  • Weixuan Liu Chengdu Center for Disease Control and Prevention, Chengdu, 610047, China
  • Liang Wang Chengdu Workstation for Emerging Infectious Disease Control and Prevention, Chinese Academy of Medical Sciences, Chengdu, 610047, China
  • Xuejun Zhang Institute of Blood Transfusion, Chinese Academy of Medical Sciences, Chengdu, 610066, China

DOI:

https://doi.org/10.3855/jidc.20348

Keywords:

SARS-CoV-2, sample pooling, gene sequencing, simulation study

Abstract

Introduction: Coronavirus disease 2019 (COVID-19) continues to pose a significant public health threat, requiring epidemiological and genomic surveillance. Next generation sequencing (NGS) is commonly utilized for monitoring viral evolution at a high cost. This study evaluated pooled sequencing as a cost-effective tool for monitoring virus variants.

Methodology: A simulation study was conducted to evaluate the efficacy of sample pooling for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing. In total, 72 original sets of raw data of gene sequencing with different genotypes were collected and combined to create 70 simulated samples based on five pooling strategies. A bioinformatics tool based on Freyja was utilized to analyze the variant composition of these 70 simulated pooled samples. The efficiency of recovering the correct genotypes of the original samples among different pooling strategies, result reports, and genotypes was evaluated with R software.

Results: The genetic composition of the pooled samples mostly recovered the genotype compositions of the original samples, with discrepancies between the top X results (where X is the number of original samples in the pool) and the complete results (p < 0.05). Variability in identification efficiency of genotypes were observed in the reports for the top X results (p < 0.05) across the five pooling strategies, but not in the reports of complete results (p > 0.05). Some original samples of low quality were not accurately identified.

Conclusions: Sample pooling coupled with streamlined genotyping offers a promising approach for cost-effective gene sequencing of SARS-CoV-2, which will aid in COVID-19genomic surveillance.

Downloads

Published

2025-01-31

How to Cite

1.
Chen H, Cheng Y, He X, Zhou Y, Xie W, Shen D, He Z, Li R, Liu W, Wang L, Zhang X (2025) Evaluation of sample pooling for gene sequencing of SARS-CoV-2: a simulation study. J Infect Dev Ctries 19:1–8. doi: 10.3855/jidc.20348

Issue

Section

Coronavirus Pandemic