Perelman School of Medicine at the University of Pennsylvania

Grice Lab

HmmUFOtu


Introduction

HmmUFOtu is an HMM based Ultra-fast OTU assignment tool for baterial 16S and amplicon sequencing research, it has two core algorithms, the CSFM-index (Consensus Sequence FM-index) powered banded-HMM algorithm, and SEP (Seed-Estimate-Place) local phylogenetic-placement based taxonomy assignment algorithm.

The main program hmmufotu takes single or paired-end NGS FASTA/FASTQ reads and generate taxonomy assignment results of every read. The main program hmmufotu-sum then generates phylogeny-based OTUs, a reference tree based OTU-tree, and consensus-based representative sequences for the OTUs. See the details on GitHub.

Supported models

HmmUFOtu supports all major DNA substitution models and an optional Discrete Gamma (dΓ) model (Yang 1994) for capturing among-site variations.

Download

Please download the source code (written in pure C++98) or pre-compiled binaries from GitHub.

Pre-built databases

You need to build an HmmUFOtu database before assigning taxonomies to your 16S or other amplicon sequencing reads. You can build your own database using hmmufotu-build (which may take ~10 mins with 6 processors), or alternatively download the pre-built databases below.

  • gg_97_otus_GTR GreenGenes (v13.8) species-level (97% OTU) reference + GTR DNA model. This is recommended for most bacteria 16S studies.
  • gg_97_otus_TN93 GreenGenes (v13.8) species-level (97% OTU) reference + TN93 DNA model
  • gg_97_otus_HKY85 GreenGenes (v13.8) species-level (97% OTU) reference + HKY85 DNA model
  • gg_79_otus_GTR "GreenGenes 79% OTU + GTR") GreenGenes (v13.8) middle-level (79% OTU) reference + GTR DNA model
  • gg_79_otus_TN93 "GreenGenes 79% OTU + TN93") GreenGenes (v13.8) middle-level (79% OTU) reference + TN93 DNA model
  • gg_79_otus_HKY85 "GreenGenes 79% OTU + HKY85") GreenGenes (v13.8) middle-level (79% OTU) reference + HKY85 DNA model

Citations

HmmUFOtu is currently being peer-reviewed on Genome Biology.

Contact us

Please contact Qi Zheng or Elizabeth Grice with any questions.