Why using CODA?
Microbiome count data is compositional since their total are constrained by the sequencing depth. Relative abundances (proportions) are obviously constraint by a sum equal to one. This total constraint induces strong dependencies among the observed abundances of the different taxa. In fact, nor the absolute abundance (read counts) nor the relative abundance (proportion) of one taxon alone are informative of the real abundance of the taxon in the environment. Instead, they provide information on the relative measure of abundance when compared to the abundance of other taxa in the same sample.
Ignoring the compositional nature of microbiome data can lead to spurious results. This is especially critical when dealing with microbiome variable selection. In this context, several simulation studies have shown that commonly used differential abundance tests provide large false positive rates (Weiss et al. 2017, Nearing et al. 2022).
Weiss et al. Microbiome 2017
Nearing et al. Nat.Comm 2022
Nearing, J.T. et al. (2022) Microbiome differential abundance methods produce different results across 38 datasets. Nat.Comm, 13, 342.
Weiss, S. et al. (2017) Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome, 5, 27.
Comments