coda4microbiome and Selbal
Differences with Selbal
Both, coda4microbiome (Calle and Susin 2022) and Selbal (Rivera-Pinto et al. 2018) search for two groups of taxa, A and B, that are jointly associated with the outcome of interest Y.
The main differences between coda4microbiome and Selbal are (1) the model for combining the relative abundances of taxa in group A and B, (2) the process for selecting the taxa that will constitute the microbial signature and (3) the type of study that can be approached with each method.
- Model:
- Selbal: ilr balance between A and B, that is, $B(A,B)=log(g(A)/g(B))$ the log-ratio of the geometric mean abundances of taxa in group A vs taxa in group B
- coda4microbiome: log-contrast, that is, $\sum 𝑎_𝑖 \log(𝑥_𝑖)$ with the constraint $\sum 𝑎_𝑖=0$. Those taxa with a positive coefficient will be in group A, those with a negative coefficient in group B and those with a zero coefficient are not part of the microbial signature.
- Variable selection algorithm:
- Selbal: Forward selection
- coda4microbime: Penalized regression
- Studies:
- Selbal: Cross-sectional studies
- coda4microbime: Cross-sectional and longitudinal studies
In summary, coda4microbiome improves Selbal by considering a more general model (an ilr balance is a special log-contrast), a more powerful variable selection process (forward selection does not ensure a global optimum) and can be used in both, cross-sectional and longitudinal studies.
References:
Calle, M.L. and Susin, A. (2022) Identification of dynamic microbial signatures in longitudinal studies. BioRxiv, https://www.biorxiv.org/content/10.1101/2022.04.25.489415v1
Rivera-Pinto,J. et al. (2018) Balances: a New Perspective for Microbiome Analysis. MSystems, 3, 4.
Comments