I achieved my goals with this project, to make some assessment on how well specific communities recapitulate a normal microbiome. But there are many ways this approach could be improved.

Improvements

I took a functional approach so far, that ignored individual genes in favor of functional pathways. Another approach would use the presence/absence of specific genes (or gene clusters) instead. I could then assess the functional importance of these genes after the fact.

Currently I compress the information of each simulated community down to a single number, but each community actually contains a lot of functional information. So I could use this information to understand which functions (or genes) are actually missing in specific defined communities.

Also I’d love to use a self-organizing-map for this somehow…

Next steps

My predictions about defined communities should be validated in vitro. To do this I could inoculate different defined communities into germ-free bees, and assess the phenotypic effects on these bees. Do some communities allow better utilization of sugar? Change susceptibility to pathogens? Only time (and a lot of work) will tell.