Nutrition-wide association study of microbiome diversity and composition in colorectal cancer patients

Background The effects of diet on the interaction between microbes and host health have been widely studied. However, its effects on the gut microbiota of patients with colorectal cancer (CRC) have not been elucidated. This study aimed to investigate the association between diet and the overall diversity and different taxa levels of the gut microbiota in CRC patients via the nutrition-wide association approach. Methods This hospital-based study utilized data of 115 CRC patients who underwent CRC surgery in Department of Surgery, Seoul National University Hospital. Spearman correlation analyses were conducted for 216 dietary features and three alpha-diversity indices, Firmicutes/Bacteroidetes ratio, and relative abundance of 439 gut microbial taxonomy. To identify main enterotypes of the gut microbiota, we performed the principal coordinate analysis based on the β-diversity index. Finally, we performed linear regression to examine the association between dietary intake and main microbiome features, and linear discriminant analysis effect size (LEfSe) to identify bacterial taxa phylogenetically enriched in the low and high diet consumption groups. Results Several bacteria were enriched in patients with higher consumption of mature pumpkin/pumpkin juice (ρ, 0.31 to 0.41) but lower intake of eggs (ρ, -0.32 to -0.26). We observed negative correlations between Bacteroides fragilis abundance and intake of pork (belly), beef soup with vegetables, animal fat, and fatty acids (ρ, -0.34 to -0.27); an inverse correlation was also observed between Clostridium symbiosum abundance and intake of some fatty acids, amines, and amino acids (ρ, -0.30 to -0.24). Furthermore, high intake of seaweed was associated with a 6% (95% CI, 2% to 11%) and 7% (95% CI, 2% to 11%) lower abundance of Rikenellaceae and Alistipes, respectively, whereas overall beverage consumption was associated with an 10% (95% CI, 2% to 18%) higher abundance of Bacteroidetes, Bacteroidia, and Bacteroidales, compared to that in the low intake group. LEfSe analysis identified phylogenetically enriched taxa associated with the intake of sugars and sweets, legumes, mushrooms, eggs, oils and fats, plant fat, carbohydrates, and monounsaturated fatty acids. Conclusions Our data elucidates the diet-microbe interactions in CRC patients. Additional research is needed to understand the significance of these results in CRC prognosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-022-09735-6.


Introduction
The gut microbiota of humans is a complex community comprising bacteria, archaea, and eukarya, with approximately 100 trillion microorganisms [1,2]. It can interact with the host through several physiological processes, such as gut integrity consolidation, intestinal epithelium shaping, food digestion and energy metabolism, pathogen protection, and host immunity regulation [1][2][3]. Microbes begin colonizing the human gut immediately after birth; the gut microbiota community rapidly develops until the age of 3 years, gradually diversifies until the age of 40 years, and remains stable thereafter [4,5]. However, the abundance and diversity of the gut microbiome are affected not only by host genetics but also by the health conditions of the host, such as inflammation, metabolic diseases, and cancer [3,[6][7][8].
In colorectal cancer (CRC) patients, tumorigenesis may alter the surrounding microenvironment, facilitate microbial translocation from the lumen to the lamina, and enhance the proliferation of opportunistic bacteria [9][10][11]. Overabundance of genera Prevotella, Fusobacterium, Parvimonas, Porphyromonas, Peptostreptococcus, Bacteroides, and Gemella has been observed in CRC patients compared to that in healthy individuals [9]. Even after CRC surgery, the composition of the gut microbiota varies between those with newly developed adenoma (similar to the gut microbiota of CRC patients) and those with a clean intestine (similar to the gut microbiota of healthy individuals) [12].
In the gastrointestinal tract, the microbiota plays a vital role in the fermentation of non-digestible components, especially the production of short-chain fatty acids (SCFAs), including acetate (central appetite regulation), propionate (gluconeogenesis and satiety signaling regulation), and butyrate (main energy source for human colonocytes) [13]. Numerous studies have demonstrated the effects of various dietary features on the gut microbiota. In general, a Western diet can inhibit mucus production and activate the penetrability of the colonic mucus barrier, which co-occurs with a shift toward a microbial community characterized by lower production of SCFAs due to fiber deficiency [14]. In contrast, individuals with the more consumption of fiber, which is highly contained in a plant-based diet, show a more diverse and stable microbial community and an increased abundance of SCFA-producing and lactic acid bacteria [14]. Individuals on polyphenol-based diets, another plant-based diet, have shown high abundance of Bifidobacterium and Lactobacillus, which have anti-pathogenic and antiinflammatory effects [15]. Given the geographical variation in both the food culture and the microbiota structure [16,17], recent studies have been conducted to elucidate the diet-microbiome relationship in the Korean population [18,19]. However, to the best of our knowledge, the relationship between the diet and the microbiota, especially the effect of diet on CRC prognosis-related microbiota in CRC patients, has not been studied.
Therefore, in the present study, we performed a nutrition-wide association study to elucidate the effect of different dietary features on microbiome diversity and composition to evaluate the diet-microbiome association in CRC patients. Understanding the microbial response to diet in CRC patients is an important step for the development of therapeutic strategies based on dietary interventions to prevent the recurrence and improve the prognosis of CRC. and the average portion size of 106 food items were recorded to estimate the average weight of and energy intake from food items consumed during the previous year. Daily intake of 106 food items, 663 food subitems, and 92 nutrients was calculated using the Computer-Aided Nutritional Analysis Program (CAN-Pro) 4.0 (Computer-Aided Nutritional Analysis Program, The Korean Nutrition Society, Seoul, Korea). The consumption of 663 subitems was then classified into 16 groups based on the nutrient profiles and culinary usage of each food item. Additionally, a residual method was used for the energy adjustment of the 106 food items, 16 food groups, and 92 nutrients [21]. Furthermore, we derived dietary patterns using principal component analysis (PCA). We constructed a scree plot to represent the variability of food groups based on dietary patterns. Food groups with factor loadings ≥ 0.20 were considered to have dominant contributions to the distinctive dietary pattern [22]. Moreover, we applied k-means clustering analysis based on the scores of the first two dietary patterns in the 'factoextra' package [23] to divide study participants into different groups based on dietary scores of the first two principal components.

DNA extraction and 16S rRNA gene sequencing
DNA was isolated from fecal samples using the DNeasy power soil kit (Qiagen, Hilden, Germany) and quantified using the Quant-IT PicoGreen kit (Invitrogen), according to the manufacturer's instructions. The genetic sequencing was performed after a median of 24 days (interquartile range, 16-41 days) from the date of sample collection. Sequencing libraries were prepared according to the Illumina 16S metagenomic sequencing library protocols to amplify the V3 and V4 regions of the 16S rRNA gene of bacteria. The universal primer pair with Illumina adapter overhang sequences used for the first amplification were as follows: V3-forward primer: 5 '-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA  CAG CCT ACGGGNGGC WGC AG-3'  V4-reverse primer:  5'-GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG  ACA GGA CTACHVGGG TAT CTA ATC C-3' After the sequencing process was completed for the MiSeq raw data, a FASTQ file was created using the MiSeq control software v2.2 and bcl2fastq (v1.8.4), and the PhiX sequence was removed using BWA. Paired-end data separated by each sample were assembled into one sequence using FLASH (1.2.11). After removing lowquality sequences, ambiguous sequences, and chimera sequences, which were considered as sequencing errors in the CD-HIT-OTU program, reads were clustered into operational taxonomic units (OTUs). A threshold of 97% was used to identify 16S rRNA sequence similarity within a species.
For the representative sequence of each OTU, BLASTN (v2.4.0) was performed using the nucleotide sequences present in the Reference database (NCBI 16S Microbial), and taxonomic assignment was performed using the sequence with the highest similarity; if the query coverage of the best hit matching the sequence from the database was less than 85% or the identity of the matched area was less than 85%, taxonomy was not defined. A comparative analysis of various microbial communities was performed using QIIME (v1.8) as the OTU abundance and taxonomic information.

Descriptive statistics
To examine the distribution of demographics and lifestyles between the low fruit-vegetable and high fruit/ low meat-poultry dietary groups, the Wilcoxon and chisquare tests were applied for continuous and categorical variables, respectively.

Microbial diversity and relative abundance
Rare species with mean relative abundances lower than 1 × 10 −6 and/or unspecified phylum/class/order/family/genus were excluded during the construction of the phylogenetic tree. To examine within-sample diversity, we used the 'vegan' package and calculated α-diversity indices, including Chao1, Shannon, and Simpson indices, which represent the richness, evenness, and both the richness and evenness of the microbial community, respectively [24].
To identify the main enterotypes of the gut microbiota, we performed the principal coordinate analysis (PCoA) based on the β-diversity index calculated using the Jensen-Shannon divergence distance algorithm, which may be more efficient in capturing compositional changes with low-abundance factors and can work more stable than the Euclidean, Manhattan, hypersphere, and Aitchison-based distance measures [25], and then divided study participants into distinct enterotypes using the k-medoids method in the 'cluster' package [26]. For elucidating the microbial composition, zero values in microbial data were imputed using a compositional approach of the Bayesian-Multiplicative replacement using the 'zCompositions' package. The abundance data were then converted into the proportion form (relative abundance) [27,28]. Additionally, given that the compositional data points did not map to the Euclidean space, but mapped to the Aitchison simplex, the transform compositions were converted into real space using a logratio transformation [27].

Microbial network structure
The network structure for pairwise correlations of the microbial community was constructed using the Gaussian graphical model (GGM) approach. In the GGM, missing edges between two nodes represent the conditional independence between these nodes, conditionalizing the remaining nodes [29,30]. The pairwise correlation network structure was estimated using the lasso regularization method from the 'glasso' package to retain more solid edges only [31]. In general, increasing the level of regularization (lambda/tuning parameter) shifts the coefficient estimates of one node in correlation with the remaining nodes toward zero, leading to a sparser network. The optimal regulation parameter was selected from the cross-validation process of glasso in the 'nethet' package to ensure both the screening and sparsity assumptions of the network [32,33].

Correlation analysis
The Spearman correlation coefficients between different dietary features and the microbial composition and diversity were calculated using the nutrition-wide association approach.

Association analysis
Based on the consumption of two diets, 16 food groups, six macronutrients, and three fatty acids, the participants were categorized into low intake and high intake groups based on the median value of consumption. Their associations with the relative abundance of major enterotypes, Firmicutes/Bacteroidetes (F/B) ratio, and α-diversity indices were investigated using multivariable regression analysis after adjustment for age, sex, family history of CRC, neoadjuvant therapy, smoking status, alcohol consumption, BMI, AJCC stage, and comorbidity.
Furthermore, we implemented a linear discriminant analysis (LDA) of effect size (LEfSe) in the Galaxy server to identify taxa (for all phylum, class, order, family, genus, and species levels) that significantly differed by consumption status. Differences were evaluated using a threshold for the logarithmic LDA score for discriminating features of 2.0 and p-values for the Wilcoxon test of 0.01.

Identification of dietary patterns
We identified two main dietary patterns using PCA, which explained 69.5% of the variability of all food groups (Additional File 1: Figure S1). Considering factor loadings ≥ 0.20 to have dominant contributions to the distinctive dietary pattern, the low fruit-vegetable pattern was characterized by high intake of cereals and grains (0.64) and low intake of vegetables (-0.30) and fruits (-0.69), whereas the more healthy pattern was characterized by high intake of cereals and grains (0.62), and fruits (0.68) and low intake of meat and poultry (-0.25) ( Table 1). After applying k-means clustering analysis, the study participants were divided into two groups based on diet: the low fruit-vegetable (N = 94) and high fruit/low meat-poultry (N = 21) groups.

Characteristics of study participants according to dietary pattern groups General characteristics
The distribution of demographics, lifestyle, and disease status of the study participants is presented in Table

Microbial abundance according to dietary patterns
The abundance of microbial taxa at the phylum, class, order, family, genus, and species levels according to dietary pattern groups is shown in Fig. 1. In general, the microbial community appeared to be dominated by Bacteroidetes and Firmicutes at the phylum level, by Bacteroidia and Clostridia at the class level, and by Bacteroidales and Clostridiales at the order level. Additionally, the Wilcoxon test revealed a significant difference between the two groups in terms of the relative abundance of class RF3 (p = 0.01), orders ML615J-28 (p = 0.01), RF32 (p = 0.03), and Spirochaetales (p = 0.04), families RF16, S24-7, and Spirochaetaceae (p = 0.04), and genera Acidaminococcus, Anaerococcus, Butyrivibrio, Enterobacter, Megamonas, and Treponema (p = 0.04).

Microbial network structure
Additional file 1: Figures S2-S6 and Additional file 2: Table S2 present differences in the interconnected relationship of the microbial community between the two diet groups. Considering the dominant taxa, Bacteroidetes was negatively correlated with Actinobacteria in   Figure S2). At the class level, Bacteroidia was negatively correlated with Betaprobacteria in the low fruit-vegetable dietary group, whereas the pairwise correlation was positive in the high fruit/low meat-poultry dietary group (Additional file 1: Figure S3). Nevertheless, given the higher tuning parameter, the GGM networks of microbial taxonomy for the high fruit/low meat-poultry dietary group were sparser than those for the low fruit-vegetable dietary group.

Identification of main enterotypes in colorectal cancer patients
Enterotypes of the fecal microbiota among CRC patients based on the Jensen-Shannon divergence distance algorithm are presented in Fig

Correlation of dietary intake with microbial diversity and composition Diet and microbial alpha-diversity
Additional file 2: Table S1 displays the Spearman correlations of log-transformed alpha-diversity indices, including Chao1, Shannon, and Simpson indices, with continuous intake of 106 food items, 16 food groups, two dietary patterns, and 92 nutrients. Overall, we did not observe any significant correlations between within-sample diversity and diet consumption in CRC patients (false discovery rate [FDR], p > 0.05).

Diet and microbial composition
Additional file 2:  (Fig. 4). This included the enrichment of several bacteria with higher consumption of mature pumpkin or pumpkin juice (ρ, 0.31 to 0.41) but lower ). An inverse correlation was observed between the relative abundance of Clostridium symbiosum and intake of some fatty acids, amines, and amino acids (ρ, -0.30 to -0.24). In addition, the high fruit/low meat-poultry dietary pattern was inversely correlated only with the abundance of the genus Clostridium. Table 3 presents the beta coefficients and their corresponding 95% CIs for the difference in taxon diversity and relative abundance between low and high intake groups (food groups, macronutrients, and fatty acids). The linear regression model was used for log-transformation of the alpha-diversity indices, F/B ratio, and relative abundance of dominant bacteria. Participants with a high consumption of seaweed showed a significantly lower relative abundance of family Rikenellaceae (β, -0.06, 95% CI, -0.11 to -0.02) and genus Alistipes (β, -0.07, and 95% CI, -0.11 to -0.02) than those in the low intake group. These associations were persistent after adjustment for age, sex, family history of colorectal cancer, smoking status, alcohol consumption, BMI, AJCC stage, and underlying diseases. Additionally, the participants with a high intake of beverages had a 10% (95% CI, 2% to 18%) higher relative abundance of Bacteroidetes, Bacteroidia, and Bacteroidales than those with a low intake of beverages in the multivariable model.

Linear discriminant analysis effect size
We performed the LEfSe analysis and constructed a cladogram to identify the phylogenetically enriched taxa in the low and high diet consumption groups. Of the 16 food groups, two dietary patterns, eight macronutrients, and three fatty acids, phylogenetically enriched taxa were identified according to the low and high intake of sugars and sweets, legumes, eggs, and oils and fats (Figs. 5A-D and 6A-D). In addition, enriched taxa were identified among patients in the low intake group of mushrooms (Alistipes indistinctus), plant fat (Actinobacteria), and carbohydrates (Bacteroides fragilis), and those in the high intake group of monounsaturated fatty acids (MUFAs) (Clostridium symbiosum) only (data not shown, LDA > 2.0, p < 0.01).

Discussion
This is the first study to elucidate both the within-sample diversity and individual components of the gut microbial community in association with dietary features of a cohort of Korean CRC patients. We carried out a nutrition-wide association study on the effect of consumption of 106 food items, 16 food groups, two dietary patterns, and 92 nutrients on the overall microbial diversity of species richness and/or evenness and the abundance of 439 gut microbial taxa at different physiological levels. After multiple comparison adjustments, no significant correlations were observed between diet consumption and overall richness and/or evenness of the gut microbiota. However, we identified some bacteria that were phylogenetically enriched with higher or lower consumption of sugars and sweets, legumes, mushrooms, eggs, oils and fats, plant fat, carbohydrate, and MUFAs. Previous studies have reported three PCA-derived dietary patterns in the Korean population [22,[34][35][36][37][38][39][40]; the traditional Korean diet is characterized by high intake of food items such as vegetables, seaweeds, fish, soy, and mushrooms [22]; the Western-style diet is characterized by high intake of different meat, fast food, and oil and sugar [22]; and the prudent pattern is characterized by high intake of fruits, milk, and dairy products, and low intake of refined grains [22]. In the present study, we identified only two dietary patterns (low fruit-vegetable and high fruit/low meat-poultry), with high intake of cereals and grains in both the patterns, but a distinction was observed in the factor loadings of vegetables, fruits, and meat and poultry; this could be due to the nature of data-driven methods, such as PCA, and the variation in habitual diets of CRC patients in comparison with that of the general population. Nevertheless, our PCA-derived dietary patterns are appropriate for CRC patients because the components of the two dietary patterns assist in CRC prevention [41,42]. Furthermore, using clustering methods, we could classify study participants into separate dietary behavior groups and examine differences in their microbiome structure.
Our results for enterotypes are partially comparable to those of previous studies. In a cohort of 1,199 Korean adults, three enterotypes were identified, namely Bacteroidaceae, Prevotellaceae, and Ruminococcaceae [19]. In another cohort of 222 healthy Koreans, enterotypes including Bacteroidetes, Prevotella, and Ruminococcus were identified [18]. In the present study, we identified Rikenellaceae and Alistipes enterotypes instead of  Table 3 Linear regression coefficients and their 95% confidence intervals for the difference of microbial relative abundance and alpha diversity indices in high intake group compared to low intake group    Ruminococcaceae and Ruminococcus at the family and genus levels, respectively. We also found that the main enterotypes did not separate from each other [18,19], except when Bacteroidetes and Alistipes were combined into a single enterotype. Our results, therefore, suggest a higher abundance of Alistipes and its family compared to that of Ruminococcus and its family in CRC patients. This finding was in line with the results of previous studies reporting a higher abundance of Ruminococcaceae in healthy controls than in CRC tumor samples [43,44]; the elevated abundance of Rikenellaceae in mucosa colon cancer patients compared to that in controls has also been reported [45]. Similarly, an overrepresentation of Alistipes in CRC patients and Ruminococcus in healthy controls was reported [46], which could explain our findings. Non-toxigenic Bacteroides fragilis is not harmful to the intestinal tract, but another class called enterotoxigenic Bacteroides fragilis produces toxins, which may trigger the development of advanced CRC through the dysfunction of the intestinal immune system [11,47,48]. Nevertheless, a significantly lower 3-year overall survival and disease-free survival among those with a high abundance of Bacteroides fragilis than in the low-abundance group was observed in a pilot study of 180 CRC patients [49]. Approximately 48% fat and 39% lean are present in unprocessed pork (belly), mainly consisting of MUFAs (47%) and saturated fatty acids (36%) [50], which can promote or inhibit the outer membrane vesicles of Bacteroides fragilis in a fatty-acid-chain-length-and dosedependent manner [51]. Furthermore, palmitoleic and palmitic acids exert an inhibitory effect on the growth of Bacteroides fragilis at low concentrations [51].
Clostridium symbiosum, which is involved in the butyrate-producing pathway [52], is postulated to activate protein synthesis in the local gut epithelium and enhance the development of carcinogenesis [53]. Clostridium symbiosum abundance has been reported to cause bacteremia in CRC patients, and noninvasive methods, such as fecal immunochemical test and carcinoembryonic antigen test, revealed an improvement in the efficacy of early CRC diagnosis [53,54]. Although acid amines act as substrates by Clostridium sp. [55], the mechanism by which the consumption of amines was related to the decreased abundance of Clostridium symbiosum needs to be further elucidated.
Recent studies have reported the contribution of gut microbiota to the progression of CRC [11,56]. Among them, Fucobacterium nucleatum is mostly associated with CpG island methylator phenotype, microsatellite instability, and BRAF, KRAS, TP53, CHD7, and CHD8 mutations, which are suggested to predispose mortality  [57][58][59]. F. nucleatum was enriched in CRC patients with or without chemotherapy treatment and depleted in healthy or postoperative individuals [60]. In contrast, Prevotella and Bacteroides co-abundance groups and Faecalibacterium prausnitzii were found to be associated with better survival outcomes [61]. However, our study failed to detect any dietary factors affecting the relative abundance of Fucobacterium nucleatum and Faecalibacterium sp.
In Korea, edible seaweeds exist in water-containing or dried forms [62,63]. Despite the complexity of structural and storage polysaccharides according to taxonomically different seaweeds, polysaccharides are the most abundant bioactive compounds in seaweeds [64]. In the digestive system, polysaccharides are proteolytically fermented as SCFAs and other end-products [64,65]. The levels of SCFAs and intestinal bacterial communities may, therefore, reflect the effects of polysaccharides on the gut microbiota [66]. In particular, several polysaccharides from green algae have been shown to decrease the abundance of Rikenellaceae and Alistipes in mice [66], which supports our findings.
Previous studies have consistently shown the association of coffee and tea with a healthy gut microbial community [67,68]. A study on 147 healthy individuals revealed a higher abundance of Bacteroides-Prevotella-Porphyromonas in individuals who consumed more coffee than in those who consumed less coffee [68], which could be due to the polyphenols and caffeine in coffee beverages [68]. In a mouse model of metabolic syndrome, partial effects on improving the gut dysbiosis and disrupted plasma SCFA profile were reported for caffeine and chlorogenic acids [69]. A pilot trial revealed possible effects of caffeine and chlorogenic acids in rising Bifidobacterium in patients with non-alcoholic fatty liver and diabetes, although the increases were not significant [70]. Furthermore, polyphenols present in both green tea and black tea have been reported to exert inhibitory effects on α-amylase and α-glucosidase in the saliva and small intestine, which can result in residual carbohydrates in the large intestine, providing a substrate for SCFAs and energy for colonic epithelium and peripheral tissues [71]. On the contrary, the effect of high-, low-, and non-calorie sweeteners on the abundance of Bacteroidetes remains controversial due to the complex polyols in these beverages, which limits the establishment of directionality [72,73]. Given that caffeine is one of the biologically active compounds in coffee, tea, and carbonated drinks [74,75], we considered these food items into a single food group of total beverages in the present study. Despite the variation in nutritional compositions, we did not find any significant correlations between the consumption of coffee, green tea, and carbonated drinks with microbiome diversity and abundance.
Studies have observed a positive association of withinsample microbial diversity with dietary quality indices in healthy adults [76][77][78]. Notably, a Western-style diet was associated with lower microbial diversity, whereas a plant-based diet was associated with higher microbial diversity [79]. In the present study, we did not observe any significant association between dietary features and alpha-diversity indices. Therefore, we suggest a weaker association of within-sample microbial diversity with dietary intake in CRC patients than in healthy subjects.
However, several limitations of this study must be acknowledged. First, given the cross-sectional study design, we could not determine the causal relationship and evaluate the effect of the diet-microbiome association on CRC recurrence and prognosis. However, the bias regarding this temporal relationship was minimized by assessing the average of habitual diets for the year prior to the date of fecal sample collection. Since our study population comprised Korean CRC patients in a hospital in Seoul, our findings might not be generalizable to other populations. Second, the possible measurement error and recall bias in using the FFQ for dietary assessment need to be addressed. However, the validated and reproducible FFQ for the Korean population was administered by well-trained staff, which minimized the risk of inaccurate collection of information [20,80]. Third, there could be residual confounding due to the lack of information. In general, probiotics are introduced with beneficial functions by restoring the composition of the gut microbiome, whereas antibiotics may decrease the population of several bacteria [81,82]. Besides, CRC patients were reported to commonly face with mental health conditions such as anxiety (1.6%-57%) and depression (1.0%-47.2%) [83]. In these conditions, there was a reduction of SCFA producing bacteria which can contribute to the gut permeability and systemic inflammation [84]. Thus, further studies may take the account for prebiotic and antibiotic use and mental health in the diet-microbiome interaction among CRC patients. Finally, by obtaining fecal samples at a single timepoint prior to the operation, we could not take into account the variability and stability of the microbial community at different timepoints. Although diet consumption accounted for a relatively small proportion of microbial variation at the population level [85][86][87][88], changing habitual diets might contribute to the modification of microbial composition at the individual level [85,[89][90][91][92][93]. Therefore, dense longitudinal studies are required to further elucidate personalized diet-microbiome relationships in CRC patients.

Conclusion
In summary, our data provide comprehensive evidence for diet-microbe interactions in CRC patients. Although the dietary features were not associated with withinsample diversity, we identified several bacteria that were phylogenetically enriched according to the consumption of several food items, food groups, dietary patterns, and nutrients. Additional research is needed to understand the mechanisms underlying these observations as well as their significance in CRC prognosis.
Additional file 1: Figure S1. Scree plot for variance of food groups explained by dietary patterns. Figure S2. Gaussian graphical model networks for pairwise correlations of relative abundances of phyla in (A) low fruit-vegetable and (B) high fruit/low meat-poultry groups. Nodes reflect phylum, and edges reflect the conditional dependencies between phyla. The size of the circles is proportional to the mean relative abundance of the corresponding phylum. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of edges represents the strength of correlations. Figure S3. Gaussian graphical model networks for pairwise correlations of relative abundances of classes in (A) low fruit-vegetable and (B) high fruit/low meat-poultry dietary groups. Nodes reflect phylum, and edges reflect the conditional dependencies between classes. The size of the circles is proportional to the mean relative abundance of the corresponding class. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of edges represents the strength of correlations. Figure S4. Gaussian graphical model networks for pairwise correlations of relative abundances of orders in A) low fruit-vegetable and (B) high fruit/ low meat-poultry dietary groups. Nodes reflect phylum, and edges reflect the conditional dependencies between classes. The size of the circles is proportional to the mean relative abundance of the corresponding order. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of edges represents the strength of correlations. Figure S5. Gaussian graphical model networks for pairwise correlations of relative abundances of families in A) low fruit-vegetable and (B) high fruit/low meat-poultry dietary groups. Nodes reflect phylum, and edges reflect the conditional dependencies between families. The size of the circles is proportional to the mean relative abundance of the corresponding family. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of edges represents the strength of correlations. Figure S6. Gaussian graphical model networks for pairwise correlations of relative abundances of genera in A) low fruit-vegetable and (B) high fruit/low meat-poultry dietary groups. Nodes reflect phylum, and edges reflect the conditional dependencies between genera. The size of the circles is proportional to the mean relative abundance of the corresponding genus. Green lines show positive partial correlations, and red lines show negative partial correlations. The thickness of edges represents the strength of correlations.
Additional file 2: Table S1. Daily diet consumption and Spearman correlation between dietary factors and microbial alpha-diversity. Table S2. Pairwise correlations for relative abundances of different phylogenetical levels in low fruit-vegetable and high fruit/low meat-poultry dietary groups. Table S3. Spearman correlation coefficients between dietary factors and relative abundance of microbial taxonomy. Table S4. Crude p-values for Spearman correlation between dietary factors and relative abundance of microbial taxonomy. Table S5. False discovery rate adjusted p-values for Spearman correlation between dietary factors and relative abundance of microbial taxonomy.