Vegetable and fruit consumption and cancer of unknown primary risk: results from the Netherlands cohort study on diet and cancer

Cancer of Unknown Primary (CUP) is a metastatic cancer for which the primary lesion remains unidentifiable during life and little is also known about the modifiable risk factors that contribute to its development. This study investigates whether vegetables and fruits are associated with CUP risk. We used data from the prospective Netherlands Cohort Study on Diet and Cancer which includes 120,852 participants aged between 55 and 69 years in 1986. All participants completed a self-administered questionnaire on cancer risk factors at baseline. Cancer follow-up was established through record linkage to the Netherlands Cancer Registry and the Dutch Pathology Registry. As a result, 867 incident CUP cases and 4005 subcohort members were available for case-cohort analyses after 20.3 years of follow-up. Multivariable adjusted hazard ratios were calculated using proportional hazards models. We observed no associations between total vegetable and fruit consumption (combined or as separate groups) and CUP risk. However, there appeared to be an inverse association between the consumption of raw leafy vegetables and CUP. With respect to individual vegetable and fruit items, we found neither vegetable nor fruit items to be associated with CUP risk. Overall, vegetable and fruit intake were not associated with CUP incidence within this cohort.

and alcohol consumption (dose-response) [8][9][10][11]. However, the relationship between diet and CUP has been less studied, especially with respect to plant-based nutrition such as vegetables and fruits.
The World Cancer Research Fund reports that the consumption of vegetables and fruits may reduce cancer risk, although the association may be restricted to specific cancers [12][13][14]. In addition, they describe that non-starchy vegetables and fruits have been linked to protecting against a number of aerodigestive cancers [12,13]. Associations between diet and cancer are complex as each bioactive food constituent has the potential to modify aspects of carcinogenesis, either individually or in combination with several micronutrients (alongside quantity, timing, and duration of exposure to those constituents) [12]. Then again, a lower intake of vegetables and fruits (low intake levels of carotenoids, vitamin A, C, E) has been linked to increase levels of oxidative stress and inflammation, alongside genomic instability, reduced apoptosis and increased proliferation [14].
To the best of our knowledge, only one Australian prospective cohort study has investigated the relationship between diet and CUP, in which they did not find any associations between vegetable or fruit consumption and CUP risk [10]. However, it should be noted that the study only examined vegetable and fruit consumption by using the usual number of servings as ≥5 vegetables/day and ≥ 2 fruits/day in relation to CUP. Similarly, it did not investigate specific groups of vegetables and fruits, nor individual vegetable and fruit items. For that reason, we decided to investigate the relationship between vegetable and fruit consumption and CUP risk in greater detail by using combined groups of vegetables and fruits, as well as individual vegetable and fruit items. In addition, we aimed to examine residual confounding by cigarette smoking status on the association between vegetable and fruit consumption and CUP risk, as cigarette smoking has been linked to increase CUP risk.

Study design and population
The prospective Netherlands Cohort Study on Diet and Cancer (NLCS) was started in September 1986 and included 58,279 men and 62,573 women aged between 55 and 69 years. Participants originated from 204 Dutch computerized municipal population registries. Data processing and analysis were based on the case-cohort design for efficiency reasons. Incident cancer cases were derived from the full cohort while the number of person-years at risk was estimated from a subcohort of 5000 participants who were randomly sampled from the full cohort immediately after baseline [15]. The subcohort comprises a group of participants in which CUP cases can occur [16]. The case-cohort design implies that cases can arise both inside and outside the subcohort. The cases in the subcohort are at risk from baseline until cancer incidence, cases outside the subcohort have been assigned a minimal person-time at risk in order to be included in the statistical analysis. Participants who had reported a history of cancer (except for skin cancer) at baseline were excluded from analyses (see Fig. 1).

Fig. 1
Outcome measure CUP is defined here as a metastasised epithelial malignancy with no identifiable primary tumor origin after cytological and/or histological verification during a patient's lifetime. This CUP definition only includes epithelial malignancies (ICD-O-3: M-8000 -M-8570) and thus excludes non-epithelial cancers, such as sarcoma, lymphoma, mesothelioma, and melanoma.

Follow-up
Cancer follow-up was established through annual record linkage with the Netherlands Cancer Registry (NCR) and the Dutch Pathology Registry (PALGA) [17]. Information regarding the site of metastasis was obtained from the NCR, but this data was only partially available and, therefore, supplementary information was retrieved from the pathology excerpts provided by PALGA. These pathology excerpts were also used to determine whether cytological and/or histological confirmed cases had been correctly categorised in the data received from the NCR.

Questionnaire data
All cohort members completed a self-administered questionnaire, which included detailed questions on dietary habits, lifestyle, and other cancer risk factors. The dietary section was a validated 150-item semi quantitative foodfrequency questionnaire (FFQ) that concentrated on the habitual consumption of foods and beverages during the year preceding baseline [18]. The Spearman correlation coefficient was 0.38 for total vegetable consumption and 0.60 for total fruit consumption, compared to the results of the 9 recording days. The relatively low correlation for total vegetable consumption may derive from lack of variation in consumption and possibly due to imprecise estimation of the portion size [18,19]. Participants were asked to indicate how often they consumed vegetables (15 cooked vegetables, 4 raw vegetables), both in summer and in winter. They were able to choose from one out of six categories: never or less than once a month, 1 time per month, 2 to 3 times per month, 1 time per week, 2 times per week, or 3 to 7 times per week. Usual serving sizes were asked for string beans and cooked endive only; the mean of these values served as an indicator for serving sizes of all cooked vegetables. Participants who did not report their usual serving sizes were assigned a default value. If participants reported only one serving size, then the individual serving size was derived using a conversion factor. Both the default value and the conversion factor were derived from a pilot study [20]. Tomato and sweet pepper consumption were asked to be reported in frequency per week and per month, respectively, both in summer and in winter. Participants were asked to indicate how often they consumed fruit by choosing from one out of seven categories: never or less than once a month, 1 time per month, 2 to 3 times per month, 1 time per week, 2 to 3 times per week, 4 to 5 times per week, or 6 to 7 times per week. For all the fruits of interest, participants were able to indicate the amount of each fruit that was consumed. Frequencies and amounts were converted to grams per day. For both vegetable and fruit consumption, dietary data measured in summer and winter were merged and averaged into specific intake variables for analyses purposes. The questionnaire was also used to measure exposure to tobacco smoking. Tobacco smoking was addressed through questions on baseline smoking status, and the ages at first exposure and last (if stopped) exposure to smoking. Questions were also asked about smoking frequency and smoking duration (excluding stopping periods), for cigarette, cigar, and pipe smokers. Participants who indicated that they had never smoked cigarettes were considered never smokers.

Statistical methods
Person-years at risk were calculated from baseline (17 September 1986) until CUP diagnosis, death, emigration, loss to follow-up, or end of follow-up (31 December 2006), whichever occurred first. Patient characteristics were presented for CUP cases and stratified for histological and cytological confirmation. General characteristics were presented for subcohort members and CUP cases with frequencies (percentages) for categorical variables, and means including standard deviations for continuous variables.
Based on the distribution of the subcohort, participants were compared using quartiles (Q) of vegetable, legume, and fruit consumption. For continuous analyses, increments of 25 g per day were used. The composition of the vegetable, legume, and fruit groups that were studied within the NLCS are described in Table 1.
Vegetable and fruit consumption were mutually adjusted in the analyses, which means that vegetable consumption was additionally adjusted for fruit consumption, whereas fruit consumption was additionally adjusted for vegetable consumption. Legume consumption was additionally adjusted for vegetable and fruit intake. The predefined confounders included: age at baseline (years, continuous); sex (male/female); current cigarette smoking status (never/ever); cigarette smoking frequency (number of cigarettes smoked per day); and cigarette smoking duration (number of years smoking). We included the smoking variables as predefined confounders, as they have been linked to increased CUP risk [8][9][10][11]. Additionally, smokers have been observed to consume lower amounts of vegetables and fruits in comparison to non-smokers [21]. The potential confounders included: alcohol consumption (ethanol intake per day); body mass index (BMI) at baseline (kg/m 2 ); non-occupational physical activity (< 30 min/day, 30-60 min/day, 60-90 min/day and > 90 min/day); socio-economic status (highest level of education); diabetes (yes/no); and history of cancer in a first-degree relative (yes/no). Variables were considered a confounder if they changed the HR by > 10%. Accordingly, none of the potential confounders were included in the final model.
Cox proportional hazards models were used to estimate age-and sex-adjusted, and multivariable adjust hazard ratios (HRs) with 95% confidence intervals (CIs). Time since baseline (1986) was used for the time axis. Standard errors were calculated using the robust Huber-White sandwich estimator to account for additional variance introduced by sampling from the full cohort [22]. The proportional hazards assumption was tested using the scaled Schoenfeld residuals [23]. In cases where the assumption had been violated, a time-varying coefficient for that variable was added to the model where appropriate. Ordinal exposure variables were fitted as continuous variables in trend analyses. Wald tests and cross-product terms were used to evaluate potential multiplicative interaction between total vegetable and fruit consumption (combined and individually), with sex, and CUP risk, and between total vegetable and fruit consumption (combined and individually), cigarette smoking frequency, and CUP risk. Analyses were conducted using Stata version 15. P values were considered statistically significant if p < 0.05.
We performed three sensitivity analyses. The first sensitivity analysis was restricted to histologically verified CUP cases alone. For this analysis, patients who received a cytological verification alone were excluded. Patients who were histologically verified are more likely to have undergone extensive diagnostic investigation(s) to rule out the primary tumour origin. For those patients who received cytological verification alone, other factors may have played a role in the decision to refrain from further diagnostic investigation, such as age, comorbidities, performance status, localisation of the metastasis, and the patient's decision. The second sensitivity analysis was performed after the first 2 years of follow-up had been excluded so as to check for potential reverse causality bias as a result of preclinical cancer at baseline. To assess whether associations differed over time, we conducted a third analysis in which we compared the first 10 years of follow-up (< 1996) to the last 10 years of follow-up (≥1996).

Results
After 20.3 years of follow-up (17 September 1986 until 31 December 2006), data was available for a total of 1353 potential CUP cases and 4774 participants of the subcohort. After excluding CUP cases with neither microscopical confirmation or non-epithelial histology, a total of 1073 CUP cases remained. Participants with incomplete or inconsistent dietary data were excluded from analyses. This resulted in 867 available CUP cases and 4005 subcohort members with complete and consistent dietary data. In general, when comparing differences between CUP cases and subcohort members, we observed that CUP cases consumed lower amounts of vegetables (mean values 185.8 versus 189.0 g per day, respectively) (see Table 2). Male CUP cases in particular consumed lower amounts of vegetables (mean values 182.3 versus 187.0 g  per day, respectively), while female CUP cases consumed a more similar amount of vegetables (mean values 191.6 versus 190.9 g per day, respectively). We also observed that CUP cases consumed lower amounts of fruits (mean values 164.7 versus 175.5 g per day, respectively). Results from the age-and sex-adjusted analyses were comparable to the results of the multivariable adjusted analyses. Therefore, we only discuss the multivariable adjusted results. We observed no association between total vegetable and fruit consumption (HR for Q4 vs. Q1: 0.98, 95% CI: 0.92-1.05, P trend = 0.63) and CUP risk (see Table 3). In addition, when mutually adjusted, we found no association between total vegetables (HR for Q4 vs. = 0.37) and CUP risk. However, we observed a statistically significant trend between the consumption of raw leafy vegetables and a decreased CUP risk (HR for Q4 vs. Q1: 0.82, 95% CI: 0.64-1.03, P trend = 0.03). With respect to individual vegetable and fruit items, which were mutually adjusted, we found no association between the individual vegetable items or the individual fruit items and the development of CUP (see Table 4).
No multiplicative interactions were observed between sex and the association between total vegetable and fruit consumption (combined), vegetable consumption, or fruit consumption, in relation to CUP risk (P interaction = 0.20, 0.17, and 0.46, respectively). However, we did observe multiplicative interactions between vegetables and fruits (combined), and fruit consumption and smoking status in relation to CUP risk (P interaction = 0.03, 0.02, respectively), but not between vegetable consumption and smoking status in relation to CUP risk (P interaction = 0.67). Furthermore, the potential for residual confounding was evaluated based on cigarette smoking status and the relationship between vegetable and fruit consumption and CUP risk (see Table 5). In current smokers, the association of vegetables and fruits with CUP risk was inverse, although not statistically significant (per 25 g per day increment HR: 0.89, 95% CI: 0.79-1.00, P trend = 0.06). In never and ex-smokers, vegetable and fruit consumption was not associated with CUP risk. Furthermore, current smokers with the highest fruit intake compared to the lowest fruit intake appeared to have a reduced CUP risk (HR for Q4 vs. Q1: 0.65, 95% CI: 0.43-0.99, although the P trend = 0.16 was not statistically significant).
Results from all three sensitivity analyses, when restricted to histologically verified CUP cases alone (n = 614), after excluding the first 2 years of followup, and when comparing the first 10 years of follow-up (< 1996) to the last 10 years of follow-up (≥1996), did not differ substantially from the findings of the overall analyses (see Supplementary Tables 1-6).

Discussion
We have presented here a detailed investigation of the relationship between vegetable and fruit consumption and the development of CUP, which we accomplished by assessing combined groups of vegetables and fruits as well as individual vegetable and fruit items. Our results demonstrate that consuming vegetables and fruits is generally unrelated to CUP incidence within this cohort; however, the consumption of raw leafy vegetables did appear to be associated with a decreased CUP risk. We found no multiplicative interaction between sex in relation to the association between total vegetable and fruit consumption and CUP risk. Yet, we did observe multiplicative interactions between total vegetables and fruits (combined), and fruit consumption and smoking status in relation to CUP risk, but not between vegetable consumption and smoking status in relation to CUP risk.
The Australian cohort study, mentioned in the introduction, investigated the relationship between consuming vegetables and fruits and the risk of developing CUP by comparing 327 incident CUP cases to two randomly selected sets of controls (3:1) using incidence density sampling with replacement [10]. It found no relation by assessing plant-based food consumption and the usual number of servings as ≥5 vegetables/day and ≥ 2 fruits/ day, compared to consuming < 5 vegetables/day and < 2 fruits/day [10]. Although the categories differ between the Australian study and those of the NLCS, the respective findings are comparable. Moreover, having analysed combined groups of vegetables and fruits as well as individual vegetable and fruit items in greater detail, we conclude that there is no association between vegetable and fruit consumption and CUP risk. We did, however, observe an inverse association between the consumption of raw leafy vegetables and CUP risk, but this might be a chance finding due to multiple comparisons. As described elsewhere, vegetable and fruit consumption have been associated with a protective effect against cancer, but the association may be restricted to specific cancers [12]. Nonetheless, it should be acknowledged that CUP constitutes a group of heterogeneous metastatic cancers, therefore, specific effects from vegetables and/or fruits could be masked.
In an additional analysis, residual confounding by cigarette smoking status was evaluated for its possible influence on the association between vegetable and fruit consumption and CUP risk. We observed no associations for never or ex-smokers who consumed vegetables and fruits in relation to CUP risk, while current smokers appeared to have a decreased CUP risk, although a Analyses were adjusted for age at baseline (years), sex, cigarette smoking status (never/ever), frequency (continuous; centered), and duration (continuous; centered). Additionally adjusted for cigarette smoking status (never/ever), and duration (continuous; centered) as time-varying covariates b Tests for dose-response trends were assessed by fitting ordinal variables as continuous terms in the Cox proportional hazards model c Additionally adjusted for total fruit consumption (grams per day; continuous) d Additionally adjusted for total vegetable and fruit consumption (grams per day; continuous) e Additionally adjusted for total vegetable consumption (grams per day; continuous) not statistically significant. This effect may derive from residual confounding by smoking. Our finding is in line with the limited-suggestive evidence by the World Cancer Research Fund that describes the consumption of non-starchy vegetables and fruit to be linked to reduced lung cancer risk in people who smoke or used to smoke tobacco [13].

Strengths and limitations
The strengths of this study are its prospective cohort design, its large cohort population including 120,852 participants, its large number of 867 incident CUP cases, and its ability to correct for multiple and detailed confounders in the analyses. Data on incident CUP cases were provided by the NCR and included information from both pathology reports and clinical reports [24]. Pathology excerpts were available to confirm whether the cytological and/or histological confirmed cases had been correctly categorised in the data received from the NCR. Cancer follow-up through record linkage with the NCR and PALGA was at least 96% complete, thereby minimizing selection bias [25]. Cases were registered by trained NCR registry clerks who had access to the medical files and who entered data by applying uniform coding rules. It should, however, be acknowledged that we utilised a CUP definition that may differ from that used in other countries, as the criteria for defining 'CUP' are heterogeneous. Another possible limitation is that exposure data were only measured once at baseline in 1986. Vegetable and fruit consumption (both in summer and in winter) were, however, extensively addressed in the FFQ, and we expect that participants in the studied age group (55-69) had stable dietary habits at baseline. The reproducibility of the FFQ as well as the stability of dietary habits as estimated by the test-retest r, was on average 0.07 for nutrients over a time period of 5 years [26]. Nonetheless, it is possible that participants subsequently changed their dietary habits. If they did change their  habits, that may have resulted in bias due to misclassification and may have led to underestimation of the effect of vegetable and fruit consumption on CUP risk. We do expect this bias to be non-differential between CUP cases and subcohort members. Unfortunately, we do not have data to check which diagnostic methods were used to identify the primary tumor origin. Nevertheless, if we restrict our analysis to histologically verified CUP cases alone, for whom extended diagnostic methods are more likely, we find that the results do not differ greatly from the overall multivariable analyses. Accordingly, we can assume that the findings from the overall multivariable analyses are representative of CUP cases with or without an extensive diagnostic work-up. We were unable to conduct subgroup analyses based on histopathological findings as precision medicine was not yet available at the time of the follow-up of our study. Studies with more recent data on CUP cases would therefore be encouraged to conduct such analyses.

Conclusions
In our study, we observed no associations between total vegetable and fruit consumption, total vegetables, cooked vegetables, raw vegetables, legumes, brassica vegetables, allium vegetables, cooked leafy vegetables, total fruits, citrus fruits, and the development of CUP. However, the consumption of raw leafy vegetables appeared to decrease risk of the malignancy. With respect to individual vegetable and fruit items, neither vegetable nor fruit items were found to be associated with CUP risk. We thus conclude that consuming vegetables and fruits is unrelated to CUP incidence within this cohort.