Hematological and biochemical markers influencing breast cancer risk and mortality: Prospective cohort study in the UK Biobank by multi-state models

Background Breast cancer is the most common cancer and the leading cause of cancer-related death among women. However, evidence concerning hematological and biochemical markers influencing the natural history of breast cancer from in situ breast cancer to mortality is limited. Methods In the UK Biobank cohort, 260,079 women were enrolled during 2006–2010 and were followed up until 2019 to test the 59 hematological and biochemical markers associated with breast cancer risk and mortality. The strengths of these associations were evaluated using the multivariable Cox regression models. To understand the natural history of breast cancer, multi-state survival models were further applied to examine the effects of biomarkers on transitions between different states of breast cancer. Results Eleven biomarkers were found to be significantly associated with the risk of invasive breast cancer, including mainly inflammatory-related biomarkers and endogenous hormones, while serum testosterone was also associated with the risk of in-situ breast cancer. Among them, C-reactive protein (CRP) was more likely to be associated with invasive breast cancer and its transition to death from breast cancer (HR for the highest quartile = 1.46, 95 % CI = 1.07–1.97), while testosterone and insulin-like growth factor-1 (IGF-1) were more likely to impact the early state of breast cancer development (Testosterone: HR for the highest quartile = 1.31, 95 % CI = 1.12–1.53; IGF-1: HR for the highest quartile = 1.17, 95 % CI = 1.00–1.38). Conclusion Serum CRP, testosterone, and IGF-1 have different impacts on the transitions of different breast cancer states, confirming the role of chronic inflammation and endogenous hormones in breast cancer progression. This study further highlights the need of closer surveillance for these biomarkers during the breast cancer development course.

Background: Breast cancer is the most common cancer and the leading cause of cancer-related death among women.However, evidence concerning hematological and biochemical markers influencing the natural history of breast cancer from in situ breast cancer to mortality is limited.Methods: In the UK Biobank cohort, 260,079 women were enrolled during 2006-2010 and were followed up until 2019 to test the 59 hematological and biochemical markers associated with breast cancer risk and mortality.The strengths of these associations were evaluated using the multivariable Cox regression models.To understand the natural history of breast cancer, multi-state survival models were further applied to examine the effects of biomarkers on transitions between different states of breast cancer.Results: Eleven biomarkers were found to be significantly associated with the risk of invasive breast cancer, including mainly inflammatory-related biomarkers and endogenous hormones, while serum testosterone was also associated with the risk of in-situ breast cancer.Among them, C-reactive protein (CRP) was more likely to be associated with invasive breast cancer and its transition to death from breast cancer (HR for the highest quartile = 1.46, 95 % CI = 1.07-1.97),while testosterone and insulin-like growth factor-1 (IGF-1) were more likely to impact the early state of breast cancer development (Testosterone: HR for the highest quartile = 1.31, 95 % CI = 1.12-1.53;IGF-1: HR for the highest quartile = 1.17, 95 % CI = 1.00-1.38).Conclusion: Serum CRP, testosterone, and IGF-1 have different impacts on the transitions of different breast cancer states, confirming the role of chronic inflammation and endogenous hormones in breast cancer progression.This study further highlights the need of closer surveillance for these biomarkers during the breast cancer development course.

Introduction
Breast cancer remains the most common cancer and the leading cause of cancer-related deaths among women worldwide [1].Breast cancer prognosis largely depends on the early detection and timely interventions for the tumor [2].Understanding the progression of breast cancer from preclinical biomarkers to breast cancer mortality may contribute to the identification of crucial indicators to prevent breast cancer and reduce breast cancer mortality effectively.However, previous efforts to assess the effect of hematological and biochemical markers on the disease progression of breast cancer were scarce.
Previous studies indicate that inflammation may stimulate increases in platelet and white blood cell production [3,4], and inflammatory markers may be associated with the development of breast cancer [5,6].In addition, high serum sex hormone-binding globulin (SHBG) concentrations may reduce the risk of breast cancer, while serum IGF-1 levels may be positively associated with the risk [7,8].However, few studies have assessed the effects of hematological and biochemical markers on in-situ breast cancer, and to date, no study has been carried out to evaluate the role of these biomarkers prior to breast cancer diagnosis to predict future breast cancer mortality.
Despite the known effects of sex hormones or other biochemical markers on the risk of breast cancer and survival [7,9], it is difficult to distinguish whether biomarkers have different effects on the transitions of disease states.In previous studies, multi-state survival analysis has been applied to identify risk factors for lethal breast cancer among cancer-free women, and to examine their effects on transitions between event-free, fibroadenoma and breast cancer [10,11].No previous study has further explored the roles of hematological and biochemical markers in transitions from event-free until death from breast cancer.
In the present study, we aimed to comprehensively investigate the associations between hematological and biochemical markers and the risk of in situ and invasive breast cancer and mortality in the UK Biobank cohort.Multi-state models were further used to examine the potential impact of biomarkers on transitions between different disease states during breast cancer progression.

Study populations
Our study was based on women who participated in the UK Biobank between 2006 and 2010, and were between 40 and 70 years old at enrollment.All participants were followed up by linkage to the electronic health records of the UK National Health Service (NHS).At baseline, self-administered touchscreen questionnaires were applied to collect information on participants' sociodemographic, health and medical history, and lifestyle exposures.The participants also underwent physical measurements, and provided blood samples.All participants provided written informed consent, and the study was approved by the North West Multi-centre Research Ethics Committee.We have excluded participants who requested to withdraw from the UK Biobank cohort study.Data from the UK Biobank (http://www.ukbiobank.ac.uk/) are available to all researchers after making an application.
In this study, we aimed to assess the effect of preclinical hematological and biochemical markers on the disease progression of breast cancer, and thus we excluded women with any in-situ (N = 849) or invasive breast cancer (N = 6851) diagnosis before cohort entry, and without available data on hematological and biochemical markers (N = 5545).A total number of 260,079 participants were finally included in our study.

Hematological and biochemical markers
Blood samples were collected from all participants at recruitment.As part of the UK Biobank Biomarker Project, the biomarkers were measured in UK Biobank's purpose-built facility in Stockport.Full details on processing, storage, assay performance, and rigorous quality control measures for the blood samples are available elsewhere [12][13][14][15].Briefly, serum concentrations of biochemical markers were measured by Immunoassay analyzers using several methods, such as colorimetric, enzymatic rate, Chemiluminescent Immunoassay, and immune-turbidimetric assays.The LH750 hematology analyzer was applied to reticulocyte analysis and counted red blood cells or white blood cells automatically.The summary description of measurement/calculation and analytical platform of hematological and biochemical markers were listed in Supplementary Table 1 and Supplementary Table 2.
For the current analysis, we have included 59 hematological and biochemical markers, of which 80 % had a missing rate of less than 10 %.Details regarding the analytical range, distribution, and missing proportion of these biomarkers were summarized in Supplementary Table 3.If the test results for a biomarker were missing due to the values beyond the reportable range, values below the detection limit were imputed with half of the minimum detected value, and values above the detection limit were imputed with the maximum reportable concentration for the particular biomarker.These biomarkers were analyzed as quartiles based on their overall distributions and as standardized continuous variables, except that the number of nucleated red blood cells was dichotomized based on whether nucleated red blood cells were detected in their blood samples.

Ascertainment of in situ and invasive breast cancer incidence and mortality
We retrieved the main diagnoses from the Scottish Morbidity Record and the Patient Episode Database in England, Scotland, and Wales, respectively, available for all participants since 1997 [16], which were documented using International Classification of Diseases-10 (ICD-10) codes.In this study, The ICD-10 codes D05 and C50 were used to identify in situ and invasive breast cancer, respectively.The date and cause of death were retrieved from death certificates held by the NHS Information Center and NHS Central Register.
Follow-up for the participants started from the date of enrollment and continued until the incidence of interested outcomes (in situ or invasive breast carcinoma), death, loss to follow-up, or the end of the study (December 31, 2019).The end of follow-up was set to avoid the potential influence of the COVID-19 pandemic [17].In analyses of breast cancer mortality, the endpoint was defined as being breast cancer as the primary cause of death.

Statistical analysis
A flowchart of the study design and main analysis process is shown in Fig. 1.To identify the hematological and biochemical markers measured at baseline that may be associated with the risk of breast cancer overall, multivariable Cox regression with attained age as the underlying timescale was performed.We also performed subgroup analysis by invasiveness at diagnosis to validate potential heterogeneity.The basic model (model 1) was adjusted for the UK Biobank assessment centers and the fully adjusted model (model 2) was further adjusted for ethnicity, BMI, smoking, family history of breast cancer, age at first birth, number of births, oral contraceptive use, hormone replacement therapy, age at menarche, menopausal status at baseline, and the product of BMI and menopausal status.To avoid false-positive findings caused by multiple testing, biomarkers with P for trend <0.05/59 (the Bonferroni corrected threshold) were considered statistically significant.Meanwhile, sensitivity analyses stratified by menopausal status were implemented.Sensitivity analyses to reduce reverse causality were also conducted by repeating the analyses after excluding the first two years of follow-up.
Subsequently, all biomarkers significantly associated with the risk of in-situ or invasive breast cancer were included in the next step of analysis to investigate their associations with breast cancer mortality.
Considering that the risk and progression of breast cancer may vary significantly according to menopausal status [18,19], subgroup analysis was also performed by menopausal status at the baseline to further validate potential heterogeneity.In addition, their combined effects on the risk of breast cancer and mortality were also evaluated.
In the final step, for the hematological and biochemical markers related to both breast cancer incidence and mortality risk, the multistate survival models were used to assess the effects of the biomarkers on transitions from event-free (State 1) to in-situ breast cancer (State 2), invasive breast cancer (State 3), and breast cancer mortality (State 4).Breast cancer was treated as the absorbing state (ie, in-situ cancer diagnosed after an invasive breast cancer diagnosis was not considered).All subjects started in the event-free state.Possible courses for each woman include: 1 → 1 (the woman remained event-free until the end of the study); 1 → 2 (the woman developed in-situ breast cancer and not invasive breast cancer); 1 → 3 (a direct transition from event-free into invasive breast cancer); 1 → 2→3 (the woman developed in-situ breast cancer and subsequently invasive breast cancer); 1 → 3→4 (the woman developed invasive breast cancer and subsequently death due to breast cancer); 1 → 2→3 → 4 (the woman developed in situ and invasive breast cancer and subsequently death due to breast cancer).Since the number of breast cancer mortality cases was too small among women with insitu breast cancer, we failed to calculate the hazard ratios for the transition from in-situ cancer to death from breast cancer.Covariate effects were allowed to vary freely across distinct transitions, which means that incorporating covariates in multistate models through transition intensities may explain differences in the course of the disease across individuals [20].In the multi-state models, attained age was used as the underlying timescale, model 1 adjusted for the UK Biobank assessment centers, and model 2 further adjusted for ethnicity, BMI, smoking, family history of breast cancer, age at first birth, number of births, oral contraceptive use, hormone replacement therapy, age at menarche, menopausal status at baseline, and the product of BMI and menopausal status.We fitted the models for each transition separately and the Wald test was used to test whether the effects of the biomarkers can be assumed to be identical across transitions.
The proportional hazards assumption was tested using Schoenfeld residuals.All statistical analyses were performed using Stata 15.1 and R 3.6.1.

Results
Among women followed in this cohort, 1410 in-situ breast cancer cases (total follow-up, 2,739,563 person years), 8858 invasive breast cancer cases (total follow-up, 2,809,023 person years), and 613 breast cancer mortality cases (total follow-up, 2,855,526 person years) were recorded, corresponding to an incidence rate of 0.51/1000, 3.15/1000 person years and a mortality rate of 0.21/1000 person years.(Table 1).

Associations between hematological and biochemical markers and insitu and invasive breast cancer risk
12 out of 59 hematological and biochemical markers were associated with the risk of breast cancer overall (P trend <0.0008), most of which were inflammation-related biomarkers and endogenous hormones (Supplementary Table 4).For in-situ breast cancer, higher serum testosterone concentration was the only biomarker significantly associated with an elevated risk of in-situ breast cancer (HR for the highest quartile = 1.31, 95 % CI = 1.12-1.53,P trend <0.001) (Fig. 2, Supplementary Table 5).In addition, 11 biomarkers were associated with the risk of invasive breast cancer, among which the biomarkers with the smallest P trend included testosterone (HR for the highest quartile = 1.47, 95 % CI = 1.38-1.56),neutrophil count (HR for the highest quartile = 1.16, 95 % CI = 1.09-1.23)and IGF-1 (HR for the highest quartile = 1.17, 95 % CI = 1.10-1.25)(Fig. 2, Supplementary Table 6).
The results did not differ appreciably when we analyzed postmenopausal women, whereas in premenopausal women, only monocyte count, Gamma glutamyl transferase, IGF-1, SHBG, and testosterone were associated with invasive breast cancer risk (P trend <0.05) (Supplementary Table 7).In addition, the observed associations did not change substantially when all participants were followed up from 2 years after cohort entry, suggesting that the results were not subject to reverse causality (Supplementary Table 8).

Associations between hematological and biochemical markers and breast cancer mortality
Among the 12 biomarkers associated with breast cancer risk, higher baseline circulating concentrations of CRP (HR for the highest quartile = 1.75, 95 % CI = 1.34-2.29,P trend <0.001) and IGF-1 (HR for the highest quartile = 1.31, 95 % CI = 1.04-1.66,P trend = 0.030) were also positively associated with breast cancer mortality risk, even after adjusting for other potential risk factors for breast cancer (Table 2, Supplementary Table 9).However, a positive association between serum testosterone levels and breast cancer mortality risk was only observed in postmenopausal women (HR for the highest quartile = 1.37, 95 % CI = 1.04-1.81,P trend = 0.025) (Table 2).Moreover, the combined effects of these biomarkers showed that serum CRP, IGF-1, and testosterone were also significant (P < 0.01) and consistent with their respective independent effects (Supplementary Table 10).

Multi-state analyses
Multi-state models were constructed based on the natural history of breast cancer (event-free, in-situ breast cancer, invasive breast cancer, breast cancer mortality) (Fig. 3, Supplementary Table 11).We found that CRP was more strongly associated with the risk of invasive breast cancer (HR for the highest quartile = 1.23, 95 % CI = 1.14-1.31)and breast cancer mortality (HR for the highest quartile = 1.46, 95 % CI = 1.07-1.97)than with in-situ breast cancer (HR for the highest quartile = 1.14, 95 % CI = 0.96-1.35).The effects of testosterone and IGF-1 mainly contributed to the transitions from event-free to in-situ breast cancer (Testosterone: HR for the highest quartile = 1.31, 95 % CI = 1.12-1.53;IGF-1: HR for the highest quartile = 1.17, 95 % CI = 1.00-1.38),and the transitions from event-free to invasive breast cancer (Testosterone: HR for the highest quartile = 1.46, 95 % CI = 1.37-1.56;IGF-1: HR for the highest quartile = 1.17, 95 % CI = 1.10-1.24).However, we did not find an association between testosterone level and the transition from breast cancer incidence to mortality, which was significantly different from its effect on the transition from event-free to invasive breast cancer (P = 0.001 for the Wald test).Similarly, borderline different effect sizes of IGF-1 on the transitions between event-free to invasive breast cancer, and invasive breast cancer to death from breast cancer were also observed (P = 0.064).

Discussion
Using the UK Biobank data, we describe a comprehensive picture of the natural history of breast cancer from preclinical biomarkers to breast cancer mortality.Eleven hematological and biochemical markers were found to be associated with breast cancer risk, among which higher serum levels of CRP, testosterone, and IGF-1 were also associated with a higher breast cancer mortality.In the multi-state survival analysis, we further found that CRP was mainly associated with invasive breast cancer and the transition from cancer incidence to mortality, while testosterone and IGF-1 were more likely to impact the early state of breast cancer development.
Among the 11 hematological and biochemical biomarkers found to be associated with the risk of invasive breast cancer, some were inflammatory markers, such as white blood cell count, neutrophil count, and CRP, while others were endogenous hormones, such as SHBG, IGF- Women with age younger than 55 or self-reported non-menopause at recruitment were categorized as premenopausal women and women with age older than 55 or self-reported menopause were categorized as postmenopausal women.

Fig. 2.
The forest plot for hematological and biochemical markers associated with the risk in-situ and invasive breast cancer.The multivariable Cox regression was performed to identify hematological and biochemical markers measured at baseline that may associate with the risk of in situ and invasive breast cancer.The fully adjusted model adjusted for UK Biobank assessment centers, ethnicity, BMI, smoking, family history of breast cancer, age at first birth, number of births, oral contraceptive use, hormone replacement therapy, age at menarche, menopausal status at baseline, and the product of BMI and menopausal status.Considering falsepositive findings caused by multiple testing, biomarkers with P for trend< 0.05/59 (the Bonferroni corrected threshold) were considered statistically significant.The biomarkers significantly associated with the breast cancer risk were shown in the forest plot, and the detailed results of fifty-nine biomarkers were provided in Supplementary Table 5 and Supplementary Table 6.
randomization studies confirmed the potential causal associations between IGF-1, testosterone and breast cancer risk [24,25].Consequently, elucidation of how these biomarkers influence the natural progression of breast cancer is highly warranted, which may help us target screening programs for high-risk individuals, especially for those who are more likely to have worse outcomes.
In addition to the known association between CRP and breast cancer risk, we observed that higher baseline serum CRP levels were associated with a higher risk of breast cancer mortality.In multi-state models, we further found that high CRP conferred greater risks for invasive breast cancer and its transition to breast cancer mortality rather than the transition from in-situ cancers to invasive cancers.As CRP is a biomarker that reflects the systemic burden of inflammation [26][27][28], our results suggest that chronic inflammation may play a more important role in breast cancer development and prognosis.Consistent with this, previous studies have shown that increasing CRP levels were associated with more advanced disease stages, such as lymph node metastasis [29], increasing tumor size, and lower histological grade [28].Our hypothesis was also supported by evidence showing that once a tumor is present, tumor cells may recruit inflammatory cells into the tumor microenvironment, stimulating tumor growth and leading to a worse prognosis [22,30].In addition, some studies have suggested that systemic CRP may influence therapeutic resistance to chemotherapy and Trastuzumab through activation of molecular pathways and reduction of drug distribution [31,32].The in vitro model showed that CRP activates the integrin α2 signaling pathways, demonstrating a role for CRP in the growth, acquisition of adhesive, and invasive phenotypes in breast and triple-negative breast cancer cells [33].While immunotherapy including programmed death-1 (PD-1)/programmed death ligand-1 (PD-L1) inhibitors, and cytotoxic T-lymphocyte associated antigen-4 (CTLA-4) inhibitor are emerging therapies for metastasis and triple-negative breast cancer [34,35], CRP might also predict response to checkpoint inhibitor treatment for these patients [36].Previously, high levels of endogenous sex hormones including testosterone have been identified as risk factors for developing breast cancer [23,37,38].Our results further suggest that serum testosterone was also associated with the early states of breast cancer development, such as in-situ breast cancer, consistent with recent findings [39].Additionally, a nested case-control study reported that adding testosterone to the Gail model could moderately increase the accuracy of breast cancer risk prediction [40].A possible biological mechanism is that androgen receptors are highly expressed in invasive and non-invasive breast tumors [41].In addition to its main contribution to breast carcinogenesis via aromatization to estrogen in mammary tissues, testosterone also plays an anti-proliferative role through the activation of androgen receptors [42][43][44].The recent study showed that selective estrogen receptor downregulators (new endocrine agents) are being developed to prevent or overcome endocrine resistance [45].
Additionally, we found that high circulating IGF-1 was a contributor to the occurrence of in-situ breast cancer and invasive breast cancer.The association between IGF-1 and the risk of breast cancer has been reported in previous studies [23,46], which might be explained by its role in stimulating cancer cell proliferation, inhibiting apoptosis, and promoting angiogenesis [47,48].A case-control study also showed that high IGF-1 levels may be a crucial factor in the progression of benign breast disease to breast cancer [49].In contrast, a weaker impact of IGF-1 on breast cancer mortality was observed in our study, although this association could not be validated in multi-state models.We hypothesized that the direct association between IGF-1 and breast cancer mortality was mainly due to the increased incidence, as it did not influence the transition from breast cancer incidence to mortality.Our hypothesis was further supported by previous evidence that IGF-1 expression in the peripheral blood was not associated with breast cancer recurrence [50].
The major strength of our study is that it is the largest prospective cohort study to investigate the associations between a wide range of hematological and biochemical markers and breast cancer.Moreover, to
our knowledge, this is the first study to apply multi-state models to explore the effects of biomarkers on breast disease states from eventfree, in-situ breast cancer, invasive breast cancer to death from breast cancer.However, we also acknowledge some limitations in our study.First, the large proportion of participants in our study was white ethnicity, and thus may not reflect the associations between the selected biomarkers and the breast cancer risk and mortality in other populations.Second, considering a time lag between breast cancer onset and clinical diagnosis, we might not have an accurate measure of the onset breast cancer.Sensitivity analyses to reduce reverse causality were conducted by repeating the analyses after excluding the first two years of follow-up.Third, the hematological and biochemical markers levels measured at baseline might not well represent the long-term exposure levels of the participants, which might cause misclassification bias.Fourth, considering the heterogeneity of breast cancer, different breast cancer molecular subtypes and tumor characteristics at diagnosis are not available in the UK Biobank.Thus, whether biomarkers leading to death from breast cancer differ by molecular subtypes requires further investigation.

Conclusions
In summary, we explored the natural history of breast cancer from preclinical hematological and biochemical markers to death from breast cancer in a community-based cohort.CRP, testosterone, and IGF-1 were found to have different impacts on the transitions of different breast cancer states, confirming the role of chronic inflammation and endogenous hormones in breast cancer progression.

Fig. 1 .
Fig. 1.The flow chart of study design and main analyses steps.

Fig. 3 .
Fig. 3.The effects of biomarkers on transitions between different states of breast cancer progression.The multi-state survival models with attained age as underlying timescale were implemented to assess the effects of the biomarkers on transitions from the event-free (State 1) to in-situ breast cancer (State 2, D05), invasive breast cancer (State 3, C50), and breast cancer mortality (State 4, BCM).The models adjusted for the UK Biobank assessment centers, ethnicity, BMI, smoking, family history of breast cancer, age at first birth, number of births, oral contraceptive use, hormone replacement therapy, age at menarche, menopausal status at baseline, and the product of BMI and menopausal status.Panel (A) for CRP, (B) for Testosterone, (C) for IGF-1.The associations between the highest quartile levels of biomarkers and specific transition are given on the edge.The hazard ratios and 95 % confidence intervals were provided in Supplementary Table11.

Table 1
Basic characteristics of the UK Biobank women included in this study.

Table 11 .
for Scientific Research, Fujian Medical University [grant no: 2019QH1002].KC is supported by the Swedish Research Council [grant no: 2018-02547] and Swedish Cancer Society [grant no: 190266].WH is supported by Zhejiang University through "Hundred Talents Program".