Original link:tecdat.cn/?p=22422
Original source:Tuo End number according to the tribe public number
In this paper, we describe a flexible competitive risk regression model. The regression model was specified as the transition probability, which is the cumulative incidence in the competitive risk setting. The model includes Fine and Gray’s (1999) model as a special case. This can be used to test the fitness of the proportionality hypothesis of the sub-distribution of risk (Scheike and Zhang 2008). Confidence intervals can also be constructed for predicted cumulative incidence curves. We applied these methods to Pintilie’s (2007) follicular cell lymphoma data where the competing risk was disease recurrence and death without recurrence.
Example of work: Follicular cell lymphoma research
We considered the follicular cell lymphoma data from Pintilie (2007). The data set consisted of 541 patients with follicular cell lymphoma (I or II) in the early stages of the disease who were treated with either radiotherapy alone (chemotherapy =0) or a combination of radiotherapy and chemotherapy (chemotherapy =1). Disease recurrence or non-response and death in remission are two competing risks. Patient age (age: mean =57, SD =14) and hemoglobin levels (HGB: mean =138, SD =15) were also recorded. The median follow-up time was 5.5 years. First we read the data, calculate the cause of death indicator and code the covariates.
R> table(cause)
cause
0 1 2
193 272 76
R> stage <- as.numeric(clinstg == 2)
R> chemo <- as.numeric(ch == "Y")
R> times1 <- sort(unique(time[cause == 1]))
Copy the code
There were 272 disease-related events (no treatment response or recurrence), 76 competitive-risk events (death without recurrence) and 193 reduced individuals. The event time is dftime. The variable times1 gives the event time of cause “1 “. We first estimate nonparametric cumulative incidence curves for comparison.
We specify the event time and subtract the variable cause == 0. The regression model contains only one intercept term (+1). The cause variable gives the cause associated with different events. Cause = 1 specifies that we consider events of type 1. The time to calculate/base the estimate can be given by the argument times = times1.
Figure 1 (a) shows the estimated cumulative incidence curves for both causes. In Figure 1 (b), we construct a 95% confidence interval (dotted line) and a 95% confidence band.
risk(Surv(dftime, cause == 0) ~ + 1,
causeS = 1, n.sim = 5000, cens.code = 0, model = "additive")
Copy the code
Figure 1
R> fit <- cum(time, cause, group)
R> plot(fit)
Copy the code
Subdistribution risk method and direct binomial model method are both subtraction weighted techniques based on inverse probability. When applying such weights, it is critical that the estimates of the reduced weights are not biased, otherwise the cumulative incidence curve estimates may be biased.
In this case, we found that the distribution of deletions was significantly dependent on covariates of hemoglobin, stage, and chemotherapy and was well described by Cox’s regression model. The fitting of the Cox model was verified by cumulative residuals, see Martinussen and Scheike (2006) for further details. Therefore, using simple KaplanMeier estimates for culling weights can lead to severe bias estimates. Therefore, we add the option cens.model = “cox “in the call, which uses all covariables of the competing risk model in the COX model as culling weights. In general, inverse probabilistic regression models with reduced weights can be used to improve efficiency (Scheike et al., 2008).
Now let’s fit the model
We first fit a general scale model, allowing all covariables to have time-varying effects. In the following call, only the covariable X in model (6) is defined. The covariable z in model (6) is specified by a const operator.
summary(outf) OUTPUT: Competing risks Model Test for nonparametric terms Test for non-significant effects Supremum-test of significance p-value H_0: B(t)=0 (Intercept) 3.29 0.0150 stage 5.08 0.0000 age 4.12 0.0002 chemo 2.79 0.0558 HGB 1.16 0.8890 Test for time Invariant effects Kolmogorov-smirnov test P-value H_0: Constant effect (Intercept) 8.6200 0.0100 stage 1.0400 0.0682 age 0.0900 0.0068 chemo 1.7200 0.0004 HGB 0.0127 0.5040 Cramer von Mises test p-value H_0: Constant effect 3.69e+01 0.0170 stage 2.52 +00 0.0010 age 4.26e-03 0.0014 chemo 1.50e+00 0.0900hgb 2.64e-04 0.4220Copy the code
Significance tests based on nonparametric tests showed that in nonparametric models, stage and age were significant, chemotherapy was significant (P = 0.056), and hemoglobin was not significant (P = 0.889).
Figure 2
The estimated regression coefficient αj (t) and its 95% confidence band were plotted, and the observation test process of constant effect and simulation test process of null value were plotted respectively.
R> plot(outf, score = 1)
Copy the code
Figure 2 shows that these effects do not vary over time and are quite pronounced in early time periods. 95% directional confidence interval, 95% confidence interval.
Figure 3 shows the relevant test procedure used to determine whether the time-varying effect is significantly time-varying or whether H0: αj (t) = βj is acceptable. A summary of these graphs is given in the output and we see that stage and chemotherapy are clearly time-varying and therefore inconsistent with the Fine-gray model. Kolmogorov-smirnov and Cramer von Mises test statistics were consistent with the two different summations of the test procedure, with the overall conclusion that there was no proportional Cox type effect for any of the three variables. We see that hemoglobin is well described by constants, so we consider models in which hemoglobin has a constant effect and the remaining covariables have time-varying effects.
Figure 3
R> summary(outf1) OUTPUT: Competing risks Model Test for nonparametric terms Test for non-significant effects Supremum-test of significance p-value H_0: B(t)=0 (Intercept) 5.46 0 stage 5.18 0 age 4.20 0 chemo 3.89 0 Test for time invariant Effects Kolmogorov-smirnov Test P-value H_0: Constant effect (Intercept) 10.100 0.000 stage 1.190 0.048 age 0.101 0.004 chemo 1.860 0.000 Cramer von Mises test P-value H_0: Constant effect (Intercept) 79.90000 0.000 stage 1.84000 0.006 age 0.00583 0.000 chemo 2.53000 Parametric terms: Coef. SE Robust SE z P const(HGB) 0.00195 0.00401 0.00401 0.486 0.627 Parametric terms: Parametric Coef. SE Robust SE z P const(HGB) 0.00195 0.00401 0.00401 0.486 0.627Copy the code
Competing risks Model Test for nonparametric terms Test for non-significant effects Supremum-test of significance p-value H_0: B. (t)=0 (Intercept) 6.32 0 Test for time invariant effects Kolmogorov-smirnov Test p-value H_0:constant effect (Intercept) 1.93 0 Cramer von Mises test P-value H_0: Constant effect (Intercept) 14.30 Parametric terms SE Robust SE z P const(stage) 0.45200 0.13500 0.13500 3.340 0.001838 const(age) 0.01450 0.00459 0.00459 3.150 0.001610 const(chemo) -0.37600 0.18800 0.18800 -2.000 0.045800 const(HGB) 0.00249 0.00401 0.00401 0.622 0.534000Copy the code
We noticed that the effect of hemoglobin was almost equal to that of a more suitable model (shown above). However, due to the incompatibility of other covariables in the model, the estimated value may be seriously biased, which may mislead the important characteristics of the data. Finally, we compare the FG model’s prediction with that of the semi-parametric model, which describes the effect in more detail. We consider the following predictions for two different types of patients defined by the new data allocation. Patient type I: Stage I disease (stage =0), 40 years old, no chemotherapy (chemotherapy =0); patient type II: Stage II disease (stage =1), 60 years old, radiotherapy plus chemotherapy (chemotherapy =1).
R> newdata <- data.frame(stage = c(0, 1), age = c(40, 60), chemo = c(0, 1),
+ hgb = c(138, 138))
R> predict(out, newdata)
Copy the code
To specify the data to calculate the forecast, we can specify a newData parameter.
Predictions based on the model may not be monotonous. We plot predictions without a point confidence interval (SE = 0) and without a confidence band (Uniform = 0). The prediction in Figure 4 (a) is based on the flexible model, while the prediction in Figure 4 (b) is based on the FG model. The cumulative incidence curves of recurrence for type I and II patients were shown as solid and dotted lines, respectively. Figure 5 (a) compares the predicted results for type I patients based on the flexible model and the FG model. Similarly, Figure 5 (b) compares the predictions for type II patients. The broken lines around the two predicted values represent confidence regions based on flexible models.
Figure 4.
R> par(mfrow = c(1, 2))
R> plot(f1, se = 0, uniform = 1, col = 1, lty = 1
R> plot(fg, new = 0, se = 0, uniform = 0, col = 2, lty = 2,
Copy the code
Higher disease stage, older age, and combination therapy lead to a higher cumulative incidence, the effects of which are more pronounced earlier in the time period (Figure 4 (a) and Figure 2). Chemotherapy, on the other hand, increased the cumulative incidence initially over time and subsequently reduced the incidence (Figure 4 (a) and Figure 2). FIG. 5 shows that the FG model cannot accurately simulate the time-varying effect. Despite these differences, the overall forecast is somewhat similar in this case, especially when the uncertainty of the estimates is taken into account. However, the time-varying behavior of covariates is clearly important.
Figure 5
4. Discuss
This paper implements a flexible competitive risk regression model for cumulative incidence curves, which can analyze in detail how covariates predict cumulative incidence and allow time varying effects of covariates. The fit of simpler models can be checked and predictions with confidence intervals and confidence bands can be produced, which is useful to researchers.
Most welcome insight
1. R language survival curve drawing estimate | | how to survive R for survival analysis graph
2. Visual analysis of R language survival analysis
3. How does R language calculate IDI and NRI indicators in survival analysis and Cox regression
4. Use Bioconductor to analyze chip data in R language
5. Visualization case of R language survival analysis data analysis
6. R language GGploT2 error bar diagram quick guide
7. Draw functional enrichment bubble map in R language
8. How does R language find indicators with differences in patient data? (PLS-DA analysis)
9. Survival analysis of R language in 4 patients with advanced lung cancer