Link to the full article:tecdat.cn/?p=5438

Original source:Tuo End number according to the tribe public number

 

Survival analysis refers to a series of statistical methods used to explore the timing of events of interest.

Survival analysis is used in a variety of areas, such as:

Cancer studies analyze patient survival,

The sociology of the Historical analysis of events

Failure time analysis in engineering.

In cancer research, typical research questions are as follows:

How do certain clinical features affect patient survival?

What is the probability of an individual surviving three years?

Is there a difference in survival among the groups?

 

The basic concept

Here, we start by defining the basic terms of survival analysis, including:

Survival time and event

Survival function and risk function

Survival time and event types in cancer research

There are different types of events, including:

recurrence

death

The time from the beginning of the observation to the end of the observation is often called the survival time (or the time of the event).

The two most important measures of evaluation in cancer research are: I) time of death; And ii) relapse-free survival time, which corresponds to the time between treatment response and disease recurrence. It is also known as disease-free survival time and event-free survival time.

As mentioned above, survival analysis focuses on the expected duration until an event of interest (relapse or death) occurs.

 

Kaplan-meier survival assessment

The Kaplan-Meier (KM) method is a nonparametric method used to estimate the probability of survival for observed survival time (Kaplan and Meier, 1958).

The survival curve is a curve that manages the probability of survival in relation to time and provides a useful summary of data that can be used to estimate measures such as median survival time.

R survival analysis

 

Survival analysis summary and visualization of survival analysis results

Sample data set

We will use the lung cancer data provided in the survival kit.

head(lung)

  inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss
1    3  306      2  74   1       1       90       100     1175      NA
2    3  455      2  68   1       0       90        90     1225      15
3    3 1010      1  56   1       0       90        90       NA      15
4    5  210      2  57   1       1       90        60     1150      11
5    1  883      2  60   1       0      100        90       NA       0
6   12 1022      1  74   1       1       50        80      513       0
Copy the code

Inst: Organization code

Time: Survival time in days

Status: Status 1 = review, status 2 = dead

Age: age

Gender: Male = 1 Female = 2

Ph. Ecog: ECOG performance score (0 = normal 5 = dead)

Ph. Karno: Karnofsky performance score (poor = 0 normal = 100) was assessed by physicians

Pat karno: The Karnofsky performance score was evaluated by the patient

Meals: Calories consumed during meals

Wt. loss: Weight loss in the past six months

Calculate the survival curve: Survfit ()

We calculate the probability of survival by sex.

The function survfit () can be used to calculate kaplan-Meier survival estimates.

Using the functionSurv() The created survival object

 

To calculate the survival curve, enter the following:

Print (fit) n events median 0.95LCL 0.95UCL sex=1 138 112 270 212 310 sex=2 90 53 426 348 550Copy the code

By default, the function print () displays a summary of the survival curve. It shows the number of observations, number of events, median survival, and median confidence interval.

To display a more complete summary of the survival curve, enter the following:

# Summary (fit)$tableCopy the code

 

Visual survival curve

We generated survival curves for both groups of subjects.

Ggplot (fit, pval = TRUE, conf.int = TRUE, risk. Table = TRUE, # add riskCopy the code

 

 

 

Legend. LABS Changes the legend tag.

 

Ggplot (fit, # SurvFIT object with calculated statistics. Pval = TRUE, # displays the p-value of the logarithmic rank test. Conf. int = TRUE, # displays the confidence interval of the survival curve point estimate. Conf.int. style = "step", # custom confidence interval style xlab = "Time in days", # custom X-axis tag. By = 200, # breaks the X-axis at intervals of 200. Ggtheme = theme_light(), # use theme custom drawing and risk table. Table = "abs_pct", # absolute valueCopy the code

 

 

The median survival time of each group represents the time when the probability of survival S (t) is 0.5.

 

The range of the survival curve can be shortened using the parameter Xlim, as shown below:

 

 

 

Note that you can specify three frequently used transformations using the fun parameter:

 

 

 

 

Cumulative risk is commonly used to estimate the probability of risk.

 ,

 

Kaplan-meier Life Table: A summary of the Survival curve

As mentioned above, you can use the function summary () to get a complete summary of the survival curve:

summary(fit)
Copy the code

 

六四运动

Log-rank test for comparative survival curve: Survdiff ()

The logarithmic rank test is the most widely used method for comparing two or more survival curves. The null hypothesis is that there is no difference between the two groups during survival.

Survdiff () can be used as follows:

Surv_diff N Observed Expected (O-E)^2/E (O-E)^2/V sex=1 138 112 91.6 4.55 10.3 sex=2 90 53 73.4 5.68 10.3 Chisq= 10.3 on 1 Degrees of freedom, P = 0.00131Copy the code

 

Logarithmic rank test of survival differences gave a P value of p = 0.0013, indicating significant differences in survival between the sex groups.

Complex survival curve

In this section, we will calculate the survival curve using a combination of multiple factors. Next, we’ll output the result using ggsurvplot ()

Ggplot (fit, conf.int = TRUE, risk.table.col = "strata", # change the risk table color by group ggtheme = theme_BW (), # change the ggplot2 themeCopy the code

Visual output. The figure below shows the survival curve of the gender variable according to the values of RX&Adhere.

 

 

 

The profile

Survival analysis is a statistical method of data analysis in which the outcome variable of interest is the time before the event occurred.

In this article, we demonstrate how to perform and visualize survival analysis using two R packages).


Most welcome insight

1. R language survival curve drawing estimate | | how to survive R for survival analysis graph

2. Visual analysis of R language survival analysis

3. How does R language calculate IDI and NRI indicators in survival analysis and Cox regression

4. Use Bioconductor to analyze chip data in R language

5. Visualization case of R language survival analysis data analysis

6. R language GGploT2 error bar diagram quick guide

7. Draw functional enrichment bubble map in R language

8. How does R language find indicators with differences in patient data? (PLS-DA analysis)

9. Survival analysis of R language in 4 patients with advanced lung cancer