

Introduction to Survival Analysis
Correspondence Address: Source of Support: None, Conflict of Interest: None DOI: 10.4103/00283886.299141
In previous papers in this series, we talked of our patients having events occurring (or accumulating) over a fixed period (called cumulative incidence). If all (or nearly all) the events occur within a short length of followup, e.g., in hours or days or weeks, then the cumulative incidence (referred by some authors as “event rates”) is an acceptable and meaningful measure of risk. An advantage of the cumulative measure is the simplicity in analyzing and ease of interpretation. The downside of these rates is that large differences in outcomes can be hidden within similar event rates. For example, [Figure 1] shows 5year survival for patients with dissecting aneurysm and AIDS.
For each condition, about 10% of the patients are alive at 5 years, but the survival pattern is considerably different. For dissecting aneurysm, early mortality is extremely high; however, if they survive the first few months, their risk of dying levels up (becomes constant). On the other hand, patients with AIDS die throughout the 5 years. The fact of death at 5 years gives the same percentage, but the time to death (survival) has been different for the two conditions. To overcome this drawback, the time to death (= survival) is analyzed for a cohort of patients and is presented in the form of a graph, called “survival curve.” The form of analysis that yields the curve is called survival analysis. Survival curves are usually the probability of surviving (without the event “death”) on the vertical axis (but, sometimes, the proportion with, rather than without, the outcome event is indicated on the vertical axis). In either case, the horizontal axis has the period since the beginning of observation. There are some features that are unique to survival times:
Concept of censoring Censoring is a form of missing data problem in which the time to event is not observed for reasons, such as termination of the study before all recruited subjects have shown the event of interest or the subject has left the study before experiencing an event. Illustration by example We have a hypothetical study in which we want to know about patients with stroke who have a recurrence of stroke. The study started in 2016 and began taking patients who had presented with firstever stroke. The patients were recruited between 2016 and 2018. These patients were followed up 6monthly and the study ended in 2020. There were 200 patients enrolled in the study period. There were 20 recurrence strokes. However, there were also 50 coronary events and 30 deaths. Five died due to cancer. There were 20 patients who could not be contacted after variable periods of followup. Points related to censoring in this example are as follows:
Kaplan–Meier analysis Now that we have understood the concepts of survival probability and censoring, we can understand the simple example provided below. There are eight patients who were followed up to 11time units. “+” represents censoring. Below represent the event times. 1 5^{+} 6 6 8^{+} 8 9 11^{+} Kaplan–Meier (KM) curve of this example is shown in [Figure 2]a. As we can see, the xaxis depicts the timeline while the yaxis depicts the probability of the event. The graph has a step function which is dependent on the events recorded at the progressive timeline. The “+” mark depicts censoring, which may be of various types as depicted above. [Figure 2]b confirms the computation we just did and that is listed in the “survival table.”
Certain points to be kept in mind when interpreting survival curves are as follows:
Often, the number of patients at risk at various points in time is shown below the horizontal axis. The estimates on the lefthand side of the curve are sound because more patients are at risk during this time. However, at the tail of the curve, on the right, the number of patients on whom the estimation is based is often small because fewer patients are available for followup for that length of time. As a result, the estimates of survival toward the end of the followup period are less precise than in the earlier period. The survival analysis methodology can be used to analyze not only time to death (=survival) but also timetoany event and the results can be presented as a “survival curve.” The events may be remission, recurrence, stroke, or acute myocardial infarction. Therefore, instead of survival analysis, we can also call it “timetoevent” analysis [Table 1].
When percent (probability) of having an event, rather than not having it, presented, the curve starts at “zero” and sweeps upward and to the right. Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest.
[Figure 1], [Figure 2]
[Table 1]


