Neurol India Home 

Year : 2020  |  Volume : 68  |  Issue : 5  |  Page : 1172--1174

Introduction to Survival Analysis

Deepti Vibha, Kameshwar Prasad 
 Department of Neurology, All India Institute of Medical Sciences, New Delhi, India

Correspondence Address:
Prof. Kameshwar Prasad
Department of Neurology, Room Number 2, 6th Floor, Neurosciences Center, All India Institute of Medical Sciences, New Delhi - 110 029

How to cite this article:
Vibha D, Prasad K. Introduction to Survival Analysis.Neurol India 2020;68:1172-1174

How to cite this URL:
Vibha D, Prasad K. Introduction to Survival Analysis. Neurol India [serial online] 2020 [cited 2021 Jan 26 ];68:1172-1174
Available from:

Full Text

In previous papers in this series, we talked of our patients having events occurring (or accumulating) over a fixed period (called cumulative incidence). If all (or nearly all) the events occur within a short length of follow-up, e.g., in hours or days or weeks, then the cumulative incidence (referred by some authors as “event rates”) is an acceptable and meaningful measure of risk. An advantage of the cumulative measure is the simplicity in analyzing and ease of interpretation. The downside of these rates is that large differences in outcomes can be hidden within similar event rates. For example, [Figure 1] shows 5-year survival for patients with dissecting aneurysm and AIDS.{Figure 1}

For each condition, about 10% of the patients are alive at 5 years, but the survival pattern is considerably different. For dissecting aneurysm, early mortality is extremely high; however, if they survive the first few months, their risk of dying levels up (becomes constant). On the other hand, patients with AIDS die throughout the 5 years. The fact of death at 5 years gives the same percentage, but the time to death (survival) has been different for the two conditions.

To overcome this drawback, the time to death (= survival) is analyzed for a cohort of patients and is presented in the form of a graph, called “survival curve.” The form of analysis that yields the curve is called survival analysis. Survival curves are usually the probability of surviving (without the event “death”) on the vertical axis (but, sometimes, the proportion with, rather than without, the outcome event is indicated on the vertical axis). In either case, the horizontal axis has the period since the beginning of observation.

There are some features that are unique to survival times:

Survival times are nonnegative; they will always proceed in the positive directionIn many cases, the time to the event can have unusual distribution, i.e., it does not look like a normal distribution and may be skewed to left or right. Therefore, naïve analysis of untransformed times may produce invalid results.

 Construction of the Survival Curve

Concept of censoring

Censoring is a form of missing data problem in which the time to event is not observed for reasons, such as termination of the study before all recruited subjects have shown the event of interest or the subject has left the study before experiencing an event.

Illustration by example

We have a hypothetical study in which we want to know about patients with stroke who have a recurrence of stroke. The study started in 2016 and began taking patients who had presented with first-ever stroke. The patients were recruited between 2016 and 2018. These patients were followed up 6-monthly and the study ended in 2020.

There were 200 patients enrolled in the study period. There were 20 recurrence strokes. However, there were also 50 coronary events and 30 deaths. Five died due to cancer. There were 20 patients who could not be contacted after variable periods of follow-up.

Points related to censoring in this example are as follows:

20 patients had the outcome of interest. We have completed information of interest for these patientsThere are some patients without the event of interest (stroke recurrence) until the end of the study. These patients may have a stroke sometime later; however, for the purpose of the study, they do not have the event. Thus, they are right censoredThere are 30 deaths and the follow-up ends here – this would be informative censoringThere were 20 patients who could not be contacted – this would be noninformative censoringThere would also be patients who would have been excluded because they already had a recurrence of stroke when they were contacted. These were left censored.

Kaplan–Meier analysis

Now that we have understood the concepts of survival probability and censoring, we can understand the simple example provided below.

There are eight patients who were followed up to 11-time units. “+” represents censoring. Below represent the event times.

1 5+ 6 6 8+ 8 9 11+

Kaplan–Meier (KM) curve of this example is shown in [Figure 2]a. As we can see, the x-axis depicts the timeline while the y-axis depicts the probability of the event. The graph has a step function which is dependent on the events recorded at the progressive timeline. The “+” mark depicts censoring, which may be of various types as depicted above. [Figure 2]b confirms the computation we just did and that is listed in the “survival table.”{Figure 2}

 Interpretation of the Survival Curve

Certain points to be kept in mind when interpreting survival curves are as follows:

The vertical axis does not represent the actual percent surviving for an actual cohort but the estimated probability of surviving for members of a hypothetical cohort. Since the function of probability always lies between 0 and 1, the y-axis ranges between 0 and 1 (or 0%–100%)

Often, the number of patients at risk at various points in time is shown below the horizontal axis. The estimates on the left-hand side of the curve are sound because more patients are at risk during this time. However, at the tail of the curve, on the right, the number of patients on whom the estimation is based is often small because fewer patients are available for follow-up for that length of time. As a result, the estimates of survival toward the end of the follow-up period are less precise than in the earlier period.

The survival analysis methodology can be used to analyze not only time to death (=survival) but also time-to-any event and the results can be presented as a “survival curve.” The events may be remission, recurrence, stroke, or acute myocardial infarction. Therefore, instead of survival analysis, we can also call it “time-to-event” analysis [Table 1].{Table 1}

Survival curve can be used to describe the survival after any length of time – 1 year, 2 years, 3 years, or 5 years. This gives a more complete and detailed description of prognosis.

When percent (probability) of having an event, rather than not having it, presented, the curve starts at “zero” and sweeps upward and to the right.

Financial support and sponsorship


Conflicts of interest

There are no conflicts of interest.

 Further Reading

Cox D, Oakes D. Analysis of Survival Data. London: Chapman & Hall; 1984.Kleinbaum D, Klein M. Survival Analysis-A Self-Learning Text. New York: Springer-Verlag; 2012.