Data Analysis

11 min read

Cox Proportional Hazards and Kaplan-Meier: A Survival Analysis Guide

Cox proportional hazards regression estimates hazard ratios from time-to-event data. Learn how it differs from Kaplan-Meier, how to read a hazard ratio, and the key assumption.

Dr. Sarah Mitchell

June 12, 2026

Have time-to-event outcomes and a reviewer asking about the proportional hazards assumption? Our biostatistics consulting service builds and validates survival models in reproducible code.

Key Takeaways

Cox proportional hazards regression estimates hazard ratios from time-to-event data while handling censored participants

Kaplan-Meier describes unadjusted survival with a curve and log-rank test; Cox regression estimates adjusted effects of predictors

A hazard ratio compares instantaneous event rates, not cumulative risk or median survival, and should be reported with confidence intervals

The proportional hazards assumption, a constant hazard ratio over time, must be checked rather than presumed

The number of events, not participants, limits how many covariates the model can support

Cox is semiparametric: the partial likelihood cancels the unspecified baseline hazard, and tied event times should use the Efron approximation, not Breslow

The hazard ratio is non-collapsible and carries a built-in selection effect over time, so pair it with an absolute measure such as restricted mean survival time

With competing risks, Kaplan-Meier overestimates risk; report the cumulative incidence function and a cause-specific or Fine-Gray subdistribution model

Immortal time bias comes from misaligned time zero; handle changing exposures with a counting-process layout and distinguish a time-varying covariate from a time-varying effect

Cox proportional hazards regression is a survival analysis method that estimates how one or more predictors relate to the rate at which an event happens over time, while handling the fact that not everyone in a study experiences the event before it ends. It produces a hazard ratio, the relative rate of the event in one group compared with another at any given moment, without requiring you to assume a particular shape for the underlying survival curve. That flexibility is why it is the workhorse of clinical and epidemiological time-to-event analysis.

Survival data are different from the outcomes ordinary regression handles. The outcome is not simply whether an event occurred but when, and many participants are censored, meaning the study ended or they were lost to follow-up before the event happened. You know only that they survived event-free up to a certain point. Cox regression and its companion, the Kaplan-Meier estimator, are built to use that partial information rather than discard it.

Kaplan-Meier and Cox: two tools, two jobs

The Kaplan-Meier estimator describes survival. It produces the familiar stepped survival curve showing the probability of remaining event-free over time, and groups can be compared visually and tested with the log-rank test. It is descriptive and unadjusted: it shows what happened in each group but cannot account for other variables.

Cox proportional hazards regression explains and adjusts. It estimates the effect of a predictor on the hazard while holding other covariates constant, which is what you need when groups differ on characteristics beyond the exposure of interest. In practice the two are used together: Kaplan-Meier curves and a log-rank test to show the unadjusted picture, then a Cox model to estimate adjusted hazard ratios. Choosing the right combination for a given question is part of what our biostatistics consulting service does on clinical datasets.

Interpreting a hazard ratio

The central output of a Cox model is the hazard ratio. A value of 1 means no difference in the event rate between groups. A value above 1 means a higher rate, and below 1 a lower rate. A hazard ratio of 1.5 for a treatment, for example, indicates the event occurs at one and a half times the rate in that group at any given instant, with the confidence interval and p-value telling you the precision and statistical significance.

A common error is to read a hazard ratio as a risk ratio or as a statement about how many people will eventually have the event. It is a ratio of instantaneous rates, not of cumulative outcomes, and it does not by itself tell you the difference in median survival. Reporting it alongside Kaplan-Meier curves gives readers both the rate comparison and the absolute survival picture. The distinction between rate-based and risk-based measures parallels the issues covered in our logistic regression guide for binary outcomes.

The assumption that defines the method

The model is named for its key assumption: proportional hazards. It assumes the hazard ratio between groups stays constant over the whole follow-up period. If a treatment's effect grows, shrinks, or reverses over time, the assumption is violated and a single hazard ratio misrepresents the data.

Checking this assumption is not optional. Analysts inspect scaled residuals over time, test the residual-time correlation formally, and examine whether survival curves cross. When the assumption fails, options include adding a time-varying effect, stratifying on the offending variable, or modeling distinct time periods. Reporting a hazard ratio without ever checking proportional hazards is one of the most frequent reasons a survival analysis draws reviewer criticism.

Reporting standards

A sound survival analysis reports the number of events and the censoring, presents Kaplan-Meier curves with a numbers-at-risk table, states the variables in the Cox model, gives hazard ratios with confidence intervals, and documents how the proportional hazards assumption was checked. If you are digitizing survival data from a published figure for a secondary analysis or meta-analysis, our survival curve digitizer extracts the underlying time-to-event data from a Kaplan-Meier plot. Aligning the whole analysis to the reporting standard for your study design is part of what our research methodology support provides.

Common mistakes

Skipping the proportional hazards check. A constant hazard ratio is an assumption that must be tested, not presumed.
Reading a hazard ratio as a risk ratio. It compares instantaneous rates, not cumulative outcomes or median survival.
Ignoring censoring. Treating censored participants as event-free or dropping them biases the estimate.
Reporting Cox without Kaplan-Meier. The adjusted ratio and the absolute survival picture answer different questions; readers need both.
Overfitting with too many covariates. The number of events, not the number of participants, limits how many predictors the model can support.

Need statistical analysis support?

Our PhD statisticians handle data analysis, produce reproducible R code, and write results sections that satisfy peer reviewers.

Chat on WhatsApp Get a Free Quote

What the model actually estimates: the partial likelihood

The Cox model writes the hazard for a participant with covariates x as the product of an unspecified baseline hazard and an exponential term:

h(t | x) = h0(t) * exp( b1*x1 + b2*x2 + ... + bk*xk )

The clever part is that the baseline hazard h0(t), the event rate over time for a reference participant, is never estimated to get the hazard ratios. Cox's partial likelihood conditions on the ordered event times and asks, at each event, which of the people still at risk was the one who failed. Because h0(t) is shared by everyone in the risk set, it cancels, leaving a likelihood that depends only on the coefficients. This is why the method is called semiparametric: it makes no assumption about the shape of the survival curve, yet still estimates covariate effects.

One technical choice matters more than people expect: tied event times. When two events share a time, the software must approximate the partial likelihood. The Efron approximation is the sensible default and is far more accurate than the older Breslow method when ties are common; reserve the exact method for heavily tied discrete-time data. If your output looks off and you have many ties, check which approximation was used.

The hazard ratio is a slippery estimand

Two subtleties keep the hazard ratio from being the simple number it appears to be. First, like the odds ratio, it is non-collapsible: adjusting for a strong predictor of survival shifts the hazard ratio even when that predictor is not a confounder. Second, and less well known, the hazard ratio has a built-in selection effect over time (Hernan, 2010). The risk set at later times is made up of survivors, who are systematically different from the original cohort, so a hazard ratio that looks constant can still hide a changing effect. The practical response is to stop treating a single hazard ratio as the whole story: report it alongside an absolute, time-anchored measure such as survival at a fixed horizon, or a restricted mean survival time, which has a plain interpretation (average event-free time up to a chosen point) and does not depend on proportional hazards at all.

When proportional hazards fails, you have real options

Test the assumption formally with scaled Schoenfeld residuals (the cox.zph function), looking at both the global test and each covariate, and inspect the residual plots rather than trusting the p-value alone. When it fails, the fix depends on the pattern:

A time-varying effect. Interact the covariate with a function of time so the hazard ratio is allowed to change, and report how it changes rather than a single average.
Stratification. Stratify on the offending variable, which lets each stratum keep its own baseline hazard while you estimate the other effects.
A different estimand. Switch to a restricted mean survival time difference, which summarises the whole follow-up in one interpretable number when the curves are not proportional.
A parametric model. An accelerated failure time model (for instance Weibull or log-normal) models survival time directly and can be more efficient and easier to interpret when its distributional assumption fits.

Competing risks change the question, not just the model

If a participant can experience an event that prevents the one you are studying, for example dying of another cause before the cancer recurrence you are tracking, you have competing risks, and the standard Kaplan-Meier estimator overestimates the cumulative risk because it treats the competing event as ordinary censoring. Two correct paths exist. Model the cause-specific hazard with a Cox model that censors competing events, which answers an aetiological question, or model the subdistribution hazard with a Fine-Gray model, which links directly to the cumulative incidence function, the quantity you actually want when predicting absolute risk in the presence of competing events. Report cumulative incidence functions (the Aalen-Johansen estimator) instead of Kaplan-Meier whenever competing events are non-trivial.

Design traps: immortal time bias and time-varying covariates

The most damaging survival errors are in the design, not the model. Immortal time bias arises when follow-up time during which the outcome could not occur is misattributed to a group, for instance classifying patients by a treatment they could only have received by surviving long enough to receive it. The cure is to align time zero so that eligibility, treatment assignment, and the start of follow-up coincide, and to handle genuinely time-dependent exposures with a counting-process (start-stop) data layout rather than a single baseline value. A treatment that changes during follow-up is a time-varying covariate; an effect that changes during follow-up is a time-varying coefficient. They are different problems with different fixes, and conflating them is a common slip.

From Kaplan-Meier curves to adjusted hazard ratios to assumption checks, our PhD methodologists make the survival analysis defensible. Request a quote.

A worked analysis in R

library(survival)

# Kaplan-Meier and the adjusted Cox model (Efron handling of ties is the default)
km  <- survfit(Surv(time, status) ~ group, data = d)
fit <- coxph(Surv(time, status) ~ group + age + stage, data = d)
summary(fit)                       # hazard ratios with confidence intervals

# Test proportional hazards: global and per-covariate
ph <- cox.zph(fit); print(ph); plot(ph)

# Restricted mean survival time difference (no proportional-hazards assumption)
library(survRM2)
rmst2(d$time, d$status, arm = as.numeric(d$group) - 1, tau = 60)

# Competing risks: cumulative incidence and a Fine-Gray subdistribution model
library(tidycmprsk)
cuminc(Surv(time, factor(event_type)) ~ group, data = d)   # event_type: 0 censor, 1 event, 2 competing
crr(Surv(time, factor(event_type)) ~ group + age, data = d)

Bringing it together

Cox proportional hazards regression and the Kaplan-Meier estimator together turn time-to-event data into a defensible story: the curves show what happened, the log-rank test compares groups, and the Cox model estimates adjusted hazard ratios, provided the proportional hazards assumption holds. Done carefully, with censoring handled and assumptions checked, survival analysis is among the most informative tools in clinical research.

If your study has time-to-event outcomes and the analysis has to withstand peer review, our statistical consulting team builds and validates the survival models in reproducible code. Request a quote and tell us about your endpoints.

To plot survival curves from your own data with censoring marks, median survival, and a log-rank test between groups, use the free Kaplan-Meier plotter. To recover data points from a published survival figure before modelling, the survival curve digitizer extracts the coordinates.

Pro Tip

Always test proportional hazards

Inspect scaled residuals over time and check whether survival curves cross. If the hazard ratio is not constant, a single number misrepresents the data.

Pro Tip

Pair Cox with Kaplan-Meier

Report Kaplan-Meier curves with a numbers-at-risk table alongside the adjusted hazard ratios so readers see both absolute survival and the rate comparison.

Pro Tip

Model competing events instead of censoring them

If patients can die of other causes before the event you study, a standard Kaplan-Meier overstates the risk of that event. Report the cumulative incidence function and fit a Fine-Gray subdistribution or cause-specific hazard model.

Pro Tip

Align time zero to avoid immortal time bias

Follow-up must start when eligibility and exposure are defined. Classifying patients by a treatment they could only receive after surviving some period manufactures a spurious survival benefit.

Frequently Asked Questions

Kaplan-Meier is descriptive: it estimates and plots unadjusted survival over time and compares groups with the log-rank test. Cox proportional hazards regression is explanatory: it estimates the effect of one or more predictors on the hazard while adjusting for other covariates, producing hazard ratios.

A hazard ratio of 1 means no difference in the event rate between groups, above 1 means a higher rate, and below 1 a lower rate. It compares instantaneous rates at any moment, not cumulative risk or median survival, so it should be reported with a confidence interval and read alongside survival curves.

Inspect scaled Schoenfeld residuals plotted against time, test the correlation between those residuals and time formally, and examine whether the Kaplan-Meier curves cross. If the assumption fails, options include time-varying effects, stratification, or modeling distinct time periods.

Censoring occurs when a participant does not experience the event before the study ends or is lost to follow-up, so you know only that they remained event-free up to a certain point. Survival methods use this partial information rather than discarding the participant, which would bias the estimate.

Found this useful? Share it with your colleagues.

Meta-Analysis

How to Do a Meta-Analysis: A Step-by-Step Guide for Researchers

A rigorous, doctoral-level guide to conducting a meta-analysis: defining the question, extracting effect sizes and their variances, choosing a between-study variance estimator, pooling, and diagnosing heterogeneity and bias.

Meta-Analysis

Meta-Analysis in Psychology: Definition, Examples, and How It Works

Meta-analysis in psychology pools the effect sizes from many studies into one reliable result. Learn the definition, real examples, and how researchers run one.

Evidence Synthesis

Systematic Review Statistics: 40+ Verified Benchmarks (2026)

Roughly 80 systematic reviews are published daily. The average takes 67.3 weeks, uses 5 authors, and costs about $141,195 in researcher time. Every figure sourced and linked.

Need statistical analysis support?

Our PhD statisticians handle data analysis, produce reproducible R code, and write results sections that satisfy peer reviewers.

Explore our Biostatistics Service, handled end-to-end by a PhD methodologist.

Quote my statistical analysis or see Biostatistics Service

Biostatistics Support

Need a Statistician? Our PhD Team Handles the Numbers.

From data cleaning to advanced statistical analysis, reproducible R code, and a results section ready for peer review. We handle the stats so you focus on the science.

Our promise: Free re-run and re-write if reviewers question the analysis or reporting.

4.9 / 5Quote in minutesReproducible R or Stata codePhD methodologistNDA available on request

Chat on WhatsApp now

Quote my statistical analysis See Biostatistics Service

Written by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Systematic Review MethodologyMeta-AnalysisBiostatistics

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences.

Learn more about our team

A survival analysis that survives review handles censoring correctly and checks every assumption. If your study has time-to-event endpoints, our team delivers the models and the write-up. Request a quote or see our statistical consulting support.

Need a Statistician? Our PhD Team Handles the Numbers.

From data cleaning to advanced statistical analysis, reproducible R code, and a results section ready for peer review. We handle the stats so you focus on the science.

Quote my statistical analysis See Biostatistics Service

Quote in minutes. Pay only after you approve your quote. Unlimited revisions until your reviewers are satisfied. NDA available on request.

Cox Proportional Hazards and Kaplan-Meier: A Survival Analysis Guide

Key Takeaways

Kaplan-Meier and Cox: two tools, two jobs

Interpreting a hazard ratio

The assumption that defines the method

Reporting standards

Common mistakes

What the model actually estimates: the partial likelihood

The hazard ratio is a slippery estimand

When proportional hazards fails, you have real options

Competing risks change the question, not just the model

Design traps: immortal time bias and time-varying covariates

A worked analysis in R

Bringing it together

Always test proportional hazards

Pair Cox with Kaplan-Meier

Model competing events instead of censoring them

Align time zero to avoid immortal time bias

Frequently Asked Questions

Related Articles

Need a Statistician? Our PhD Team Handles the Numbers.

Dr. Sarah Mitchell

Need a Statistician? Our PhD Team Handles the Numbers.

Related Articles

Key Takeaways

Kaplan-Meier and Cox: two tools, two jobs

Interpreting a hazard ratio

The assumption that defines the method

Reporting standards

Common mistakes

What the model actually estimates: the partial likelihood

The hazard ratio is a slippery estimand

When proportional hazards fails, you have real options

Competing risks change the question, not just the model

Design traps: immortal time bias and time-varying covariates

A worked analysis in R

Bringing it together

Related free tools

Always test proportional hazards

Pair Cox with Kaplan-Meier

Model competing events instead of censoring them

Align time zero to avoid immortal time bias

Frequently Asked Questions

Related Articles

Need a Statistician? Our PhD Team Handles the Numbers.

Dr. Sarah Mitchell

Need a Statistician? Our PhD Team Handles the Numbers.

Related Articles