Statistical Tests for Comparing Two Survival Graphs
The log-rank test is the primary and most widely used statistical test for comparing two survival curves, as it is the most popular method for evaluating differences between Kaplan-Meier estimates. 1
Primary Test: Log-Rank Test
The log-rank test is specifically designed to compare different Kaplan-Meier (KM) survival estimates and is the standard approach for testing equality of survivor functions between two groups. 1
This test is particularly efficient under the proportional hazards assumption and is appropriate for evaluating prognostic factors such as overall survival, disease-free survival, or progression-free survival. 1
The log-rank test can be easily implemented in all major statistical software packages (R, Stata, SPSS) using simple commands. 1
Alternative Tests Available
While the log-rank test is most common, several other tests exist for comparing KM estimates, though they are used less frequently: 1
Other weighted log-rank tests can be applied depending on the specific hypothesis being tested. 1
The Mantel-Byar test is available for specific scenarios, though it requires custom scripting in most software packages. 1
Special Consideration: Competing Risk Analysis
When competing events are present (such as death before treatment or transplant-related mortality versus disease relapse), Gray's test should be used instead of the log-rank test to compare cumulative incidence curves. 1
The standard log-rank test is inappropriate when patients experience competing events because it only considers one possible event and censors patients who experience alternative outcomes. 1
Important Caveats
The log-rank test assumes proportional hazards; when survival curves cross or the proportional hazards assumption is violated, the test may perform poorly or yield misleading results. 2
When reporting log-rank test results in publications, clearly denote whether the p-value is one-sided or two-sided. 1
The log-rank test is most appropriate for time-to-event data where censoring occurs, distinguishing it from simple binary outcome comparisons. 1