Sensitivity and Specificity of the PHQ-9
The PHQ-9 has a sensitivity of 89.5% and specificity of 77.5% at a cut-off score of 11 for detecting major depressive disorder, making it an effective screening tool for depression in clinical settings. 1
Diagnostic Accuracy of the PHQ-9
- The PHQ-9 is a validated depression screening tool that includes all nine DSM criteria for depression, with each item scored from 0-3 based on symptom frequency over the past two weeks 2
- When validated against the gold standard Diagnostic Interview Schedule for Children-IV (DISC-IV), the PHQ-9 demonstrated excellent diagnostic properties with a sensitivity of 89.5% and specificity of 77.5% at a cut-off score of 11 1
- The positive predictive value (PPV) is 15.2% and negative predictive value (NPV) is 99.4%, indicating the tool is particularly strong at ruling out depression when scores are below the threshold 1
- While the traditional cut-off for the PHQ-9 is 10, some guidelines recommend a cut-off score of 8 based on studies of diagnostic accuracy in specific populations such as cancer patients 1, 2
Cut-off Scores and Their Impact on Diagnostic Properties
- A meta-analysis of PHQ-9 performance across multiple studies found that cut-off scores between 8 and 11 provide acceptable diagnostic properties with no substantial differences in pooled sensitivity and specificity 3
- At lower cut-off scores (≥7), specificity decreases to 0.73 (95% CI 0.63-0.82), while at higher cut-off scores (≥15), specificity increases to 0.96 (95% CI 0.94-0.97) 3
- An individual participant data meta-analysis found that a cut-off score of 10 maximized combined sensitivity and specificity across various patient subgroups 4
- The diagnostic accuracy of the PHQ-9 varies depending on the type of diagnostic interview used as the reference standard, with sensitivity 5-22% higher when compared to semi-structured interviews versus fully structured interviews 4
PHQ-2 as an Initial Screening Tool
- The PHQ-2, which consists of the first two items of the PHQ-9 assessing depressed mood and anhedonia, can be used as an initial screening step 1, 5
- With a cut-off score of 3, the PHQ-2 has a sensitivity of 73.7% and specificity of 75.2% for detecting major depressive disorder 1
- Using a two-step approach with PHQ-2 (cut-off ≥2) followed by PHQ-9 (cut-off ≥10) for those who screen positive on the PHQ-2 has similar sensitivity (0.82) but higher specificity (0.87) compared to using the PHQ-9 alone (sensitivity 0.86, specificity 0.85) 5
- This combined approach can reduce the number of patients needing to complete the full PHQ-9 by approximately 57% 5
Scoring Methods and Their Impact
- The PHQ-9 can be scored using either an algorithm based on DSM-IV criteria or a summed-item score method 6
- The summed-item score method at a cut-off point of ≥10 has better diagnostic performance for screening purposes compared to the algorithm scoring method, which tends to have lower sensitivity 6
- When using the continuous scoring method, a cut-off point of >9 has been identified as having the highest sensitivity (77.5%) and specificity (86.7%) in some population-based studies 7
Clinical Implications and Implementation
- The PHQ-9 is recommended for depression screening at initial diagnosis, at appropriate intervals, and with changes in disease or treatment status 1
- Special attention should be paid to item 9 of the PHQ-9, which assesses thoughts of self-harm, as positive responses warrant immediate referral for emergency evaluation 1, 8
- Management decisions can be guided by PHQ-9 scores: mild symptoms (1-7), moderate symptoms (8-14), and moderate to severe/severe symptoms (15-27) 2
- For patients with moderate to severe symptoms (PHQ-9 score ≥15), immediate referral to psychology and/or psychiatry is recommended 2
Limitations and Considerations
- The PHQ-9's diagnostic accuracy may vary across different populations and settings, highlighting the importance of considering the context when selecting appropriate cut-off scores 4
- Sensitivity appears to be greater when the PHQ-9 is compared with semi-structured diagnostic interviews versus fully structured interviews 4
- Cultural factors, learning disabilities, cognitive impairments, and age may affect the accuracy of the PHQ-9, requiring adjustments in assessment approaches 2
- When screening for depression in adolescents, the PHQ-9 Modified for Teens may be more appropriate 1