Statistical Test Selection for Pediatric Obesity Research with Multiple Categories
When analyzing the relationship between pediatric groups and obesity with more than three diagnostic categories, multinomial logistic regression (Option D) is the correct statistical approach.
Why Multinomial Logistic Regression is Required
Multinomial logistic regression is specifically designed to handle polytomous (multiple category) outcomes and is the recommended method when pediatric obesity research involves more than three weight classifications 1. The American Heart Association establishes that comprehensive pediatric obesity assessment requires multiple BMI percentile categories including underweight, normal weight, overweight, obesity, and severe obesity—creating 4-5 distinct diagnostic categories 1, 2.
Key Advantages of Multinomial Regression
Multinomial logistic regression simultaneously models all outcome categories while accounting for the relationships between them, avoiding the inflation of Type I error that occurs when running multiple separate analyses 1.
This method preserves the full gradient of risk across weight categories rather than collapsing clinically meaningful distinctions into binary outcomes, which loses critical information and reduces statistical power 1.
Studies have successfully employed multinomial logistic regression to predict weight categories based on BMI percentiles and demographic factors in pediatric populations, demonstrating its practical utility 1.
Why Other Options Are Inappropriate
T-test (Option A) - Incorrect
- T-tests compare means between two groups only and cannot handle multiple categorical outcomes 3. With more than three obesity categories, this test is fundamentally incompatible with the research design.
Linear Regression (Option B) - Incorrect
- Linear regression requires a continuous dependent variable, but obesity categories (underweight, normal, overweight, obese, severely obese) are discrete, ordered classifications 3. Using linear regression would violate the assumption that the outcome variable is continuous and normally distributed.
Binary Logistic Regression (Option C) - Incorrect
Binary logistic regression is limited to dichotomous outcomes (two categories only) 3. While you could artificially collapse multiple obesity categories into two groups (e.g., "obese vs. not obese"), this approach discards valuable clinical information about the spectrum of weight status 1.
Running multiple binary logistic regressions for each pairwise comparison inflates Type I error and fails to account for the simultaneous relationships between all categories 1, 4.
Clinical Context Supporting This Choice
Severe obesity represents a distinct category defined as BMI ≥99th percentile or ≥120% of the 95th percentile, and studies consistently demonstrate different risk factor profiles across these multiple weight categories 1, 2. For example, severe obesity shows higher prevalence of metabolic syndrome clustering compared to normal-weight participants, and Hispanic and non-Hispanic Black youth demonstrate higher prevalence rates across all obesity definitions 5, 2.
Prevalence data show distinct patterns across weight categories: approximately 4-6% of children have severe obesity, while overall obesity affects 17-21% of U.S. children aged 2-19 years 2. These epidemiological patterns underscore why maintaining separate analytical categories is clinically important.
Critical Methodological Pitfall to Avoid
Do not collapse multiple obesity categories into binary outcomes or run separate binary comparisons for each category 1. When research examines relationships with multiple diagnostic criteria, the outcome naturally becomes polytomous, requiring multinomial logistic regression to properly model the data structure and maintain statistical validity 1.