The table also presents the coefficient of variation (CV) for each group. The CV is a standardized measure of dispersion developed by dividing the standard deviation in expenditures for the group by the group’s mean expenditure. The CV for the entire sample was 1.49. Group CVs should be significantly smaller than the CV for the sample as a whole. This proved true for all groups created in the P/ECM. In fact, only the CVs for three groups (E1, E3, and S1) marginally exceeded one.
Initially, evaluating the usefulness of the P/ECM requires estimating the P/ECM’s 24 categories level of explained variance in expenditures. Table 3 presents the data from OLS models using the P/ECM’s 24 categories as independent variables. The 24 categories explained 41% of the variance in annual expenditures.
(To view a larger version of Table 3, click here.)
For purposes of comparison, the RUG-III/HC model explained 34% of the variance in total (formal and informal) home care costs with the Michigan sample.11 When the RUG-III/HC model was applied to home care data from Ontario, Canada, the model explained 37% of the variance in total per diem costs but only 21% of the variance in formal per diem care costs.30
To demonstrate that the model’s ability to explain variance did not depend heavily on outliers, the dependent variable was logged. An OLS model with the 24 groups was then estimated using this transformed dependent variable. The explanatory power of the P/ECM fell (R2 = 0.38) relatively little in this analysis.
Table 3 also displays how well the P/ECM explained variance in sub-populations within the sample. Sample members were categorized as facing challenges derived solely from medical conditions, as facing challenges deriving solely from psychological or developmental conditions, facing both types of challenges, or having a diagnosis of an intellectual or developmental disability. OLS models using the 24 P/ECM categories were then estimated for each group. As those results indicate, the P/ECM worked well for all four sub-populations.
With all case-mix models, observers often raise reasonable concerns about seemingly important characteristics, diagnoses, or conditions not included in the model. For example, muscular dystrophy (MD) is a devastating condition that is not given special attention in the P/ECM; in addition, autism spectrum disorders are not emphasized in the model. The reason for such omissions is often relatively simple. The variance in expenditures generated by such conditions is captured by other elements already included in the model.
This assumption is relatively easily tested. For illustrative purposes, two OLS models were estimated using the 24 P/ECM groups. A binary variable representing MD was added to one model; a binary variable representing autism spectrum disorders was added to the other. The coefficients for neither of these variables were statistically significant (p > 0.05).
The same process indicated that the characteristics included in the model also captured any differences that might be attributable to age or gender. When used in an OLS model in conjunction with P/ECM groups, neither gender nor age was statistically significant predictors (p > 0.05) of Medicaid annual home care expenditures.
To determine whether some organizational factors might be affecting the results, an additional model was estimated. The assessments were completed in one of 16 regional offices. A series of binary variables representing these offices were added to the model containing the 24 P/ECM groups. The adjusted R2increased by only 0.0045 when these variables were added to the model.
To examine the external validity or robustness of the P/ECM, a series of 10 random 50% sub-samples were drawn from the larger sample. The P/ECM was then retested on each sub-sample. The explained variance (R2) for those 10 sub-samples ranged from 0.37 to 0.44 and averaged 0.41. While arguably not as convincing evidence of robustness as might be achieved using split samples, the results bode well for the external validity of the P/ECM.
For additional insight into the model’s predictive capabilities and possible utility as a screener, logistic regression models were estimated. The first model focused on predicting which sample members had expenditures in the top 10%. The independent variables in the model were four of the five basic case-mix categories (extensive services, special care, complex care, and cognitive issues) and the ADL scale. Estimating that model resulted in a c-statistic of 0.91. Completing the same exercise with those in the top quartile of expenditures resulted in a c-statistic of 0.86. The model predicting which children or youth would have expenditures in the lowest quartile produced a c-statistic of 0.75. This same exercise with the lowest decile in expenditures resulted in a c-statistic of 0.70, which is at the low end of the range of acceptable predictive power for such models.31