Maintaining a National Acute Kidney Injury Risk Prediction Model to Support Local Quality Benchmarking



Circulation: Cardiovascular Quality and Outcomes, Ahead of Print.
BACKGROUND:The utility of quality dashboards to inform decision-making and improve clinical outcomes is tightly linked to the accuracy of the information they provide and, in turn, accuracy of underlying prediction models. Despite recognition of the need to update prediction models to maintain accuracy over time, there is limited guidance on updating strategies. We compare predefined and surveillance-based updating strategies applied to a model supporting quality evaluations among US veterans.METHODS:We evaluated the performance of a US Department of Veterans Affairs–specific model for postcardiac catheterization acute kidney injury using routinely collected observational data over the 6 years following model development (n=90 295 procedures in 2013–2019). Predicted probabilities were generated from the original model, an annually retrained model, and a surveillance-based approach that monitored performance to inform the timing and method of updates. We evaluated how updating the national model impacted regional quality profiles. We compared observed-to-expected outcome ratios, where values above and below 1 indicated more and fewer adverse outcomes than expected, respectively.RESULTS:The original model overpredicted risk at the national level (observed-to-expected outcome ratio, 0.75 [0.74–0.77]). Annual retraining updated the model 5×; surveillance-based updating retrained once and recalibrated twice. While both strategies improved performance, the surveillance-based approach provided superior calibration (observed-to-expected outcome ratio, 1.01 [0.99–1.03] versus 0.94 [0.92–0.96]). Overprediction by the original model led to optimistic quality assessments, incorrectly indicating most of the US Department of Veterans Affairs’ 18 regions observed fewer acute kidney injury events than predicted. Both updating strategies revealed 16 regions performed as expected and 2 regions increasingly underperformed, having more acute kidney injury events than predicted.CONCLUSIONS:Miscalibrated clinical prediction models provide inaccurate pictures of performance across clinical units, and degrading calibration further complicates our understanding of quality. Updating strategies tailored to health system needs and capacity should be incorporated into model implementation plans to promote the utility and longevity of quality reporting tools.



Source link