Why Are We Weighting? Understanding the Estimates From Propensity Score Weighting and Matching Methods

January 9, 2024 Editor

Circulation: Cardiovascular Quality and Outcomes, Ahead of Print.
BACKGROUND:Propensity score methods are used in observational studies to compensate for the lack of random allocation by balancing measured baseline characteristics between treated and untreated patients. We sought to explain the treatment effect estimates derived from different propensity score methods.METHODS:We performed a retrospective analysis of long-term mortality after single internal mammary artery versus bilateral internal mammary artery (BIMA) conduit in 47 984 index isolated coronary artery bypass grafting procedures from 1992 to 2014 in the Northern New England Cardiovascular Disease Study Group registry using multivariable Cox regression, 1:1 propensity score matching, inverse probability weighting (IPW) among the treated, and IPW among the overall population treatment estimates.RESULTS:The mean duration of follow-up was 13.2 (interquartile range, 7.4–17.7) years. In multivariable Cox regression, the adjusted hazard ratio for mortality was 0.83 (95% CI, 0.75–0.92) in patients receiving BIMA compared with a single internal mammary artery. The 1:1 propensity matched (hazard ratio, 0.79 [95% CI, 0.69–0.91]) and IPW among the treated (hazard ratio, 0.83 [95% CI, 0.75–0.92]) estimates showed a protective treatment effect of BIMA use on mortality. However, the IPW estimate of treatment effect for the overall population showed an increased risk of mortality after BIMA that was not statistically significant (hazard ratio, 1.08 [95% CI, 0.94–1.24]).CONCLUSIONS:While the multivariable Cox regression, 1:1 propensity matching, and IPW treatment effect in the treated estimates demonstrate that BIMA was associated with a statistically significantly decreased risk of mortality, the IPW treatment effect in the average study population showed an increased risk of mortality associated with BIMA that was not statistically significant. This is attributed to the different populations (weighted to look like the overall study population versus treated group) represented by the 2 IPW approaches. Determining how the study population is balanced is a large driver of the treatment effect. Ultimately, the treatment effect estimate desired should drive the choice of the propensity score method.

Source link