Reliability Estimation of Individual Multi-target Regression Predictions

Reliability Estimation of Individual Multi-target Regression Predictions

Abstract: To estimate the quality of the induced predictive model we generally use measures of averaged prediction accuracy, such as the relative mean squared error on test data. Such evaluation fails to provide local information about reliability of individual predictions, which can be important in risk-sensitive fields (medicine, finance, industry etc.). Related work presented several ways for computing individual prediction reliability estimates for single-target regression models, but has not considered their use with multi-target regression models that predict a vector of independent target variables. In this paper we adapt the existing single-target reliability estimates to multi-target models. In this way we try to design reliability estimates, which can estimate the prediction errors without knowing true prediction errors, for multi-target regression algorithms, as well. We approach this in two ways: by aggregating reliability estimates for individual target components, and by generalizing the existing reliability estimates to higher number of dimensions. The results revealed favorable performance of the reliability estimates that are based on bagging variance and local cross-validation approaches. The results are consistent with the related work in single-target reliability estimates and provide a support for multi-target decision making.

Conclusion: In the paper we proposed several approaches for estimating the reliabilities of individual multi-target regression predictions. The aggregated variants (AM, l2 and +) produce a single-valued estimate which is preferable for interpretation and comparison. The last variant (+) is a direct generalization of the singletarget estimators from the related work. Our evaluation showed that best results were achieved using the BAGV and the LCV reliability estimates regardless the estimate variant. This complies with the related work on the single-target predictions, where these two estimates also performed well. Although all of the proposed variants achieve comparable results, our proposed generalization of existing methods (+) is still the preferred variant due to its lower computational complexity (as estimates are only calculated once for all of the target attributes) and the solid theoretical background. In our further work we intend to additionally evaluate other reliability estimates in combination with several other regression models. We also plan to test the adaptation of the proposed methods to multi-target classification. Reliability estimation of individual predictions offers many advantages especially when making decisions in highly sensitive environment. Our work provides an effective support for model-independent multi-target regression.