Reliability Estimation of Individual Multi-target Regression Predictions

2018 Mar 11

Abstract. To estimate the quality of the induced predictive model we

generally use measures of averaged prediction accuracy, such as the relative

mean squared error on test data. Such evaluation fails to provide

local information about reliability of individual predictions, which can

be important in risk-sensitive fields (medicine, finance, industry etc.).

Related work presented several ways for computing individual prediction

reliability estimates for single-target regression models, but has not

considered their use with multi-target regression models that predict a

vector of independent target variables. In this paper we adapt the existing

single-target reliability estimates to multi-target models. In this way

we try to design reliability estimates, which can estimate the prediction

errors without knowing true prediction errors, for multi-target regression

algorithms, as well. We approach this in two ways: by aggregating reliability

estimates for individual target components, and by generalizing

the existing reliability estimates to higher number of dimensions. The

results revealed favorable performance of the reliability estimates that

are based on bagging variance and local cross-validation approaches. The

results are consistent with the related work in single-target reliability

estimates and provide a support for multi-target decision making.

Conclusion

In the paper we proposed several approaches for estimating the reliabilities of

individual multi-target regression predictions. The aggregated variants (AM, l

and +) produce a single-valued estimate which is preferable for interpretation

and comparison. The last variant (+) is a direct generalization of the singletarget

estimators from the related work.

Our evaluation showed that best results were achieved using the BAGV and

the LCV reliability estimates regardless the estimate variant. This complies with

the related work on the single-target predictions, where these two estimates also

performed well. Although all of the proposed variants achieve comparable results,

our proposed generalization of existing methods (+) is still the preferred variant

due to its lower computational complexity (as estimates are only calculated once

for all of the target attributes) and the solid theoretical background.

In our further work we intend to additionally evaluate other reliability estimates

in combination with several other regression models. We also plan to test

the adaptation of the proposed methods to multi-target classification.

Reliability estimation of individual predictions offers many advantages especially

when making decisions in highly sensitive environment. Our work provides

an effective support for model-independent multi-target regression.