Chapter Post-stratification as a tool for enhancing the predictive power of classification methods

d'Ovidio, Francesco Domenico; D'Uggento, Angela Maria; mancarella, rossana; TOMA, Ernesto

doi:10.36253/978-88-5518-461-8.24

dc.contributor.author	d'Ovidio, Francesco Domenico
dc.contributor.author	D'Uggento, Angela Maria
dc.contributor.author	mancarella, rossana
dc.contributor.author	TOMA, Ernesto
dc.date.accessioned	2022-06-01T12:20:58Z
dc.date.available	2022-06-01T12:20:58Z
dc.date.issued	2021
dc.identifier	ONIX_20220601_9788855184618_556
dc.identifier.issn	2704-5846
dc.identifier.uri	https://library.oapen.org/handle/20.500.12657/56371
dc.description.abstract	It is well known that, in classification problems, the predictive capacity of any decision-making model decreases rapidly with increasing asymmetry of the target variable (Sonquist et al., 1973; Fielding 1977). In particular, in segmentation analysis with a categorical target variable, very poor improvements of purity are obtained when the least represented modality counts less than 1/4 of the cases of the most represented modality. The same problem arises with other (theoretically more exhaustive) techniques such as Artificial Neural Networks. Actually, the optimal situation for classification analyses is the maximum uncertainty, that is, equidistribution of the target variable. Some classification techniques are more robust, by using, for example, the less sensitive logit transformation of the target variable (Fabbris & Martini 2002); however, also the logit transformation is strongly affected by the distributive asymmetry of the target variable. In this paper, starting from the results of a direct survey in which the target (binary) variable was extremely asymmetrical (10% vs. 90%, or greater asymmetry), we noted that also the logit model with the most significant parameters had very reduced fitting measures and almost zero predictive power. To solve this predictive issue, we tested post-stratification techniques, artificially symmetrizing a training sample. In this way, a substantially increase of fitting and predictive capacity was achieved, both in the symmetrized sample and, above all, in the original sample. In conclusion of the paper, an application of the same technique to a dataset of very different nature and size is described, demonstrating that the method is stable even in the case of analysis executed with all data of a population.
dc.language	English
dc.relation.ispartofseries	Proceedings e report
dc.subject.other	Classification
dc.subject.other	Asymmetry
dc.subject.other	Post-stratification
dc.subject.other	Predictive power
dc.title	Chapter Post-stratification as a tool for enhancing the predictive power of classification methods
dc.type	chapter
oapen.identifier.doi	10.36253/978-88-5518-461-8.24
oapen.relation.isPublishedBy	bf65d21a-78e5-4ba2-983a-dbfa90962870
oapen.relation.isbn	9788855184618
oapen.series.number	132
oapen.pages	6
oapen.place.publication	Florence

Files in this item

Name:: 26242.pdf
Size:: 412.0Kb
Format:: PDF
ISBN:: 9788855184618
License:: https://creativecommons.org/li ...
Webshop link:: https://books.fupress.com/doi/ ...

View/Open

This item appears in the following Collection(s)

Imported or submitted locally

Show simple item record