Evaluation of approaches to internal validation of multinomial Logit models: The case of personal travel mode choice

The prediction validity of discrete choice models is key for policy making in the transportation sector. For internal validation, i.e., when the population used to estimate and validate the model is the same, different approaches exist. Each approach is characterized in terms of sampling strategy an...

Full description

Saved in:
Bibliographic Details
Published inCommunication in statistics. Case studies and data analysis Vol. 11; no. 3; pp. 316 - 342
Main Authors Parmar, Janak, Delle Site, Paolo
Format Journal Article
LanguageEnglish
Published Taylor & Francis 03.07.2025
Subjects
Online AccessGet full text
ISSN2373-7484
2373-7484
DOI10.1080/23737484.2025.2522358

Cover

More Information
Summary:The prediction validity of discrete choice models is key for policy making in the transportation sector. For internal validation, i.e., when the population used to estimate and validate the model is the same, different approaches exist. Each approach is characterized in terms of sampling strategy and accuracy metric. The former includes in-sample, also referred to as apparent, split-sample, cross-validation, and bootstrapping. The latter include McFadden rho-squared, percentage of right classification, McFadden proportion of right predictions, Brier Score, polytomous discrimination index, and hypervolume under ROC manifold. It is widely recognized that in-sample strategies are overly optimistic because the model is optimized for performance in the sample in which it is estimated. Evaluation of performance of approaches to internal validation has been carried out in the clinical epidemiology area with logistic regression models. This paper evaluates approaches to internal validation using synthetic and real datasets related to personal travel mode choices modeled using multinomial Logit. The performance of each approach is evaluated against the apparent performance in the full population. With both synthetic and real data, cross-validation produces the lowest bias with most metrics. The metric with lowest bias is data-specific. Lowest variability is produced by bootstrapping.
ISSN:2373-7484
2373-7484
DOI:10.1080/23737484.2025.2522358