DICE: Data Imputation for Cost Estimates from Multiple Sources to Model User Decision-Making

Hailun Wu, Ziqian Dong, Roberto Rojas-Cessa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Understanding key factors that affect users' commute mode choice is essential to design policies that promote sustainable transportation. However, the reliance on survey data for these studies often faces incomplete data challenges. One of the regional transportation surveys obtained for the study on commute mode decision-making misses 97% of the parking cost data, an important factor in people's decision-making. To tackle the problem, we propose the data imputation for cost estimates (DICE) scheme to synthesize data from multiple sources to infer the missing data. DICE linearly maps imputed values to missing entries based on the assumption that higher-income users can spend more on their commute. In the absence of ground truth data, we propose to use the accuracy of the regression model trained with the imputed data as a metric to evaluate DICE. We train the regression model with 75% of the imputed data, test it with the remainder, and evaluate it with the complete cases. The prediction accuracy of the test data and the evaluation data are 0.89 and 0.77, respectively. The results indicate that the imputed data and complete cases share similar distributions and the model trained with the imputed data can perform classification. We tested DICE using a 1995 transportation survey and a 2021 housing survey data sets where cost is considered a key feature in decision-making. In both cases, the regression model achieves higher than 0.7 prediction accuracy, which proves the applicability of DICE on different data sets.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE 35th International Conference on Tools with Artificial Intelligence, ICTAI 2023
PublisherIEEE Computer Society
Pages149-154
Number of pages6
ISBN (Electronic)9798350342734
DOIs
StatePublished - 2023
Event35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2023 - Atlanta, United States
Duration: Nov 6 2023Nov 8 2023

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
ISSN (Print)1082-3409

Conference

Conference35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2023
Country/TerritoryUnited States
CityAtlanta
Period11/6/2311/8/23

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence
  • Computer Science Applications

Keywords

  • commute mode choice
  • data imputation
  • decision-making
  • logistic regression
  • multiple data sources
  • regression

Fingerprint

Dive into the research topics of 'DICE: Data Imputation for Cost Estimates from Multiple Sources to Model User Decision-Making'. Together they form a unique fingerprint.

Cite this