TY - GEN
T1 - Enhancing the Travel Experience for People with Visual Impairments through Multimodal Interaction
T2 - 2025 ACM International Conference on Supporting Group Work, GROUP Companion 2025
AU - Zhang, He
AU - Falletta, Nicholas J.
AU - Xie, Jingyi
AU - Yu, Rui
AU - Lee, Sooyeon
AU - Billah, Syed Masum
AU - Carroll, John M.
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/1/12
Y1 - 2025/1/12
N2 - Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle detection, which can hinder a seamless and efficient user experience. In this paper, we present NaviGPT, a high-fidelity prototype that integrates LiDAR-based obstacle detection, vibration feedback, and large language model (LLM) responses to provide a comprehensive and real-time navigation aid for PVI. Unlike existing applications such as Be My AI and Seeing AI, NaviGPT combines image recognition and contextual navigation guidance into a single system, offering continuous feedback on the user’s surroundings without the need for app-switching. Meanwhile, NaviGPT compensates for the response delays of LLM by using location and sensor data, aiming to provide practical and efficient navigation support for PVI in dynamic environments.
AB - Assistive technologies for people with visual impairments (PVI) have made significant advancements, particularly with the integration of artificial intelligence (AI) and real-time sensor technologies. However, current solutions often require PVI to switch between multiple apps and tools for tasks like image recognition, navigation, and obstacle detection, which can hinder a seamless and efficient user experience. In this paper, we present NaviGPT, a high-fidelity prototype that integrates LiDAR-based obstacle detection, vibration feedback, and large language model (LLM) responses to provide a comprehensive and real-time navigation aid for PVI. Unlike existing applications such as Be My AI and Seeing AI, NaviGPT combines image recognition and contextual navigation guidance into a single system, offering continuous feedback on the user’s surroundings without the need for app-switching. Meanwhile, NaviGPT compensates for the response delays of LLM by using location and sensor data, aiming to provide practical and efficient navigation support for PVI in dynamic environments.
KW - accessibility
KW - AI-assisted tool
KW - disability
KW - llm
KW - mobile application
KW - multimodal interaction
KW - navigation
KW - People with visual impairments
KW - prototype
UR - http://www.scopus.com/inward/record.url?scp=85216255475&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85216255475&partnerID=8YFLogxK
U2 - 10.1145/3688828.3699636
DO - 10.1145/3688828.3699636
M3 - Conference contribution
AN - SCOPUS:85216255475
T3 - GROUP Companion 2025 - 2025 ACM International Conference on Supporting Group Work
SP - 29
EP - 35
BT - GROUP Companion 2025 - 2025 ACM International Conference on Supporting Group Work
PB - Association for Computing Machinery, Inc
Y2 - 12 January 2025 through 15 January 2025
ER -