TY - JOUR
T1 - Proportionate Diversification of Top-k LLM Results using Database Queries
AU - On, Thinh
AU - Ghosh, Subhodeep
AU - Du, Mengnan
AU - Roy, Senjuti Basu
N1 - Publisher Copyright:
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2023
Y1 - 2023
N2 - Result diversification aims to return relevant results that cover a variety of perspectives. Attribute-based diversification groups results by shared attributes (e.g., genre for movies) and selects a proportional number of items from each group based on their distribution in the underlying data. However, large language models (LLMs) are not designed to produce proportionally diverse results. In this work, we propose leveraging external data sources to determine the distribution of groups related to a query and prompt LLMs to produce proportionally diverse results. This can improve result diversity by representing groups in proportion to their prevalence. Specifically, we first argue the benefits of making top-k results from LLMs proportionally diverse. We then show how to use external benchmark databases to enable proportional diversity. Finally, we outline a framework that prompts LLMs with proportionality information from external data and discuss challenges in automating this process. Our approach provides a path to overcoming LLMs' limitations in producing proportionally diverse responses.
AB - Result diversification aims to return relevant results that cover a variety of perspectives. Attribute-based diversification groups results by shared attributes (e.g., genre for movies) and selects a proportional number of items from each group based on their distribution in the underlying data. However, large language models (LLMs) are not designed to produce proportionally diverse results. In this work, we propose leveraging external data sources to determine the distribution of groups related to a query and prompt LLMs to produce proportionally diverse results. This can improve result diversity by representing groups in proportion to their prevalence. Specifically, we first argue the benefits of making top-k results from LLMs proportionally diverse. We then show how to use external benchmark databases to enable proportional diversity. Finally, we outline a framework that prompts LLMs with proportionality information from external data and discuss challenges in automating this process. Our approach provides a path to overcoming LLMs' limitations in producing proportionally diverse responses.
KW - large language models (LLMs)
KW - prompting LLMs
KW - querying database
KW - top-k Diversification
UR - http://www.scopus.com/inward/record.url?scp=85171260694&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171260694&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85171260694
SN - 1613-0073
VL - 3462
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - Joint Workshops at the 49th International Conference on Very Large Data Bases, VLDBW 2023
Y2 - 28 August 2023 through 1 September 2023
ER -