Gauging Library Needs for Advanced AI-Assisted Cataloging
Project Introduction
The current era of unprecedented information proliferation and increasing multilingual
diversity challenges libraries’ traditional cataloging and resource management processes.
Cutting-edge artificial intelligence (AI) tools known as large language models (LLMs),
which excel at processing natural language, have the potential to assist librarians
in their quest to organize and provide access to their ever-growing collections. By
combining the capability of AI with the expertise of catalogers, we aim to create
a synergy that will empower catalogers to be as efficient and accurate as possible
as they enhance the accessibility and inclusivity of library resources.
The 2-year Applied Research grant will investigate the applicability of LLMs running
locally to assist the subject cataloging of digital and print resources. We plan to
address two main questions. RQ1: How can LLM-based models be developed to generate accurate cataloging results, particularly
classification and subject analysis, for both English and foreign language resources?
RQ2: How can AI models be integrated into cataloging procedures to assist librarians?
This project aims to build knowledge for the future development and deployment of
LLM-based applications for cataloging.
Project Outcomes
Publications
- Liu, J., Song, X., Zhang, D., Thomale, J., He, D., & Hong, L. (2025). A Hybrid Framework for Subject Analysis: Integrating Embedding-Based Regression Models
with Large Language Models. Proceedings of the Association for Information Science and Technology. (Accepted)
- Luo, P., Hong, L., & Nie, L. (2025). Automatic classification of research data sets into the Chinese Library Classification
with generative large language model. The Electronic Library.
Presentations
Workshop Organization
Data and Code
- Project GitHub repository:
- Code for processing MARC records:
PIs
Project Director and Lead Principal Investigator
- Dr. Lingzi Hong, Assistant Professor in the Department of Data Science at UNT.
Co-Principal Investigator
- Jason Thomale, Resource Discovery Systems Librarian in University of North Texas Libraries at UNT.
Advisory Board
- Kevin Yanowski, Department Head of the Cataloging and Metadata Services at the University
of North Texas Libraries.
- Casey Mullin: Head of Cataloging and Metadata Services at Western Washington University Libraries.
- Charlene Chou: Head of Knowledge Access at the New York University Libraries.
- Sarah Hovde: Monographs & Media Cataloging Librarian (Librarian II) at the University of Maryland
Libraries.
- Dr. Jian Wu: Assistant Professor of Computer Science at Old Dominion University.
- Dr. C. Lee Giles: David Reese Professor, College of Information Sciences and Technology at the Pennsylvania
State University.
Relevant Resources
- Library of Congress' recent experiments with AI for cataloging tasks: https://labs.loc.gov/work/experiments/ECD/
- An interview study with catalogers on the applicability of AI to cataloging tasks:
https://static.sched.com/hosted_files/2024coreforum/96/KateSlauson-ALACore.pdf
- Blog about the AI for cataloging: https://ruthtillman.com/talk/mcls-waiting-for-production/
Acknowledgement
This work was supported by the Institute of Museum and Library Services under Grant
(IMLS) LG-256666-OLS-24. The opinions, findings, and conclusions expressed in this publication are those
of the author(s) and do not necessarily reflect the views of IMLS.

Last updated Dec 2, 2024