[Home] [Contact] [CV] [Research] [Education] [Awards] [Publications] [Presentations] [Service] [QCS]

Conducting scientific research most often involves a search through existing literature in order to avoid repeating research efforts, review methods already developed for solving a problem, gain a better understanding of a problem, etc. Typically, this search is performed using the Internet, which is a convenient portal to various databases of books, journal articles, technical reports, preprints, etc.
The Query, Cluster, Summarize (QCS) information retrieval system is presented in an attempt to improve efficiency in these literature searches. Given a query, QCS retrieve documents relevant to the query, separates the retrieved documents into topic clusters, and creates a single summary for each of topic clusters. Latent Semantic Indexing is used retrieval, generalized spherical k-means (gmeans) is used for the document clustering, and a hidden Markov model coupled with a pivoted QR decomposition is used to create a single extract summary for each topic cluster.
The QCS system currently works with a set of MEDLINE abstracts and documents from news agencies and newswire services, and has been tested with documents used for summarization evaluation in the Document Understanding Conferences (2001-2003).This project was developed for Advanced Scientific Computation (AMSC 663-4), the graduate qualifying sequence for the Scientific Computation Concentration of the Applied Mathematics and Scientific Computation (AMSC) Program at the University of Maryland, College Park.
Advisor:
Dianne O'Leary
Course Instructors:
Dave Levermore,
Bill Dorland
Comments and suggestions on anything related to QCS or this information page are certainly welcome. You can email me at ddunlavy@cs.umd.edu. Thanks in advance.