LING 531
HLT 2: Information Retrieval
This intermediate-level course is a continuation of LING 529 Human Language Technology 1 and covers the basics of information retrieval, focusing on both search and classification.
This course presents fundamental considerations and concepts for text-based search in the context of a simple keyword search function with Boolean connectives. We’ll then refine our methods for effective search—with the goal of returning the best results, satisfying a user’s actual information need—by exploring ways to represent similarity and to use term weighting. We’ll round out our view of search by thinking about how to determine how well our search application is meeting this goal. In the second half of the course, we’ll use some of these concepts that we developed in the context of search to explore methods for document classification, comparing both statistical approaches (Naive Bayes) and vectorspace approaches (k-Nearest Neighbors, Rocchio).