LING 529

Human Language Technology 1

This class serves as an introduction to human language technology (HLT), an emerging interdisciplinary Field that encompasses most subdisciplines of linguistics, as well as computational linguistics, natural language processing, computer science, artificial intelligence, psychology, philosophy, mathematics, and statistics. Content includes a combination of theoretical and applied topics such as (but not limited to) tokenization across languages, n-grams, word representations, basic probability theory, introductory programming, and version control.

In this course, we will cover fundamental concepts related to human language technology, such as tokens and their attributes, text normalization techniques, tokenization and regular expressions, character and word n-grams, basics of probability theory, representing words and documents as vectors, and vector-based comparisons. The course will foster technical skills, such as linux command line basics, virtualization and containerization technologies (VMs, Docker), version control (git), and the feature branch workflow.

Most recent syllabus

All past syllabi for this course