Instructor: Markus Dickinson
Days: Mondays and Wednesdays
Location: Lindley Hall (LH) 030
L445 counts as a Natural and Mathematical Sciences (N&M) credit!
Present-day computer systems work with human language in many different forms, whether as stored data in the form of text, typed queries to a database or search engine, or speech commands in a voice-driven computer system. We also increasingly expect computers to produce human language, such as user-friendly error messages and synthesized speech. Through selected readings, exercises, demonstrations and Python programming, this course will: a) survey a range of issues relating natural language to computers, covering real-world applications, b) provide practical experience about representation and use of natural language on computers, and c) illustrate key principles of natural language processing through programming. Emphasis will be put on basic natural language processing strategies and technologies using linguistic theory
Topics include text encoding, search technology, tools for writing support, machine translation, dialogue systems, computer-aided language learning, and the social context of language technology.
None. That means that no prior programming experience is expected.
There will be various reading selections throughout the quarter, but most of the material will be introduced solely in the classroom. There will be approximately one homework (exercise sheet) every two weeks. These assignments give the opportunity to explore new aspects of the topics discussed in class, as well as to ensure you are comprehending the material covered in class. These assignments will occasionally also give you the opportunity to practice your programming skills. Additionally, there will be in-class exercises which are included in your participation grade.
For L515 students, there is an additional final project.
There is no textbook for this course, but there will be readings (mostly online) assigned periodically throughout the course.
For each unit, slides will be available from the webpage before class. These slides are meant to aid classroom discussion and cannot replace actually being in class.
Grades will be based on:
|Aug.||27||M||Intro to class|
|29||W||Text & speech encoding: text (.pdf, 2x3.pdf)|
|Sep.||3||M||Text & speech encoding: speech|
|5||W||Basics of Python (.pdf, 2x3.pdf, code)|
|10||M||Searching (.pdf, 2x3.pdf)||HW1 due|
|12||W||Searching: internals (handout) (Python handout)|
|17||M||Searching: regular expressions (Handouts: 1, 2)|
|19||W||Corpus annotation (.pdf, 2x3.pdf) + Python (.pdf, 2x3.pdf, code) (handout)||HW2 due (Code)|
|26||W||More Python (.pdf, 2x3.pdf)|
|Oct.||1||M||Text classification (TC) (.pdf, 2x3.pdf) (handout)|
|3||W||TC: Spam filtering (.pdf, 2x3.pdf)||HW3 due|
|8||M||Spelling & grammar correctors|
|15||M||Spelling correctors (.pdf, 2x3.pdf)|
|17||W||Spelling correctors for the web (handout)||HW4 due|
|22||M||Grammar correctors: parsing & n-grams|
|24||W||N-grams in Python (.pdf, 2x3.pdf; Code: 1, 2, 3, 4, 5, Final, useful.py)|
|29||M||Machine Translation (MT) (.pdf, 2x3.pdf) (handout)|
|31||W||Symbolic MT (handouts: 1, 2)||HW5 due|
|7||W||Dialogue systems: dialogue (.pdf, 2x3.pdf) (TRAINS transcipt, map)|
|12||M||Dialogue systems: chatterbots (ELIZA)|
|14||W||Dialogue systems: modern systems (handout)||HW6 due|
|19||M||Python practice (.pdf, 2x3.pdf)|
|21||W||NO CLASS, THANKSGIVING BREAK|
|26||M||Computer-aided language learning (CALL) (.pdf, 2x3.pdf)|
|28||W||CALL: authentic-text CALL||HW7 due|
|Dec.||3||M||CALL: parser-based CALL|
|5||W||Social context of language technology use (.pdf, 2x3.pdf)|
|10||M||FINAL PROJECT (L515) (description)||due by 5pm|
|14||F||FINAL EXAM (review)||due by 5pm|