Linguistics 445/515

The Computer and Natural Language

Fall 2007

This syllabus is tentative and subject to change.

Instructor: Markus Dickinson

Days: Mondays and Wednesdays

Time: 2:30-3:45pm

Location: Lindley Hall (LH) 030

Credits: 3

L445 counts as a Natural and Mathematical Sciences (N&M) credit!

Course goals

Present-day computer systems work with human language in many different forms, whether as stored data in the form of text, typed queries to a database or search engine, or speech commands in a voice-driven computer system. We also increasingly expect computers to produce human language, such as user-friendly error messages and synthesized speech. Through selected readings, exercises, demonstrations and Python programming, this course will: a) survey a range of issues relating natural language to computers, covering real-world applications, b) provide practical experience about representation and use of natural language on computers, and c) illustrate key principles of natural language processing through programming. Emphasis will be put on basic natural language processing strategies and technologies using linguistic theory

Topics include text encoding, search technology, tools for writing support, machine translation, dialogue systems, computer-aided language learning, and the social context of language technology.

Course prerequisites: None. That means that no prior programming experience is expected.

Course requirements:

There will be various reading selections throughout the quarter, but most of the material will be introduced solely in the classroom. There will be approximately one homework (exercise sheet) every two weeks. These assignments give the opportunity to explore new aspects of the topics discussed in class, as well as to ensure you are comprehending the material covered in class. These assignments will occasionally also give you the opportunity to practice your programming skills. Additionally, there will be in-class exercises which are included in your participation grade.

For L515 students, there is an additional final project.


There is no textbook for this course, but there will be readings (mostly online) assigned periodically throughout the course.

For each unit, slides will be available from the webpage before class. These slides are meant to aid classroom discussion and cannot replace actually being in class.


Grades will be based on:

Tentative Schedule:

Month Date Day Topic Assignments
Aug. 27 M Intro to class  
  29 W Text & speech encoding: text (.pdf, 2x3.pdf)  
Sep. 3 M Text & speech encoding: speech  
  5 W Basics of Python (.pdf, 2x3.pdf, code)  
  10 M Searching (.pdf, 2x3.pdf) HW1 due
  12 W Searching: internals (handout) (Python handout)  
  17 M Searching: regular expressions (Handouts: 1, 2)  
  19 W Corpus annotation (.pdf, 2x3.pdf) + Python (.pdf, 2x3.pdf, code) (handout) HW2 due (Code)
  24 M Corpus annotation  
  26 W More Python (.pdf, 2x3.pdf)  
Oct. 1 M Text classification (TC) (.pdf, 2x3.pdf) (handout)  
  3 W TC: Spam filtering (.pdf, 2x3.pdf) HW3 due
  8 M Spelling & grammar correctors  
  10 W MIDTERM (review) MIDTERM
  15 M Spelling correctors (.pdf, 2x3.pdf)  
  17 W Spelling correctors for the web (handout) HW4 due
  22 M Grammar correctors: parsing & n-grams  
  24 W N-grams in Python (.pdf, 2x3.pdf; Code: 1, 2, 3, 4, 5, Final,  
  29 M Machine Translation (MT) (.pdf, 2x3.pdf) (handout)  
  31 W Symbolic MT (handouts: 1, 2) HW5 due
Nov. 5 M Statistical MT  
  7 W Dialogue systems: dialogue (.pdf, 2x3.pdf) (TRAINS transcipt, map)  
  12 M Dialogue systems: chatterbots (ELIZA)  
  14 W Dialogue systems: modern systems (handout) HW6 due
  19 M Python practice (.pdf, 2x3.pdf)  
  26 M Computer-aided language learning (CALL) (.pdf, 2x3.pdf)  
  28 W CALL: authentic-text CALL HW7 due
Dec. 3 M CALL: parser-based CALL  
  5 W Social context of language technology use (.pdf, 2x3.pdf)  
  10 M FINAL PROJECT (L515) (description) due by 5pm
  14 F FINAL EXAM (review) due by 5pm