Present-day computer systems work with human language in many different forms, whether as stored data in the form of text, typed queries to a database or search engine, or speech commands in a voice-driven computer system. We also increasingly expect computers to produce human language, such as user-friendly error messages and synthesized speech. Through selected readings, exercises, demonstrations and Python programming, this course will: a) survey a range of issues relating natural language to computers, covering real-world applications, b) provide practical experience about representation and use of natural language on computers, and c) illustrate key principles of natural language processing through programming. Emphasis will be put on basic natural language processing strategies and technologies using linguistic theory
Topics include text encoding, search technology, tools for writing support, machine translation, dialogue systems, computer-aided language learning, and the social context of language technology.
Office: Memorial Hall (MM) 317
E-mail: md7 ...AT... indiana ...DOT... edu
Office hours: (at least for the first week)
|or by appointment|
Meeting time: MW 9:30-10:45am
Classroom: Lindley Hall (LH) 030
Course website: http://jones.ling.indiana.edu/~mdickinson/08/445/
Assignments, slides, etc. will be posted here.
Course prerequisites: None. That means that no prior programming experience is expected.
There will be various reading selections throughout the quarter, but most of the material will be introduced solely in the classroom. There will be approximately one exercise sheet, or homework, every two weeks. These assignments give you the opportunity to explore new aspects of the topics discussed in class, as well as to ensure that you are comprehending the material covered in class. These assignments will occasionally also give you the opportunity to practice your programming skills. Additionally, there will be in-class exercises which are included in your participation grade.
There is no textbook for this course, but there will be readings assigned periodically throughout the course.
For each unit, slides will be available from the webpage before class. These slides are meant to aid classroom discussion and cannot replace actually being in class.
Grades will be based on classroom discussion/participation, homeworks, a midterm exam, and a final examination. For 515, there will be an additional final project.
|MIDTERM||25%||Wednesday, October 20 @ 9:30am|
|FINAL||25%||Monday, December 15 @ 8:00am|
|MIDTERM||20%||Wednesday, October 20 @ 9:30am|
|FINAL||20%||Monday, December 15 @ 8:00am|
|FINAL PROJECT||20%||Wednesday, December 17, by 5:00pm|
If you plan on missing either the midterm or final, you will have to provide extensive documentation for your excuse. See me immediately if this is the case.
For those enrolled at the 515 level, there is a final project requirement, the topics of which will be discussed individually with the instructor (beginning in October). The projects will generally be papers extending discussion of specific topics touched on in class, although they may also be implementations of a specific natural language processing algorithm (documented and evaluated) or evaluation of existing algorithms and software systems. The projects will be due on Wednesday, December 17 at 5:00pm.
To assist you in learning how to think logically & algorithmically, you are going to be taught some fundamentals of programming, using the Python programming language. We will include this in various class sessions (not always listed on the syllabus). I expect that most of you have absolutely no experience in programming and might be a little (or a lot) scared of it, and so I want to be clear about a few points:
Academic misconduct is not allowed in this course. The Indiana University Code of Student Rights, Responsibilities, and Conduct (http://dsa.indiana.edu/Code/) defines academic misconduct as ``any activity that tends to undermine the academic integrity of the institution . . . Academic misconduct may involve human, hard-copy, or electronic resources . . . Academic misconduct includes, but is not limited to . . . cheating, fabrication, plagiarism, interference, violation of course rules, and facilitating academic misconduct'' (II. G.1-6).
Students who need an accommodation based on the impact of a disability should contact me to arrange an appointment as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations.
I rely on Disability Services for Students for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted Disability Services are encouraged to do so (812-855-7578; http://www.indiana.edu/~iubdss/).
Links to notes and homeworks will be posted on the course website.
|Sep.||3||Intro to class|
|8||Text & speech encoding: text (.pdf, -2x3.pdf)|
|10||Text & speech encoding: speech|
|15||Programming basics (.pdf, -2x3.pdf)|
|17||Searching (.pdf, -2x3.pdf)||HW1 due|
|22||Searching: internals (handouts: 1, 2)|
|24||Searching: regular expressions (handout: 1, 2)|
|29||Corpus annotation (.pdf, -2x3.pdf)||HW2 due|
|Oct.||1||Python2 (.pdf, -2x3.pdf), Text classification (TC) (.pdf, -2x3.pdf)|
|6||TC: Spam filtering (.pdf, -2x3.pdf)|
|8||TC: Spam filtering|
|13||Spelling & grammar correctors (.pdf, -2x3.pdf)||HW3 due (code)|
|22||Spelling correctors for the web (handout)|
|27||Grammar correctors: n-grams (handout)||HW4 due|
|29||Grammar correctors: syntax/parsing|
|Nov.||3||Python3 (.pdf, -2x3.pdf)|
|5||Machine Translation (MT) (.pdf, -2x3.pdf) (handout)|
|10||Symbolic MT (handouts: 1 2)||HW5 due (code)|
|17||Dialogue systems: dialogue (.pdf, -2x3.pdf) (data, handout)|
|19||Dialogue systems: chatterbots \& modern systems (handout)|
|24||N-grams in Python (.pdf, -2x3.pdf; Code: 1, 2, 3, 4, 5, Final, useful.py)||HW6 due (code, .txt)|
|26||NO CLASS, THANKSGIVING BREAK|
|Dec.||1||Computer-aided language learning (CALL) (.pdf, -2x3.pdf)|
|3||CALL: authentic text CALL|
|8||CALL: parser-based CALL||HW7 due|
|10||Social context of language technology use (.pdf, -2x3.pdf)|
|15||FINAL EXAM, 8:00-10:00am (review)||FINAL|
|17||FINAL PROJECT (L515) (overview)||due by 5pm|