Linguistics 555
Programming for Computational Linguists
Autumn 2012

Course goals This course is geared towards students concentrating in Computational Linguistics with little or no experience in programming; Linguistics students are welcome, too. It will introduce the fundamentals of programming and computer science, aiming at attaining practical skills for text processing. While we will work with Python and the Natural Language Toolkit (NLTK), the main focus is on introducing basic concepts in programming, such as loops or functions. In contrast to similar courses in Computer Science, we will concentrate on problems in Computational Linguistics, which generally involve managing text, searching in text, and extracting information from text. For this reason, one part of the course will concentrate on regular expression searching.

Through lectures, lab sessions, and (bi-)weekly assignments, students will learn the essentials of Python and NLTK and how to apply these skills to natural language data.

By the end of the course, you should be able to:

Meeting time: MW 11:15am–12:30pm
      …but note: we may be changing this time to accommodate everyone

Classroom: Memorial Hall (MM) 401

Course website: http://jones.ling.indiana.edu/~mdickinson/10/555/ Assignments, slides, etc. will be posted here.

Credits: 3

Course prerequisites: None. That means that no prior programming experience is expected.

Instructor: Markus Dickinson

Office: Memorial Hall (MM) 317

Phone: 856-2535

E-mail: md7@indianagoat.edu (remove the animal name)

Office hours: (at least for the first week)

M1:00pm–2:00pm
R 11:00am–12:00pm
or by appointment

Readings: We will work with the following (recommended) textbooks, focusing more on the first one:

Grading: Grades will be based on classroom discussion/participation, homeworks, a midterm exam, and a final examination.

Participation 10%
Homeworks 50%(10@5% each)
Midterm exam20%Due Wednesday, October 10 by classtime
Final exam 20%Due Wednesday, December 17 @ 5:00pm

Academic Misconduct: Academic misconduct is not allowed in this course. The Indiana University Code of Student Rights, Responsibilities, and Conduct (http://dsa.indiana.edu/Code/) defines academic misconduct as “any activity that tends to undermine the academic integrity of the institution . . . Academic misconduct may involve human, hard-copy, or electronic resources . . . Academic misconduct includes, but is not limited to . . . cheating, fabrication, plagiarism, interference, violation of course rules, and facilitating academic misconduct” (II. G.1-6).

Students with Disabilities: Students who need an accommodation based on the impact of a disability should contact me to arrange an appointment as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations.

I rely on Disability Services for Students for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted Disability Services are encouraged to do so (812-855-7578; http://www.indiana.edu/~iubdss/).

Schedule: This schedule is subject to change—and I can basically promise you that it will, depending upon which concepts need more or less clarification. Links to notes and homeworks will be posted on the course website.

MonthDateTopic Readings Assignments










Aug. 20Intro to class/programming (.pdf, 2x3.pdf)
22Unix (.pdf, 2x3.pdf)





27Unix (handouts: 1, 2, 3, extra) ch. 1, p. 1–9, Unix For Poets
29Intro to python (.pdf, 2x3.pdf) ch. 1, p. 9–30 A1 due





Sep. 3Labor Day, no classes
5Lists & Tuples ch. 2, p. 31–40





10Lists & Tuples ch. 2, p. 40–52 A2 due
12Conditionals & Loops ch. 5, p. 83–97





17Conditionals & Loops ch. 5, p. 97–112 A3 due
19Conditionals & Loops





24File input & output ch. 11 (p. 261–276) A4 due
26Strings ch. 3, p. 53–59





Oct. 1Strings ch. 3, p. 60–67
3NLTK overview NLTK, ch. 1 A5 due





8Midterm review
10MIDTERM DUE





15NLTK NLTK, ch. 1 (cont.)
17Dictionaries ch. 6, p. 69–74





22Dictionaries ch. 6, p. 74–81 A6 due
24Dictionaries





29Regular expressions ch. 10, p. 242–257 A7 due
31Regular expressions ch. 10, p. 242–257





Nov. 5Functions ch. 6, p. 113–130 A8 due
7Functions ch. 6, p. 131–140





12Modules ch. 10, p. 209–241 A9 due
14NLTK tagging NLTK, ch. 5





19Thanksgiving break, no classes
21Thanksgiving break, no classes





26NLTK tagging NLTK, ch. 5 (cont.)
28Testing code ch. 16, p. 349–358





Dec. 3Testing code ch. 16, p. 358–364 A10 due
5Final review





12FINAL DUE (5pm)

If we have time, we’ll also look at classes and objects, as outlined in chapter 7.

Disclaimer This syllabus is subject to change. All important changes will be made in writing, with ample time for adjustment. (Midterm and final dates, however, will not change.)