I am a Ph D student in the linguistics department studying computational linguistics, specifically linguistic distance and computational phonology. You might want to visit [http://www.sandersn.com my personal web site] or [http://jones.ling.indiana.edu/~ncsander my holding place on jones for various things].


Within linguistic distance, I have done work with phonological distance and syntactic distance (TODO: links go here). My favourite methods are those involving similarity, although syntactic distance seems to fit well with purely statistical methods instead. Some of the background literature is Goebl (1984), [:BrettKessler:Kessler] (1995), [http://www.let.rug.nl/~heeringa/dialectology/papers/cph97.pdf Nerbonne and Heeringa (1997)], [http://www.let.rug.nl/~heeringa/dialectology/thesis/thesis.pdf Heeringa (2004)], [http://www.let.rug.nl/nerbonne/papers/aggr-syn-dist-06.pdf Nerbonne and Wiersma (2006)], [http://www.cs.ualberta.ca/~kondrak/papers/thesis.pdf Kondrak (2002)].

In computational phonology, I am interested in implementation of theoretical OT models. OT is a computationally simple and (probably) tractable model, but its weakness means that additions to make it more powerful are constantly popping up. It's not always obvious that these additions are also computable. Other problems can include a lack of formal specification (as with Uniform Exponence by Kenstowicz) or lack of testing on previous difficult examples pointed out in the literature. (For related work on Minimalism, see JoshHerring's page).

OT also has a particularly simple and attractive explanation for learning as constraint re-ranking. Several OT learning algorithms have been proposed, with various degrees of realism. I have implemented several for comparison, and added Recursive Constraint Demotion only to [http://code.google.com/p/otableau OTableau] so far.

Current Work

My funding is through the IU School of Medicine : I work for Steve Chin at IUPUI studying the linguistic development of cochlear implant users. I've been applying the linguistic distance methods mentioned above to the speech of cochlear implant users. So far I've found distances found by an algorithm that correlate well with distances perceived by humans. Now I'm trying to extract the most important features used by the distance algorithms.

The two phonological distance algorithms I've used are LevenshteinDistance and MaximumLikelihoodDistance. Levenshtein distance correlates the best with human intelligibility results, but Maximum Likelihood distance looks more promising as the work on extracting linguistically relevant features from algorithmic results develops.

The syntactic distance algorithm I've used is the one described in Nerbonne & Wiersma (2006), a permutation test on some syntactic features extracted from sentences. The features used by Nerbonne & Wiersma are trigrams, which may not perfectly capture syntax, but still seem to work better than the leaf-ancestor paths (developed by Sampson (2000) for corpus evaluation) that I tried.

I am working on qualifying papers right now. The first tests the statistical measure of dialect distance on the ICE-GB, a British corpus of transcribed and parsed conversations. The second will have to do with OT somehow, but I'm not sure what I'll do. The most likely possibility is finding a fast algorithm for generating T-Orders (Anttila 2006) now that I know that Heiberg (1999) already implemented Gen in her dissertation. I might try to find a working output-output analysis of Khalkha Mongolian's affix epenthesis patterns.


I work on [http://code.google.com/p/otableau OTableau] since I frequently have to publish OT tableaux. OTableau has become a testing ground for computational OT ideas the previously existed only as research prototypes or on Windows via OTSoft. If you are interested, please read the code and contribute. I hate GUI programming, and the next step is a port to wxPython. Maybe you can help.

I am a programming languages geek. My favourites are Scheme and Python. But I use a variety in practise. Most recently I've played around with Scala since I am extending [http://www.ic.arizona.edu/ic/heiberg/ the code from Andrea Heiberg's thesis], which is written in Java. (Don't [http://sandersn.com/blog/index.php?title=java_is_not_useful_for_research&more=1&c=1&tb=1&pb=1 make the mistake] of writing research code in Java.) One interesting thing about Scala is that it has special syntax for monads despite not needing them since it has plenty of ways to manage mutable state.

I also tried to learn a little R last month on a bet. I learned enough to write a recursive merge sort, along with car and cdr.


NathanSanders (last edited 2008-08-20 15:27:26 by NathanSanders)