Ling 354: Week 6

[This is extracted from the Spring 2022 version on Canvas, so some links/formatting may be broken.]

This week, we’re digging deeper into spell checking and language models. We’ll focus on simple models first, like n-grams, which we talked about at the end of class. That will require us to spend a little time talking about probabilities (especially conditional probabilities) as well.

Textbook reading

First, please read the remainder of Chapter 2 of the textbook. Don’t panic if you’re having trouble with Section 2.4.1; that’s a too-brief overview of syntax, a quite complex part of linguistics. We’ll go into more depth on syntax next week once we understand simpler language models like n-grams, but getting some familiarity with the concepts now will help when we come back to it next week.

Additional articles, etc.

I have some additional notes that should help with understanding the concepts in this chapter. The first is a basic overview of probability theory for linguistics (PDF).  This is optional reading, but if you’re not familiar with probabilities or hate math, I hope you’ll find it an accessible introduction to the topic, which will come up a few times this semester. I originally developed it for my Ling 502 class, but the concepts apply equally well in this class. We’ll talk about conditional probability and Bayes’ Rule as ways of working with n-gram models this week.

The second set of notes looks at how we collect and use linguistic data to try to build better language models (PDF). In particular, you might find it interesting to play around on Google Books N-grams to look at real-world usage data and see how increased context changes the probabilities of certain words.

The last set of notes (PDF) digs into the topic of Under the Hood 3: dynamic programming. I don’t think the book does a great job of explaining how dynamic programming (in the form of topological orderings on directed acyclic graphs) works, so I worked through a few examples.

Finally, let’s wrap it up with a look at how spell checkers succeed and fail in practice. First, here are two blogposts on the Cupertino effect, an unintended consequence of early automatic spelling correction systems (linklink). Second, a blogpost from the team at Microsoft that worked on Office 2007’s, discussing how they chose to trade off between high precision (if it labels something an error, it’s probably right, but it also misses some errors) versus high recall (it catches most errors, but also flags a lot of non-errors). I found their discussion of user preferences really interesting, and I’d like us to talk on Thursday about user design in these kinds of systems.


Ling 354: Week 5

[This is extracted from the Spring 2022 version on Canvas, so some links/formatting may be broken.]

This week, we’re building on last week’s core ideas about speech recognition. We’ll start by discussing biases in speech recognition and how to overcome them, based on the Scientific American article as well as two others I’m posting here on machine learning biases. Next, we’ll talk about a simple but conceptually useful algorithm, which can help us design a simple speech recognition model: the nearest-neighbors algorithm. Those readings will help us shore up our phonetic model, and the last topic of this week’s class will be starting in on the language model. We’ll go back to the textbook and examine how spell checkers and autocorrect systems work, and what information they use to infer what you meant when the input data is noisy or erroneous.

Additional articles/videos

We’ll start by talking about the short article from Scientific American that examines which dialects of English are actually captured by current speech recognition technology. This is the same article as last week’s reading, so hopefully you’ve already read it.

I wanted to add a little more context for how biases emerge in machine learning/AI algorithms like speech recognition systems, and I think these two are pretty good. The first, from the MIT Technology Review (HTML), is a brief summary of some of the common sources of AI bias and why fixing them is nontrivial.

The second, a Medium post by a data scientist (HTML), digs deeper into how we can quantify biases and makes the argument that the very process of machine learning is a biased perception of data, so addressing bias in machine learning is even more complex than it initially seems. (I’ll confess I’m not entirely won over by this argument, which feels a little too hand-washy, but I think the idea is worth ruminating on.)

You might be wondering how we actually implement speech recognition. We talked in class about the features that different sounds have, but how does an algorithm classify them? I’ve written up some notes on a simple classification algorithm, known as Nearest Neighbors. This algorithm trains on labelled examples of different sounds in a language or dialect (e.g., a bunch of examples of people producing a specified vowel) and classifies a new sound based on which labelled examples it’s most similar to. While modern speech recognition systems use more complex algorithms than this one, nearest neighbors is a nice tradeoff between effectiveness and ease of implementation. We’ll discuss the algorithm and how to apply it to speech recognition and other linguistic tasks in class.

Textbook reading

Lastly, please read Sections 2.1-2.3, excluding “Under the Hood 3: Dynamic programming” from the textbook. This section covers the basics of spell checking/autocorrect, as well as our first exposure to trigram models, which will pop up a few more times through the semester. We’ll cover the rest of the chapter next week, including the dynamic programming section, so read ahead if you’re interested.


Ling 354: Week 4

[This is extracted from the Spring 2022 version on Canvas, so some links/formatting may be broken.]

This week, we’re turning to speech recognition. How do Alexa, Siri, Google, and other voice-activated assistants work? What causes difficulties for them, and how can we overcome them? For that matter, how does human speech recognition work? Why do they screw up your name at cafes? Why do so many people think my name is “Dave”? We’ll try to get to the bottom of these mysteries this week and next.

Textbook reading

Sect 1.4. This covers the basics of speech recognition from a computer’s perspective, as well as a quick overview of the sound patterns of human language, which is covered in much more detail in the reading below.

Additional articles/videos

Read through Sections 2.1, 2.2, 2.3 and 2.6 of Language Files. This provides more detail, from a linguistic perspective, on how linguistic sounds are produced (2.1-2.3) and perceived (2.6). Any reasonable speech recognition system will need to incorporate this sort of information to accurately determine what sounds people are making.

Also, the discussion of syllable structure should help clarify the nature of syllabaries and abugidas from our discussion of writing systems.  (Sections 2.4 and 2.5 are less important for English speech recognition, so you can skip them. But in case you’re interested in the structure of language more generally, I left them in the file for you.)

Since that’s pretty dense reading, I want to wrap up the week with one short article from Scientific American that examines which dialects of English are actually captured by current speech recognition technology. Think about cases where you or your friends are misunderstood, whether by humans or computers, and we’ll talk about how these failures arise and can be countered.

(One last thing, and strictly optional, but the Proceedings of the National Academy of Sciences article that forms the basis of the SA article is pretty good, and worth a look if you have the time/interest.)


Ling 354: Week 3

[This is extracted from the Spring 2022 version on Canvas, so some links/formatting may be broken.]

In our Week 3 meeting, we’ll first wrap up emoji, including digging a bit deeper into how deeply we share our understanding of emoji. We’ll then turn to the QWERTY effect, research that argues that the ways we type language has a subtle but significant influence on our perception of it. 

Textbook reading

No textbook reading for this week.  

Additional articles/videos

We’ll start class by discussing the Bai et al 2019 paper (A Systematic Review of Emoji: Current Research and Future Perspectives) that I’d meant to get to last class. Hopefully, you’ve already read it, but here’s the links again in case it’s helpful (HTML).

A couple people in the pre-class discussion had questions about a point that Bai et al made, which is that emoji are prone to “inefficiency” and “misunderstanding”. I’ll be honest: I was also confused by Bai et al’s discussion on this point. So I went back to the papers they cited, and I found one that both clarifies this point and is interesting in its own right: Tigwell & Flatla 2016. We’ll discuss this paper alongside the Bai et al one, and talk more generally about how messages are understood and misunderstood. (Optionally, if you’re interested in these issues, you may also want to read this paper.)

For the QWERTY effect, we’ll be reading an original research paper: Jasmin and Casasanto 2012. The statistical analysis in this paper may be a little tough if you’re not familiar with such things, so if you’re feeling stuck, focus on the higher-level concepts over the specific results. What is the QWERTY effect supposed to be? What do J&C think might cause it? How do they propose testing it? Do you find their methods convincing? How could you adapt this work to investigate languages/cultures with other keyboards and other writing systems?

Class Overview Courses LMS

Ling 502: Language, Mind & Society

Class Overview

Language does not exist in a vacuum; every time we use language, it’s shaped by communicative, cognitive, and learning pressures. In this class, we combine concepts from theoretical linguistics with the real-world setting of its use, providing an overview of language acquisition, psycholinguistics (language in the mind), and sociolinguistics (language in social settings).

The core idea of this course unfolds from the ways that our minds shape language individually, the core of psycholinguistics. As we understand the influence of individual minds on the language, we can move out to examine how interactions between people, and aspects such as communicative goals and social identities can shape the language: sociolinguistics. Psycho- and sociolinguistic influences, amassed over generations, lead to language change and standardization. And all of this structure is constrained by the fact that it must be learnable, generation by generation, through language acquisition.

The goal of this class is not only to discuss key concepts within these areas of linguistics, but to build bridges between them. Ideally, this will provide a chance for you to explore applications for your linguistic knowledge, and to spur new avenues for linguistic research. We also look at ways to bring social media and other emerging linguistic data sources into linguistic research to gain new insights into the interactions of languages, minds, and society.


We use Julie Sedivy’s textbook Language in Mind as the building block for the first part of the class, as it provides a helpful overview of the state of research in language acquisition, psycholinguistics, and the basics of sociolinguistics. Throughout the class, we dive deeper into specific research papers on these topics, and they take on a more prominent role as the class progresses into sociolinguistics.

The course syllabus is available here.

Language Acquisition

By the time you’re an adult, it’s really easy to forget that you needed to learn language in the first place. Years of relatively effortless language use can make it seem like a trivial task. But the second you step inside a language classroom, that misconception evaporates. So how do kids do it, and why do they seem so much better at it than us adults?

In this portion of the class (Weeks 1-4), we examine what, if any, linguistic structure children are born with, and what they build through exposure to other people’s language use. We discuss the innatist vs. emergentist perspectives, Universal Grammar and linguistic relativism (e.g., the Whorf hypothesis), and probablistic rational models of acquisition.

We cover chapters 4 and 5 of the textbook, along with the following papers: Maye et al 2002 (data visualization), Yurovsky et al 2017, Gentner & Goldin-Meadow 2003, and Braginsky et al 2016.

Additional readings that may be useful for this topic are in this Google Drive.

I’ve also made a some introductory notes on reading linguistics research papers. The first is a video where you can read along with me on the Maye et al paper. The second is a set of notes on the Yurovsky et al paper.


Psycholinguistics is all about the representation of language in the mind. Have you ever wondered why some sentences are harder to understand than others? Have you ever had a word stuck on the tip of your tongue? Ever said something that was perfectly clear to you but incomprehensible to everyone else?

Much of this comes from the fact that language has to filter not only through the brain of the producer but of the audience as well. Understanding the ways that language is structured in the mind can help us understand why some linguistic tasks are easy and others are hard.

We introduce probabilistic frameworks for understanding cognitive pressures on language, including Bayesian analysis and the Rational Speech Act model.

This covers Chapters 8 through 10 of the textbook, as well as the following papers: Ferreira & Patson 2007, Doyle & Frank 2015, Goodman & Frank 2016, Yoon, Tessler, et al 2016, and Keysar et al 2012.

Additional readings on psycholinguistics can be found here.

I’ve made some notes on the RSA model to accompany Goodman & Frank 2016, as well as some notes on probability in psycholinguistics more generally.


As we think about not just our own minds’ influences on language, but also other people’s, we are inevitably driven toward sociolinguistics, the study of how language is shaped by its use and users. We mainly consider cognition-focused aspects of sociolinguistics in this class, examining speaker and audience design, communicative goals, and assertions of identity. We also look at how small social & cognitive pressures can build up over time into language change on the historical level.

This covers Chapter 11 of the textbook, as well as the following papers: von der Malsburg et al 2020, Clark & Schaefer 1987, Guydish & Fox Tree 2021, Nevalainen & Raumolin-Brunberg 2003, Wagner 2012, Coupland 1998, Labov 1963, Eckert 2012, Eckert 2011, Lewis et al 2014, Mahowald et al 2012.

Additional readings on sociolinguistics can be found here.

images sourced from unsplash

Courses Phonology

Ling 521: Phonology

Despite the single-word name, Ling 521 covers basic phonetics and phonology. Phonetics is the study of linguistic sounds, both how they are produced (articulatory phonetics) and processed (acoustic & auditory phonetics). Phonology is the study of how sounds get used and organized within languages. In short, this class covers the key points of how spoken language is produced and perceived.

We use Elizabeth Zsiga’s book The Sounds of Language: An Introduction to Phonetics and Phonology for most of this class. I’ve included my own notes and worksheets on these various topics below.

picture of mouth

Articulatory Phonetics

We start by covering the basics of articulation. How do we produce the various sounds of a language? What makes an “s” sound different from a “z” sound? (Hint: touch your throat as you make each of them. You should feel vibrations from one but not the other.)

How do languages differ in the set of sounds they contain? How do we explain and classify the differences between sounds? How do you make the Arabic sound that gets written as “Q” in English? Why aren’t the vowels in English lay and Spanish leche quite the same?

We cover the International Phonetic Alphabet, articulatory phonetic features, and airstream mechanisms. We also spend a lot of time making sounds to each other, feeling out how unfamiliar sounds are produced.

Acoustic Phonetics

These different articulations only matter because they change the sound of the airflow being produced; it’s primarily the sounds rather than the articulations that we use to tell what someone’s saying.

Acoustic phonetics covers the basics of sound waves, and how they are broken down into their components by our auditory system. We examine how articulatory differences induce acoustic differences, looking at waveforms, spectral slices, and spectrograms. We focus on such acoustic features as the fundamental frequency and the F1 and F2 formants.

Rule-Based Phonology

Transitioning to phonology, we now examine how languages put together their sounds. We briefly look at phonotactics, the relative acceptability of different sound sequences in a specific language. We focus on formalizing the relationship between the mental lexicon’s underlying forms of a word or intonational phrase, and the way that these actually surface in production.

We cover Sounds Patterns of English-style notation for rules, including feature-value pairs, feature bundles, and alpha notation. We discuss common and uncommon phonological processes cross-linguistically, and transition from articulatory phonetic features to abstract phonological features. We handle phonological analyses from a wide variety of languages, including cases of rule ordering, feeding, and bleeding.

Constraint-Based Phonology

We wrap up with a brief overview of Optimality Theory, a prominent constraint-based approach to phonology. Whereas rule-based phonology treats the phonological processes like an assembly, with a sequence of precisely-defined changes applied unwaveringly, constraint-based phonology considers multiple possible surface forms, and chooses the one that best satisfies the violable constraints of the language. (This is my favorite part of the class.)

We cover the four key components of Optimality Theory (Gen, Con, H, and Eval) and the main constraints (Max, Dep, Ident, Agree, NoCoda, etc.). We look at how constraint rankings are determined for a language through ranking arguments, and apply these to a range of complex phenomena.

images sourced from unsplash