So much of our modern lives happens through computers, phones, and other electronic devices. The same is true of our language. How do computers shape our language, and how do we adapt computers to our language use? This class considers a range of topics connecting language and computers, including speech recognition systems (Alexa, Siri, Google Assistant), emoji use, and sentiment analysis. It also covers basic linguistic and algorithmic concepts to help understand the strengths and failures of our contemporary language systems.
Students in this class tend to have a pretty wide range of majors, interests, and experiences. Some know much more about language, others much more about computers, and some know both or neither. My goal in the class is to fill in the gaps in students’ linguistic or computational knowledge, then build out into the applications they find interesting. For instance, one Ling 354 student ended up submitting an emoji proposal to the Unicode Consortium after this class!
Here are some of the topics we cover, along with links to the reading lists (extracted from Canvas, so apologies for their imperfections). I’ll be adding to these slowly, because extracting them out from Canvas is exhausting!
Writing Systems, Unicode, and Emoji
Computers think in ones and zeros, or more accurately, in “ons” and “offs”. Human writing systems are much more complex; even the smallest, Rotokas, contains 12 letters. Furthermore, many writing systems lack “letters” in the sense of the English alphabet, with characters that indicate whole syllables or even words.
In this topic, we’ll look at the diversity of linguistic writing systems (alphabets, abjads, syllabaries, etc.), and how these get represented on a computer through Unicode, with each character getting its own number.
We’ll also examine emoji and start asking: what’s their relationship to language? Are they linguistic elements, like words? Are they paralinguistic, like gestures or intonation? Or are they something else entirely?
Readings:
Project: Propose an emoji
Speech recognition
Speech recognition is probably one of the most common ways that you encounter “natural language processing”: getting a computer to understand human language. How do Alexa, Siri, Google, and other voice-activated assistants work? What causes difficulties for them, and how can we overcome them?
For that matter, how does human speech recognition work? Why do they screw up your name at cafes? Why do so many people think my name is “Dave”? We’ll try to get to the bottom of these mysteries and more.
In this topic, we’ll look at the range of linguistic sounds, how language structure their sound inventories, how humans hear and understand spoken language, and how computers can (try to) do the same. We’ll also examine some of the shortcomings of speech recognition, especially involving less-studied languages and dialects. What biases do these systems have, and what can we do about them?
Readings:
Spell-check, autocorrect, & grammar checking
Once we have the basics of speech recognition down, we can turn to how computers understand larger linguistic structures, like words and sentences. What “language models” does a computer have, and how are these used?
We’ll look through the lens of autocorrect and grammar checking, to understand how computers deal with input that doesn’t fit their expectations. When I type “langauge”, did I mean to type “language”, or did I mean to type this weird non-word? That’s pretty easy to tell, but what if I typed “causal”? What are the odds I meant to type “casual”? Should the system ask me? Should it autocorrect? What information can the system use to improve its guesses?
We’ll examine language models from the most basic (word frequency) to more clever ones. We’ll see how they can be used in speech recognition, autocorrect, and even autocompletion.
Readings: