SPEECH and LANGUAGE PROCESSING

An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition
Second Edition

Last Update
Monday, September 04, 2006
Select for the PDF version of the chapter.
Select the title for chapter teaching & book notes.
The 2nd Ed. should be available in its finished form in July '07. We'll continue to post new and revised chapters here as they become available. As usual, we welcome your comments. When sending comments please indicate clearly that you're referring to new and revised chapters. As in "Bug in Ch 8, 2ed" .

Table of Contents

Preface

1 Introduction

I: Words

2 Regular Expressions and Automata
Chapter 3 PDF 3 Words and Transducers
Chapter 4 PDF 4 N-grams
Chapter 5 PDF 5 Word Classes and Part-of-Speech Tagging
Chapter 6 PDF 6 HMMs and Loglinear Models

II: Speech

Chapter 7 PDF 7 Phonetics
Chapter 8 PDF 8 Speech Synthesis
Chapter 9 PDF 9 Automatic Speech Recognition
Chapter 10 PDF 10 Computational Phonology

III: Syntax

Chapter 11 PDF 11 Formal Grammars of English
Chapter 12 PDF 12 Parsing with Context-Free Grammars
13 Lexicalized and Probabilistic Parsing
14 Language and Complexity
15 Features and Unification

IV: Semantics and Pragmatics

Chapter 16 PDF 16 Representing Meaning
17 Semantic Analysis
18 Lexical Semantics
Chapter 19 PDF 19 Computational Lexical Semantics
Chapter 20 PDF 20 Discourse

V: Applications

21 Information Retrieval
22 Question Answering
Chapter 23 PDF 23 Dialog and Conversational Agents
Chapter 24 PDF 24 Machine Translation

Chapter 3: Words and Transducers

This new version of the chapter still focuses on morphology and FSTs, but is expanded in various ways. There are more details about the formal descriptions of finite-state transducers, many bugs are fixed, and two new sections are added relating to words and subwords. The first new section is on word and sentence tokenization, including algorithms for English as well as the maxmatch algorithm for Chinese word segmentation. The second new section is on spelling correction and minimum edit distance, and is an extended version of the edit-distance section from Chapter 5 of the first edition, with clearer figures for example for explaining the minimum-edit-distance backtrace. PDF for Chapter 3 (top)

Chapter 4: N-grams (Formerly Chapter 6)

This updated language model chapter has had a complete overhaul. This draft includes more examples, a more complete description of Good-Turing, expanded sections on practical issues like perplexity and evaluation, language modeling toolkits, including ARPA format, and an overview of modern methods like interpolated Kneser-Ney. PDF for Chapter 4 (top)

Chapter 5: Word Classes and Part-of-Speech Tagging (Formerly Chapter 8)

The main change to this revised chapter is a greatly expanded, and hence self-contained, description of bigram and trigram HMM part-of-speech tagging, including Viterbi decoding and deleted interpolation smoothing. Courses that don't include Chapter 7 (speech and HMMs) can now use this chapter to introduce HMM tagging in a self-contained way. Other changes in this chapter include expanded descriptions of unknown word modeling and part-of-speech tagging in other languages, and many bug fixes. Finally, we've moved this chapter earlier in the book and called it Chapter 5; it should be used after the FST chapter 3 and N-gram chapter 4. Chapter 5 PDF (top)

Chapter 6 (Formerly part of Chapter 7 and Appendix D)

This new chapter presents the Hidden Markov Model in details, including Forward, Viterbi, and EM. It will eventuallly also present Loglinear models.. Chapter 6 PDF (top)

Chapter 7: Phonetics (Formerly parts of Chapters 4, 5, and 7)

This chapter is an introduction to articulatory and acoustic phonetics for speech processing, as well as foundational tools like the ARPAbet, wavefile formats, phonetic dictionaries, and PRAAT. Chapter 7 PDF (top)

Chapter 8: Speech Synthesis

This is a new chapter on speech synthesis. Chapter 8 PDF (top)

Chapter 9: Automatic Speech Recognition (Formerly 7)

This new significantly-expanded speech recognition chapter gives a complete introduction to HMM-based speech recognition, including Gaussian Mixture Model acoustic models, embedded training, as well as overviews of advanced topics like decision-tree clustering for context-dependent phones, n-best lists, lattices, and confusion networks, MLLR adaptation, and discriminative training. The current draft is still missing the section on extraction of MFCC features. Chapter 9 PDF (top)

Chapter 10: Computational Phonology (Formerly parts of Chapters 4, 5, and 7)

This chapter is a brief introduction to computational phonology, including phonological and morphological learning, finite-state models, OT, and Stochastic OT. Chapter 10 PDF (top)

Chapter 11: Formal Grammars of English (Formerly 9)

This chapter still focuses on CFGs for English and includes a revamped and somewhat expanded grammar for the ATIS domain. New and expanded sections cover: treebanks with a focus on the Penn Treebank, searching treebanks with tgrep and tgrep2, heads and head-finding rules, dependency grammars, Categorial grammar, and grammars for spoken language processing. Chapter 11 PDF (top)

Chapter 12: Parsing with Context-Free Grammars (Formerly 10)

The focus of this chapter is still on parsing with CFGs. It now includes sections on CKY, Earley and agenda-based (chart) parsing. In addition, there is a new section on partial parsing with a focus on machine learning based base-phrase chunking and the use of IOB tags. Chapter 12 PDF (top)

Chapter 16: Semantics (Formerly 14)

This chapter still covers basic notions surrounding meaning representation languages. It now has better coverage of model-theoretic semantics for meaning representations, and a new section on Description Logics and their role as a basis for OWL and its role in the Semantic Web. Chapter 16 PDF (top)

Chapter 19: Computational Lexical Semantics (New Chapter; Parts of old Chs. 15, 16 and 17)

The focus of this new chapter is on computing with word meanings. The three main topics are word sense disambiguation, computing relations between words (similarity, hyponymy, etc.), and semantic role labeling. It considerably expands the treatment of these topics. Chapter 19 PDF (top)

Chapter 20: Discourse

This rewritten chapter includes a number of updates to the first edition. The anaphora resolution section is updated to include modern log-linear methods, and a section on the more general problem of coreference is also included. The coherence section describes cue-based methods for rhetorical relation and coherence relation extraction. Finally, there is a significant new section on discourse segmentation (including TextTiling). Chapter 19 PDF (top)

Chapter 23: Dialog and Conversational Agents (Formerly 19)

This is a completely rewritten version of the dialogue chapter. It includes much more information on modern dialogue systems, including VoiceXML, confirmation and clarification dialogues, the information-state model, markov decision processes, and other current approaches to dialogue agents. Chapter 23 PDF (top)

Chapter 24: Machine Translation

The MT chapter has been extensively rewritten and a significant new section added covering statistical MT, including IBM Model 1, Model 3, and HMM alignment. A new evaluation section covering human evaluation and Bleu has also been added, as well as sections on SYSTRAN and more details on cross-linguistic divergences. Chapter 24 PDF (top)