Speech Technologies

Speech Technologies

Lectures: 35

Seminars: 25

Tutorials: 15

ECTS credit: 5

Lecturer(s): prof. dr. Dobrišek Simon

Introduction: Definition of the field of speech technologies, historical development and current trends, the importance of speech technologies for the Slovenian language and digital linguistics.

Fundamentals of speech production and perception in humans. Characteristics of speech signals and their representation for computational processing.

Computational speech processing: Digitization and preprocessing of speech signals, extraction of speech features, speech segmentation, speech databases, and tools for their processing.

Speech and speaker recognition: Automatic recognition of speech and speaker using statistical and deep learning models; acoustic, linguistic, and semantic analysis and modeling of speech.

Speech synthesis: Structure of speech synthesis systems, grapheme-to-phoneme conversion, prosody modeling, methods for generating synthetic speech signals, and evaluation of speech synthesis systems.

Human–computer dialogue systems: Architecture of dialogue systems, dialogue management, knowledge representation, the use of large generative language models, and evaluation of dialogue systems.

1. R. Pieraccini: The Voice in the Machine: Building Computers That Understand Speech, MIT Press , 2012. https://search.ebscohost.com/login.aspx?direct=true&db=e000xww&AN=44356…
2. Rabiner L., Schafer R., Theory and Applications of Digital Speech Processing, Prentince Hall, 1. Ed., 2010