Natural language processing
Study Cycle: 2
Lectures: 45
Seminars: 10
Tutorials: 20
ECTS credit: 5
The syllabus is based on a selection of modern deep learning based natural learning processing techniques and their practical use. The lectures introduce the main tasks and techniques, explain their operation and theoretical background. During practical sessions and seminars the gained knowledge is applied to language practical task using open source tools. Student investigate and solve assignments, based on real-world research and commercial problems form English and Slovene languages.
1. Introduction to natural language processing: motivation, language understanding, ambiguity, traditional, statistical, and neural approaches.
2. Text preprocessing and normalization: regular expressions, grammars, string similarity, advanced normalization techniques, lemmatization.
3. Language resources: corpora, dictionaries, thesauri, networks and semantic databases, WordNet.
4. Text similarity: measures, clustering approaches, cosine distance, language networks and graphs.
5. Text representation: sparse and dense embeddings; language models; word, sentence, and document embeddings.
6. Deep neural networks for text: recurrent neural networks, convolutional networks for text, transformers.
7. Neural embeddings: word2vec, fastText, ELMo, BERT, cross-lingual embeddings.
8. Large language models: BERT, GPT, and T5, multimodal models.
9. Shallow computational and lexical semantics: part-of-speech tagging, dependency parsing, named entity recognition, semantic role labelling.
10. Word senses and their disambiguation.
11. Affective computing: sentiment, emotions.
12. Text summarization, question answering and reading comprehension: methods and evaluation.
13. Machine translation: methods and evaluation
Jurafsky, David and Martin, James H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 3rd edition draft. 2023.
Jacob Eisenstein. Natural Language Processing, MIT press, 2019