Statistical Methods for NLP

Course Plan


Class Topics Materials Code
Class 1

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

Må 16/1 13:15 – 15:00

Notes: See also the current New York article: The great AI awakening.

Introduction to the class

Introduction to Machine Learning


Computational Statistics using Python (& R)

Class 1 Presentation

Class 1 Presentation (printer friendly version1)

Charalambos Themistocleous (2017). Introduction to R. Part A. Language Fundamentals (manuscript)Introduction to Python: Python Programming LanguageScipy/Numpy Quickstart Tutorial (see also Class 5 notes)
1Please prefer the printer friendly version to save paper.
The code of the examples in the presentation (frequency lists, concordances, etc.) can be accessed here for those interested to see how they were created.
Class 2

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 19/1 10:15 – 12:00


Probability Theory: Introduction Class 2 Presentation
Class 2 (printer friendly version)
Class 3

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 23/1 10:15 – 12:00

Notes: Using probabilities in everyday decision making and how do avoid biases: 

Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124-1131. doi:10.1126/science.185.4157.1124


Law of Total probability

Independent vs. Dependent Events

Conditional Probability
Bayesian Theorem


Class 3 Presentation
Class 3 (printer friendly version)



Class 4

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

26/1 10:15:00 AM - 12:00:00 PM

Discrete Variables

Continuous Variables


Bernoulli Distribution

Binomial Distribution

Hypergeometric Distribution

Random Variables

Class 4 Presentation

Class 4 (printer friendly version)

The code of the examples in the presentation (frequency lists, concordances, etc.) can be accessed here.


Assignment: Task 1 
Class 5

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 30/1 10:15:00 AM – 12:00:00 PM

Computer Exercise 1: Distributions and
Random number generation based on distribution
Class 5 Presentation

Class 5 (printer friendly version)



Class 6

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 02/02/16 10:15 - 12:00
Continuous Variables

Hypothesis Testing

Statistical concepts

Linear Models

Linear Mixed effectsModels

Class 6 Presentation

Class 6 (printer friendly version)

Assignment: Task 2 / Data
Class 7

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 06/02/16 10:15 – 12:00

Information Theory


 Class 7 Presentation

Class 7 (printer friendly version)

Class 8
Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

To 09/02/16 10:15:00 AM – 12:00:00 PM

Machine learning
Basic Concepts
Class 8 Presentation

Class 8 (printer friendly version)

Machine Learning – videos by Trevor Hastie and  and Rob Tibshirani

Scikit Learn: Machine Learning in Python

Working With Text Data

Class 9
Lecturers: Mehdi GhanimifardRoom: Lab 4

Må 13/2 10:15:00 – 12:00


DEADLINE 23/02/17
Naive Bayes

Hints and sample codes  ASSIGNMENT 1
Class 10
Lecturers: Mehdi Ghanimifard

Room: Lab 4

To 16/2 10:15 AM – 12:00 PM


DEADLINE 23/02/17
Naive Bayes

Class 11

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

Må 20/2 10:15 – 12:00

Machine Learning Approaches

Linear Discriminant Analysis

Functional Discriminant Analysis


 Class 11 Presentation

Class 11 (printer friendly version)

Caret Package in R (used for demonstrating model comparison in class):



Class 12

ASSIGNMENT 2 (Evaluation)

Lecturers: Mehdi Ghanimifard

Room: Lab 4

To 23/2 10:15 – 12:00

Class 13

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 27/2 10:15 – 12:00

Decision trees




 Class 13 Presentation

Class 13 (printer friendly version)

Class 14

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 02/03/ 10:15:00 AM – 12:00:00

Markov Chains

Hidden Markov Models


 Class 14 Presentation

Class 14 (printer friendly version)

HMM: Book chapter from Daniel Jurafsky & James H. Martin.  Speech and Language Processing.
Class 15

Lecturers: Mehdi Ghanimifard

Room: Lab 4

Må 06/03 10:15:00 AM – 12:00:00 PM



Implementation of a part-of-speech tagger with Viterbi Algorithm.


Link to old instructions, and extended material

Tagged corpora (ask for password)

Class 16

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 09/03 10:15 – 12:00

Hidden Markov Models
Training and Evaluating HMMs
Class 16 Presentation
Class 16 (printer friendly version)
Viterbi Python Code
Class 17Lecturers: Mehdi Ghanimifard

Room: Lab 4

Må 13/3 10:15 – 12:00

Class 18

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

To 16/3 10:15 – 12:00

On Unsupervised Machine Learning Learning (Chatrine)

Neural Networks
Deep Neural Networks

Class 18 Presentation

Class 18 (printer friendly version)


20/3 12.30-16-30

Room: T219.





Course Literature

Course Books

  • Christopher Manning and Hinrich Schütze (1999) Foundations of Statistical Natural Language Processing, Cambridge, Massachusetts, USA. MIT Press. Also see the book’s supplemental materials website at Stanford.
  • Joseph K. Blitzstein, Jessica Hwang (2014). Introduction to Probability. London: CRC Press. Taylor & Francis.
  • James Gareth, Witten Daniela, Hastie Trevor and Robert Tibshirani (). An Introduction to Statistical Learning. Springer. Available online by the authors here.  Slides and videos for Statistical Learning MOOC by Hastie and Tibshirani available separately here. Slides and video tutorials related to this book by Abass Al Sharif can be downloaded here.

Complementary Textbooks

  • Daniel Jurafsky and James Martin (2008) An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Second Edition. Prentice Hall.
  • Russell, Stuart J.; Norvig, Peter (2009), Artificial Intelligence: A Modern Approach (3rd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-604259-7.


Course Description

7.5 hecr, 2nd semester, 1st study period

The purpose of this course is to give an introduction to probabilistic modeling, statistical methods and their use within the field of language technology. The following topics will be covered in the course:

  • Probability theory
  • Information theory
  • Statistical theory (sampling, estimation, hypothesis testing)
  • Language modeling
  • Part-of-speech tagging
  • Syntactic parsing
  • Word sense disambiguation
  • Machine translation
  • Evaluation

Elective course offered by the programme for students taking the one-year degree: Degree of Master of Arts (60 credits) in Language Technology (Filosofie magisterexamen i språkteknologi).

Course Syllabus

The course syllabus in full as adopted by the head of department can be downloaded in pdf.

Course syllabus in English

Course syllabus in Swedish


The course can be offered as a freestanding single subject course for students not on the MLT programme. Information on application deadlines and admissions in the university course catalogue: