Lecture: Natural Language Processing

Administrative Information

Title	Natural Language Processing
Duration	60 - 70 minutes
Module	A
Lesson Type	Lecture
Focus	Practical - AI Modelling
Topic	Statistical methods for NLP and text classification

Keywords

NLP,Natual Language Processing,Computational Linguistics,

Learning Goals

Students understand the basic concepts of Natural Language Processing
Students learn NLP use cases
Students get familiar with various NLP tools and concepts

Expected Preparation

Learning Events to be Completed Before

None.

Obligatory for Students

A review of basic statistics

Optional for Students

Review of Python Programming Language

References and background for students

Ethical by Design: Ethics Best Practices for Natural Language Processing
Bishop, Christopher M. (2006). Pattern recognition and machine learning
[https://terpconnect.umd.edu/~kshilton/pdf/VitaketalCSCWpreprint.pdf Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]
Jurafskly D., Martin J. H. - An Introduction to NLP, Computational Linguistics, and Speech Recognition
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.

Recommended for Teachers

Ethical Principles and Guidelines for the Protection of Human Subjects of Research (THE BELMONT REPORT)

Lesson materials

Lecture slides

Instructions for Teachers

You can base this class around the slides. The material is suggested but can be adapted.

Outline

Time schedule
Duration (Min)	Description	Concepts	Activity
5	Introduction to Natural Language Processing, goals, methods and challenges	computer linguistics, natural language processing
5	Processing Natural Language Text�: Use cases	corpus, segmentation, tokenization, concordance
10	Regular Expressions, Text Normalisation	language modeling, edit distance
15	N-gram Models	Sequences of words as a Markov process
5	Chain Rule of Probality	General product rule
10	Markov and MAximum Likelihood Estimation	Markov chain - stochastic model
5	Evaluation Language Models	Perplexity
5	Naive Bayes Classifier	Probabilistic classifiers	Preparing the lab excercise

Acknowledgements

The Human-Centered AI Masters programme was Co-Financed by the Connecting Europe Facility of the European Union Under Grant №CEF-TC-2020-1 Digital Skills 2020-EU-IA-0068.

Lesson plan on SURF

WikiWijs page