Administrative Information
Title | Natural Language Processing |
Duration | 60 - 70 minutes |
Module | A |
Lesson Type | Lecture |
Focus | Practical - AI Modelling |
Topic | Statistical methods for NLP and text classification |
Keywords
NLP,Natual Language Processing,Computational Linguistics,
Learning Goals
- Students understand the basic concepts of Natural Language Processing
- Students learn NLP use cases
- Students get familiar with various NLP tools and concepts
Expected Preparation
Learning Events to be Completed Before
None.
Obligatory for Students
- A review of basic statistics
Optional for Students
- Review of Python Programming Language
References and background for students
- Ethical by Design: Ethics Best Practices for Natural Language Processing
- Bishop, Christopher M. (2006). Pattern recognition and machine learning
- [https://terpconnect.umd.edu/~kshilton/pdf/VitaketalCSCWpreprint.pdf Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community]
- Jurafskly D., Martin J. H. - An Introduction to NLP, Computational Linguistics, and Speech Recognition
- Christopher D. Manning, Prabhakar Raghavan and Hinrich SchĂĽtze, Introduction to Information Retrieval, Cambridge University Press. 2008.
Lesson materials
Instructions for Teachers
You can base this class around the slides. The material is suggested but can be adapted.
Outline
Duration (Min) | Description | Concepts | Activity | Material |
---|---|---|---|---|
5 | Introduction to Natural Language Processing, goals, methods and challenges | computer linguistics, natural language processing | ||
5 | Processing Natural Language Text : Use cases | corpus, segmentation, tokenization, concordance | ||
10 | Regular Expressions, Text Normalisation | language modeling, edit distance | ||
15 | N-gram Models | Sequences of words as a Markov process | ||
5 | Chain Rule of Probality | General product rule | ||
10 | Markov and MAximum Likelihood Estimation | Markov chain - stochastic model | ||
5 | Evaluation Language Models | Perplexity | ||
5 | Naive Bayes Classifier | Probabilistic classifiers | Preparing the lab excercise |
Acknowledgements
The Human-Centered AI Masters programme was Co-Financed by the Connecting Europe Facility of the European Union Under Grant â„–CEF-TC-2020-1 Digital Skills 2020-EU-IA-0068.