Friday, March 17, 2017

Lecture 4: language modeling (2); neural networks and NLP

We discussed perplexity and its close relationship with entropy, we introduced smoothing and interpolation techniques to deal with the issue of data sparsity. Practical session on language modeling with Python and the Berkeley LM toolkit.

Friday, March 10, 2017

Lecture 3: morphological analysis: practical session; homework 1; language modeling (1)

We had a practical session on morphological analysis in Python and Java. We reviewed basic probability concepts. introduced N-gram models (unigrams, bigrams, trigrams), together with their probability modeling and issues.

We also discussed homework 1 (see post on the class group).

Friday, March 3, 2017

Lecture 2: intro (2); morphological analysis

We introduced words and morphemes. Before delving into morphology and morphological analysis, we introduced regular expressions as a powerful tool to deal with different forms of a word. We then introduced recent work on morphological analysis based on machine learning: unsupervised (Morfessor) and supervised (based on CRFs).


Saturday, February 25, 2017

Lecture 1: Introduction to NLP

We gave an introduction to the course and the field it is focused on, i.e., Natural Language Processing, with a focus on the Turing Test as a tool to understand whether "machines can think". We also discussed the pitfalls of the test, including Searle's Chinese Room argument.


Thursday, January 19, 2017

Ready, steady, go!

Welcome to the Sapienza NLP course blog! This year there will be important changes: first, projects will be lightweight for attending students; second, homeworks will be part of the final project (in this respect, attending students will complete more than 50% of their projects before the end of the course); third, the class will be updated on the newest trends in neural networks; fourth: this year the (class) project will be... the development of an intelligent chatbot working on Telegram!
IMPORTANT: The 2017 class hour schedule will be on Fridays 2.30pm-5.45pm. Please sign up to the NLP class!


Friday, May 27, 2016

Lecture 12: statistical machine translation

Introduction to Machine Translation. Rule-based vs. Statistical MT. Statistical MT: the noisy channel model. The language model and the translation model. The phrase-based translation model. Learning a model of training. Phrase-translation tables. Parallel corpora. Extracting phrases from word alignments. Word alignments

IBM models for word alignment. Many-to-one and many-to-many alignments. IBM model 1 and the HMM alignment model. Training the alignment models: the Expectation Maximization (EM) algorithm. Symmetrizing alignments for phrase-based MT: symmetrizing by intersection; the growing heuristic. Calculating the phrase translation table. Decoding: stack decoding. Evaluation of MT systems. BLEU.

Saturday, May 21, 2016

Lecture 11: semantic parsing (2), AMR, research in Rome

Unsupervised semantic parsing, semi-supervised semantic parsing, Abstract Meaning Representation (AMR). NLP research in Rome.


Friday, May 13, 2016

Lecture 10: semantic role labeling and semantic parsing

PropBank, FrameNet, semantic role labeling. Introduction to semantic parsing. Presentation of the projects.

Friday, May 6, 2016

Lecture 9: Neural Networks, word embeddings and deep learning

Motivation. The perceptron. Input encoding, sum and activation functions; objective function. Linearity of the perceptron. Neural networks. Training. Backpropagation. Connection to Maximum Entropy. Connection to language. Vector representations. NN for the bigram language model. Word2vec: CBOW and skip-gram. Word embeddings. Deep learning. Language modeling with NN. The big picture.

Tuesday, May 3, 2016

Lecture 8: Entity Linking

Entity Linking. Main approaches. AIDA, TagMe, Wikifier, DBpedia spotlight, Babelfy. The MASC annotated corpus. Demo di sistemi di WSD e Entity Linking. Introduction to Neural Networks.