Lecture 1. Introduction

Материал из Wiki - Факультет компьютерных наук
Перейти к: навигация, поиск

Lecture 1. Introduction

Brief history of NLP

  • January 7, 1954 — Georgetown experiment. Russian to English machine translation;
  • 1957 — Noam Chomsky introduced "universal grammar";
  • since 1961 — Brown Corpus;
  • the late 1960's — ELIZA, a simulation of a psychotherapist;
  • 1975 — Vector Space Model by Salton;
  • up to the early 1980's — rule based approaches;
  • after the early 1980's — machine learning, corpus linguistics;
  • 1998 — Language Model by Ponte and Croft;
  • since 1999 — topic modeling (LSI, pLSI, LDA, etc);
  • 1999 — "Foundations of Statistical Natural Language Processing" by Manning and Shuetze;
  • 2009 — "Natural Language Processing with Python" by Bird, Klein, and Loper.

Major tasks of NLP

  • Machine Translation
  • Text classification
    • Sentiment analysis
    • Spam filtering
    • Classification by topic or by genre
  • Text clustering
  • Named entity recognition
  • Question answering
  • Automatic summarization
  • Natural language generation
  • Speech recognition
  • Spell checking
  • User study design and evaluation


NLP techniques

  • The level of characters:
    • Word segmentation
    • Sentence breaking
  • The level of words — morphology:
    • Part of speech (POS) tagging
    • Word sense disambiguation
  • The level of sentences — syntax:
    • Parsing
  • The level of senses — semantics:
    • Coreference resolution
    • Discourse analysis
    • Semantic role labeling
    • Synonymy detection

Main problems

About this course