Genashtim - eCornell

Topic Modeling With Unsupervised Machine Learning

COURSE ID: CIS574

Course Overview

Can a computer tell the difference between an article on “jaguar” the animal and “Jaguar” the car? It can if we teach it how. In this course, you will extract key phrases or words from a document, which is a key step in the process of text summarization. Part of what makes natural language processing (NLP) so powerful is that it processes text at scale, when a human would simply take too long to perform the same task given the sheer number of text documents to be read and processed. A classic use of NLP, then, is to summarize long documents, whether they are articles or books, in order to create a more easily readable abstract, or summary.

Extracting keywords or keyphrases is a first step in this direction, which is where you will start in this course. Once you train a computer what the most important words in a document might be, you have to train it to identify the most important sentences. This is the second stepin extracting information from a document to help create an abstract, and you will perform this step on larger text documents as well. Finally, you will calculate and interpret similarity metrics to compute the degree of similarity among documents that are possibly related to one another. The techniques you use throughout this course will prove useful in specific situations at work and beyond as you support your team or achieve your personal goals.

You are required to have completed the following courses or have equivalent experience before taking this course:

Natural Language Processing Fundamentals
Transforming Text Into Numeric Vectors
Classifying Documents With Supervised Machine Learning

S$1,000

Enroll now

Certificates with this course