Genashtim - eCornell

Alternative Approaches to Text Data Analysis for Investment

COURSE ID: JCB664

Course Overview

The Latent Dirichlet Allocation (LDA) algorithm is undoubtedly a powerful tool for text data analysis. Like any tool, however, it has certain limitations that need to be acknowledged before its application in real-world scenarios. It's therefore beneficial to examine other algorithms to compare their performance and application, helping you choose the most fitting method for your NLP projects. Enter the Doc2Vec algorithm, another frequently used tool for text data analysis. It takes a unique approach by creating numerical vectors that encapsulate the context and relation of words to documents, instead of generating topics based on word frequency. Despite its own limitations, Doc2Vec possesses certain strengths that are extremely relevant to the construction and management of investment portfolios.

In this course, we'll explore the Doc2Vec algorithm as an alternative approach to text data analysis. You'll replicate many of the same general operations you performed in previous courses with the LDA algorithm. Your journey will involve training and evaluating an initial Doc2Vec model then crafting your own custom vectors to build lists of comparable companies relevant to specific investment themes.

As we delve into the course, you'll introduce additional algorithms as part of your analysis. You'll explore different ways to customize and visualize results, comparing them against an industry standard and real-world investment portfolios. By the end of this course, you will have gained a deep understanding of multiple NLP algorithms, their strengths and weaknesses, and how to make an informed choice for your specific needs in the financial markets.

The following course is required to be completed before taking this course:

Preparing Data for Natural Language Processing
Cleaning Text Data to Optimize Model Performance
Tuning your NLP Model for Market Relevance

S$700

Enroll now

Certificates with this course