Visualization is one of the most simple and effective ways to find patterns in data. These patterns include: What is the general range and shape of the data set? Are there any clusters of observations? Which variables correlate with each other? Are there any obvious outliers?
As your data set grows in terms of the number of data points and variables, however, it becomes increasingly difficult to visualize all this information at once. At most, you can plot data points on a three-dimensional axis and add further distinctions of size, color, shape, and so on. Yet this can easily become too busy and difficult to read. How, then, do we find patterns in really big data sets?
In this course, you will explore several powerful and commonly utilized techniques for distilling patterns from data. You will implement each of these techniques using the free and open-source statistical programming language R with real-world data sets. The focus will be on making these methods accessible for you in your own work.
You are required to have completed the following courses or have equivalent experience before taking this course:
- Understanding Data Analytics