STAT 4690- Fall 2019

General Information

This is the course website for STAT 4690: Applied Multivariate Analysis. This course aims to provide students with a broad overview of techniques used in multivariate statistical analysis, with an emphasis on Multivariate Linear Regression and Principal Component Analysis. At the end of the course, students will be able to

  • make decisions on how and when to use the techniques discussed in class;
  • apply and assess multivariate methods on real data;
  • make sound statistical conclusions based on a multivariate analysis.

Moreover, the course aims to make students familiar, or competent, with the R statistical software.

The course outline can be downloaded here.

Prerequisites

Students should have a good working knowledge of statistical inference and linear algebra: STAT 3480 (005.348) (C); and a “C” or better in one of MATH 1220 (or the former MATH 2300 or MATH 2301) and MATH 2150 (or the former MATH 2720 or MATH 2721 or MATH 2750) or consent of instructor.

Textbook

Applied Multivariate Statistical Analysis (6th ed.) by R. A. Johnson and D. W. Wichern, Prentice Hall, 2007.

The textbook is not required but strongly recommended. A hard-copy will be available on course reserve.

Assessments

The assessments for this course include:

  • Two (2) assignments;
  • One (1) midterm test;
  • One (1) final project, which includes a written report and an oral presentation.
    • The guidelines for the term project can be found here.

In particular, there is no final exam.

Outline of Topics

The course is expected to cover the following topics:

  1. Aspects of multivariate analysis: handling multivariate data, graphical displays, statistical distance (Chapter 1)
  2. Matrix algebra and random vectors: eigenvalues and eigenvectors, positive definite matrices, mean vectors, covariance matrices and matrix decompositions (Chapter 2)
  3. Random Samples: sample geometry, characterizing random samples (Chapter 3)
  4. Multivariate normal distribution: definition and properties, estimation and sampling distri- butions (Chapter 4)
  5. Inferences about a mean vector: Hotelling’s T2 and likelihood ratio tests, confidence regions and multiple comparisons (Chapter 5)
  6. Multivariate linear regression: least squares estimation and inference (Chapter 7)
  7. Principal Component Analysis: interpretation and use of principal components (Chapter 8)
  8. Factor Analysis: orthogonal factor model, estimation and inference (Chapter 9)
  9. Canonical Correlation Analysis: canonical variables and canonical correlations (Chapter 10)
  10. Kernel methods and Manifold Learning (if time permits)

Statistical Software

The course requires you to make extensive use of the R statistical software for your assignments and final data project. Sample codes will be provided to students.

You can download R for free (for Windows, Mac, Linux, and Solaris) from the Comprehensive R Archive Network at: https://cran.r-project.org/

For additional resources on R, see here.