1. Name of Course: Multivariate Statistics

2. Lecturer: Marianna Bolla

3. No. of Credits: 3, no. of
ECTS credits: 6

4. Semester or Time Period of the course (give dates if the course does
not extend over the normal designated semester period): January 7, 2009 – March
25, 2009

5. Any other required elements of the department (e.g.. is this course
the co-requisite of another course? Is this course a pre-requisite? Etc.):

Pre-requisite: Probability and
Statistics

6. Course Level – (for those programmes who have 2 year MAs, the
programmes may be divided into levels – i.e., level 1 credits in the first year
etc.; or MA and PhD; or PhD only year 1 etc.) ?

MA and PhD

7. Brief introduction to the course outlining its primary theme,
objective and briefly the place of the course in the overall programme of
study.

The course is based on the Probability and Statistics course, and generalizes the concepts studied there to
multivariate observations and multidimensional parameter spaces. Students will
be introduced to basic models of multivariate analysis with applications. We
also aim at developing skills to work with real-world data.

8. The goals of the course – these may be expressed within a narrative
(a more detailed explanation of the course) **as long as the goals are clear
to the students**.

The first part of the course gives an introduction to the multivariate
normal distribution

and deals with spectral techniques to reveal the covariance structure
of the data. In the second part dimension reduction methods will be introduced
(factor analysis and canonical correlation analysis) together with linear
models, regression analysis and analysis of variance. In the third part students
will learn classification and clustering methods to establish connections between
the observations. Finally, algorithmic models are introduced for large data
sets. Applications are also discussed, mainly on a theoretical basis, but we
make the students capable of using statistical program packages.

9. The learning outcomes of the course – these are the more specific
achievements of the students as they leave the course. They should be related
to the course goals (i.e., be designed so that fulfilling the learning outcomes
means fulfilling to a large degree the goals) and should be assessable.

Students will be able to identify multivariate statistical models, analyze
the results and make further inferences on them. Students will gain familiarity
with basic methods of dimension reduction and classification (applied to scale,
ordinal or nominal data). They will become familiar with applications to real-world
data sets, and will be able to choose the most convenient method for given
real-life problems.

10. More detailed display of contents. This may include a longer
narrative explaining the intellectual foundations of the course, for instance,
but must include a week by week breakdown. We should note here that the
contents could be changed at a later date in discussion with the relevant
students. A course is often seen as a contract between students and teacher,
but the contract can be modified if both parties agree (!).

Week by week breakdown:

- Multivariate
normal distribution, conditional distributions, multiple and partial
correlations.
- The Wishart
distribution and distribution of eigenvalues of sample covariance
matrices.
- Multidimensional
Central Limit Theorem. Multinomial sampling and the chi-square test.
- Parameter
estimation and Fisher information matrix.
- Likelihood
ratio tests and testing hypotheses about the mean. Hotelling’s T-square
distribution.
- Multivariate
statistical methods for reduction of dimensionality: principal components
and factor analysis, canonical correlation analysis.
- Theory of
least squares. Multivariate regression, Gauss-Markov theory.
- Fisher-Cochran
Theorem. Analysis of variance.
- Classification
and clustering. Discriminant analysis, k-means and hierarchical clustering
methods.
- Factoring
and classifying categorical data. Contingency tables, correspondence
analysis.
- Algorithmic
models: EM-algorithm for missing data, ACE-algorithm for generalized
regression, Kaplan-Meier algorithm for censored data.
- Resampling
methods: jackknife and bootstrap. Statistical graph theory.

Literature:

R.A. Johnson, G.K. Bhattacharyya, Statistics. Principles and Methods.
Wiley,

C.R. Rao, Linear statistical inference and its applications. Wiley,

K.V. Mardia, J.T. Kent, M. Bibby, Multivariate analysis. Academic
Press,

Handouts: ANOVA tables and outputs of the BMDP Program Package, while
processing real-world data.

11. Assessment: Types of assessment should be clearly expressed with a
short description and with the percentage breakdowns for the course overall. In
the interests of good overall approaches to assessment, it is suggested that
individual assessment plans within the same semester and same department be
cross-referenced.

For passing grade, student
must solve correctly at least 50% of homework exercises (there will be 4
homework assignments, each of them containing 5 exercises) and 50% of the test
exercises (there will be 2 tests each of them containing 10 exercises). Grading
in the function of the collected (maximum 40) points:

A : 36-40

A-: 33-35

B+: 30-32

B :
26-29

B_: 23-25

C+: 20-22

However, the final grade can
be contested by taking an oral exam at the end of the semester.

12. Such further items as assessment deadlines, office hours, contact
details etc are at the discretion of the department or the individual.

Office hours: after the classes in office 301.

Schedule of the tests: 6^{th} and 11^{th} weeks.

Contact details: Marianna Bolla, Budapest University of Technology and
Economics,

Office phone: 06-1-4631111, ext.
5902.

E-mail:
marib@math.bme.hu

Homepage: www.math.bme.hu/~marib/ceu