1. Name of Course:  Multivariate Statistics


2. Lecturer: Marianna Bolla


3. No. of Credits: 3,  no. of ECTS credits: 6


4. Semester or Time Period of the course (give dates if the course does not extend over the normal designated semester period): January 7, 2009 – March 25, 2009


5. Any other required elements of the department (e.g.. is this course the co-requisite of another course? Is this course a pre-requisite? Etc.):

Pre-requisite:  Probability and Statistics


6. Course Level – (for those programmes who have 2 year MAs, the programmes may be divided into levels – i.e., level 1 credits in the first year etc.; or MA and PhD; or PhD only year 1 etc.) ?

MA and PhD


7. Brief introduction to the course outlining its primary theme, objective and briefly the place of the course in the overall programme of study.


The course is based on the Probability and  Statistics course,  and generalizes the concepts studied there to multivariate observations and multidimensional parameter spaces. Students will be introduced to basic models of multivariate analysis with applications. We also aim at developing skills to work with real-world data.


8. The goals of the course – these may be expressed within a narrative (a more detailed explanation of the course) as long as the goals are clear to the students.


The first part of the course gives an introduction to the multivariate normal distribution

and deals with spectral techniques to reveal the covariance structure of the data. In the second part dimension reduction methods will be introduced (factor analysis and canonical correlation analysis) together with linear models, regression analysis and analysis of variance. In the third part students will learn classification and clustering methods to establish connections between the observations. Finally, algorithmic models are introduced for large data sets. Applications are also discussed, mainly on a theoretical basis, but we make the students capable of using statistical program packages.


9. The learning outcomes of the course – these are the more specific achievements of the students as they leave the course. They should be related to the course goals (i.e., be designed so that fulfilling the learning outcomes means fulfilling to a large degree the goals) and should be assessable.


Students will be able to identify multivariate statistical models, analyze the results and make further inferences on them. Students will gain familiarity with basic methods of dimension reduction and classification (applied to scale, ordinal or nominal data). They will become familiar with applications to real-world data sets, and will be able to choose the most convenient method for given real-life problems. 


10. More detailed display of contents. This may include a longer narrative explaining the intellectual foundations of the course, for instance, but must include a week by week breakdown. We should note here that the contents could be changed at a later date in discussion with the relevant students. A course is often seen as a contract between students and teacher, but the contract can be modified if both parties agree (!).


            Week by week breakdown:

  1. Multivariate normal distribution, conditional distributions, multiple and partial correlations.
  2. The Wishart distribution and distribution of eigenvalues of sample covariance matrices.
  3. Multidimensional Central Limit Theorem. Multinomial sampling and the chi-square test.
  4. Parameter estimation and Fisher information matrix.
  5. Likelihood ratio tests and testing hypotheses about the mean. Hotelling’s T-square distribution.
  6. Multivariate statistical methods for reduction of dimensionality: principal components and factor analysis, canonical correlation analysis.
  7. Theory of least squares. Multivariate regression, Gauss-Markov theory.
  8. Fisher-Cochran Theorem. Analysis of variance.
  9. Classification and clustering. Discriminant analysis, k-means and hierarchical clustering methods.
  10. Factoring and classifying categorical data. Contingency tables, correspondence analysis.
  11. Algorithmic models: EM-algorithm for missing data, ACE-algorithm for generalized regression, Kaplan-Meier algorithm for censored data.
  12. Resampling methods: jackknife and bootstrap. Statistical graph theory.




R.A. Johnson, G.K. Bhattacharyya, Statistics. Principles and Methods. Wiley, New York, 1992.

C.R. Rao, Linear statistical inference and its applications. Wiley, New York, 1973.

K.V. Mardia, J.T. Kent, M. Bibby, Multivariate analysis. Academic Press, New York, 1979.


Handouts: ANOVA tables and outputs of the BMDP Program Package, while processing real-world data.



11. Assessment: Types of assessment should be clearly expressed with a short description and with the percentage breakdowns for the course overall. In the interests of good overall approaches to assessment, it is suggested that individual assessment plans within the same semester and same department be cross-referenced.


For passing grade, student must solve correctly at least 50% of homework exercises (there will be 4 homework assignments, each of them containing 5 exercises) and 50% of the test exercises (there will be 2 tests each of them containing 10 exercises). Grading in the function of the collected (maximum 40) points:

  A : 36-40

  A-: 33-35

  B+: 30-32

  B :  26-29

  B_: 23-25

  C+: 20-22

However, the final grade can be contested by taking an oral exam at the end of the semester.




12. Such further items as assessment deadlines, office hours, contact details etc are at the discretion of the department or the individual.

Office hours: after the classes in office 301.


Schedule of the tests: 6th and 11th weeks.

Contact details: Marianna Bolla, Budapest University of Technology and Economics,

                          Institute of Mathematics, 1111. Budapest, Egry József u. 1. Bldg. H5.2

                          Office phone: 06-1-4631111, ext. 5902.

                          E-mail: marib@math.bme.hu

                          Homepage: www.math.bme.hu/~marib/ceu