Quick skip to the last lecture
Notifications
The next live zoom session will take place on Thursday June 3 at 9:00. We will discuss the exam, the project etc. No recording will be available.
Project assignment has been published.
Exam terms for the oral part will be put in the SIS on request. If there is no convenient exam date for you in the SIS contact me and give a range when you are going to be ready to take the exam. Project evaluation can be done separately from the oral part (before or after) and will not be scheduled in the SIS. Opportunities to take either part of the exam will be offered in June, July, and September.
Final Project
The assignment explains all that is needed. If you feel you need to know more, ask questions by mail. The report from the final project is due two working days before the date of project evaluation (see below  requirements for exam).
The dataset is available in the SIS (you need to sign in, it can be accessed by enrolled students only).
Course Materials
 Extended course notes for distant learning (last updated May 23, 2021)

Access to video recordings
(first step: ANYTHING@cuni.cz, second step: regular SIS login)
Progress of lectures

Monday March 1.
Review of linear regression.
Extended course notes, Chap. 1, pp. 49.
Video recording:
 lecture_01_210301.mp4: Review of linear regression (43 min.)

Thursday March 4.
Exponential family
of distributions.
Extended course notes, Sec. 2.1, pp. 1016.
Video recording:
 lecture_02_210304.mp4: Exponential family of distributions (49 min.)

Thursday March 4.
Generalized linear
model: definition.
Extended course notes, Sec. 2.2, pp. 1620.
Video recording:
 lecture_03_210304.mp4: Generalized linear model: definition (30 min.)

Thursday March 11.
MLE in the GLM:
likelihood, score statistic
Extended course notes, Sec. 2.22.3, pp. 2024.
Video recording:
 lecture_04_210311.mp4: MLE in the GLM: likelihood, score statistic (33 min.)

Thursday March 11.
MLE in the GLM:
information matrix. Iterative weighted least squares.
Extended course notes, Sec. 2.32.4, pp. 2427.
Video recording:
 lecture_05_210311.mp4: MLE in the GLM: information matrix, iterative weighted least squares (36 min.)

Thursday March 18.
Estimation of the
Dispersion Parameter. Deviance.
Extended course notes, Sec. 2.52.6, pp. 2730.
Video recording:
 lecture_06_210318.mp4: Estimation of the dispersion parameter. Deviance (25 min.)

Monday March 22.
Asymptotics for the
GLM.
Extended course notes, Sec. 2.7, pp. 3035.
Video recording:
 lecture_07_210322.mp4: Asymptotics for the GLM (40 min.)

Monday March 22.
Diagnostic methods for the
GLM. Modelbuilding principles.
Extended course notes, Sec. 2.8, 2.9, pp. 3539.
Video recording:
 lecture_08_210322.mp4: Diagnostic methods for the GLM, modelbuilding principles (42 min.)

Thursday April 1.
Binary data,
alternative vs. binomial distribution of the response. Link
functions for binary data.
Extended course notes, Sec. 3.1.1  3.1.4, pp. 4044.
Video recording:
 lecture_09_210401.mp4: Binary data (30 min.)

Thursday April 1.
Logistic regression.
Extended course notes, Sec. 3.1.5, pp. 4550.
Video recording:
 lecture_10_210401.mp4: Logistic regression (48 min.)

Thursday April 8.
Loglinear models
for Poisson count data. Modelling Poisson process intensity
Extended course notes, Sec. 3.2, pp. 5156.
Video recording:
 lecture_11_210408.mp4: Poisson regression models (43 min.)

Thursday April 8.
Loglinear models
for contingency tables  introduction.
Extended course notes, Sec. 3.3.1, 3.3.2, pp. 5662.
Video recording:
 lecture_12_210408.mp4: Loglinear models for contingency tables (48 min.)

Thursday April 15.
Loglinear models
for twoway tables
Extended course notes, Sec. 3.3.3, 3.3.4, pp. 6166.
Video recording:
 lecture_13_210415.mp4: Loglinear models for twoway tables (37 min.)

Thursday April 15.
Loglinear models
for threeway tables  first part.
Extended course notes, Sec. 3.3.5, pp. 6674.
Video recording:
 lecture_14_210415.mp4: Marginal and conditional associations. Simpson's paradox. Confounding and causality (53 min.)

Thursday April 22.
Interpretation of
loglinear models for threeway tables.
Extended course notes, Sec. 3.3.5, pp. 7482.
Video recording:
 lecture_15_210422.mp4: Interpretation of loglinear models for threeway tables (44 min.)

Thursday April 22.
Loglinear models
for multiway tables. Equivalence between loglinear and logistic
models. Overdispersion in binary data  beta binomial distribution.
Extended course notes, Sec. 3.3.6, 3.3.7, 4.1.1 pp. 8288.
Video recording:
 lecture_16_210422.mp4: Loglinear models for multiway tables. Equivalence between loglinear and logistic models. Overdispersion in binary data  beta binomial distribution (47 min.)

Thursday April 29.
Overdispersion in
count data  Poisson gamma distribution. Quasilikelihood metods.
Extended course notes, Sec. 4.1.24.1.4, pp. 8994.
Video recording:
 lecture_17_210429.mp4: Overdispersion in count data  Poisson gamma distribution. Quasilikelihood metods (45 min.)

Thursday April 29.
Maximum likelihood
estimation under invalid models. Sandwich estimation in the
GLM.
Extended course notes, Sec. 4.2 pp. 9499.
Video recording:
 lecture_18_210429.mp4: Maximum likelihood estimation under invalid models. Sandwich estimation in the GLM (30 min.)

Thursday May 6.
Groupdependent data. Generalized Estimating Equations (GEE).
Extended course notes, Sec. 5.15.3, pp. 101106.
Video recording:
 lecture_19_210506.mp4: Groupdependent data. Generalized Estimating Equations  GEE (47 min.)

Thursday May 6.
Linear mixed
effects models  introduction. Oneway ANOVA with fixed and random effects.
Extended course notes, Sec. 6.1.16.1.2 pp. 107114.
Video recording:
 lecture_20_210506.mp4: Oneway ANOVA with fixed and random effects (35 min.)

Thursday May 13.
Twoway ANOVA with
random effects. Random intercept and slope. Definition of Linear
Mixed Effects Model.
Extended course notes, Sec. 6.1.3, 6.1.4, 6.2.1, 6.2.2, pp. 112118.
Video recording:
 lecture_21_210513.mp4: Twoway ANOVA with random effects. Random intercept and slope. Definition of Linear Mixed Effects Model (50 min.)

Thursday May 13.
Marginal
likelihood. Henderson's equations.
Extended course notes, Sec. 6.3.1, 6.3.2 pp. 119123.
Video recording:
 lecture_22_210513.mp4: Marginal likelihood. Henderson's equations. (36 min.)

Thursday May 20.
Estimation of
variance parameters: maximum likelihood, REML.
Extended course notes, Sec. 6.3.3.  6.3.5, pp. 124130.
Video recording:
 lecture_23_210520.mp4: Estimation of variance parameters: maximum likelihood, REML (43 min.)

Thursday May 20.
Hypothesis
testing. Confidence intervals.
Extended course notes, Sec. 6.4.1  6.4.3 pp. 130135.
Video recording:
 lecture_24_210520.mp4: Hypothesis testing. Confidence intervals (36 min.)

Thursday May 27.
Extended linear
mixed effects model. Comparison of LME and GEE.
Extended course notes, Sec. 6.5., 6.6, pp. 135138.
Video recording:
 lecture_25_210527.mp4: Extended linear mixed effects model. Comparison of LME and GEE (35 min.)

Thursday May 27.
Generalized linear
mixed models.
Extended course notes, Chap. 7, pp. 139143.
Video recording:
 lecture_26_210527.mp4: Generalized linear mixed models (40 min.)
Schedule
Schedule is not much relevant in times of distant teaching...
Lectures  
Monday  9:00  10:30  K4  
Thursday  9:00  10:30  K3  
Exercise Class  
Thursday  14:00  15:30  K11  Instructor: Arnošt Komárek 
Supplementary Course Materials
 Brief course notes from the last year.

Summary of maximum likelihood
estimation theory (pdf)
This is a useful brief summary of the maximum likelihood theory. These results are assumed to be known to the enrolled students and will be used in the course during the whole semester.

J.C. Pinheiro & D.M. Bates.
MixedEffects Models in S and
Splus.
Springer, New York, 2000.
A good reference on fitting mixed effect models in R (and Splus).

P.J. Diggle, K.Y. Liang & S.L. Zeger. Analysis of Longitudinal Data.
Oxford University Press, Oxford, 1994.
Another useful book on GEE, linear mixed models and GLMM.
Course Plan
The course covers methods for regression analysis of data that belong to one or more of the following categories
 do not follow the normal distribution
 violate the assumption of equal variance
 violate the assumption independence
We will learn some of the common statistical methods that allow fitting regression models to such data.
The lecture focuses on the development, theoretical justification, and interpretation of these methods.
The exercise classes will teach how to apply these methods to real problems but may include some theoretical tasks as well. A new assignment will be given about every 2 weeks.
The course will be concluded by a written data analysis project.
Prerequisites
This course assumes midlevel knowledge of linear regression theory and applications. Master students of "Probability, statistics and econometrics" must have completed the course on Linear Regression (NMSA407) before enrolling here.
Requirements for Credit/Exam
Credit:
The credit for the exercise class will be awarded to the student who hands in a satisfactory solution to each assignment by the prescribed deadline.
Exam:
The exam has two parts:
 Evaluation of project report (has the assignment been completed in all aspects without major errors?)
 Oral part focuses on the ability to propose an acceptable model for a particular practical problem and to demonstrate understanding of the theory underlying the chosen model (incl. derivations and proofs).
To pass the exam, both parts need to be passed.