Course in generalized linear modeling with biological applications - Spring 2006

This course is given in collaboration with the DINA research school. The course is accepted as a Phd-course (9 ECTS points) on KVL.
The pages was updated: May 29, 2006

News

The course starts the first day, Monday, 03. April 2006, at 10:00.
Persons from outside Foulum are required to get a 'guest-card' at the reception.
Place:
The course will be held in Foulum.
Schedule:
The course will consist of 4 blocks, the first and last two blocks consisting of 3 days. The dates are
03. April - 05. April; 19. April - 20. April; 01. - 03. May; 15. - 17. May
The course will each day start around 9 am and end at 4 pm (the exact details will be announced later).
Accommodation:
The course is arranged in blocks of 3 days to facilitate participation from other DIAS centres such that people will not have to spend too much time on transportation and with the only additional expense of having to spend a few nights in the Foulum area. Accommodation is available at Nørresøkollegiet in Viborg, see http://www.nkvib.dk/. If participants come from far away, we have the possibility of not starting until 10am on the first day in a block.

Registration

Registration should be done until Marts 17., 2006. To sign up, send an e-mail to Ulrich Halekoh, (e-mail: ulrich.halekoh(a)agrsci.dk

Course description

The fundamental focus in many experiments and studies is on relating a response variable to one or several explanatory variables. A traditional way of accomplishing this is through a multiple linear regression model (technically speaking, analysis of variance is also a multiple linear regression).
Through practical experience with regression and analysis of variance, one may have experienced situations where the model assumptions are questionable: Data might not be normally distributed, for example because the data are counts (0,1,2,3,4,5,...) or binary (sick/not sick or yes/no). It is not uncommon to find that the variance of the response variable grows with the expected value, or the response variable depends on the explanatory variables in a nonlinear way. Starting from real data examples, it is shown how generalized linear models (GLM) are used for handling such data. The course also describes how to analyze such data, when they are correlated, e.g. because the measurements are made on the same experimental unit. This is achieved using generalized estimating equations (GEE). The course also gives a brief introduction to analysis of censored data.
The course is planned such that practice and theory goes hand in hand. This means that the starting point for all topics will be practical examples primarily, but not exclusively, taken from biological sciences. The necessary statistical theory is then added as needed to solve the practical problems.
Topics: Linear normal models, logistic regression, analysis of count data, analysis of data with non-constant variance (in particular data with constant coefficient of variation), nonlinear relations between data and explanatory variables, growth curve models, analysis of correlated data (generalized mixed models, generalized estimating equations), the model concept, statistical inference, model control.
For computer labs the R program will be used. In the course an introduction to R will be given on the first two days. Nevertheless, the participants are strongly recommended to download, install and start playing around with R before the course starts.

Prerequisites

Working knowledge of basic mathematical and statistical tools and concepts: Solving a simple equation, logarithmic and exponential function. Probability distribution, random variable, mean, variance, normal distribution, confidence interval, linear regression, analysis of variance, hypothesis testing. If you are uncertain about whether you meet this requirements, please contact the teachers!!!
It may be advisable to brush-up your statistical skills before the start of the course. We suggest to consult e.g. 

Additional information

Language:
The course language will be English.
On the web:
The course homepage is http://genetics.agrsci.dk/biometry/courses/phd06
Homepage of the previous course in 2005
Form:
The course will consist of a mixture of lectures, exercises, and computer practicals.
Credit:
The course is approved as a PhD course at RVAU (KVL) with 9 ECTS points.
Workload:
To complete this course you should expect to put about 7 weeks of full time work into it.
Compulsory homework:
A very important part of the course is the take-home assignments. These are larger assignments which must be handed in and approved. Participants can only attend the exam if the take-home assignments have been approved.
Exam:
A project has to be made at the end of the course. The final (oral) exam is based on that project, but a participant can only attend the exam if the take-home assignments have been approved.
Price:
The course is free for PhD students, other students which are affiliated with DIAS and for DIAS employees. Participants outside DIAS will have to pay for participation.
Lectureres:

Course program and course material

The data sets used in the course are installed to R by executing in R the command
install.packages("dataRep",repos="http://gbi.agrsci.dk/biometry/software/r/packages")

In the software folder you can find some additional software used in the course.
  1. DAY Click here to find material for this day
  2. DAY Click here to find material for this day
  3. DAY Click here to find material for this day
  4. DAY Click here to find material for this day
  5. DAY Click here to find material for this day
  6. DAY Click here to find material for this day
  7. DAY Click here to find material for this day
  8. DAY Click here to find material for this day
  9. DAY Click here to find material for this day
  10. DAY Click here to find material for this day
  11. Final-Exam Click here to find material for this day
Homework:

Literature

In addition we suggest consulting:

Useful Links




File translated from TEX by TTH, version 3.72.
On 29 May 2006, 10:24.