Part A Statistical Programming

The aim of the course is to introduce students to how to carry out statistical analysis using a computer and the theory of statistical programming and its related techniques. The course will be based on the R programming language which is now widely used in all branches of applied statistics.

Lectures and practical sessions

There will be 6 classes, made up of a lecture followed by a computer practical session: all lectures and all practical sessions are an essential part of the course. If practical sheets are not finished in the practical sessions then they should be completed outside the practical. See below for details of how to install R on your own computers.

Introduction week1.pdf week1prac.pdf

Working with data week2.pdf week2prac.pdf

Programming I week3.pdf week3prac.pdf

Programming II week4.pdf week4prac.pdf

Solving Equations and Optimization week5.pdf week5prac.pdf

Simulation week6.pdf week6prac.pdf

Datasets

The following datasets are used in the practical sessions

hellung.txt

cystfibr.txt

juul.txt

AIDS.txt

speed.txt

Written exercise sheets and classes

As the exam for this course is a written exam it is important to practice questions without using a computer.

For this reason there will be two written exercise sheets with a class for each one. The details are below.

Written exercise sheet Deadline Class times
Week 7 : Sheet1.pdf Tues 8 March, by 5pm, SPR1 Thur 10 March 2-3pm, 3-4pm
Week 8 : Sheet2.pdf Tues 10 May, by 5pm, SPR1 Thur 12 May 2-3pm, 3-4pm

Revision session

There will be a revision session Week 7 Thur 16 June 2-3pm in SPR1.104.

Examination

This course will be examined in the same way as all other Part A courses. Since this course is the equivalent to an 8hr lecture course there will be one question on paper AS1 and one question on paper AS2.

Programming with R

The course will teach you how to write programs using the statistical computing environment called R.

If you have your own computer you are strongly encouraged to download and install the software (it’s free) so that you can work on the problem sheets and complete any of the practicals as required.

The software can be downloaded from this website http://www.r-project.org/

Books

In addition to the extensive documentation and help system that is included in R there are two main books that we recommend.

`Introductory Statistics with R’ by Peter Dalgaard, ISBN 0-387-95475-9

– a very good introduction to R that includes many biostatistical examples and covers most of the basic statistics covered in this course.

`Modern Applied Statistics with S’ by Bill Venables and Brian Ripley, ISBN 0-387-95457-0

– a comprehensive text that details the S-PLUS and R implementation of many statistical methods using real datasets.