Predictive analytics is a well-established tradition in medicine. The development of better prediction models is an important step in improving health care. We need these tools to guide us in our decisions about preventive measures and individual treatments. To be able to use these models effectively and develop them, it is important that we understand them. This course will teach you how to create accurate prediction tools and how to evaluate their validity. We will first discuss predictive analytics in the context of prevention, diagnosis, effectiveness. Next, we will discuss key concepts like study design, sample size, and overfitting.
We also discuss important modeling issues like missing values, nonlinear relations, and model selection. We also discuss the importance of the bias/variance tradeoff, and its role in prediction. We also discuss different ways to evaluate a model, including performance measures and evaluating both internal and exterior validity. We will also talk about how to modify a model for a particular setting. The course uses R to illustrate the concepts. However, you don't need to install R to be able access R and all the examples datasets in the Coursera environment. However, we do provide references to additional packages that can be used for specific types of analyses. Feel free to download and install them on your own computer. Each module may also include practice questions. These questions can be used to test your knowledge and will allow you to pass, regardless of whether or not you gave the correct answer. The best way to learn is to first think about the answers, then check your answers with the correct explanations and answers. This course is part a master's program Population Health Management at Leiden University (currently under development).