MTHM503 - Applications of Data Science and Statistics (2023)

Back | Download as PDF
MODULE TITLEApplications of Data Science and Statistics CREDIT VALUE15
MODULE CODEMTHM503 MODULE CONVENERDr Victoria Volodina (Coordinator)
DURATION: TERM 1 2 3
DURATION: WEEKS 0 11 0
Number of Students Taking Module (anticipated) 15
DESCRIPTION - summary of the module content

This module will enable you to learn new Data Science and Statistical methods, and to use the techniques learnt in other modules, by working on analyses of real data examples. There will be a strong emphasis throughout on understanding the practical application of statistical and machine learning methods including clustering, data reduction, methods for handling missing data, study design and introductory methods for time series data. Theory and ideas will be developed to allow the implementation of methods in examples drawn from industry, medicine, finance, public health and environmental challenges, including climate change and air pollution.

 

Pre-requisites: None

 

AIMS - intentions of the module

The aim of this module is to practice the use of Data Science and Statistical modelling by working through a series of case studies. The case studies will be based on real-life problems and will start with a description of the setting of the problem and the intended outcomes. One of the important things in any statistical analysis is to understand the background to the problem and, for each case study, there will be a review the field in which it is set. Analyses will start with raw data that will have to be sense-checked and manipulated into a form that is suitable for the intended analyses. Deciding on the exact form of the analyses in each case will be a central focus of this module and an important aim of this module will be developing the skills to make decisions in this regard, drawing on information from the setting, the exact nature of the problem being assessed and knowledge of the techniques and methods that are available. In each case study, the results of the chosen form of analyses will be interpreted, with particular attention given to the best way of communicating the results to a variety of technical and non-technical audiences.

Activities will include problem formulation, knowledge discovery, regression modelling, machine learning and report writing and presentation. Assessment will be based on examination and practical examples using real-world data examples.

 

INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)

On successful completion of this module you should be able to:

Module Specific Skills and Knowledge

1. Apply data science and statistical methods using real-world examples
2. Apply new techniques learnt through case studies to other datasets to answer questions in other applications
3. Implement machine learning and regression techniques using R/RStudio

Discipline Specific Skills and Knowledge

4. Select the most appropriate method(s) that should be used based on an understanding of the problem being addressed
5. Understand the potential issues associated with using data science and statistical methodology in real-world settings

Personal and Key Transferable / Employment Skills and Knowledge

7. Apply a range of data analysis skills to address real-world problems
8. Use R/RStudio and other software to manipulate and summarise data
9. Use learning resources effectively
10. Communicate the results of data analysis clearly and accurately, both in writing and verbally.Communicate the results of data analysis clearly and accurately, both in writing and verbally.
11. Formulate real-world problems in a manner that enables statistical and data science methods to be used to answer questions

 

SYLLABUS PLAN - summary of the structure and academic content of the module

Data Science and Statistical modelling topics will be introduced through their application in a series of case studies. Case studies may change each year, but the initial selection will include:

·         Case study: modelling environmental hazards

·         Case study: clustering and segmentation of customers

·         Case study: forecasting electricity demands

·         Case study: modelling the effects of air pollution on health

·         Case study: mapping rates of disease

·         Case study: exploring physical activity data for health

Case study: using local sources of data to address local challenges

LEARNING AND TEACHING
LEARNING ACTIVITIES AND TEACHING METHODS (given in hours of study time)
Scheduled Learning & Teaching Activities 36.00 Guided Independent Study 114.00 Placement / Study Abroad 0.00
DETAILS OF LEARNING ACTIVITIES AND TEACHING METHODS
Category Hours of study time Description
Scheduled learning and teaching 24 Lectures
Scheduled learning and teaching 12 Hands-on practical sessions
Guided Independent Study 50 Self study & background reading
Guided Independent Study 64 Assessed data analyses, report writing.

 

ASSESSMENT
FORMATIVE ASSESSMENT - for feedback and development purposes; does not count towards module grade
Form of Assessment Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Feedback on unassessed data analyses examples (which will include report writing) 24 All Oral

 

SUMMATIVE ASSESSMENT (% of credit)
Coursework 80 Written Exams 20 Practical Exams 0
DETAILS OF SUMMATIVE ASSESSMENT
Form of Assessment % of Credit Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Coursework – extended piece of data analysis involving data collection, analysis and reporting 80 Max 10 pages (plus appendixes) All Oral & Written
Class test 20 1 hour All Oral & Written

 

DETAILS OF RE-ASSESSMENT (where required by referral or deferral)
Original Form of Assessment Form of Re-assessment ILOs Re-assessed Time Scale for Re-assessment
Extended data analysis* Extended data analysis (80%) All August Ref/Def Period
Class test * Class test (20%) All August Ref/Def Period

*Please refer to reassessment notes for details on deferral vs. Referral reassessment

RE-ASSESSMENT NOTES

Deferrals: Reassessment will be by coursework and/or exam in the deferred element only. For deferred candidates, the module mark will be uncapped. 

Referrals: Reassessment will be by a single piece of coursework worth 100% of the module only. As it is a referral, the mark will be capped at 50%. 

RESOURCES
INDICATIVE LEARNING RESOURCES - The following list is offered as an indication of the type & level of
information that you are expected to consult. Further guidance will be provided by the Module Convener

Reading list for this module:

Type Author Title Edition Publisher Year ISBN Search
Set James, G., Witten, D., Hastie, T., Tibshirani, R. An Introduction to Statistical Learning: with Applications in R Springer 2013 978-1461471370 [Library]
Set Lantz, B. Machine Learning with R: Expert Techniques for Predictive Modeling 3rd Packt 2019 978-1788295864 [Library]
CREDIT VALUE 15 ECTS VALUE 15
PRE-REQUISITE MODULES None
CO-REQUISITE MODULES None
NQF LEVEL (FHEQ) 7.5 AVAILABLE AS DISTANCE LEARNING No
ORIGIN DATE Friday 13 September 2019 LAST REVISION DATE Friday 09 December 2022
KEY WORDS SEARCH None Defined