# Mathematics

## MTHM503 - Applications of Data Science and Statistics (2019)

MODULE TITLE CREDIT VALUE Applications of Data Science and Statistics 15 MTHM503 Dorottya Fekete (Coordinator)
DURATION: TERM 1 2 3
DURATION: WEEKS 11 0 0
 Number of Students Taking Module (anticipated) 15
DESCRIPTION - summary of the module content

This module will enable you to learn new Data Science and Statistical methods, and to use the techniques learnt in other modules, by working on analyses of real data examples. There will be a strong emphasis throughout on understanding the practical application of statistical and machine learning methods including clustering, data reduction, methods for handling missing data, study design and introductory methods for time series data. Theory and ideas will be developed to allow the implementation of methods in examples drawn from industry, medicine, finance, public health and environmental challenges, including climate change and air pollution.

Pre-requisites: None

AIMS - intentions of the module

The aim of this module is to practice the use of Data Science and Statistical modelling by working through a series of case studies. The case studies will be based on real-life problems and will start with a description of the setting of the problem and the intended outcomes. One of the important things in any statistical analysis is to understand the background to the problem and, for each case study, there will be a review the field in which it is set. Analyses will start with raw data that will have to be sense-checked and manipulated into a form that is suitable for the intended analyses. Deciding on the exact form of the analyses in each case will be a central focus of this module and an important aim of this module will be developing the skills to make decisions in this regard, drawing on information from the setting, the exact nature of the problem being assessed and knowledge of the techniques and methods that are available. In each case study, the results of the chosen form of analyses will be interpreted, with particular attention given to the best way of communicating the results to a variety of technical and non-technical audiences.

Activities will include problem formulation, knowledge discovery, regression modelling, machine learning and report writing and presentation. Assessment will be based on examination and practical examples using real-world data examples.

INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)
 Module Specific Skills and Knowledge: 1 Apply data science and statistical methods using real-world examples 2 Apply new techniques learnt through case studies to other datasets to answer questions in other applications 3 Implement machine learning and regression techniques using R/RStudio Discipline Specific Skills and Knowledge: 4 Select the most appropriate method(s) that should be used based on an understanding of the problem being addressed 5 Understand the potential issues associated with using data science and statistical methodology in real-world settings Personal and Key Transferable/ Employment Skills and  Knowledge: 7 Apply a range of data analysis skills to address real-world problems 8 Use R/RStudio and other software to manipulate and summarise data 9 Use learning resources effectively 10 Communicate the results of data analysis clearly and accurately, both in writing and verbally. 11 Formulate real-world problems in a manner that enables statistical and data science methods to be used to answer questions

SYLLABUS PLAN - summary of the structure and academic content of the module

Data Science and Statistical modelling topics will be introduced through their application in a series of case studies. Case studies may change each year, but the initial selection will include:

·         Case study: modelling environmental hazards

·         Case study: clustering and segmentation of customers

·         Case study: forecasting electricity demands

·         Case study: modelling the effects of air pollution on health

·         Case study: mapping rates of disease

·         Case study: exploring physical activity data for health

Case study: using local sources of data to address local challenges

LEARNING AND TEACHING
LEARNING ACTIVITIES AND TEACHING METHODS (given in hours of study time)
 Scheduled Learning & Teaching Activities Guided Independent Study Placement / Study Abroad 36 114 0
DETAILS OF LEARNING ACTIVITIES AND TEACHING METHODS
 Category Hours of study time Description Scheduled learning and teaching 24 Lectures Scheduled learning and teaching 12 Hands-on practical sessions Guided Independent Study 50 Self study & background reading Guided Independent Study 64 Assessed data analyses, report writing.

ASSESSMENT
FORMATIVE ASSESSMENT - for feedback and development purposes; does not count towards module grade
 Form of Assessment Size of the assessment  e.g. duration/length ILOs assessed Feedback method Feedback on unassessed data analyses examples (which will include report writing) 24 All Oral

SUMMATIVE ASSESSMENT (% of credit)
 Coursework Written Exams Practical Exams 60 40 0
DETAILS OF SUMMATIVE ASSESSMENT
 Form of Assessment % of credit Size of the assessment  e.g. duration/length ILOs assessed Feedback method Assessed data analyses and reports from practical sessions (selected ones from the weekly sessions) 40 1.5 hours x 4 All Oral & Written Coursework – extended piece of data analysis involving data collection, analysis and reporting 40 Max 10 pages (plus appendixes) All Oral & Written Presentation on coursework 20 20 mins All Oral & Written

DETAILS OF RE-ASSESSMENT (where required by referral or deferral)
RE-ASSESSMENT NOTES

RE-ASSESSMENT NOTES

Deferral – if you miss an assessment for certificated reasons judged acceptable by the Mitigation Committee, you will normally be either deferred in the assessment or an extension may be granted. The mark given for a re-assessment taken as a result of deferral will not be capped and will be treated as it would be if it were your first attempt at the assessment.

Referral – if you have failed the module overall (i.e. a final overall module mark of less than 50%) you will be required to re-take some or all parts of the assessment, as decided by the Module Convenor. The final mark given for a module where re-assessment was taken as a result of referral will be capped at 50%.

RESOURCES
INDICATIVE LEARNING RESOURCES - The following list is offered as an indication of the type & level of
information that you are expected to consult. Further guidance will be provided by the Module Convener

Reading list for this module:

Type Author Title Edition Publisher Year ISBN Search
Set James, G., Witten, D., Hastie, T., Tibshirani, R. An Introduction to Statistical Learning: with Applications in R Springer 2013 978-1461471370 [Library]
Set Lantz, B. Machine Learning with R: Expert Techniques for Predictive Modeling 3rd Packt 2019 978-1788295864 [Library]
CREDIT VALUE ECTS VALUE 15 15
PRE-REQUISITE MODULES None None
NQF LEVEL (FHEQ) AVAILABLE AS DISTANCE LEARNING 7.5 No Friday 13 September 2019 Friday 13 September 2019
KEY WORDS SEARCH None Defined