COMM033DA - Data Engineering (2023)

Back | Download as PDF
MODULE TITLEData Engineering CREDIT VALUE15
MODULE CODECOMM033DA MODULE CONVENERUnknown
DURATION: TERM 1 2 3
DURATION: WEEKS 12
Number of Students Taking Module (anticipated)
DESCRIPTION - summary of the module content

 

The module equips you with big data technologies for handling large-scale, varied, and real-time data. It focuses on data management and the properties of modern data storage solutions, and their relevance in the context of enterprise systems. You will gain knowledge about creating data pipelines for analytics and to make informed platform choices for designing and implementing solutions in diverse data scenarios. The module also covers the identification and documentation of relevant data hierarchies or taxonomies. You will develop the necessary skills and knowledge to become proficient data engineers, capable of ensuring data is organised and accessible for analysis and decision-making.

Pre-requisite modules: None.

Co-requisite modules: None.

This module is a part of MSc Digital and Technology Solutions (Integrated Degree Apprenticeship) programme. It cannot be taken as an elective by students on other programmes.

The apprenticeship standard and other documentation relating to the Level 7 Digital and Technology Solutions (Data Analyst Specialist) Apprenticeship can be found here: https://www.instituteforapprenticeships.org/apprenticeship-standards/digital-and-technology-solutions-specialist-integrated-degree/

AIMS - intentions of the module

This module covers engineering aspects of data science and big data, hence the main objective of the module is to provide you with specialised knowledge and critical understanding of the process of collecting, organising, retrieving and storing large scale and real-time complex datasets with help of databases and data storage solutions and technologies such as Hadoop and Spark. Consequently, an important aim of the module is to equip you with practical knowledge needed to work with modern database management system including both relational databases and non-relational databases (NoSQL). You also need to know how to develop code in a programming language (Python) and use its various libraries for big data processing, managing the coordination and creating data pipelines for analytics that transform data into actionable decisions.

 

INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)

On successful completion of this module you should be able to:

Module Specific Skills and Knowledge

1. Explain Big Data technologies and the challenges to address in infrastructures, platforms and applications. 
2. Analyse the properties of different data storage solutions and their relevance in the context of enterprise systems. 
3. Explain the process of acquiring, storing, manipulating and managing large scale real-time complex datasets effectively and building pipelines for analytics. 
4. Evaluate the data platform choices available for designing and implementing data storage solutions in different data scenarios. 

Discipline Specific Skills and Knowledge

5. Understand database management and transaction processing systems
6. Demonstrate the foundation of Data Science and Big Data
7. Demonstrate proficiency at programming in languages such as Python for driving big data technologies 
8. Explain relevant data hierarchies or taxonomies are identified and properly documented

Personal and Key Transferable / Employment Skills and Knowledge

9.Evaluate available data (including from clouds) to identify and select the business data for analytics in the context of enterprise systems
10.Synthesis data quality rule sets and guidelines for database designers
11. Communicate ideas and techniques fluently

 

SYLLABUS PLAN - summary of the structure and academic content of the module

Whilst the module’s precise content may vary from year to year, an example of an overall structure is as follows:

  • Big data, foundation, infrastructures and Platforms
  • Data representation, information modelling and mapping
  • Data storage solutions, relational and non-relational databases systems
  • Transactions and their use in integrity and recovery management
  • Distributed systems, cloud computing and Data
  • Data orchestration, integration and ETL/ELT pipelines
  • CAP theorem and BASE database design principle
  • Business intelligence, data quality and data hierarchies in enterprise data management
  • Python for API programming of various big data platforms and applications  

 

LEARNING AND TEACHING
LEARNING ACTIVITIES AND TEACHING METHODS (given in hours of study time)
Scheduled Learning & Teaching Activities 20.00 Guided Independent Study 130.00 Placement / Study Abroad 0.00
DETAILS OF LEARNING ACTIVITIES AND TEACHING METHODS
Category Hours of study time Description
Scheduled Learning and Teaching activities 20 Masterclasses & Webinars
Guided independent study 6 Asynchronous Online classes  
Guided independent study 124 Background readings, practice and preparation for the assessment. Application of knowledge in workplace and demonstration of skills.

 

ASSESSMENT
FORMATIVE ASSESSMENT - for feedback and development purposes; does not count towards module grade
Form of Assessment Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Online tests  1 hour 107 Verbal - online

 

SUMMATIVE ASSESSMENT (% of credit)
Coursework 100 Written Exams 0 Practical Exams 0
DETAILS OF SUMMATIVE ASSESSMENT
Form of Assessment % of Credit Size of Assessment (e.g. duration/length) ILOs Assessed Feedback Method
Portfolio of tasks completed and your reflections on these 100 3500 words 1011 Written feedback from academic tutor

 

DETAILS OF RE-ASSESSMENT (where required by referral or deferral)
Original Form of Assessment Form of Re-assessment ILOs Re-assessed Time Scale for Re-assessment
Portfolio of tasks completed and your reflections on these (100%), 3500 words Resubmission 1-11 Programme schedule dependent

 

RE-ASSESSMENT NOTES
RESOURCES
INDICATIVE LEARNING RESOURCES - The following list is offered as an indication of the type & level of
information that you are expected to consult. Further guidance will be provided by the Module Convener

Basic reading:

  • Soufian, M. (2014) Notes on STFC Big Data and Analytics Summer School 2014, Daresbury Laboratories, Warrington, UK.
  • Buyya, R., Calheiros, R.N., Dastjerdi, A.V. (2016), Big Data: Principles and Paradigms. Morgan Kaufman
  • Berman, J.J. (2018), Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information. 2nd ed. Academic Press
  • Kleppmann, M (2016) Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, O'Reilly
  • Ceder, N. (2018) The Quick Python Book. Third Edition, Manning Publications Co

 

 

Reading list for this module:

There are currently no reading list entries found for this module.

CREDIT VALUE 15 ECTS VALUE 7.5
PRE-REQUISITE MODULES None
CO-REQUISITE MODULES None
NQF LEVEL (FHEQ) 7 AVAILABLE AS DISTANCE LEARNING No
ORIGIN DATE Thursday 14 September 2023 LAST REVISION DATE Wednesday 06 March 2024
KEY WORDS SEARCH Data engineering