Data Engineering(Degree)

Build data pipelines and maintain the infrastructure required to process and store large datasets with our quizzes. Master data integration, data transformation, and data warehousing. Prepare to architect the infrastructure that supports data analytics

Placeholder
Year 2

Big Data Ecosystems with Apache Spark

This course provides comprehensive training in Apache Spark, the leading unified analytics engine for large-scale data processing. Students learn to process petabytes of data across clustered computers using Spark's core APIs (RDD, DataFrame, Dataset) for batch processing and Spark Streaming for near-real-time analytics. Mastery of Spark is essential for Data …

Read more
0 Views
0 Topics
Placeholder
Year 3

Cloud Data Engineering on AWS and Azure

This course focuses on building and deploying data pipelines using cloud-native services from major providers like Amazon Web Services (AWS) and Microsoft Azure. Students gain hands-on experience with services such as AWS Glue, Azure Data Factory, Redshift, Synapse Analytics, and cloud object storage (S3, ADLS). As Zambian enterprises and government …

Read more
0 Views
0 Topics
Placeholder
Year 4

Data Engineering Capstone: Production System Build

This culminating project course requires students to integrate all acquired knowledge to design, build, document, and present a fully functional, production-like data engineering system. Working with a real or simulated dataset from a Zambian context (e.g., utility data from ZESCO, mobile data from MTN), students will implement a complete pipeline—from …

Read more
0 Views
0 Topics
Placeholder
Year 1

Data Engineering Fundamentals and Pipeline Design

This foundational course introduces the core principles, roles, and responsibilities of a Data Engineer within the modern data ecosystem. Students learn the data pipeline lifecycle—from ingestion and transformation to storage and serving—and compare batch versus streaming architectures. The course establishes the critical importance of reliable, scalable data infrastructure for Zambian …

Read more
0 Views
0 Topics
Placeholder
Year 4

Data Governance, Quality, and Operations (DataOps)

This course addresses the non-technical pillars of successful data engineering: governance, quality assurance, and operational excellence (DataOps). Students learn to implement data lineage tracking, define and monitor data quality rules (completeness, validity, consistency), and establish CI/CD practices for data pipelines. In Zambia's regulated sectors like Banking and Telecommunications, or for …

Read more
0 Views
0 Topics
Placeholder
Year 2

Data Modeling and Warehousing Techniques

This course covers the systematic design of data storage systems for analytical processing, moving beyond transactional databases. Students master dimensional modeling concepts (star and snowflake schemas), learn to design fact and dimension tables, and apply data warehouse architecture patterns (Kimball, Inmon). This skill is fundamental for building the centralized reporting …

Read more
0 Views
0 Topics
Placeholder
Year 3

Data Pipeline Orchestration with Airflow

This course focuses on the critical operational skill of orchestrating complex, interdependent data workflows using industry-standard tools like Apache Airflow. Students learn to define workflows as directed acyclic graphs (DAGs), schedule tasks, handle dependencies, monitor pipeline health, and manage alerts. Reliable orchestration is the backbone of any mature data platform, …

Read more
0 Views
0 Topics
Placeholder
Year 2

Extract, Transform, Load (ETL) Process Design

This course delves into the heart of data engineering: designing, building, and maintaining reliable ETL/ELT pipelines. Students learn to extract data from diverse sources (APIs, logs, databases), apply complex transformation logic for cleaning and business rules, and load data into target systems. The course emphasizes idempotency, error handling, and data …

Read more
0 Views
0 Topics
Placeholder
Year 1

Python for Data Engineering and Automation

This hands-on course focuses on using Python as the primary tool for building robust, automated data pipelines. Students learn core Python programming, then specialize in libraries critical for engineering: Pandas for data wrangling, PySpark for distributed processing, SQLAlchemy for database interaction, and Apache Airflow for workflow orchestration. The ability to …

Read more
0 Views
0 Topics