Data Engineering Capstone Project

Course Cover
compare button icon

Course Features

icon

Duration

6 weeks

icon

Delivery Method

Online

icon

Available on

Limited Access

icon

Accessibility

Mobile, Desktop, Laptop

icon

Language

English

icon

Subtitles

English

icon

Level

Beginner

icon

Effort

3 hours per week

icon

Teaching Type

Self Paced

Course Description

In this Capstone you’ll demonstrate your ability to perform like a Data Engineer. Your mission is to design, implement, and manage a complete data and analytics platform consisting of relational and non-relational databases, data warehouses, data pipelines, big data processing engines, and Business Intelligence (BI) tools.

This Capstone project will require that you apply and sharpen the skills and knowledge you developed in the various courses in the IBM Data Engineering Professional Certificate and utilize multiple tools and technologies to design databases, collect data from multiple sources, extract, transform and load data into a data warehouse, and utilize a cloud-based BI tool to create analytic reports and visualizations. You will also implement predictive analytics and machine learning models using big data tools and techniques.

This capstone requires significant amount of hands-on lab effort throughout the course. You’ll exhibit your knowledge and proficiency working with Python, Bash scripts, SQL, NoSQL, RDBMSes, ETL, MySQL, PostgreSQL, Db2, MongoDB, Apache Airflow, Apache Spark, and Cognos Analytics.

Upon successfully completing this Capstone, you should have the confidence and portfolio to take on real-world data engineering projects and showcase your abilities to perform as an entry-level data engineer.

Course Overview

projects-img

International Faculty

projects-img

Post Course Interactions

projects-img

Instructor-Moderated Discussions

projects-img

Case Studies, Captstone Projects

Skills You Will Gain

What You Will Learn

Build a complete data and analytics platform.

Setup, manage and query relational and NoSQL databases.

Create data pipelines and ETL processes using Apache Airflow.

Design and populate a star/snowflake schema data warehouse and query it using SQL.

Analyze warehouse data using Business Intelligence (BI) tool Cognos Analytics to create reports and dashboards.

Deploy a big data machine learning model using Apache Spark.

Course Instructors

Rav Ahuja

AI and Data Science Program Director

Rav Ahuja is a Global Program Director at IBM. He leads growth strategy, curriculum creation, and partner programs for the IBM Skills Network. Rav co-founded Cognitive Class, an IBM led initiative to...

Ramesh Sannareddy

Content Developer

Ramesh Sannareddy holds a Bachelors Degree in Information Systems (Birla Institute of Technology, Pilani). He has two and a half decades of experience in Information Technology Infrastructure Managem...
Course Cover