When you enroll through our links, we may earn a small commission—at no extra cost to you. This helps keep our platform free and inspires us to add more value.

Managing Big Data on Google's Cloud Platform
The Second Course in a Series for Attaining the Google Certified Data Engineer

This Course Includes
udemy
4.5 (97 reviews )
1h 28m
english
Online - Self Paced
professional certificate
Udemy
About Managing Big Data on Google's Cloud Platform
Welcome to
Managing Big Data on Google's Cloud Platform.
This is the
_second_
course in a series of courses designed to help you attain the coveted
Google Certified Data Engineer.
Additionally
, the series of courses is going to
show you
the role of the _
data engineer on the Google Cloud Platform
_. At this juncture the
Google Certified Data Engineer
is the only
real world certification
for data and machine learning engineers.
NOTE:
This is
NOT
Google Cloud Dataproc.
The course was designed to be
part of a series
for those who want to become
data engineers
on _Google's Cloud Platform_. This course is all about
Google's Cloud
and
migrating
on-premise Hadoop jobs
to GCP.
In reality,
Big Data
is simply about
unstructured data.
There are two core types of data in the real world. The first is
structured data
, this is the kind of data found in a relational database. The second is
unstructured
, this is a file sitting on a file system. Approximately
90% of all data
in the enterprise is
unstructured
and our job is to give it structure.
Why
do we want to give it
structure
? We want to give is structure so
we can analyze it.
Recall that
99%
of all applied machine learning is
supervised learning
. That simply means we have a
data set
and we point our
machine learning
models
at that data set in order to gain insight into that data. In the
course
we will spend much of the time working in
Cloud Dataproc.
This is
Google’s managed Hadoop and Spark
platform. Recall the end
goal of big data
is to get that data into a state where it can be
analyzed and modeled
. Therefore, we are also going to cover how to work on machine learning projects with
big data at scale
. Please keep in mind this course alone
will not give you
the knowledge and skills to
pass the exam
. The course will provide you with the
big data knowledge
you need for working with
Cloud Dataproc
and for moving existing projects to the Google Cloud Platform.
Five Reasons to take this Course.
1)
The Top Job in the World
The data engineer role is the single most needed role in the world. Many believe that it's the data scientist but several studies have broken down the job descriptions and the most needed position is that of the data engineer.
2)
Google's the World Leader in Data
Amazon's AWS is the most used cloud and Azure has the best UI but no cloud vendor in the world understands data like Google. They are the world leader in open sources artificial intelligence. You can't be the leader in AI without being the leader in data.
3)
90% of all Organizational Data is Unstructured
The study of big data is the study of unstructured data. As the data in companies grows most will need to scale to unprecedented level. Without a significant investment in infrastructure and talent this won't be possible without the cloud.
4)
The Data Revolution is Now
We are in a data revolution. Data used to be viewed as a simple necessity and lower on the totem pole. Now it is more widely recognized as the source of truth. As we move into more complex systems of data management, the role of the data engineer becomes extremely important as a bridge between the DBA and the data consumer. Beyond the ubiquitous spreadsheet, graduating from RDBMS (which will always have a place in the data stack), we now work with NoSQL and Big Data technologies.
5) Data is Foundation
Data engineers are the plumbers building a data pipeline, while data scientists are the painters and storytellers giving meaning to an otherwise static entity. Simply put, data engineers clean, prepare and optimize data for consumption. Once the data becomes useful, data scientists can perform a variety of analyses and visualization techniques to truly understand the data, and eventually, tell a story from the data. Thank you for your interest in
Managing Big Data on Google's Cloud Platform
and we will see you in the course!!
What You Will Learn?
- At the end of the course you'll understand Cloud Dataproc .
- You'll also know how to craft machine learning projects at scale on GCP. .
- You'll also know how to integrate dataproc with other core services like BigQuery .
- Additionally, you'll learn how to migrate on premise Hadoop and Spark jobs to Cloud Dataproc..