When you enroll through our links, we may earn a small commission—at no extra cost to you. This helps keep our platform free and inspires us to add more value.

Udemy logo

Managing Big Data on Google's Cloud Platform

The Second Course in a Series for Attaining the Google Certified Data Engineer

     
  • 4.5
  •  |
  • Reviews ( 97 )
₹519

This Course Includes

  • iconudemy
  • icon4.5 (97 reviews )
  • icon1h 28m
  • iconenglish
  • iconOnline - Self Paced
  • iconprofessional certificate
  • iconUdemy

About Managing Big Data on Google's Cloud Platform

Welcome to

Managing Big Data on Google's Cloud Platform.

This is the

_second_

course in a series of courses designed to help you attain the coveted

Google Certified Data Engineer.

Additionally

, the series of courses is going to

show you

the role of the _

data engineer on the Google Cloud Platform

_. At this juncture the

Google Certified Data Engineer

is the only

real world certification

for data and machine learning engineers.

NOTE:

This is

NOT

Google Cloud Dataproc.

The course was designed to be

part of a series

for those who want to become

data engineers

on _Google's Cloud Platform_. This course is all about

Google's Cloud

and

migrating

on-premise Hadoop jobs

to GCP.

In reality,

Big Data

is simply about

unstructured data.

There are two core types of data in the real world. The first is

structured data

, this is the kind of data found in a relational database. The second is

unstructured

, this is a file sitting on a file system. Approximately

90% of all data

in the enterprise is

unstructured

and our job is to give it structure.

Why

do we want to give it

structure

? We want to give is structure so

we can analyze it.

Recall that

99%

of all applied machine learning is

supervised learning

. That simply means we have a

data set

and we point our

machine learning

models

at that data set in order to gain insight into that data. In the

course

we will spend much of the time working in

Cloud Dataproc.

This is

Google’s managed Hadoop and Spark

platform. Recall the end

goal of big data

is to get that data into a state where it can be

analyzed and modeled

. Therefore, we are also going to cover how to work on machine learning projects with

big data at scale

. Please keep in mind this course alone

will not give you

the knowledge and skills to

pass the exam

. The course will provide you with the

big data knowledge

you need for working with

Cloud Dataproc

and for moving existing projects to the Google Cloud Platform.

Five Reasons to take this Course.

1)

The Top Job in the World

The data engineer role is the single most needed role in the world. Many believe that it's the data scientist but several studies have broken down the job descriptions and the most needed position is that of the data engineer.

2)

Google's the World Leader in Data

Amazon's AWS is the most used cloud and Azure has the best UI but no cloud vendor in the world understands data like Google. They are the world leader in open sources artificial intelligence. You can't be the leader in AI without being the leader in data.

3)

90% of all Organizational Data is Unstructured

The study of big data is the study of unstructured data. As the data in companies grows most will need to scale to unprecedented level. Without a significant investment in infrastructure and talent this won't be possible without the cloud.

4)

The Data Revolution is Now

We are in a data revolution. Data used to be viewed as a simple necessity and lower on the totem pole. Now it is more widely recognized as the source of truth. As we move into more complex systems of data management, the role of the data engineer becomes extremely important as a bridge between the DBA and the data consumer. Beyond the ubiquitous spreadsheet, graduating from RDBMS (which will always have a place in the data stack), we now work with NoSQL and Big Data technologies.

5) Data is Foundation

Data engineers are the plumbers building a data pipeline, while data scientists are the painters and storytellers giving meaning to an otherwise static entity. Simply put, data engineers clean, prepare and optimize data for consumption. Once the data becomes useful, data scientists can perform a variety of analyses and visualization techniques to truly understand the data, and eventually, tell a story from the data. Thank you for your interest in

Managing Big Data on Google's Cloud Platform

and we will see you in the course!!

What You Will Learn?

  • At the end of the course you'll understand Cloud Dataproc .
  • You'll also know how to craft machine learning projects at scale on GCP. .
  • You'll also know how to integrate dataproc with other core services like BigQuery .
  • Additionally, you'll learn how to migrate on premise Hadoop and Spark jobs to Cloud Dataproc..