Big Data Analytics Using Spark

Yoav Freund, UCSanDiegoX

Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform.

In data science, data is called "big" if it cannot fit into the memory of a single standard laptop or workstation.

The analysis of big datasets requires using a cluster of tens, hundreds or thousands of computers. Effectively using such clusters requires the use of distributed files systems, such as the Hadoop Distributed File System (HDFS) and corresponding computational models, such as Hadoop, MapReduce and Spark.

In this course, part of the Data Science MicroMasters program, you will learn what the bottlenecks are in massive parallel computation and how to use spark to minimize these bottlenecks.

You will learn how to perform supervised an unsupervised machine learning on massive datasets using the Machine Learning Library (MLlib).

In this course, as in the other ones in this MicroMasters program, you will gain hands-on experience using PySpark within the Jupyter notebooks environment.

What will you learn

  • Programming Spark using Pyspark
  • Identifying the computational tradeoffs in a Spark application
  • Performing data loading and cleaning using Spark and Parquet
  • Modeling data through statistical and machine learning methods

Сессии:
  • 28 апреля 2020
Характеристики онлайн курса:
  • Бесплатный:
  • Платный:
  • Сертификат:
  • MOOC:
  • Видеолекции:
  • Аудиолекции:
  • Email-курс:
  • Язык: Английский Gb

Отзывы

Пока никто не написал отзыв по этому курсу. Хотите быть первым?

Зарегистрируйтесь, чтобы оставить отзыв

Show?id=n3eliycplgk&bids=695438
NVIDIA
Ещё курсы на эту тему:
Cloud_applications_v01_600x340 Cloud Computing Applications
Learn how to use the cloud and write programs for data analytics. Learn about...
Large-icon Data Manipulation at Scale: Systems and Algorithms
Data analysis has replaced data acquisition as the bottleneck to evidence-based...
Big-data-_2_ Introduction to Big Data Analytics
********* A new, improved version of the Big Data Specialization will become...
Dat202.2x-course_card_image-378x225 Implementing Real-Time Analysis with Hadoop in Azure HDInsight
Learn how to use Hadoop technologies like HBase, Storm, and Spark in Microsoft...
464572_3f38_3 Big Data Analytics with Apache Spark and Python
Learn to use Apache Spark to store and analyze data in real time.
Ещё из рубрики «Компьютерные науки»:
0ebff2a6-62d3-43f2-9924-ecf153cd2550-55dac65a881d.small Bias and Discrimination in AI
Discover how even computer algorithms may be biased and have a serious impact...
C3a60328-0870-4ca0-b57f-50d70eb2ee6a-dbb0c2dbc813.small Deep Learning Essentials
Do you want to learn how machines can learn tasks we thought only human brains...
34ea79ed-b0c4-4a12-aeb6-64a5b507607a-6d1e5b2a57c3.small Machine Learning with Python: A Practical Introduction
Machine Learning can be an incredibly beneficial tool to uncover hidden insights...
3a287fad-9544-46ea-bc6a-85863940a402-98b9d78d8cba.small Compilers
This self-paced course will discuss the major ideas used today in the implementation...
F0e739b2-40cc-49bf-8f24-11a41b54ac16-2355c3022040.small Automata Theory
This course covers the theory of automata and languages. We begin with a study...
Ещё от edX:
956ae690-25d8-4524-9d7a-ceb33204fe8f-f070dce821bf.small Entrepreneurship 101: Who is your customer?
Entrepreneurship can be learned. Begin your journey by learning the first important...
A3c01da9-af9c-4dab-8bbf-b14bcb962b7d-a89a20a61f95.small Climate Change Education
Learn how to work with your students to help them explore climate change through...
949a4020-22e5-4762-9e15-8be6be00aedf-b4acbd7d8588.small What Works in Education: Evidence-Based Education Policies
Learn what works in education and how to identify, analyze and implement evidence...
83c62468-3458-40cc-ac21-9eb3909ec204-1d351e5558c0.small Risk Management in Development Projects
Learn to preemptively manage positive and negative events that may affect the...
4a081c09-82b6-4c2c-b41c-75e995cafef8-8acb7c3ebedb.small Convex Optimization
This course concentrates on recognizing and solving convex optimization problems...

© 2013-2019