This is the code repository for Solving 10 Hadoop'able Problems [Video], published by Packt. It contains all the supporting project files necessary to work through the video course from start to finish.
The Apache Hadoop ecosystem is a popular and powerful tool to solve big data problems. With so many competing tools to process data, many users want to know which particular problems are well suited to Hadoop, and how to implement those solutions.
To know what types of problems are Hadoop-able it is good to start with a basic understanding of the core components of Hadoop. You will learn about the ecosystem designed to run on top of Hadoop as well as software that is deployed alongside it. These tools give us the building blocks to build data processing applications. This course covers the core parts of the Hadoop ecosystem, helping to give a broad understanding and get you up-and-running fast. Next, it describes a number of common problems as case-study projects Hadoop is able to solve. These sections are broken down into sections by different projects, each serving as a specific use case for solving big data problems.
By the end of this course, you will have been exposed to a wide variety of Hadoop software and examples of how it is used to solve common big data problems.
The code bundle for this video course is available at - https://github.com/PacktPublishing/-Solving-10-Hadoop-able-Problems
- Store data with HDFS and learn in detail about HBase
- Share and access data in a SQL-like interface for HDFS
- Analyze real-time events using Spark Streaming
- Perform complex big data analytics using MapReduce
- Analyze data to perform complex processing with Hive and Pig
- Explore functional programming using Spark
- Learn to import data using Sqoop
To fully benefit from the coverage included in this course, you will need:
This course is for data science professionals and Machine Learning enthusiasts who want to gain practical solutions to common problems faced with Deep Learning tasks with Python. Working knowledge of Deep Learning techniques and Python programming knowledge is assumed.
This course has the following software requirements:
Minimum Hardware Requirements
For successful completion of this course, students will require the computer systems with at least the following:
● OS: Windows 10
● Processor: Intel core i5
● Memory: 4 GB
● Storage: 256 GB
Recommended Hardware Requirements For an optimal experience with hands-on labs and other practical activities, we recommend the following configuration:
● OS: Windows 10
● Processor: Intel core i7
● Memory: 8 GB
● Storage: 256 GB
Software Requirements ● Python 3.6 (https://www.python.org/downloads/)
● Anaconda for Python 3.6 version (https://www.anaconda.com/download/)