Learn IT - Books tags hdfs

Hadoop in Practice

Manning Publications, 2014

Summary

Hadoop in Practice, Second Edition provides over 100 tested, instantly useful techniques that will help you conquer big data, using Hadoop. This revised new edition covers changes and new features in the Hadoop core architecture, including MapReduce 2. Brand new chapters cover YARN and integrating...

Learning Cloudera Impala

Packt Publishing, 2013

Perform interactive, real-time in-memory analytics on large amounts of data using the massive parallel processing engine Cloudera Impala

Overview

Step-by-step guidance to get you started with Impala on your Hadoop cluster

Manipulate your data rapidly by writing proper SQL statements

...

Learning Spark: Lightning-Fast Big Data Analysis

O'Reilly, 2015

Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java,...

Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies

O'Reilly, 2015

If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections....

Hadoop: The Definitive Guide

Yahoo Press, 2010

Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing...

Programming Pig

O'Reilly, 2011

This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application—making it easy for you to experiment with new datasets.

Programming Pig introduces...

HBase Administration Cookbook

Packt Publishing, 2012

Master HBase configuration and administration for optimum database performance

Move large amounts of data into HBase and learn how to manage it efficiently

Set up HBase on the cloud, get it ready for production, and run it smoothly with high performance

Maximize the ability of HBase with the...

Apache Accumulo for Developers

Packt Publishing, 2013

Discover how to build Accumulo, Hadoop, and ZooKeeper clusters from scratch on both Windows and Linux. With this book's examples-based approach, you'll learn the painless way through clear instructions and real-world exercises.

Overview

Shows you how to build Accumulo, Hadoop, and ZooKeeper...

Pentaho for Big Data Analytics

Packt Publishing, 2013

With your knowledge of Java and this guide, you can take the analysis of your big data to new levels using Pentaho. Covers all the essentials tools, techniques, tips, and tricks in one handy volume.

Overview

A guide to using Pentaho Business Analytics for big data analysis

Learn...

Pro Couchbase Development: A NoSQL Platform for the Enterprise

Apress, 2015

Pro Couchbase Development: A NoSQL Platform for the Enterprise discusses programming for Couchbase using Java and scripting languages, querying and searching, handling migration, and integrating Couchbase with Hadoop, HDFS, and JSON. It also discusses migration from other NoSQL databases like MongoDB.

This book is for big...

Big Data Analysis with Python: Combine Spark and Python to unlock the powers of parallel computing and machine learning

Packt Publishing, 2019

Get to grips with processing large volumes of data and presenting it as engaging, interactive insights using Spark and Python.

Key Features

Get a hands-on, fast-paced introduction to the Python data science stack

Explore ways to create useful metrics and statistics from...

Python Data Analysis

Packt Publishing, 2017

Key Features

Find, manipulate, and analyze your data using the Python 3.5 libraries

Perform advanced, high-performance linear algebra and mathematical calculations with clean and efficient Python code

An easy-to-follow guide with realistic examples that are frequently used in real-world data...

Result Page: 4 3 2 1