Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
Apache Flume: Distributed Log Collection for Hadoop - Second Edition

Buy

Design and implement a series of Flume agents to send streamed data into Hadoop

About This Book

  • Construct a series of Flume agents using the Apache Flume service to efficiently collect, aggregate, and move large amounts of event data
  • Configure failover paths and load balancing to remove single points of failure
  • Use this step-by-step guide to stream logs from application servers to Hadoop's HDFS

Who This Book Is For

If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed.

What You Will Learn

  • Understand the Flume architecture, and also how to download and install open source Flume from Apache
  • Follow along a detailed example of transporting weblogs in Near Real Time (NRT) to Kibana/Elasticsearch and archival in HDFS
  • Learn tips and tricks for transporting logs and data in your production environment
  • Understand and configure the Hadoop File System (HDFS) Sink
  • Use a morphline-backed Sink to feed data into Solr
  • Create redundant data flows using sink groups
  • Configure and use various sources to ingest data
  • Inspect data records and move them between multiple destinations based on payload content
  • Transform data en-route to Hadoop and monitor your data flows

In Detail

Apache Flume is a distributed, reliable, and available service used to efficiently collect, aggregate, and move large amounts of log data. It is used to stream logs from application servers to HDFS for ad hoc analysis.

This book starts with an architectural overview of Flume and its logical components. It explores channels, sinks, and sink processors, followed by sources and channels. By the end of this book, you will be fully equipped to construct a series of Flume agents to dynamically transport your stream data and logs from your systems into Hadoop.

A step-by-step book that guides you through the architecture and components of Flume covering different approaches, which are then pulled together as a real-world, end-to-end use case, gradually going from the simplest to the most advanced features.

(HTML tags aren't allowed.)

The .NET Developer's Guide to Windows Security (Microsoft Net Development Series)
The .NET Developer's Guide to Windows Security (Microsoft Net Development Series)

The .NET Developer's Guide to Windows Security is required reading for .NET programmers who want to develop secure Windows applications. Readers gain a deep understanding of Windows security and the know-how to program secure systems that run on...

Securing the Perimeter: Deploying Identity and Access Management with Free Open Source Software
Securing the Perimeter: Deploying Identity and Access Management with Free Open Source Software

Leverage existing free open source software to build an identity and access management (IAM) platform that can serve your organization for the long term. With the emergence of open standards and open source software, it’s now easier than ever to build and operate your own IAM stack.

The most common culprit of...

AutoCAD 2005 For Dummies
AutoCAD 2005 For Dummies
Use sheet sets, collaborate nicely, and go LT for 2005

Find out how to use DWF, live up to new standards, and share files online

AutoCAD can be complicated, but this book isn’t! Here’s where you’ll discover how to set up a drawing, toe the lines, add dimension and text, share your stuff, and more, in AutoCAD or...


Developing with Google App Engine (Firstpress)
Developing with Google App Engine (Firstpress)
Developing with Google App Engine introduces development with Google App Engine, a platform that provides developers and users with infrastructure Google itself uses to develop and deploy massively scalable applications.
  • Introduction to concepts
  • Development with App Engine
  • Deployment into App...
Role-Based Access Control, Second Edition
Role-Based Access Control, Second Edition
Role-based access control (RBAC) is a security mechanism that has gained wide acceptance in the field because it can greatly lower the cost and complexity of securing large networked and Web-based systems. Written by leading experts, this newly revised edition of the Artech House bestseller, Role-Based Access Control, offers practitioners...
Object-Oriented Thought Process, The, Second Edition
Object-Oriented Thought Process, The, Second Edition

The Object-Oriented Thought Process, Second Edition will lay the foundation in object-oriented concepts and then explain how various object technologies are used. Author Matt Weisfeld introduces object-oriented concepts, then covers abstraction, public and private classes, reusing code, and...

©2021 LearnIT (support@pdfchm.net) - Privacy Policy