Home | Amazing | Today | Tags | Publishers | Years | Account | Search 
Spark for Python Developers

Buy
Spark for Python Developers, 9781784399696 (1784399698), Packt Publishing, 2015

Key Features

  • Set up real-time streaming and batch data intensive infrastructure using Spark and Python
  • Deliver insightful visualizations in a web app using Spark (PySpark)
  • Inject live data using Spark Streaming with real-time events

Book Description

Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer―an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms.

Are you a Python developer inclined to work with Spark engine? If so, this book will be your companion as you create data-intensive app using Spark as a processing engine, Python visualization libraries, and web frameworks such as Flask.

To begin with, you will learn the most effective way to install the Python development environment powered by Spark, Blaze, and Bookeh. You will then find out how to connect with data stores such as MySQL, MongoDB, Cassandra, and Hadoop.

You'll expand your skills throughout, getting familiarized with the various data sources (Github, Twitter, Meetup, and Blogs), their data structures, and solutions to effectively tackle complexities. You'll explore datasets using iPython Notebook and will discover how to optimize the data models and pipeline. Finally, you'll get to know how to create training datasets and train the machine learning models.

By the end of the book, you will have created a real-time and insightful trend tracker data-intensive app with Spark.

What you will learn

  • Create a Python development environment powered by Spark (PySpark), Blaze, and Bookeh
  • Build a real-time trend tracker data intensive app
  • Visualize the trends and insights gained from data using Bookeh
  • Generate insights from data using machine learning through Spark MLLIB
  • Juggle with data using Blaze
  • Create training data sets and train the Machine Learning models
  • Test the machine learning models on test datasets
  • Deploy the machine learning algorithms and models and scale it for real-time events

About the Author

Amit Nandi studied physics at the Free University of Brussels in Belgium, where he did his research on computer generated holograms. Computer generated holograms are the key components of an optical computer, which is powered by photons running at the speed of light. He then worked with the university Cray supercomputer, sending batch jobs of programs written in Fortran. This gave him a taste for computing, which kept growing. He has worked extensively on large business reengineering initiatives, using SAP as the main enabler. He focused for the last 15 years on start-ups in the data space, pioneering new areas of the information technology landscape. He is currently focusing on large-scale data-intensive applications as an enterprise architect, data engineer, and software developer. He understands and speaks seven human languages. Although Python is his computer language of choice, he aims to be able to write fluently in seven computer languages too.

Table of Contents

  1. Setting Up a Spark Virtual Environment
  2. Building Batch and Streaming Apps with Spark
  3. Juggling Data with Spark
  4. Learning from Data Using Spark
  5. Streaming Live Data with Spark
  6. Visualizing Insights and Trends
(HTML tags aren't allowed.)

Fuzzy Relational Calculus: Theory, Applications And Software (Advances in Fuzzy Systems)
Fuzzy Relational Calculus: Theory, Applications And Software (Advances in Fuzzy Systems)

This book examines fuzzy relational calculus theory with applications in various engineering subjects. The scope of the text covers unified and exact methods with algorithms for direct and inverse problem resolution in fuzzy relational calculus. Extensive engineering applications of fuzzy relation compositions and fuzzy linear systems...

The Fast Forward MBA in Project Management (Portable Mba Series)
The Fast Forward MBA in Project Management (Portable Mba Series)
PProject management has hung on long past the “new management fad” stage. When the first edition of this book came out in 1999, project management was riding a rocket of popularity. Nine years later the discipline continues to add disciples in new fields such as health care and nonprofit aid organizations. While the...
Adobe Acrobat 6: The Professional User's Guide
Adobe Acrobat 6: The Professional User's Guide

Acrobat 6 contains strong business applications, and this book is the first to delve into them. In the first edition, acclaimed author Donna Baker devoted a chapter to ways that Acrobat can be used to streamline your business processes. She has expanded on this information in this edition, demonstrating the usefulness of...


Game Art for Teens (Game Development Series)
Game Art for Teens (Game Development Series)
Wouldn?t you love to create really great art for your games? The kind of art that keeps players coming back for more? Now you can! Game Art for Teens is full of step-by-step, hands-on projects that allow you to begin creating art right away. Each project includes easy-to-follow examples that help you master each concept, making it easy to put what...
Mac Application Development by Example Beginner's Guide
Mac Application Development by Example Beginner's Guide

It's never been more important to have the ability to develop an App for Mac OS X. Whether it's a System Preference, a business app that accesses information in the Cloud, or an application that uses multi-touch or uses a camera, you will have a solid foundation in app development to get the job done.

Mac Application...

Computer Arithmetic: Algorithms and Hardware Implementations
Computer Arithmetic: Algorithms and Hardware Implementations

The subject of this book is the analysis and design of digital devices that implement computer arithmetic. The book's presentation of high-level detail, descriptions, formalisms and design principles means that it can support many research activities in this field, with an emphasis on bridging the gap between algorithm optimization and...

©2021 LearnIT (support@pdfchm.net) - Privacy Policy