Tuesday, June 24, 2014

Hadoop meets OpenStack

The Sahara project provides a simple means to provision a Hadoop cluster on top of OpenStack.

https://wiki.openstack.org/wiki/Sahara


http://docs.openstack.org/developer/sahara/architecture.html



The Sahara architecture consists of several components:
  • Auth component - responsible for client authentication & authorization, communicates with Keystone
  • DAL - Data Access Layer, persists internal models in DB
  • Provisioning Engine - component responsible for communication with Nova, Heat, Cinder and Glance
  • Vendor Plugins - pluggable mechanism responsible for configuring and launching Hadoop on provisioned VMs; existing management solutions like Apache Ambari and Cloudera Management Console could be utilized for that matter
  • EDP - Elastic Data Processing (EDP) responsible for scheduling and managing Hadoop jobs on clusters provisioned by Sahara
  • REST API - exposes Sahara functionality via REST
  • Python Sahara Client - similar to other OpenStack components Sahara has its own python client
  • Sahara pages - GUI for the Sahara is located on Horizon

http://www.slideshare.net/mirantis/savanna-hadoop-on-openstack

https://www.youtube.com/watch?v=3bI1WjB-5AM






No comments:

Post a Comment