zoukankan      html  css  js  c++  java
  • Hue for Apache Hadoop

    • What is Hue
    • Hue architecture
    • Install and configure Hue on hadoop
    • Tips for some common issues

    1. What is Hue

    Hue is one of Hadoop’s most important projects, as it significantly increases a user’s ease of access to the power of the Hadoop platform. While Hive and YARN provide a processing backbone for data analysts familiar with SQL to use Hadoop, Hue provides the interface of choice for data analysts to quickly get connected with big data and Hadoop’s powerful tools

     

    2. Hue architecture

    Hue applications run in a Web browser and require no client installation. The following figure illustrates how Hue works. Hue Server is a "container" web application that sits in between hadoop installation and the browser. It hosts all the Hue web applications and communicates with various servers that interface with  hadoop components.

    3. Install and configure Hue on Hadoop

    Hue consists of a web service that runs on a special node in the hadoop cluster. At here, I will use existing master node as the Hue Server.

    3.1 Technical Details

    • Distribution: Apache hadoop HDFS
    • Cluster Manager: Resource manager
    • Environment: Ali cloud server
    • Operating System: Ubuntu 14.04 LTS

    3.2 Features confirmed to work in partial or complete fashion

    • File Browser  (HDFS access through WebHdfs or HttpFS)
    • Hive/Beeswax (Beeswax uses the Hive client libraries)
    • HBase Cluster Browser  (Requires Thrift 1 service)
    • job Browser (Job information access through hue-plugins)

    3.3 Hue Dependencies

     Hue employs some Python modules which use native code and requires certain development libraries be installed on the system. To install from the tarball, following components must be installed:

    • sudo apt-get install -y ant
    • sudo apt-get install -y gcc g++
    • sudo apt-get install -y libkrb5-dev libmysqlclient-dev
    • sudo apt-get install -y libssl-dev libsasl2-dev libsasl2-modules-gssapi-mit
    • sudo apt-get install -y libsqlite3-dev
    • sudo apt-get install -y libtidy-0.99-0 libxml2-dev libxslt-dev
    • sudo apt-get install -y maven
    • sudo apt-get install -y libldap2-dev
    • sudo apt-get install -y python-dev python-simplejson python-setuptools

    3.4  Install and configure hue

     (1) Download Hue 3.9 release tarball from below link.

          http://gethue.com/hue-3-9-with-all-its-improvements-is-out/

     (2) Unpackage the tarball to the Ubuntu opt directory.

      

     (3) Make sure all the dependencies have been installed and then start the build process.

      

     By default, Hue installs to ‘/usr/local/hue’ in the master node’s local filesystem.

     

      (4) The Hue configuration file can be found at ‘/usr/local/hue/desktop/conf/hue.ini’ . Bellow are the changes.

      Desktop

      

      Hadoop

      

        Hbase

      

       Hive

       

    3.5 Start hue and browser the data.

      (1) Start the Hue server using the ‘supervisor’ command.
       (2) Also need to start hive2 server for using Hive
     
        (3) Start Hbase server and thrift server for using Hbase.

       

    3.6 Browser the data with Hue.

     (1) Using hsql to select demo data from hive database

      

       (2) Create demo table by using Hbase browser

       

       (3) Check  jobs information

       

    4.Tips for some common issues

    (1)  Sometimes the build process for the dependencies raises errors. So I suggest to manually install Ant and Maven.For these two tools, you can directly download corresponding release tarballs and then manually configure environment variable ANT_HOME and MAVEN_HOME and PATH.

    (2)  As installed, the Hue installation folders and file ownership will be set to the ‘root’ user. we'd better to fix that so Hue can run correctly without root user permissions.

    (3)  For error message "creating build/temp.linux-x86_64-2.7/src gcc -pthread -fno-strict-aliasing -fwrapv -Wall -Wstrict-prototypes -fPIC -std=c99 -O3 -fomit-frame-pointer -Isrc/ -I/usr/include/ -I/home/huser/miniconda/include/python2.7 -c src/_fastmath.c -o build/temp.linux-x86_64-2.7/src/_fastmath.o src/_fastmath.c:36:18: fatal error: gmp.h: No such file or directory # include <gmp.h> ^ compilation terminated. error: command 'gcc' failed with exit status 1..."

    is because that gcc had not found "gmp.h". You need to assure that you have installed "libgmp3-dev" package and have gmp.h in path. Try to execute below command to install libgmp3-dev.

     #sudo apt-get install libgmp3-dev
  • 相关阅读:
    忍者必须死3 模拟器按键设置
    C# 工厂模式 个人基本流程
    WPF Boolean类型转化器收集 反转转化器
    Json实体类驼峰名称转化器
    TDengine + EMQ X + Grafana 轻松搭建高效低成本的边缘侧工业互联网平台
    呼声最高的数据更新功能来了,用户需要什么,我们就开源什么
    年轻人不讲武德,TDengine边缘侧数据存储方案挑战SQLite
    保姆级演示一分钟搞定TDengine的下载安装
    双汇大数据方案选型:从棘手的InfluxDB+Redis到毫秒级查询的TDengine
    HiveMQ TDengine extension 使用指南
  • 原文地址:https://www.cnblogs.com/kinginme/p/7204976.html
Copyright © 2011-2022 走看看