一、配置hadoop配置文件
这里修改分为两种模式,一种是hdfs HA模式,一种是hdfs Non HA模式
1.1 非HA模式配置
使用webhdfs方式
1)修改hdfs-site.xml文件,添加如下配置:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
2)修改core-site.xml,添加如下配置
<property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property>
修改完记得分发配置到其他节点,然后重启hadoop集群,线上集群就只能分批下线了
2.2HA模式
HA模式只能使用httpfs方式访问
1)修改httpfs-site.xml文件,添加
<property> <name>httpfs.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>httpfs.proxyuser.hduser.groups</name> <value>*</value> </property>
2) 启动httpsfs
[hduser@yjt hadoop]$ httpfs.sh start
默认监听在14000
二,修改hue配置
1、修改hue.ini
主要修改以下几项配置
default_hdfs_superuser=hduser # 默认是hdfs,这个是配置启动集群的用户,如果不修改,界面访问hdfs的时候,可能出现权限问题
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://yjt:9000 # hdfs 文件系统的URL
# NameNode logical name.
logical_name=yjt
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://192.168.0.230:14000/webhdfs/v1 # 使用httpfs的url
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false #kerberos相关
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
# Directory of the Hadoop configuration
## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
hadoop_conf_dir=${HADOOP_HOME}/etc/hadoop/conf # 集群配置的目录
重启集群以及重启hue
三、界面查看
然后选择files
二、配置yarn
1、修改Hue.ini文件,
先找到[[yarn_clusters]]这个标签,信息如下:
非HA:
[[yarn_clusters]] [[[default]]] resourcemanager_host=yjt # 配置resourcemanager的地址 resourcemanager_port=8032 submit_to=True resourcemanager_api_url=http://yjt:8088 proxy_api_url=http://yjt:8088 history_server_api_url=http://yjt:19888
HA:
这里需要说明一下,[[[default]]] 和 [[ha]]中各配置一个RM。
# Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Whether to submit jobs to this cluster submit_to=True # Name used when submitting jobs logical_name=rm1 # 这个配置的是yarn.resourcemanager.ha.rm-ids对应的值 # URL of the ResourceManager API resourcemanager_api_url=http://log1:8088 # web访问地址yarn.resourcemanager.webapp.address.rm1 对应的值 # URL of the ProxyServer API proxy_api_url=http://log1:8088 # URL of the HistoryServer API history_server_api_url=http://log1:19888 # mapred-site.xml 文件里面mapreduce.jobhistory.webapp.address对应的值 [[[ha]]] # Enter the host on which you are running the failover Resource Manager resourcemanager_api_url=http://log2:8088 logical_name=rm2 submit_to=True
修改完重启hue
2、web界面查看
1)先点击左上角三横线这个按钮,然后点击jobs
2)可能出现的错误
点击jobs的时候界面出现错误:
Failed to contact an active Resource Manager: YARN RM returned a failed response: { "RemoteException" : { "message" : "User: hue is not allowed to impersonate admin", "exception" : "AuthorizationException", "javaClassName" : "org.apache.hadoop.security.authorize.AuthorizationException" } } (error 403)
解决办法:
修改 desktop/conf/hue.ini
# Webserver runs as this user #server_user=hue #server_group=hue # This should be the Hue admin and proxy user default_user=hduser # 修改这个值为启动集群的用户,默认值是hue
借鉴: