CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

 

本文档主要记录了Hadoop+Hive+Spark集群安装过程,并且对NameNodeResourceManager进行了HA高可用配置,以及对NameNode的横向扩展(Federation联邦)

 

1       VM网络配置

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

将子网IP设置为192.168.1.0

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

将网关设置为192.168.1.2

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

并禁止DHCP

当经过上面配置后,虚拟网卡8IP会变成192.168.1.1

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

虚拟机与物理机不在一个网段是没有关系的

2                      CentOS配置

2.1       下载地址

IT虾米网

下载不带桌面的最小安装版本

2.2       激活网卡

激活网卡,并设置相关IP

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

网关与DNS设置为上面虚拟网卡8中设置的网关即可

2.3       SecureCRT

当网卡激活后,就可以使用SecureCRT终端远程连接Linux,这样方便后续操作。如何连接这里省略,

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

这里连接上后简单的进行下面设置:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

2.4       修改主机名

/etc/sysconfig/network

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

/etc/hostname

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

/etc/hosts

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

192.168.1.11   node1

192.168.1.12   node2

192.168.1.13   node3

192.168.1.14   node4

2.5       yum代理上网

由于公司内部是代理上网,所以yum无法连网搜索软件包

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

yum代理的设置:vi /etc/yum.conf

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

再次运行yum,发现可以连网搜索软件包了:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据 

2.6       安装ifconfig

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

2.7       wget安装与代理

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

安装好wget后,在/etc目录下就会产生wget配置文件wgetrc,在这里面可以配置wget代理:

[root@node1 ~]# vi /etc/wgetrc

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

http_proxy = IT虾米网

https_proxy = IT虾米网

ftp_proxy = IT虾米网

2.8       安装VMware Tools

为了虚拟机与主机时间同步,所以需要安装VMWare Tools

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 opt]# yum -y install perl

[root@node1 ~]# mount /dev/cdrom /mnt

[root@node1 ~]# tar -zxvf /mnt/VMwareTools-9.6.1-1378637.tar.gz -C /root

[root@node1 ~]# umount /dev/cdrom

[root@node1 ~]# /root/vmware-tools-distrib/vmware-install.pl

[root@node1 ~]# rm -rf /root/vmware-tools-distrib

注:下面文件共享与鼠标拖放功能不要安装,否则安装过程会出问题:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 ~]# chkconfig –list | grep vmware

vmware-tools    0:    1:    2:    3:    4:    5:    6:

vmware-tools-thinprint  0:    1:    2:    3:    4:    5:    6:

[root@node1 ~]# chkconfig vmware-tools-thinprint off

[root@node1 ~]# find / -name *vmware-tools-thinprint* | xargs rm -rf

2.9       其他

2.9.1  问题

刚启动时会出以下错误提示:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

修改虚拟机配置文件node1.vmx可以解决:

vcpu.hotadd = “FALSE”

mem.hotadd = “FALSE”

 

2.9.2  设置

2.9.2.1去掉开机等待时间

[root@node1 ~]# vim /etc/default/grub

GRUB_TIMEOUT=0                                               #默认为5

[root@node1 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

2.9.2.2VM调整

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

注:小内存禁用

修改node1.vmx文件:

mainMem.useNamedFile = “FALSE”

 

为了全屏显示,方便命令行输入,做以下调整:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

并去掉状态栏显示:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

2.9.3  命令

2.9.3.1关机与重启

[root@node1 ~]# reboot

[root@node1 ~]# shutdown -h now

2.9.3.2服务停止与禁用

#查看开机自启动服务

[root@node1 ~]# systemctl list-unit-files | grep enabled | sort

auditd.service                               enabled

crond.service                               enabled

dbus-org.freedesktop.NetworkManager.service enabled

dbus-org.freedesktop.nm-dispatcher.service  enabled

default.target                              enabled

dm-event.socket                             enabled

getty@.service                              enabled

irqbalance.service                          enabled

lvm2-lvmetad.socket                         enabled

lvm2-lvmpolld.socket                        enabled

lvm2-monitor.service                        enabled

microcode.service                           enabled

multi-user.target                           enabled

NetworkManager-dispatcher.service           enabled

NetworkManager.service                      enabled

postfix.service                             enabled

remote-fs.target                            enabled

rsyslog.service                             enabled

sshd.service                                enabled

systemd-readahead-collect.service           enabled

systemd-readahead-drop.service              enabled

systemd-readahead-replay.service            enabled

tuned.service                               enabled

[root@node1 ~]#  systemctl | grep running | sort 

crond.service                   loaded active running   Command Scheduler

dbus.service                    loaded active running   D-Bus System Message Bus

dbus.socket                     loaded active running   D-Bus System Message Bus Socket

getty@tty1.service              loaded active running   Getty on tty1

irqbalance.service              loaded active running   irqbalance daemon

lvm2-lvmetad.service            loaded active running   LVM2 metadata daemon

lvm2-lvmetad.socket             loaded active running   LVM2 metadata daemon socket

NetworkManager.service          loaded active running   Network Manager

polkit.service                  loaded active running   Authorization Manager

postfix.service                 loaded active running   Postfix Mail Transport Agent

rsyslog.service                 loaded active running   System Logging Service

session-1.scope                 loaded active running   Session 1 of user root

session-2.scope                 loaded active running   Session 2 of user root

session-3.scope                 loaded active running   Session 3 of user root

sshd.service                    loaded active running   OpenSSH server daemon

systemd-journald.service        loaded active running   Journal Service

systemd-journald.socket         loaded active running   Journal Socket

systemd-logind.service          loaded active running   Login Service

systemd-udevd-control.socket    loaded active running   udev Control Socket

systemd-udevd-kernel.socket     loaded active running   udev Kernel Socket

systemd-udevd.service           loaded active running   udev Kernel Device Manager

tuned.service                   loaded active running   Dynamic System Tuning Daemon

vmware-tools.service            loaded active running   SYSV: Manages the services needed to run VMware software

wpa_supplicant.service          loaded active running   WPA Supplicant daemon

#查看一个服务的状态

systemctl status auditd.service

#开机时启用一个服务

systemctl enable auditd.service

#开机时关闭一个服务

systemctl disable auditd.service

systemctl disable postfix.service

systemctl disable rsyslog.service

systemctl disable wpa_supplicant.service

#查看服务是否开机启动

systemctl is-enabled auditd.service

2.9.3.3查大文件目录

find . -type f -size +10M  -print0 | xargs -0 du -h | sort -nr

将前最大的前20目录列出来,–max-depth表示目录深度,如果去掉,则遍历所有子目录:

du -hm –max-depth=5 / | sort -nr | head -20

find /etc -name ‘*srm*’  #表示在/etc目录下查找文件名中含有字符

2.9.3.4查看磁盘使用情况

[root@node1 dev]# df -h

文件系统                 容量  已用  可用 已用% 挂载点

/dev/mapper/centos-root   50G  1.5G   49G    3% /

devtmpfs                 721M     0  721M    0% /dev

tmpfs                    731M     0  731M    0% /dev/shm

tmpfs                    731M  8.5M  723M    2% /run

tmpfs                    731M     0  731M    0% /sys/fs/cgroup

/dev/mapper/centos-home   47G   33M   47G    1% /home

/dev/sda1                497M  106M  391M   22% /boot

tmpfs                    147M     0  147M    0% /run/user/0

2.9.3.5查看内存使用情况

[root@node1 dev]# top

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

3                      安装JDK

JDK所有旧版本在官网中的下载地址:IT虾米网

在线下载jdk-8u72-linux-x64.tar.gz,并存放在/root下:

wget -O /root/jdk-8u92-linux-x64.tar.gz IT虾米网

 

 

[root@node1 ~]# tar -zxvf /root/jdk-8u92-linux-x64.tar.gz -C /root

[root@node1 ~]# vi /etc/profile

/etc/profile文件的最末加上如下内容:

export JAVA_HOME=/root/jdk1.8.0_92
export PATH=.:$PATH:$JAVA_HOME/bin

export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

[root@node1 ~]# source /etc/profile

[root@node1 ~]# java -version

java version “1.8.0_92”

Java(TM) SE Runtime Environment (build 1.8.0_92-b14)

Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)

使用env命令查看当前设置的环境变量是否正确:

[root@node1 ~]# env | grep CLASSPATH

CLASSPATH=.:/root/jdk1.8.0_92/jre/lib/rt.jar:/root/jdk1.8.0_92/lib/dt.jar:/root/jdk1.8.0_92/lib/tools.jar

4                      复制虚拟机

前面只安装一台node1的物理机,现从node1复制出node2/node3/node3

node1

192.168.1.11

node2

192.168.1.12

node3

192.168.1.13

node4

192.168.1.14

修改相应虚拟机显示名:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

开机时选择复制:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

修改主机名:

[root@node1 ~]# vi /etc/sysconfig/network

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 ~]# vi /etc/hostname

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

5                      SSH 免密码登录

RSA加密算法是一种典型的非对称加密算法

RSA算法可以用于数据加密公钥加密,私钥解密)和数字签名或认证私钥加密,公钥解密

5.1       一般的ssh原理(需要密码)

客户端向服务器端发出连接请求

服务器端向客户端发出自己的公钥

客户端使用服务器端的公钥加密通讯登录密码然后发给服务器端

如果通讯过程被截获,由于窃听者即使获知公钥和经过公钥加密的内容,但不拥有私钥依然无法解密(RSA算法)

服务器端接收到密文后,用私钥解密,获知通讯密码

5.2       免密码原理

先在客户端创建一对密匙,并把公用密匙放在需要访问的服务器上

客户端向服务器发出请求,请求用你的密匙进行安全验证

   服务器收到请求之后, 先在该服务器上你的主目录下寻找你的公用密匙,然后把它和你发送过来的公用密匙进行比较。如果两个密匙一致, 服务器就用公用密匙加密“质询”(challenge)并把它发送给客户端

客户端收到“质询”之后就可以用自己的私人密匙解密再把它发送给服务器

服务器比较发来的“质询”和原先的是否一致,如果一致则进行授权,完成建立会话的操作

5.3       SSH免密码

先删除以前生成的:

rm -rf /root/.ssh

生成密钥:

[root@node1 ~]# ssh-keygen -t rsa

[root@node2 ~]# ssh-keygen -t rsa

[root@node3 ~]# ssh-keygen -t rsa

[root@node4 ~]# ssh-keygen -t rsa

命令“ssh-keygen -t rsa”表示使用 rsa 加密方式生成密钥, 回车后,会提示三次输入信息,我们直接回车即可。

查看生成的密钥:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

其中id_rsa.pub为公钥,id_rsa为私钥

服务器之间公钥拷贝:

ssh-copy-id -i /root/.ssh/id_rsa.pub <主机名>

表示将本机的公钥拷贝到hadoop-slave1主机上去,并自动追加到authorized_keys文件中去,如果不存在则会自动创建一个。如果是自己远程自己时,主机就填自己

[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node1

[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node2

[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node3

[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node4

[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node1

[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node2

[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node3

[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node4

[root@node3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node1

[root@node3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node2

[root@node3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node3

[root@node3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node4

[root@node4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node1

[root@node4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node2

[root@node4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node3

[root@node4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub node4

注:如果发现三台虚拟机上生成的公钥都是一个时,请先删除/etc/udev/rules.d/70-persistent-net.rules 文件,再删除 /root/.ssh文件夹后,重新生成

 

6                      HA+Federation服务器规划

node1

node2

node3

node4

NameNode

Hadoop

Y(属于cluster1

Y集群1

Y(属于cluster2

Y集群2

DateNode

Y

Y

Y

NodeManager

Y

Y

Y

JournalNodes

Y

Y

Y

zkfcDFSZKFailoverController

Y(有namenode的地方

Y就有zkfc

Y

Y

ResourceManager

Y

Y

ZooKeeperQuorumPeerMain

Zookeeper

Y

Y

Y

MySQL

HIVE

Y

metastoreRunJar

Y

HIVERunJar

Y

Scala

Spark

Y

Y

Y

Y

Spark-master

Y

Spark-worker

Y

Y

Y

不同的NameNode通过同一ClusterID来共用同一套DataNode

说明: HDFS Federation Architecture

NS-n单元:

说明: Hadoop-HA

说明: MapReduce NextGen Architecture

7                      zookeeper

[root@node1 ~]# wget -O /root/zookeeper-3.4.9.tar.gz IT虾米网

[root@node1 ~]# tar -zxvf /root/zookeeper-3.4.9.tar.gz -C /root

[root@node1 conf]# cp /root/zookeeper-3.4.9/conf/zoo_sample.cfg /root/zookeeper-3.4.9/conf/zoo.cfg

[root@node1 conf]# vi /root/zookeeper-3.4.9/conf/zoo.cfg

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 conf]# mkdir /root/zookeeper-3.4.9/zkData

[root@node1 conf]# touch /root/zookeeper-3.4.9/zkData/myid

[root@node1 conf]# echo 1 > /root/zookeeper-3.4.9/zkData/myid

[root@node1 conf]# scp -r /root/zookeeper-3.4.9 node2:/root

[root@node1 conf]# scp -r /root/zookeeper-3.4.9 node3:/root

[root@node2 conf]# echo 2 > /root/zookeeper-3.4.9/zkData/myid

[root@node3 conf]# echo 3 > /root/zookeeper-3.4.9/zkData/myid

7.1       超级权限

[root@node1 ~]# vi /root/zookeeper-3.4.9/bin/zkServer.sh

在下面启动Java的地方加上启动参数“-Dzookeeper.DigestAuthenticationProvider.superDigest=super:Q9YtF+3h9Ko5UNT8apBWr8hovH4=”super后面是密码(AAAaaa111):

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 ~]# /root/zookeeper-3.4.9/bin/zkCli.sh

[zk: localhost:2181(CONNECTED) 11] addauth digest super:AAAaaa111

现在就可以任意删除节点数据了:

[zk: localhost:2181(CONNECTED) 15] rmr /rmstore/ZKRMStateRoot

7.2       问题

zookeeper无法启动Unable to load database on disk

[root@node3 ~]# more zookeeper.out

2017-01-24 11:31:31,827 [myid:3] – ERROR [main:QuorumPeer@557] – Unable to load database on disk

java.io.IOException: The accepted epoch, d is less than the current epoch, 17

        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:554)

        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)

        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)

        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)

        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)

[root@node3 ~]# more /root/zookeeper-3.4.9/conf/zoo.cfg | grep dataDir

dataDir=/root/zookeeper-3.4.9/zkData

[root@node3 ~]# ls /root/zookeeper-3.4.9/zkData

myid  version-2  zookeeper_server.pid

清空version-2下的所有文件:

[root@node3 ~]# rm -f /root/zookeeper-3.4.9/zkData/version-2/*.*

[root@node3 ~]# rm -rf /root/zookeeper-3.4.9/zkData/version-2/acceptedEpoch

[root@node3 ~]# rm -rf /root/zookeeper-3.4.9/zkData/version-2/currentEpoch

8                      Hadoop

[root@node1 ~]# wget -O /root/hadoop-2.7.2.tar.gz  IT虾米网

[root@node1 ~]# tar -zxvf /root/hadoop-2.7.2.tar.gz -C /root

8.1       hadoop-env.sh

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/hadoop-env.sh

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

下面这个存放PID进程号的位置一定要修改,否则可能会出现:XXX running as process 1609. Stop it first.

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.2       hdfs-site.xml

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/hdfs-site.xml

<configuration>

   

       <property>

               <name>dfs.replication</name>

               <value>2</value>

<description>指定DataNode存储block的副本数量。默认值是3个,我们现在有4DataNode,该值不大于4即可</description>

        </property>

 

<property>

  <name>dfs.blocksize</name>

  <value>134217728</value>

  <description>

      The default block size for new files, in bytes.

      You can use the following suffix (case insensitive):

      k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.),

      Or provide complete size in bytes (such as 134217728 for 128 MB).

           注:1.X及以前版本默认是64M,而且配置项名为dfs.block.size

  </description>

</property>

 

<property>

     <name>dfs.permissions.enabled</name>

     <value>false</value>

     <description>注:如果还有权限问题,请执行下“/root/hadoop-2.7.2/bin/hdfs dfs -chmod -R 777 /”命令</description>

</property>

 

<property>

  <name>dfs.nameservices</name>

  <value>cluster1,cluster2</value>

<description>使用federation时,使用了2HDFS集群。这里抽象出两个NameService实际上就是给这2HDFS集群起了个别名。名字可以随便起,相互不重复即可。多个集群时使用逗号分开。注:这里的命名只是个逻辑空间的概念,不是集群1、集群2两集群,应该是 cluster1+cluster2 才组成一个集群,cluster1cluster2只是集群的一部分,从逻辑上将整个集群分成了两部分(当然还要以加一个高可用NameNode进来,组成第三部分),cluster1cluster2是否属于同一集群,则是是clusterID决定的,clusterID这个值是在格式化NameNode时指定的,请参照namenode格式化和启动</description>

</property>

<property>

  <name>dfs.ha.namenodes.cluster1</name>

  <value>nn1,nn2</value>

<description>集群1里面NameNode的逻辑名,注:只是随便命的逻辑名,这里不是真实的NameNode主机名,后面配置才指定到主机</description>

</property>

<property>

  <name>dfs.ha.namenodes.cluster2</name>

  <value>nn3,nn4</value>

<description>集群2里的NameNode逻辑名</description>

</property>

 

<!– 下面配置实现逻辑名与物理主机绑定–>

<property>

  <name>dfs.namenode.rpc-address.cluster1.nn1</name>

  <value>node1:8020</value>

<description>8020HDFS 客户端接入地址(包括命令行与程序),有的使用9000</description>

</property>

<property>

  <name>dfs.namenode.rpc-address.cluster1.nn2</name>

  <value>node2:8020</value>

</property>

<property>

  <name>dfs.namenode.rpc-address.cluster2.nn3</name>

  <value>node3:8020</value>

</property>

<property>

  <name>dfs.namenode.rpc-address.cluster2.nn4</name>

  <value>node4:8020</value>

</property>

<property>

  <name>dfs.namenode.http-address.cluster1.nn1</name>

  <value>node1:50070</value>

<description> namenode web的接入地址</description>

</property>

<property>

  <name>dfs.namenode.http-address.cluster1.nn2</name>

  <value>node2:50070</value>

</property>

<property>

  <name>dfs.namenode.http-address.cluster2.nn3</name>

  <value>node3:50070</value>

</property>

<property>

  <name>dfs.namenode.http-address.cluster3.nn4</name>

  <value>node4:50070</value>

</property>

 

<property>

  <name>dfs.namenode.shared.edits.dir</name>

  <value>qjournal://node1:8485;node2:8485;node3:8485/cluster1</value>

<description>指定cluster1的两个NameNode共享edits文件目录时,使用的JournalNode集群信息。

node1/node2主机中使用这个配置</description>

</property>

<!–

<property>

  <name>dfs.namenode.shared.edits.dir</name>

  <value>qjournal://node1:8485;node2:8485;node3:8485/cluster2</value>

<description>指定cluster2的两个NameNode共享edits文件目录时,使用的JournalNode集群信息。

node3/node4主机中使用这个配置</description>

</property>

–>

 

<property>

<name>dfs.ha.automatic-failover.enabled.cluster1</name>

<value>true</value>

<description>指定cluster1是否启动自动故障恢复,即当NameNode出故障时,是否自动切换到另一台NameNode</description>

</property>

<property>

<name>dfs.ha.automatic-failover.enabled.cluster2</name>

<value>true</value>

</property>

<property>

  <name>dfs.client.failover.proxy.provider.cluster1</name>

  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

<description>指定cluster1出故障时,哪个Java类负责执行故障切换</description>

</property>

<property>

  <name>dfs.client.failover.proxy.provider.cluster2</name>

  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

 

<property>

  <name>dfs.journalnode.edits.dir</name>

  <value>/root/hadoop-2.7.2/tmp/journal</value>

<description>指定JournalNode自身存储数据的磁盘路径</description>

</property>

 

<property>

  <name>dfs.ha.fencing.methods</name>

  <value>sshfence</value>

  <description>NameNode使用SSH进行主备切换</description>

</property>

<property>

  <name>dfs.ha.fencing.ssh.private-key-files</name>

  <value>/root/.ssh/id_rsa</value>

<description>如果使用ssh进行故障切换,使用ssh通信时用的密钥存储的位置</description>

</property>

</configuration>

8.3       core-site.xml

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/core-site.xml

<configuration>

       <property>

                <name>fs.defaultFS</name>

                <value>hdfs://cluster1:8020</value>

                <description>在使用客户端(或程序)时,如果不指定具体的接入地址?该值来自于hdfs-site.xml中的配置。注:所有主机上配置都一样</description>

       </property>

       <property>

               <name>hadoop.tmp.dir</name>

               <value>/root/hadoop-2.7.2/tmp</value>

               <description>这里的路径默认是NameNodeDataNodeJournalNode等存放数据的公共目录</description>

       </property>

<property>

   <name>ha.zookeeper.quorum</name>

   <value>node1:2181,node2:2181,node3:2181</value>

   <description>这里是ZooKeeper集群的地址和端口。注意,数量一定是奇数,且不少于三个节点</description>

</property>

<!– 下面的配置可解决NameNode连接JournalNode超时异常问题–>

<property>

  <name>ipc.client.connect.retry.interval</name>

  <value>10000</value>

  <description>Indicates the number of milliseconds a client will wait for

    before retrying to establish a server connection.

  </description>

</property>

</configuration>

8.4       slaves

指定DataNode所在主机:

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/slaves

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.5       yarn-env.sh

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/yarn-env.sh

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.6       mapred-site.xml

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/mapred-site.xml

<configuration>

          <property>

 <name>mapreduce.framework.name</name>

                <value>yarn</value>

<description>指定mapreduce运行在Yarn框架下</description>

           </property>

    <property>

       <name>mapreduce.jobhistory.address</name>

       <value>node1:10020</value>

<description>注:每台机器上配置都不一样,需要修改成对应的主机名,端口不用修改,比如node2:10020node3:10020node4:10020,,拷贝过去后请做相应修改</description>

    </property>

    <property>

       <name>mapreduce.jobhistory.webapp.address</name>

       <value>node1:19888</value>

       <description>注:每台机器上配置都不一样,需要修改成对应的主机名,端口不用修改,比如node2:19888node3:19888node4:19888,拷贝过去后请做相应修改</description>

</property>

</configuration>

8.7       yarn-site.xml

[root@node1 ~]# vi /root/hadoop-2.7.2/etc/hadoop/yarn-site.xml

<configuration>

        <property>

               <name>yarn.nodemanager.aux-services</name>

               <value>mapreduce_shuffle</value>

        </property>

        <property>                                                               

<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

               <value>org.apache.hadoop.mapred.ShuffleHandler</value>

        </property>

<property>

  <name>yarn.resourcemanager.ha.enabled</name>

  <value>true</value>

</property>

<property>

  <name>yarn.resourcemanager.cluster-id</name>

  <value>yarn-cluster</value>

</property>

<property>

  <name>yarn.resourcemanager.ha.rm-ids</name>

  <value>rm1,rm2</value>

</property>

<property>

  <name>yarn.resourcemanager.hostname.rm1</name>

  <value>node1</value>

</property>

<property>

  <name>yarn.resourcemanager.hostname.rm2</name>

  <value>node2</value>

</property>

<property>

  <name>yarn.resourcemanager.webapp.address.rm1</name>

  <value>node1:8088</value>

</property>

<property>

  <name>yarn.resourcemanager.webapp.address.rm2</name>

  <value>node2:8088</value>

</property>

<property>

  <name>yarn.resourcemanager.zk-address</name>

  <value>node1:2181,node2:2181,node3:2181</value>

</property>

<property>

<name>yarn.resourcemanager.recovery.enabled</name>

<value>true</value>

</property>

<property>

<name>yarn.resourcemanager.store.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

<description>RM的数据默认存放在ZK上的/rmstore中,可通过yarn.resourcemanager.zk-state-store.parent-path 设定</description>

</property>

<property>

<name>yarn.log-aggregation-enable</name>  

<value>true</value>

<description>开启日志收集,这样会将每台执行任务的机上产生的本地日志文件集中拷贝到HDFS的某个地方,这样就可以在任何一台集群中的机器上集中查看作业日志了</description>

</property>

<property>

  <name>yarn.log.server.url</name>

  <value>IT虾米网>

  <description>注:每台机器上配置都不一样,需要修改成对应的主机名,端口不用修改,比如IT虾米网http://node3:19888/jobhistory/logshttp://node4:19888/jobhistory/logs,拷贝过去后请做相应修改</description>

</property>

</configuration>

8.8       复制与修改

[root@node1 ~]# scp -r /root/hadoop-2.7.2/ node2:/root

[root@node1 ~]# scp -r /root/hadoop-2.7.2/ node3:/root

[root@node1 ~]# scp -r /root/hadoop-2.7.2/ node4:/root

[root@node3 ~]# vi /root/hadoop-2.7.2/etc/hadoop/hdfs-site.xml

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node3 ~]# scp /root/hadoop-2.7.2/etc/hadoop/hdfs-site.xml node4:/root/hadoop-2.7.2/etc/hadoop

[root@node2 ~]# vi /root/hadoop-2.7.2/etc/hadoop/mapred-site.xml

[root@node3 ~]# vi /root/hadoop-2.7.2/etc/hadoop/mapred-site.xml

[root@node4 ~]# vi /root/hadoop-2.7.2/etc/hadoop/mapred-site.xml

[root@node2 ~]# vi /root/hadoop-2.7.2/etc/hadoop/yarn-site.xml

[root@node3 ~]# vi /root/hadoop-2.7.2/etc/hadoop/yarn-site.xml

[root@node4 ~]# vi /root/hadoop-2.7.2/etc/hadoop/yarn-site.xml

8.9       启动ZK

[root@node1 bin]# /root/zookeeper-3.4.9/bin/zkServer.sh start

[root@node2 bin]# /root/zookeeper-3.4.9/bin/zkServer.sh start

[root@node3 bin]# /root/zookeeper-3.4.9/bin/zkServer.sh start

[root@node1 bin]# jps

1622 QuorumPeerMain

查看状态:

[root@node1 ~]# /root/zookeeper-3.4.9/bin/zkServer.sh status

ZooKeeper JMX enabled by default

Using config: /root/zookeeper-3.4.9/bin/../conf/zoo.cfg

Mode: follower

[root@node2 ~]# /root/zookeeper-3.4.9/bin/zkServer.sh status

ZooKeeper JMX enabled by default

Using config: /root/zookeeper-3.4.9/bin/../conf/zoo.cfg

Mode: leader

查看数据节点:

[root@node1 hadoop-2.7.2]# /root/zookeeper-3.4.9/bin/zkCli.sh

[zk: localhost:2181(CONNECTED) 0] ls /

[zookeeper]

8.10格式化zkfc

在每个集群上的任意一节点上进行操作,目的是在Zookeeper集群上建立HA的相应Znode节点数据

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs zkfc -formatZK

[root@node3 ~]# /root/hadoop-2.7.2/bin/hdfs zkfc -formatZK

格式化后,会在ZK上创建hadoop-ha名称的Znode数据节点:

[root@node1 ~]# /root/zookeeper-3.4.9/bin/zkCli.sh

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.11启动journalnode

[root@node1 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start journalnode

[root@node2 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start journalnode

[root@node3 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start journalnode

[root@node1 ~]# jps

1810 JournalNode

8.12namenode格式化和启动

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs namenode -format -clusterId CLUSTER_UUID_1

[root@node1 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start namenode

[root@node1 ~]# jps

1613 NameNode

 

同一集群中的所有集群ID必须相同(包括NameNodeDataNode等):

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node2 ~]# /root/hadoop-2.7.2/bin/hdfs namenode -bootstrapStandby

[root@node2 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start namenode

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node3 ~]# /root/hadoop-2.7.2/bin/hdfs namenode -format -clusterId CLUSTER_UUID_1

[root@node3 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start namenode

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node4 ~]# /root/hadoop-2.7.2/bin/hdfs namenode -bootstrapStandby

[root@node4 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start namenode

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.13启动zkfc

ZKFCzookeeper Failover Controller)是用来监控NameNode状态的,协助实现主备NameNode切换的,在所有NameNode上执行

[root@node1 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc

[root@node2 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc

[root@node1 ~]# jps

5280 DFSZKFailoverController

自动切换成功:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node3 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc

[root@node4 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc

8.14启动datanode

[root@node2 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode

[root@node3 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode

[root@node4 ~]# /root/hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode

8.15HDFS验证

上传到指定的集群2中:

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -put /root/hadoop-2.7.2.tar.gz hdfs://cluster2/

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -put /root/test_upload.tar hdfs://cluster1:8020/

上传时如果未明确指定路径,则会默认使用core-site.xml配置文本中的fs.defaultFS配置项:

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -put /root/hadoop-2.7.2.tar.gz /

也可以具体到某台主机(但要是处于激活状态):

/root/hadoop-2.7.2/bin/hdfs dfs -put /root/hadoop-2.7.2.tar hdfs://node3:8020/

/root/hadoop-2.7.2/bin/hdfs dfs -put /root/hadoop-2.7.2.tar hdfs://node3/

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.16HA验证

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster1 -getServiceState nn1

active

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster1 -getServiceState nn2

standby

[root@node1 ~]# jps

2448 NameNode

3041 DFSZKFailoverController

3553 Jps

2647 JournalNode

2954 QuorumPeerMain

[root@node1 ~]# kill 2448

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster1 -getServiceState nn2

active

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.16.1              手动切换

/root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster1 -failover nn2 nn1

/root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster2 -failover nn4 nn3

8.17启动yarn

[root@node1 ~]# /root/hadoop-2.7.2/sbin/yarn-daemon.sh start resourcemanager

[root@node2 ~]# /root/hadoop-2.7.2/sbin/yarn-daemon.sh start resourcemanager

[root@node2 ~]# /root/hadoop-2.7.2/sbin/yarn-daemon.sh start nodemanager

[root@node3 ~]# /root/hadoop-2.7.2/sbin/yarn-daemon.sh start nodemanager

[root@node4 ~]# /root/hadoop-2.7.2/sbin/yarn-daemon.sh start nodemanager

IT虾米网

注:输入地址为IT虾米网形式,否则如果是备用的则会自动跳转到激活主机上面去

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

IT虾米网

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

查看状态命令:

[root@node4 logs]# /root/hadoop-2.7.2/bin/yarn rmadmin -getServiceState rm2

8.18MapReduce测试

[root@node4 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -mkdir hdfs://cluster1/hadoop

[root@node4 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -put /root/hadoop-2.7.2/etc/hadoop/*xml* hdfs://cluster1/hadoop

[root@node4 ~]# /root/hadoop-2.7.2/bin/hadoop jar /root/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount hdfs://cluster1:8020/hadoop/h* hdfs://cluster1:8020/hadoop/m* hdfs://cluster1/wordcountOutput

注:MapReduce的输出要与其输入在同一集群。虽然可以放在另一集群时也要执行成功,但通过Web查看输出结果文件时,会找不到

8.19脚本

以下脚本放在node1上运行

8.19.1              启动与停用脚本

自动交互

在通过脚本进行RM手动切换时使用

[root@node1 ~]# yum install expect

[root@node1 ~]# vi /root/starthadoop.sh

#rm -rf /root/hadoop-2.7.2/logs/*.*

#ssh root@node2 ‘export BASH_ENV=/etc/profile;rm -rf /root/hadoop-2.7.2/logs/*.*’

#ssh root@node3 ‘export BASH_ENV=/etc/profile;rm -rf /root/hadoop-2.7.2/logs/*.*’

#ssh root@node4 ‘export BASH_ENV=/etc/profile;rm -rf /root/hadoop-2.7.2/logs/*.*’

 

/root/zookeeper-3.4.9/bin/zkServer.sh start

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/zookeeper-3.4.9/bin/zkServer.sh start’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/zookeeper-3.4.9/bin/zkServer.sh start’

 

/root/hadoop-2.7.2/sbin/start-all.sh

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/yarn-daemon.sh start resourcemanager’

 

/root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc’

ssh root@node4 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh start zkfc’

 

#ret=`/root/hadoop-2.7.2/bin/hdfs dfsadmin -safemode get | grep ON | head -1`

#while [ -n “$ret” ]

#do

#echo ‘等待离开安全模式

#sleep 1s

#ret=`/root/hadoop-2.7.2/bin/hdfs dfsadmin -safemode get | grep ON | head -1`

#done

 

/root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster1 -failover nn2 nn1

/root/hadoop-2.7.2/bin/hdfs haadmin -ns cluster2 -failover nn4 nn3

echo ‘Y’ | ssh root@node1 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/bin/yarn rmadmin -transitionToActive –forcemanual rm1’

 

/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh start historyserver

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh start historyserver’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh start historyserver’

ssh root@node4 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh start historyserver’

 

#此命令行启动Spark,只安装Hadoop时去掉

/root/spark-2.1.0-bin-hadoop2.7/sbin/start-all.sh

 

echo ‘————–node1—————‘

jps | grep -v Jps | sort  -k 2 -t ‘ ‘

echo ‘————–node2—————‘

ssh root@node2 “export PATH=/usr/bin:$PATH;jps | grep -v Jps | sort  -k 2 -t ‘ ‘”

echo ‘————–node3—————‘

ssh root@node3 “export PATH=/usr/bin:$PATH;jps | grep -v Jps | sort  -k 2 -t ‘ ‘”

echo ‘————–node4—————‘

ssh root@node4 “export PATH=/usr/bin:$PATH;jps | grep -v Jps | sort  -k 2 -t ‘ ‘”

 

#下面两行命令用来启动Hive,没有安装时请去掉

ssh root@node4 ‘export BASH_ENV=/etc/profile;service mysql start’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/hive-1.2.1/bin/hive –service metastore&’

 [root@node1 ~]# vi /root/stophadoop.sh

#此命令行用来停止Spark,未安装时去掉

/root/spark-2.1.0-bin-hadoop2.7/sbin/stop-all.sh

#下面两行用来停止HIVE,未安装时去掉

ssh root@node4 ‘export BASH_ENV=/etc/profile;service mysql stop’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/jdk1.8.0_92/bin/jps | grep RunJar | head -1 |cut -f1 -d ” “|  xargs kill’

 

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/yarn-daemon.sh stop resourcemanager’

/root/hadoop-2.7.2/sbin/stop-all.sh

 

/root/hadoop-2.7.2/sbin/hadoop-daemon.sh stop zkfc

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh stop zkfc’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh stop zkfc’

ssh root@node4 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/hadoop-daemon.sh stop zkfc’

 

/root/zookeeper-3.4.9/bin/zkServer.sh stop

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/zookeeper-3.4.9/bin/zkServer.sh stop’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/zookeeper-3.4.9/bin/zkServer.sh stop’

 

/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh stop historyserver

ssh root@node2 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh stop historyserver’

ssh root@node3 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh stop historyserver’

ssh root@node4 ‘export BASH_ENV=/etc/profile;/root/hadoop-2.7.2/sbin/mr-jobhistory-daemon.sh stop historyserver’

[root@node1 ~]# chmod 777 starthadoop.sh stophadoop.sh

8.19.2              重启、关机

[root@node1 ~]# vi /root/reboot.sh 

ssh root@node2 “export PATH=/usr/bin:$PATH;reboot”

ssh root@node3 “export PATH=/usr/bin:$PATH;reboot”

ssh root@node4 “export PATH=/usr/bin:$PATH;reboot”

reboot

[root@node1 ~]# vi /root/shutdown.sh

ssh root@node2 “export PATH=/usr/bin:$PATH;shutdown -h now”

ssh root@node3 “export PATH=/usr/bin:$PATH;shutdown -h now”

ssh root@node4 “export PATH=/usr/bin:$PATH;shutdown -h now”

shutdown -h now

[root@node1 ~]# chmod 777 /root/shutdown.sh /root/reboot.sh

8.20Eclipse插件

8.20.1              插件安装

1、  hadoop-2.7.2.tar.gz(前面自己编译的CentOS版本)解压到D:/hadoop下,并将winutil.exe.hadoop.dll等文件到hadoop安装目录bin文件夹下,再将hadoop.dll放到C:/WindowsC:/Windows/System32下。

2、  添加HADOOP_HOME环境变量,值为D:/hadoop/hadoop-2.7.2,并将%HADOOP_HOME%/bin添加到Path环境变量中

3、  双击winutils.exe,如果出现“缺失MSVCR120.dll”的提示,则安装VC++2013相关组件

4、  hadoop-eclipse-plugin-2.7.2.jar(该插件包也是要在Windows上进行编译,非常麻烦,也找现成的吧!)插件包拷贝到Eclipse plugins目录下

5、  运行Eclipse,进行配置:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

  • Map/ReduceV2 Master :这个端口不用管,不影响任务远程提交与执行。如果配置正确,下面这个就可以在Eclips直接监视任务执行情况了(这个捣鼓了很久,还是没出来,在hadoop1.2.1倒是搞出来过):

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

  • DFS Master Name NodeIP和端口,hdfs-site.xmldfs.namenode.rpc-address配置端口,这个配置决定了左边树是否可以连上Hadoopdfs

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.20.2              WordCount工程

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.20.2.1         WordCount.java

package jzj;

 

import java.io.IOException;

import java.net.URI;

import java.util.StringTokenizer;

 

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.log4j.Logger;

 

publicclass WordCount {

 

       publicstaticclass TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

 

              privatefinalstatic IntWritable one = new IntWritable(1);

              private Text word = new Text();

              private Logger log = Logger.getLogger(TokenizerMapper.class);

 

              publicvoid map(Object key, Text value, Context context) throws IOException, InterruptedException {

                     log.debug(“[Thread=” + Thread.currentThread().hashCode() + “]  map任务:log4j输出:wordcountkey=” + key + value=” + value);

                     System.out.println(“[Thread=” + Thread.currentThread().hashCode() + “]  map任务:System.out输出:wordcountkey=” + key + value=”

                                   + value);

                     StringTokenizer itr = new StringTokenizer(value.toString());

                     while (itr.hasMoreTokens()) {

                            word.set(itr.nextToken());

                            context.write(word, one);

                     }

              }

       }

 

       publicstaticclass IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

              private IntWritable result = new IntWritable();

              private Logger log = Logger.getLogger(IntSumReducer.class);

 

             publicvoid reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

                     int sum = 0;

                     for (IntWritable val : values) {

                            sum += val.get();

                     }

                     result.set(sum);

                     context.write(key, result);

                     log.debug(“[Thread=” + Thread.currentThread().hashCode() + “]  reduce任务:log4j输出:wordcountkey=” + key + count=” + sum);

                     System.out.println(“[Thread=” + Thread.currentThread().hashCode() + “]  reduce任务:System.out输出:wordcountkey=” + key + count=”

                                   + sum);

              }

       }

 

       publicstaticvoid main(String[] args) throws Exception {

              Logger log = Logger.getLogger(WordCount.class);

              log.debug(“JOB Main方法:log4j输出:wordcount”);

              System.out.println(“JOB Main方法:System.out输出:wordcount”);

              Configuration conf = new Configuration();

              // 注:xxx.jar任务包中需要一个空的yarn-default.xml配置文件,否则任务远程提交后会一直等待,Why

              conf.set(“mapreduce.framework.name”, “yarn”);// 指定使用yarn框架

              conf.set(“yarn.resourcemanager.address”, “node1:8032”); // 提交任务到哪台机器上

              // 需要加上,否则抛异常:java.io.IOException: The ownership on the staging

              // directory /tmp/hadoop-yarn/staging/15040078/.staging

              // is not as expected. It is owned by . The directory must be owned by

              // the submitter 15040078 or by 15040078

              conf.set(“fs.defaultFS”, “hdfs://node1:8020”);// 指定namenode

              // 加上该配置,否则抛异常:Stack trace: ExitCodeException exitCode=1: /bin/bash: 0

              // :fg: 无任务控制

              conf.set(“mapreduce.app-submission.cross-platform”, “true”);

 

              // 此处Keymapred.jar不要修改,值为本项目导出的Jar,如果不设置,则报找不到类

              conf.set(“mapred.jar”, “wordcount.jar”);

 

              Job job = Job.getInstance(conf, “wordcount”);

              job.setJarByClass(WordCount.class);

              job.setMapperClass(TokenizerMapper.class);

              // 如果这里设置了Combiner,则Map端与会有reduce日志,原因设置了Combiner后,Map端做完Map后,会继续运行reduce任务,所以在Map端也会看到reduce任务日志就不奇怪了

              // job.setCombinerClass(IntSumReducer.class);

              job.setReducerClass(IntSumReducer.class);

              job.setOutputKeyClass(Text.class);

              job.setOutputValueClass(IntWritable.class);

              // job.setNumReduceTasks(4);

              FileInputFormat.addInputPath(job, new Path(“hdfs://node1/hadoop/core-site.xml”));

              FileInputFormat.addInputPath(job, new Path(“hdfs://node1/hadoop/m*”));

 

              FileSystem fs = FileSystem.get(URI.create(“hdfs://node1”), conf);

              fs.delete(new Path(“/wordcountOutput”), true);

 

              FileOutputFormat.setOutputPath(job, new Path(“hdfs://node1/wordcountOutput”));

 

              System.exit(job.waitForCompletion(true) ? 0 : 1);

              System.out.println(job.getStatus().getJobID());

       }

}

 

8.20.2.2      yarn-default.xml

注:工程中的yarn-default.xml为空文件,但经测式一定需要

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.20.2.3      build.xml

<projectdefault=“jar”name=“Acid”>

       <propertyname=“lib.dir”value=“D:/hadoop/hadoop-2.7.2/share/hadoop”/>

       <propertyname=“src.dir”value=“../src”/>

       <propertyname=“classes.dir”value=“../bin”/>

 

       <propertyname=“output.dir”value=“..”/>

       <propertyname=“jarname”value=“wordcount.jar”/>

       <propertyname=“mainclass”value=“jzj.WordCount”/>

 

       <!– 第三方jar包的路径 –>

       <pathid=“lib-classpath”>

              <filesetdir=“${lib.dir}”>

                     <includename=“**/*.jar”/>

              </fileset>

       </path>

 

       <!– 1. 初始化工作,如创建目录等 –>

       <targetname=“init”>

              <mkdirdir=“${classes.dir}”/>

              <mkdirdir=“${output.dir}”/>

              <deletefile=“${output.dir}/wordcount.jar”/>

              <deleteverbose=“true”includeemptydirs=“true”>

                     <filesetdir=“${classes.dir}”>

                            <includename=“**/*”/>

                     </fileset>

              </delete>

 

       </target>

 

       <!– 2. 编译 –>

       <targetname=“compile”depends=“init”>

              <javacsrcdir=“${src.dir}”destdir=“${classes.dir}”includeantruntime=“on”>

                     <compilerargline=“-encoding GBK”/>

                     <classpathrefid=“lib-classpath”/>

              </javac>

       </target>

 

 

       <!– 3. 打包jar文件 –>

       <targetname=“jar”depends=“compile”>

              <copytodir=“${classes.dir}”>

                     <filesetdir=“${src.dir}”>

                            <includename=“**”/>

                            <excludename=“build.xml”/>

                            <!–注:不能排除掉log4j.properties文件,该文件也要一起打包,否则运行时不会显示日志

                            该日志配置文件仅作用于JOB,即会在作业提交的客户端上产生日志,而TASKMapReduce任务)

                            则是由/root/hadoop-2.7.2/etc/hadoop/log4j.properties配置文件来决定–>

                            <!–exclude name=”log4j.properties” / –>

                     </fileset>

              </copy>

              <!– jar文件的输出路径 –>

              <jardestfile=“${output.dir}/${jarname}”basedir=“${classes.dir}”>

                     <manifest>

                            <attributename=“Main-class”value=“${mainclass}”/>

                     </manifest>

              </jar>

       </target>

</project>

8.20.2.4      log4j.properties

log4j.rootLogger=info,stdout,R 

log4j.appender.stdout=org.apache.log4j.ConsoleAppender 

log4j.appender.stdout.layout=org.apache.log4j.PatternLayout 

log4j.appender.stdout.layout.ConversionPattern=%5p%m%n 

log4j.appender.R=org.apache.log4j.RollingFileAppender 

log4j.appender.R.File=mapreduce_test.log 

log4j.appender.R.MaxFileSize=1MB 

log4j.appender.R.MaxBackupIndex=1

log4j.appender.R.layout=org.apache.log4j.PatternLayout 

log4j.appender.R.layout.ConversionPattern=%p%t%c%m%n 

 

log4j.logger.jzj =DEBUG

8.20.3              打包执行

打开工程中的build.xml构件文件,按 SHIFT+ALT+XQ,即可在工程下打成作业jar包:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据包结构如下:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

然后打开工程中的WordCount.java源码文件,点击:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.20.4              权限访问

运行时如果报以下异常:

 

Exception in thread “main” org.apache.hadoop.security.AccessControlException: Permission denied: user=15040078, access=EXECUTE, inode=”/tmp/hadoop-yarn/staging/15040078/.staging/job_1484039063795_0001″:root:supergroup:drwxrwx—

       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)

       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)

       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)

       at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)

       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720)

       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1704)

       at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1673)

       at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setPermission(FSDirAttrOp.java:61)

       at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1653)

       at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:695)

       at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setPermission(ClientNamenodeProtocolServerSideTranslatorPB.java:453)

       at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

       at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)

       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

       at java.security.AccessController.doPrivileged(Native Method)

       at javax.security.auth.Subject.doAs(Subject.java:422)

       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

[root@node1 ~]# /root/hadoop-2.7.2/bin/hdfs dfs -chmod -R 777 /

8.21杀任务

如果发现任务提交后,停止不前,则可以杀掉该任务:

[root@node1 ~]# /root/hadoop-2.7.2/bin/hadoop job -list

[root@node1 ~]# /root/hadoop-2.7.2/bin/hadoop job -kill job_1475762778825_0008

8.22日志

8.22.1              Hadoop系统服务日志

NameNodesecondarynamenodehistoryserverResourceManageDataNodenodemanager等系统自带的服务输出来的日志默认是存放 ${HADOOP_HOME}/logs目录下,也可以通过Web页面这样查看:

IT虾米网

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

这些日志实际上对应每台主机上的本地日志文件,进入相应主机可以看到原始文件:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

当日志到达一定的大小将会被切割出一个新的文件,后面的数字越大,代表日志越旧。在默认情况 下,只保存前20个日志文件。系统日志位置及大小都是可以在 ${HADOOP_HOME}/etc/hadoop/log4j.properties文件中配置的,配置文件中的环境变量由${HADOOP_HOME}/etc/hadoop/目录下相关配置文件来设定

*.out文件,标准输出会重定向到这里

IT虾米网

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

IT虾米网

IT虾米网

也可以这样点进来:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据 

8.22.2              Mapreduce日志

Mapreduce日志可以分为历史作业日志和Container日志。
  (1)、历史作业的记录里面包含了一个作业用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息;这些信息对分析作业是很有帮助的,我们可以通过这些历史作业记录得到每天有多少个作业运行成功、有多少个作业运行失败、每个队列作业运行了多少个作业等很有用的信息。这些历史作业的信息是通过下面的信息配置的:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

注:这一类日志文件是放在HDFS上面的

2)、Container日志:包含ApplicationMaster日志和普通Task日志等信息。

YARN提供了两种存放容器(container)日志的方式:

1)         本地:如果日志聚合服务被开启的话(通过yarn.log-aggregation-enable来配置),容器日志将会被拷贝到HDFS中并且删除本机上的日志文件,位置由yarn-site.xml中的yarn.nodemanager.remote-app-log-dir来配置,默认在hdfs://tmp/logs目录中:

<property>

    <description>Where to aggregate logs to.</description>

    <name>yarn.nodemanager.remote-app-log-dir</name>

    <value>/tmp/logs</value>

  </property>

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

/tmp/logs下的子目录默认配置:

<property>

    <description>The remote log dir will be created at {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam}

    </description>

    <name>yarn.nodemanager.remote-app-log-dir-suffix</name>

    <value>logs</value>

  </property>

默认情况下,这些日志信息是存放在${HADOOP_HOME}/logs/userlogs目录下:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

我们可以通过下面的配置进行修改:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

2)         HDFS:当日志聚合服务关闭时(yarn.log-aggregation-enablefalse),日志被保留在任务执行的机器本地的$HADOOP_HOME/logs/userlogs,作业执行完后不会被移到HDFS系统中

通过IT虾米网进去点击即可查看正在运行与已经完成的作业的日志信息:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

点击相应链接可以查看到每个MapReduce任务的日志:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.22.3              System.out

JOB启动类main方法中的System.out:会在 Job作业提交节点的终端上输出。如果在是Eclipse上远程提交的,会在Eclipse中输出:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

如果作业提交到远程服务器上运行,在哪个节点(jobtracker)上启动作业,就在哪个节点终端上显示输出:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

如果是Map或者是reduce类里输出的,则会将日志输出到 ${HADOOP_HOME}/logs/userlogs目录下的文件中(如果日志聚合服务被开启的话,则任务执行完后会移到HDFS中去存储,所以在试验时要在任务运行完之前查看):

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

这些日志还可以通过IT虾米网页面查看的

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8.22.4              log4j

Eclipse中启动运行:

作业提交代码(即Main方法)中的日志、以及作业运行过程中Eclipse控制台输出,是由作业jar打包中的log4j.properties配置文件来决定:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

由于在log4j.properties文件中配置了Console标准输出,所以在Eclipse控制台会直接打印出来:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

从输出来看,除了Main方法中的日志输出外,还有大量的作业运行过程中产生的日志记录,这些也是log4j输出的,这所有日志记录(Main中的输出、作业系统框架输出)都会记录到mapreduce_test.log文件中去:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

提交到服务上运行时:此时的配置文件为/root/hadoop-2.7.2/etc/hadoop/log4j.properties

 

MapReduce任务中的日志级别是由mapred-site.xml中配置,下面是默认配置:

<property>

  <name>mapreduce.map.log.level</name>

  <value>INFO</value>

  <description>The logging level for the map task. The allowed levels are:

  OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE and ALL.

  The setting here could be overridden if “mapreduce.job.log4j-properties-file”

  is set.

  </description>

</property>

 

<property>

  <name>mapreduce.reduce.log.level</name>

  <value>INFO</value>

  <description>The logging level for the reduce task. The allowed levels are:

  OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE and ALL.

  The setting here could be overridden if “mapreduce.job.log4j-properties-file”

  is set.

  </description>

</property>

 

MapReduce类中的log4j输出日志会直接输入到${HADOOP_HOME}/logs/userlogs目录下的相应文件中(如果日志聚合服务被开启的话,则任务执行完后会移到HDFS中去存储),而不是/root/hadoop-2.7.2/etc/hadoop/log4j.properties中配的日志文件(该配置文件所指定的默认名为hadoop.log,但一直都没找到过!?):

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

注:如果这里设置了Combiner,则Map端与会有reduce日志,原因设置了Combiner后,Map端做完Map后,会继续运行reduce任务,所以在Map端也会看到reduce任务日志就不奇怪了

9                      MySQL

1、下载mysqlrepo

[root@node4 ~]# wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm

 

2、安装mysql-community-release-el7-5.noarch.rpm

[root@node4 ~]# rpm -ivh mysql-community-release-el7-5.noarch.rpm

 

安装这个包后,会获得两个mysqlyum repo源:/etc/yum.repos.d/mysql-community.repo/etc/yum.repos.d/mysql-community-source.repo

 

3、安装mysql

[root@node4 ~]# yum install mysql-server

 

4、启动数据库

[root@node4 /root]# service mysql start

 

5、修改root的密码

[root@node4 /root]# mysqladmin -u root password ‘AAAaaa111’

 

6、配置远程访问,为了安全,默认情况只允许本地登录,限制其他IP远程访问

[root@node4 /root]# mysql -h localhost -u root -p

Enter password: AAAaaa111

mysql> GRANT ALL PRIVILEGES ON *.* TO ‘root’@’%’ IDENTIFIED BY ‘AAAaaa111’ WITH GRANT OPTION;

mysql> flush privileges;

 

7、查看数据库字符集

mysql> show variables like ‘character%’;

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

8、修改字符集

[root@node4 /root]# vi /etc/my.cnf

[client]

default-character-set=utf8

[mysql]

default-character-set=utf8

[mysqld]

character-set-server=utf8

 

9、大小写敏感配置

不区分表名的大小写;

[root@node4 /root]# vi /etc/my.cnf

[mysqld]

lower_case_table_names = 1

其中 0:区分大小写,1:不区分大小写

 

10、     重启服务

[root@node4 /root]# service mysql stop

[root@node4 /root]# service mysql start

 

11、     [root@node4 /root]# mysql -h localhost -u root -p

 

12、     字符集修改后再次查看

mysql> show variables like ‘character%’;

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

13、     创建库

mysql> create database hive;

14、     显示数据库

mysql> show databases;

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

 

15、     连接数据库

mysql> use hive;

16、     查看库中有哪些表

mysql> show tables;

 

17、     退出:

mysql> exit;

 

10               HIVE安装

10.1三种安装模式

基本概念:metastore包括两部分,服务进程数据的存储

hadoop权威指南 第二版》374页这张图:

说明: /img2/be711127-3dc2-4ae8-8256-7d38743ae8b4.jpg

1.上方描述的是内嵌模式,特点是:hive服务metastore服务运行在同一个进程中,derby服务也运行在该进程中。
该模式无需特殊配置

2.中间是本地模式,特点是:hive服务metastore服务运行在同一个进程中,mysql是单独的进程,可以在同一台机器上,也可以在远程机器上。
该模式只需将hive-site.xml中的ConnectionURL指向mysql,并配置好驱动名、数据库连接账号即可

说明: /img2/8bb58748-bf7b-4c86-bfd9-0521298b4ace.jpg

3.下方是远程模式,特点是:hive服务和metastore不同的进程内,可能是不同的机器
该模式需要将hive.metastore.local设置为false,并将hive.metastore.uris设置为metastore服务器URI,如有多个metastore服务器,URI之间用逗号分隔。metastore服务器URI的格式为thrift://host:portThrift:是hive的通信协议

<property>
<name>hive.metastore.uris</name>
<value>thrift://127.0.0.1:9083</value>
</property>

把这些理解后,大家就会明白,其实仅连接远程的mysql并不能称之为远程模式,是否远程指的是metastorehive服务是否在同一进程内,换句话说,指的是metastorehive服务离得

10.2远程模式安装

node1上安装hive,在node3上安装metastore服务:

1、  下载地址:IT虾米网

Hadoop版本为2.7.2,这里下载apache-hive-1.2.1-bin.tar.gz

[root@node1 ~]# wget http://apache.fayea.com/hive/stable/apache-hive-1.2.1-bin.tar.gz

2、  [root@node1 ~]# tar -zxvf apache-hive-1.2.1-bin.tar.gz

3、  [root@node1 ~]# mv apache-hive-1.2.1-bin hive-1.2.1

4、  [root@node1 ~]# vi /etc/profile

export HIVE_HOME=/root/hive-1.2.1

export PATH=.:$PATH:$JAVA_HOME/bin:$HIVE_HOME/bin

5、  [root@node1 ~]# source /etc/profile

6、  mysql-connector-java-5.6-bin.jar驱动放在 /root/hive-1.2.1/lib/ 目录下面

7、  [root@node1 ~]# cp /root/hive-1.2.1/conf/hive-env.sh.template /root/hive-1.2.1/conf/hive-env.sh

8、  [root@node1 ~]# vi /root/hive-1.2.1/conf/hive-env.sh

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

经过上面这些操作后,应该可以启动默认配置(数据库用的是内嵌数据库derbyHIVE了(注:运行Hive之前要启动Hadoop):

[root@node1 ~]# hive

Logging initialized using configuration in jar:file:/root/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties

hive>

9、  node1上的Hive拷贝到node3

[root@node1 ~]# scp -r /root/hive-1.2.1 node3:/root

[root@node1 ~]# scp /etc/profile node3:/etc/profile

[root@node3 ~]# source /etc/profile

 

10、              [root@node1 ~]# vi /root/hive-1.2.1/conf/hive-site.xml

<configuration>

<property>

<name>hive.metastore.uris</name>

<value>thrift://node3:9083</value>

</property>   

</configuration>

 

11、              [root@node3 ~]# vi /root/hive-1.2.1/conf/hive-site.xml

<configuration>

    <property>

      <name>hive.metastore.warehouse.dir</name>

      <value>/user/hive/warehouse</value>

    </property>

 

    <property>

      <name>javax.jdo.option.ConnectionURL</name>

      <value>jdbc:mysql://node4:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8</value>

    </property>

 

    <property>

      <name>javax.jdo.option.ConnectionDriverName</name>

      <value>com.mysql.jdbc.Driver</value>

    </property>

 

    <property>

      <name>javax.jdo.option.ConnectionUserName</name>

      <value>root</value>

    </property>

 

    <property>

      <name>javax.jdo.option.ConnectionPassword</name>

      <value>AAAaaa111</value>

    </property>

</configuration>

 

12、              启动metastore 服务:

[root@node3 ~]# hive –service metastore&

[1] 2561

Starting Hive Metastore Server

[root@hadoop-slave1 /root]# jps

2561 RunJar

&表示让metastore服务在后台运行

 

13、              启动Hive Server

[root@node1 ~]# hive –service hiveserver2 &

[1] 3310

[root@hadoop-master /root]# jps

3310 RunJar

进程号名也是RunJar

注:不要使用 hive –service hiveserver 来启动服务,否则会抛异常:

Exception in thread “main” java.lang.ClassNotFoundException: org.apache.hadoop.hive.service.HiveServer

直接使用hive命令启动shell环境时,其实已经顺带启动了hiveserver,所以远程模式下其实只需要单独启动metastore,然后就可以进入shell环境正常使用,所以这一步实际上可以省掉,直接运行hive进入shell环境

14、              启动hive命令行

[root@hadoop-master /root]# hive

Logging initialized using configuration in jar:file:/root/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties

hive>

注:启运hive时会顺带启动了hiveserver,所以没有必要运行hive –service hiveserver2 & 命令了

 

15、              验证hive

[root@hadoop-master /root]# hive

 

Logging initialized using configuration in jar:file:/root/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties

hive> show tables;

OK

Time taken: 1.011 seconds

hive> create table test(id int,name string);

可能会出现以下两种之一的异常:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don’t support retries at the client level.)

 

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:javax.jdo.JDODataStoreException: An exception was thrown while adding/validating class(es) : Specified key was too long; max key length is 767 bytes

com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Specified key was too long; max key length is 767 bytes

 

这是由于数据库字符集引起的,进入mysql修改:

[root@node4 /root]# mysql -h localhost -u root -p

mysql> alter database hive character set latin1;

 

16、              登录mySQL查看meta信息

mysql> use hive;

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

3)登录hadoop查看

[root@node1 ~]# hadoop-2.7.2/bin/hdfs dfs -ls /user/hive/warehouse

Found 1 items

drwxr-xr-x   – root supergroup          0 2017-01-22 23:45 /user/hive/warehouse/test

11               Scala安装

1、    [root@node1 ~]# wget -O /root/scala-2.12.1.tgz IT虾米网

2、    [root@node1 ~]# tar -zxvf /root/scala-2.12.1.tgz

3、    [root@node1 ~]# vi /etc/profile

export SCALA_HOME=/root/scala-2.12.1

export PATH=.:$PATH:$JAVA_HOME/bin:$HIVE_HOME/bin:$SCALA_HOME/bin

4、    [root@node1 ~]# source /etc/profile

5、    [root@node1 ~]# scala -version    

Scala code runner version 2.12.1 — Copyright 2002-2016, LAMP/EPFL and Lightbend, Inc.

 

[root@node1 ~]# scala

Welcome to Scala 2.12.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_92).

Type in expressions for evaluation. Or try :help.

 

scala> 9*9;

res0: Int = 81

 

scala>

6、    [root@node1 ~]# scp -r /root/scala-2.12.1 node2:/root

[root@node1 ~]# scp -r /root/scala-2.12.1 node3:/root

[root@node1 ~]# scp -r /root/scala-2.12.1 node4:/root

[root@node1 ~]# scp /etc/profile node2:/etc

[root@node1 ~]# scp /etc/profile node3:/etc

[root@node1 ~]# scp /etc/profile node4:/etc

[root@node2 ~]# source /etc/profile

[root@node3 ~]# source /etc/profile

[root@node4 ~]# source /etc/profile

12               Spark安装

1、    [root@node1 ~]# wget -O /root/spark-2.1.0-bin-hadoop2.7.tgz IT虾米网

2、    [root@node1 ~]# tar -zxvf /root/spark-2.1.0-bin-hadoop2.7.tgz

3、    [root@node1 ~]# vi /etc/profile

export SPARK_HOME=/root/spark-2.1.0-bin-hadoop2.7

export PATH=.:$PATH:$JAVA_HOME/bin:$HIVE_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin

4、    [root@node1 ~]# source /etc/profile

5、    [root@node1 ~]# cp /root/spark-2.1.0-bin-hadoop2.7/conf/spark-env.sh.template /root/spark-2.1.0-bin-hadoop2.7/conf/spark-env.sh

6、    [root@node1 ~]# vi /root/spark-2.1.0-bin-hadoop2.7/conf/spark-env.sh

export SCALA_HOME=/root/scala-2.12.1

export JAVA_HOME=//root/jdk1.8.0_92

export HADOOP_CONF_DIR=/root/hadoop-2.7.2/etc/hadoop

7、    [root@node1 ~]# cp /root/spark-2.1.0-bin-hadoop2.7/conf/slaves.template /root/spark-2.1.0-bin-hadoop2.7/conf/slaves

8、    [root@node1 ~]# vi /root/spark-2.1.0-bin-hadoop2.7/conf/slaves

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

7、    [root@node1 ~]# scp -r /root/spark-2.1.0-bin-hadoop2.7 node2:/root

[root@node1 ~]# scp -r /root/spark-2.1.0-bin-hadoop2.7 node3:/root

[root@node1 ~]# scp -r /root/spark-2.1.0-bin-hadoop2.7 node4:/root

[root@node1 ~]# scp /etc/profile node2:/etc

[root@node1 ~]# scp /etc/profile node3:/etc

[root@node1 ~]# scp /etc/profile node4:/etc

[root@node2 ~]# source /etc/profile

[root@node3 ~]# source /etc/profile

[root@node4 ~]# source /etc/profile

8、    [root@node1 conf]# /root/spark-2.1.0-bin-hadoop2.7/sbin/stop-all.sh

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 ~]# jps

2569 Master

[root@node2 ~]# jps

2120 Worker

 [root@node3 ~]# jps

2121 Worker

[root@node4 ~]# jps

2198 Worker

12.1测试

直接在Spark Shell中进行测试:

[root@node1 conf]# spark-shell

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

val file=sc.textFile(“hdfs://node1/hadoop/core-site.xml”)

val rdd = file.flatMap(line => line.split(” “)).map(word => (word,1)).reduceByKey(_+_)

rdd.collect()

rdd.foreach(println)

使用SparkHadoop提供的WordCount示例提交测试:

[root@node1 ~]# spark-submit –master spark://node1:7077 –class org.apache.hadoop.examples.WordCount –name wordcount /root/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar hdfs://node1/hadoop/core-site.xml hdfs://node1/output

不过此种情况还是提交成MapReduce任务,而不是Spark任务,该示例包jarJava语开发的,并且程序中未使用到Spark

使用Spark提供的WordCount示例进行测试:

spark-submit –master spark://node1:7077 –class org.apache.spark.examples.JavaWordCount –name wordcount /root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar hdfs://node1/hadoop/core-site.xml hdfs://node1/output

该示例也是Java语句实现,但程序是通过Spark包实现的,所以产生了Spark任务:

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

12.2Hive启动问题

Hivespark2.0.0启动时无法访问../lib/spark-assembly-*.jar: 没有那个文件或目录的解决办法

CentOS7+Hadoop2.7.2(HA高可用+Federation联邦)+Hive1.2.1+Spark2.1.0 完全分布式集群安装详解大数据

[root@node1 ~]# vi /root/hive-1.2.1/bin/hive

  #sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`

  sparkAssemblyPath=`ls ${SPARK_HOME}/jars/*.jar`

[root@node1 ~]# scp /root/hive-1.2.1/bin/hive node3:/root/hive-1.2.1/bin

13               清理与压缩

yum 会把下载的软件包和header存储在cache中,而不会自动删除。清除YUM缓存:

[root@node1 ~]# yum clean all

[root@node1 ~]# dd if=/dev/zero of=/0bits bs=20M       //将碎片空间填充上0,结束的时候会提示磁盘空间不足,忽略即可

[root@node1 ~]# rm  /0bits                           //删除上面的填充

关闭虚拟机,然后打开cmd ,用cd命令进入到你的vmware安装文件夹,如D:/BOE4 然后执行:

vmware-vdiskmanager -k  D:/hadoop/spark/VM/node1/node1.vmdk       //注:这个vmdk文件为总文件,而不是子的

14               hadoop2.x常用端口

组件

节点

默认端口

配置

用途说明

HDFS

DataNode

50010

dfs.datanode.address

datanode服务端口,用于数据传输

HDFS

DataNode

50075

dfs.datanode.http.address

http服务的端口

HDFS

DataNode

50475

dfs.datanode.https.address

https服务的端口

HDFS

DataNode

50020

dfs.datanode.ipc.address

ipc服务的端口

HDFS

NameNode

50070

dfs.namenode.http-address

http服务的端口

HDFS

NameNode

50470

dfs.namenode.https-address

https服务的端口

HDFS

NameNode

8020

fs.defaultFS

接收Client连接的RPC端口,用于获取文件系统metadata信息。

HDFS

journalnode

8485

dfs.journalnode.rpc-address

RPC服务

HDFS

journalnode

8480

dfs.journalnode.http-address

HTTP服务

HDFS

ZKFC

8019

dfs.ha.zkfc.port

ZooKeeper FailoverController,用于NN HA

YARN

ResourceManager

8032

yarn.resourcemanager.address

RMapplications manager(ASM)端口

YARN

ResourceManager

8030

yarn.resourcemanager.scheduler.address

scheduler组件的IPC端口

YARN

ResourceManager

8031

yarn.resourcemanager.resource-tracker.address

IPC

YARN

ResourceManager

8033

yarn.resourcemanager.admin.address

IPC

YARN

ResourceManager

8088

yarn.resourcemanager.webapp.address

http服务端口

YARN

NodeManager

8040

yarn.nodemanager.localizer.address

localizer IPC

YARN

NodeManager

8042

yarn.nodemanager.webapp.address

http服务端口

YARN

NodeManager

8041

yarn.nodemanager.address

NMcontainer manager的端口

YARN

JobHistory Server

10020

mapreduce.jobhistory.address

IPC

YARN

JobHistory Server

19888

mapreduce.jobhistory.webapp.address

http服务端口

HBase

Master

60000

hbase.master.port

IPC

HBase

Master

60010

hbase.master.info.port

http服务端口

HBase

RegionServer

60020

hbase.regionserver.port

IPC

HBase

RegionServer

60030

hbase.regionserver.info.port

http服务端口

HBase

HQuorumPeer

2181

hbase.zookeeper.property.clientPort

HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。

HBase

HQuorumPeer

2888

hbase.zookeeper.peerport

HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。

HBase

HQuorumPeer

3888

hbase.zookeeper.leaderport

HBase-managed ZK mode,使用独立的ZooKeeper集群则不会启用该端口。

Hive

Metastore

9083

/etc/default/hive-metastoreexport PORT=<port>来更新默认端口

 

Hive

HiveServer

10000

/etc/hive/conf/hive-env.shexport HIVE_SERVER2_THRIFT_PORT=<port>来更新默认端口

 

ZooKeeper

Server

2181

/etc/zookeeper/conf/zoo.cfgclientPort=<port>

对客户端提供服务的端口

ZooKeeper

Server

2888

/etc/zookeeper/conf/zoo.cfgserver.x=[hostname]:nnnnn[:nnnnn],标蓝部分

follower用来连接到leader,只在leader上监听该端口。

ZooKeeper

Server

3888

/etc/zookeeper/conf/zoo.cfgserver.x=[hostname]:nnnnn[:nnnnn],标蓝部分

用于leader选举的。只在electionAlg1,23(默认)时需要。

15               Linux命令

查超出10M的文件:

find . -type f -size +10M  -print0 | xargs -0 du -h | sort -nr

将前最大的前20目录列出来,–max-depth表示目录深度,如果去掉,则遍历所有子目录:

du -hm –max-depth=5 / | sort -nr | head -20

find /etc -name ‘*srm*’  #表示在/etc目录下查找文件名中含有srm字符的所有文件

清除YUM缓存

  yum 会把下载的软件包和header存储在cache中,而不会自动删除。假如我们觉得他们占用了磁盘空间,能够使用yum clean指令进行清除,更精确 的用法是yum clean headers清除headeryum clean packages清除下载的rpm包,yum clean all一股脑儿端 .

更改所有者

chown -R -v 15040078 /tmp

16               hadoop文件系统命令

[root@node1 ~/hadoop-2.6.0/bin]# ./hdfs dfs -chmod -R 700 /tmp

附件列表

原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/228130.html

(0)
上一篇 2022年1月11日
下一篇 2022年1月11日

相关推荐

发表回复

登录后才能评论