博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
HDFS-HA高可用 | Yarn-HA
阅读量:4646 次
发布时间:2019-06-09

本文共 15018 字,大约阅读时间需要 50 分钟。

 HDFS-HA

HA(High Available),即高可用(7*24小时不中断服务)

单点故障即有一台机器挂了导致全部都挂了;HA就是解决单点故障,就是针对NameNode;

主Active:读写、从standby只读;所依赖的服务都必须是高可用的;

两种解决共享空间的方案:NFS、QJM(主流的)

 

奇数台机器;QJM跟zookeeper(数据全局一致;半数以上的机器存活就可以提供服务)高可用的方式一模一样,;

QJM也是基于Paxos算法,系统容错不能超过n-1/2, 5台容错2台;

 

 

 这个架构只能手动决定哪个是active哪个是standby;必须只能有一个active!!如果出现两个NameNode,即两个都是active,它可能还不报错,那么可能会导致整个集群的数据都是错的,问题很严重!两个AA的情况叫脑裂(split brain缩写sb)。

standby要想变成active,要确保active那个,需要安全可靠的zookeeper(文件系统+通知机制)第三方来联系两方  ---> 实现故障的自动转移;

 由Zkfc来联系zookeeper,并不是namenode直接联系,(zookeeper客户端);HA是hadoop2.0才有的,namenode在1.0时就有了;没有把zkfc写进namenode是为了保持NameNode的健壮性,没有zkfc之前就已经运行的很好了(鲁棒性);NameNode和Zkfc虽然是两个进程但它们是绑定到一起的

两个zkfc怎么决定谁初始化就是active呢,谁快谁就是active; 是active状态它会在zookeeper中有一个临时节点,zkfc会尝试看看zookeeper中有没有这个临时节点,如果没有我就变成这个临时节点,成为active,慢的一看有了,就变成standby;

NameNode发生假死,zkfc就会把zookeeper中的临时节点删除,去通知另外一个namenode的zkfc,让它去成为active,这个namenode就会去强行杀死假死的namenode,防止脑裂!如果杀不死就自定义一个脚本强制它关机,成功之后才会变成active。

现在合并fsimage是由standby来完成的,没有secondaryNameNode;

 

在module目录下创建一个ha文件夹mkdir ha将/opt/module/下的 hadoop-2.7.2拷贝到/opt/module/ha目录下cp -r hadoop-2.7.2/ /opt/module/ha/删除data logs等文件

 配置core-site.xml

fs.defaultFS
hdfs://mycluster
hadoop.tmp.dir
/opt/module/ha/hadoop-2.7.2/data/tmp
ha.zookeeper.quorum
hadoop101:2181,hadoop102:2181,hadoop103:2181

配置hdfs-site.xml

dfs.nameservices
mycluster
dfs.ha.namenodes.mycluster
nn1,nn2
dfs.namenode.rpc-address.mycluster.nn1
hadoop101:9000
dfs.namenode.rpc-address.mycluster.nn2
hadoop102:9000
dfs.namenode.http-address.mycluster.nn1
hadoop101:50070
dfs.namenode.http-address.mycluster.nn2
hadoop102:50070
dfs.namenode.shared.edits.dir
qjournal://hadoop101:8485;hadoop102:8485;hadoop103:8485/mycluster
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/home/kris/.ssh/id_rsa
dfs.journalnode.edits.dir
/opt/module/ha/hadoop-2.7.2/data/jn
dfs.permissions.enable
false
dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.automatic-failover.enabled
true
发送到其他机器xsync ha
启动HDFS-HA集群
1. 在各个JournalNode节点上,输入以下命令启动journalnode服务    sbin/hadoop-daemons.sh start journalnode //加个s就可以3台一块启动;都启动之后才能格式化namenode;只能格式化一次!2. 在[nn1]上,对其进行格式化,并启动    bin/hdfs namenode -format 19/02/13 02:15:00 INFO util.GSet: Computing capacity for map NameNodeRetryCache19/02/13 02:15:00 INFO util.GSet: VM type = 64-bit19/02/13 02:15:00 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB19/02/13 02:15:00 INFO util.GSet: capacity = 2^15 = 32768 entries19/02/13 02:15:01 INFO namenode.FSImage: Allocated new BlockPoolId: BP-26035536-192.168.1.101-154999530180019/02/13 02:15:01 INFO common.Storage: Storage directory /opt/module/ha/hadoop-2.7.2/data/tmp/dfs/name has been successfully formatted.19/02/13 02:15:02 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 019/02/13 02:15:02 INFO util.ExitUtil: Exiting with status 019/02/13 02:15:02 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at hadoop101/192.168.1.101************************************************************/  启动namenode:  sbin/hadoop-daemon.sh start namenode3.    在[nn2]上,同步nn1的元数据信息    bin/hdfs namenode -bootstrapStandby......STARTUP_MSG: build = Unknown -r Unknown; compiled by 'root' on 2017-05-22T10:49ZSTARTUP_MSG: java = 1.8.0_144************************************************************/19/02/21 17:56:25 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]19/02/21 17:56:25 INFO namenode.NameNode: createNameNode [-bootstrapStandby]=====================================================About to bootstrap Standby ID nn2 from:Nameservice ID: myclusterOther Namenode ID: nn1Other NN's HTTP address: http://hadoop101:50070Other NN's IPC address: hadoop101/192.168.1.101:9000Namespace ID: 411281390Block pool ID: BP-1462258257-192.168.1.101-1550740170734Cluster ID: CID-d20dda0d-49d1-48f4-b9e8-2c99b72a15c2Layout version: -63isUpgradeFinalized: true=====================================================19/02/13 02:16:51 INFO common.Storage: Storage directory /opt/module/ha/hadoop-2.7.2/data/tmp/dfs/name has been successfully formatted.19/02/13 02:16:51 INFO namenode.TransferFsImage: Opening connection to http://hadoop101:50070/imagetransfer?getimage=1&txid=0&storageInfo=-63:1640720426:0:CID-81cbaa0d-6a6f-4932-98ba-ff2a46d8751419/02/13 02:16:51 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds19/02/13 02:16:52 INFO namenode.TransferFsImage: Transfer took 0.02s at 0.00 KB/s19/02/13 02:16:52 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 351 bytes.19/02/13 02:16:52 INFO util.ExitUtil: Exiting with status 019/02/13 02:16:52 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at hadoop102/192.168.1.102************************************************************/出现这个信息是:(只需格式化一次;)  Re-format filesystem in Storage Directory /opt/module/ha/hadoop-2.7.2/data/tmp/dfs/name ? (Y or N) N Format aborted in Storage Directory /opt/module/ha/hadoop-2.7.2/data/tmp/dfs/name 19/02/21 19:06:50 INFO util.ExitUtil: Exiting with status 5  ##这个是退出状态! 19/02/21 19:06:50 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop102/192.168.1.102 ************************************************************/
4.    启动[nn2]    sbin/hadoop-daemon.sh start namenode6.    在[nn1]上,启动所有datanode; hadoop-daemons.sh是3台都启动datanode    sbin/hadoop-daemons.sh start datanode ################手动切换namenode          7.    将[nn1]切换为Active    bin/hdfs haadmin -transitionToActive nn15.    查看是否Active    bin/hdfs haadmin -getServiceState nn1    http://hadoop101:50070/dfshealth.html#tab-overview    http://hadoop102:50070/dfshealth.html#tab-overview
 

 

 

配置HDFS-HA自动故障转移(直接启动不需先按手动的启动方式

配置好自动故障转移后手动模式就不可用了

 

Automatic failover is enabled for NameNode at hadoop102/192.168.1.102:9000Refusing to manually manage HA state, since it may cause a split-brain scenario or other incorrect state.If you are very sure you know what you are doing, please specify the --forcemanual flag.

 

 

 

配置HDFS-HA自动故障转移1.    具体配置 ;在上个基础上添加如下:    (1)在hdfs-site.xml中增加
dfs.ha.automatic-failover.enabled
true
(2)在core-site.xml文件中增加
ha.zookeeper.quorum
hadoop101:2181,hadoop102:2181,hadoop103:2181
2. 启动 (1)关闭所有HDFS服务:sbin/stop-dfs.sh (2)启动Zookeeper集群:bin/zkServer.sh start (3)初始化HA在Zookeeper中状态:(只初始化一次)bin/hdfs zkfc -formatZK19/02/13 02:04:02 INFO zookeeper.ClientCnxn: Socket connection established to hadoop102/192.168.1.102:2181, initiating session19/02/13 02:04:02 INFO zookeeper.ClientCnxn: Session establishment complete on server hadoop102/192.168.1.102:2181, sessionid = 0x268e2dee6e40000, negotiated timeout = 500019/02/13 02:04:02 INFO ha.ActiveStandbyElector: Session connected.19/02/13 02:04:02 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.19/02/13 02:04:02 INFO zookeeper.ZooKeeper: Session: 0x268e2dee6e40000 closed19/02/13 02:04:02 INFO zookeeper.ClientCnxn: EventThread shut down (4)启动HDFS服务:sbin/start-dfs.shhadoop101: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-namenode-hadoop101.outhadoop102: starting namenode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-namenode-hadoop102.outhadoop102: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-datanode-hadoop102.outhadoop101: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-datanode-hadoop101.outhadoop103: starting datanode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-datanode-hadoop103.outStarting journal nodes [hadoop101 hadoop102 hadoop103]hadoop103: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-journalnode-hadoop103.outhadoop102: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-journalnode-hadoop102.outhadoop101: starting journalnode, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-journalnode-hadoop101.outStarting ZK Failover Controllers on NN hosts [hadoop101 hadoop102]hadoop101: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-zkfc-hadoop101.outhadoop102: starting zkfc, logging to /opt/module/ha/hadoop-2.7.2/logs/hadoop-kris-zkfc-hadoop102.out3. 验证 (1)将Active NameNode进程kill kill -9 namenode的进程id ---> 另外一个namenode上位;( 杀掉之后,可单独启动: sbin/hadoop-daemon.sh start namenode,启动之后它不会变成之前的active;而是standby) (2) 将DFSZKFailoverController的进程kill kill -9 8819 --->尽管它的namenode没有挂,但另外一个namenode也会上位,它变成standby 再启动它要先停止->sbin/stop-dfs.sh sbin/start-dfs.sh (3)将Active NameNode机器断开网络 sudo service network stop ---> 把网络断开之后,配的隔离机制是sshfence尝试远程登录区杀敌hadoop101,直到把101网络连接上才杀死,102才成为active;

 

断网上位很容易脑裂,那边网已恢复就炸了;如果真的断网,它这个是不会自动切的是为了防脑裂。虽然可以通过配置断网可以直接上,但很危险;断网上不去反而安全。

 

[zk: localhost:2181(CONNECTED) 0] ls /[zookeeper, hadoop-ha][zk: localhost:2181(CONNECTED) 1] ls /[zookeeper, hadoop-ha][zk: localhost:2181(CONNECTED) 2] ls /hadoop-ha[mycluster][zk: localhost:2181(CONNECTED) 3] ls /hadoop-ha/mycluster[ActiveBreadCrumb, ActiveStandbyElectorLock] 选举的关键节点,谁占领了这个节点谁就是active; [zk: localhost:2181(CONNECTED) 4] get /hadoop-ha/mycluster/ActiveStandbyElectorLock        myclusternn1 hadoop101 �F(�>cZxid = 0x10000000fctime = Wed Feb 13 02:16:22 CST 2019mZxid = 0x10000000fmtime = Wed Feb 13 02:16:22 CST 2019pZxid = 0x10000000fcversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0x268e2dee6e40002  临时节点,dataLength = 33numChildren = 0

 

进程

hadoop101                hadoop102                hadoop103 NameNode                NameNode JournalNode            JournalNode                JournalNode DataNode                DataNode                DataNode DFSZKFailoverController DFSZKFailoverController ZooKeeperMain(bin/zkCli.sh ,启动zookeeperd客户端) ResourceManager         ResourceManager NodeManager             NodeManager            NodeManager    QuorumPeerMain            QuorumPeerMain            QuorumPeerMain (bin/zkServer.sh start 启动zookeeper服务器)DFSZKFailoverController是Hadoop-2.7.0中HDFS NameNode HA实现的中心组件,它负责整体的故障转移控制等。它是一个守护进程,通过main()方法启动,继承自ZKFailoverController。  zkfc 使用JournalNode实现两个NameNode(Active和Standby)之间数据的共享

YARN-HA配置

YARN-HA工作机制

 

 Yarn原生根zookeeper兼容很好,配置比较简单;

 配置YARN-HA集群

  yarn-site.xml

yarn.nodemanager.aux-services
mapreduce_shuffle
//混合服务还是shuffle,用shuffle确定reducer获取数据的方式
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
cluster-yarn1
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
hadoop101
yarn.resourcemanager.hostname.rm2
hadoop102
yarn.resourcemanager.zk-address
hadoop101:2181,hadoop102:2181,hadoop103:2181
yarn.resourcemanager.recovery.enabled
true
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

yarn-site.xml 在hadoop101中配置好将其分发到其他机器:xsync etc/ 

 

启动zookeeper;3台其他都启动;  [kris@hadoop101 ~]$ /opt/module/zookeeper-3.4.10/bin/zkServer.sh start启动hdfs   sbin/start-dfs.sh启动YARN (1)在hadoop101中执行:  sbin/start-yarn.sh[kris@hadoop101 hadoop-2.7.2]$ sbin/start-yarn.sh starting yarn daemonsstarting resourcemanager, logging to /opt/module/ha/hadoop-2.7.2/logs/yarn-kris-resourcemanager-hadoop101.outhadoop101: starting nodemanager, logging to /opt/module/ha/hadoop-2.7.2/logs/yarn-kris-nodemanager-hadoop101.outhadoop103: starting nodemanager, logging to /opt/module/ha/hadoop-2.7.2/logs/yarn-kris-nodemanager-hadoop103.outhadoop102: starting nodemanager, logging to /opt/module/ha/hadoop-2.7.2/logs/yarn-kris-nodemanager-hadoop102.out(2)在hadoop102中执行:  sbin/yarn-daemon.sh start resourcemanager //脚本只在上面一台上启动了,hadoop102上的resourcemanager要手动启;[kris@hadoop102 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanagerstarting resourcemanager, logging to /opt/module/ha/hadoop-2.7.2/logs/yarn-kris-resourcemanager-hadoop102.out(3)查看服务状态  bin/yarn rmadmin -getServiceState rm1  //rm1就是hadoop101;  rm2为hadoop102[kris@hadoop102 hadoop-2.7.2]$ bin/yarn rmadmin -getServiceState rm1active[kris@hadoop102 hadoop-2.7.2]$ bin/yarn rmadmin -getServiceState rm2standby################hdfs-ha & yarn-ha[kris@hadoop101 hadoop-2.7.2]$ jpsall-------hadoop101-------14288 ResourceManager14402 NodeManager13396 NameNode14791 Jps13114 QuorumPeerMain13946 DFSZKFailoverController13516 DataNode13743 JournalNode-------hadoop102-------9936 NameNode10352 ResourceManager9569 JournalNode9698 DFSZKFailoverController9462 DataNode9270 QuorumPeerMain10600 Jps10202 NodeManager-------hadoop103-------9073 Jps8697 JournalNode8522 QuorumPeerMain8909 NodeManager8590 DataNode

 

  http://hadoop101:8088/cluster     http://hadoop102:8088/cluster -->它会重定向到hadoop101

  

 

转载于:https://www.cnblogs.com/shengyang17/p/10367519.html

你可能感兴趣的文章
Get MAC address using POSIX APIs
查看>>
bzoj2120
查看>>
基于uFUN开发板的心率计(一)DMA方式获取传感器数据
查看>>
【dp】船
查看>>
oracle, group by, having, where
查看>>
⑥python模块初识、pyc和PyCodeObject
查看>>
object-c中管理文件和目录:NSFileManager使用方法
查看>>
Kibana:分析及可视化日志文件
查看>>
nodejs pm2使用
查看>>
cocos2d-x 3.10 PageView BUG
查看>>
装饰器的基本使用:用户登录
查看>>
CSS选择器总结
查看>>
mysql中sql语句
查看>>
head/tail实现
查看>>
sql语句的各种模糊查询语句
查看>>
vlc 学习网
查看>>
Python20-Day05
查看>>
Real World Haskell 第七章 I/O
查看>>
C#操作OFFICE一(EXCEL)
查看>>
【js操作url参数】获取指定url参数值、取指定url参数并转为json对象
查看>>