MHA大杀器——mysql主、从双击热备配置安装解析
MHA的主要作用:
在mysql的主从复制中,当master崩溃了,利用mha实现backup顶替崩溃的master自动切换为master继续工作,从而实现高可用。
下面介绍本次实验的环境:
MHA分为manager管理节点和node节点,一般来讲最少是三台服务器,两台node节点,一台manager节点,但本次环境限制,只能使用两台,所以把manager也装在一台node节点上。
两台服务器,两个网口:
IP:
10.2.16.253 10.0.0.1 node1
10.2.16.254 10.0.0.2 node2
准备工作:
1、首先根据环境修改两台服务器的/etc/hosts文件
在hosts文件最后加入两行:
10.0.0.2 node2
10.0.0.1 node1
2、关闭selinux\iptables\ip6tables
3、配置网卡:
eth0 为10.2.16.0网段 (用于外网通信)
eth1 为10.0.0.0 网段 (用于内网监测)
4、本例中的系统版本为:centos 6.4 x64 ,MHA版本为 0.54
安装MHA-manager:
mha依赖Perl模块,安装依赖非常的麻烦,首先需要确保拥有 centos的yum源和epel的yum源。(本例中使用的epel源为6.8版本)
注意:经验证有些包epel没有centos源有,有些相反,所以需要互补,通过两个源再结合cpan来安装依赖。
yum -y install perl* ncftp cpan (centos源安装)
perl-Log-Dispatch 是下载的rpm包安装,其他的缺少包找依赖下载吧。
shell> wget http://downloads.naulinux.ru/pub/NauLinux/6x/i386/sites/School/RPMS/perl-Log-Dispatch-2.27-1.el6.noarch.rpm
shell> wget http://dl.Fedoraproject.org/pub/epel/6/i386/perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
shell> wget http://dl.fedoraproject.org/pub/epel/6/i386/perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
shell> wget http://mirror.centos.org/centos/6/os/x86_64/Packages/perl-Time-HiRes-1.9721-136.el6.x86_64.rpm
装依赖时候,我也废了很大劲才装上,所以,缺啥找啥装啥吧.....
大概是这些包:perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-MIME-Lite* perl-Mail-Send*
依赖安装完毕后,下面讲解本次实验所需的大致步骤:
1、首先需要配置两台主机的SSH密钥免密码验证登录
2、建立 mysql-master\mysql-slave之间的主从复制
3、配置 mha-manager文件
4、用mha-chech-ssh和mha-check-repl验证登录和mysql复制是否成功
5、启动mha-manager(确认以上都无问题后)
一、配置SSH免密钥登录:
[root@node1 ~]# ssh-keygen -t rsa 连按三个回车(确认密钥文件位置、设置密钥密码、确认密钥密码)
[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.0.0.2 拷贝密钥密码到对方的/root/.ssh/authorized_keys文件
[root@node2 ~]# ssh-keygen -t rsa
[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.0.0.1
注意:master的authorized_keys文件也需要加入本机的id.pub内容,不然后期check-ssh会不通过(当时这个问题困扰了两天才找到原因)
[root@node1 ~]#cat .ssh/id_rsa.pub >> .ssh/authorized_keys
二、安装mysql配置主从复制:
为了方便,就用yum装了哈
- [root@node1 ~]# yum -y install mysql*
- [root@node2 ~]# yum -y install mysql*
- node1(master)的配置文件
- [root@node1 ~]# cat /etc/my.cnf
- [mysqld]
- server-id=1 #全局唯一
- log-bin=mysql-bin #生成二进制复制文件的前缀
- datadir=/var/lib/mysql
- socket=/var/lib/mysql/mysql.sock
- user=mysql
- # Disabling symbolic-links is recommended to prevent assorted security risks
- symbolic-links=0
- [mysqld_safe]
- log-error=/var/log/mysqld.log
- pid-file=/var/run/mysqld/mysqld.pid
node2(slave)的配置文件
- [root@node2 ~]# cat /etc/my.cnf
- [mysqld]
- server-id=2
- log-bin=mysql-bin
- #binlog_format=mixed
- datadir=/var/lib/mysql
- socket=/var/lib/mysql/mysql.sock
- user=mysql
- # Disabling symbolic-links is recommended to prevent assorted security risks
- symbolic-links=0
- [mysqld_safe]
- log-error=/var/log/mysqld.log
- pid-file=/var/run/mysqld/mysqld.pid
配置mysql主节点:
添加用户以及复制权限:
- mysql>grant replication slave on *.* to 'repl'@'10.0.0.%' identified by '123456';
- mysql>grant all on *.* to 'root'@'10.0.0.2' identified by '123456';
- mysql>grant all on *.* to 'root'@'node1' identified by '123456';
- mysql>flush privileges;
配置完之后查看:
- mysql> select user,host,password from mysql.user;
- +------+-----------+-------------------------------------------+
- | user | host | password |
- +------+-----------+-------------------------------------------+
- | root | localhost | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | node1 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | 127.0.0.1 | |
- | | localhost | |
- | | node1 | |
- | repl | 10.0.0.% | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | 10.0.0.2 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- +------+-----------+-------------------------------------------+
- 7 rows in set (0.02 sec)
- mysql> show master status;
- +------------------+----------+--------------+------------------+
- | File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
- +------------------+----------+--------------+------------------+
- | mysql-bin.000001 | 106 | | |
- +------------------+----------+--------------+------------------+
- 1 row in set (0.01 sec)
配置mysql的slave节点:
- mysql> change master to
- master_host='10.0.0.1',
- master_port=3306,
- master_user='repl',
- master_password='123456',
- master_log_file='mysql-bin.000001',
- master_log_pos=106;
slave机子上也要授权,因为这个是备用master;
- mysql>grant replication slave on *.* to 'repl'@'10.0.0.%' identified by '123456';
- mysql>grant all on *.* to 'root'@'10.0.0.1' identified by '123456';
- mysql>grant all on *.* to 'root'@'node2' identified by '123456';
- mysql>flush privileges;
- mysql> select user,host,password from mysql.user;
- +------+-----------+-------------------------------------------+
- | user | host | password |
- +------+-----------+-------------------------------------------+
- | root | localhost | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | node2 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | 127.0.0.1 | |
- | | localhost | |
- | | node2 | |
- | repl | 10.0.0.% | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- | root | 10.0.0.1 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |
- +------+-----------+-------------------------------------------+
- 7 rows in set (0.00 sec)
启动slave
mysql> slave start;
Query OK, 0 rows affected (0.02 sec)
查看slave 的Slave_IO_Running 和Slave_SQL_Running,都为yes 时表示配置成功
- mysql> show slave status\G
- *************************** 1. row ***************************
- Slave_IO_State: Waiting for master to send event
- Master_Host: 10.0.0.1
- Master_User: repl
- Master_Port: 3306
- Connect_Retry: 60
- Master_Log_File: mysql-bin.000001
- Read_Master_Log_Pos: 361
- Relay_Log_File: mysqld-relay-bin.000001
- Relay_Log_Pos: 326
- Relay_Master_Log_File: mysql-bin.000001
- Slave_IO_Running: Yes
- Slave_SQL_Running: Yes
- Replicate_Do_DB:
- Replicate_Ignore_DB:
- Replicate_Do_Table:
- Replicate_Ignore_Table:
- Replicate_Wild_Do_Table:
- Replicate_Wild_Ignore_Table:
- Last_Errno: 1007
- Last_Error:
- Skip_Counter: 0
- Exec_Master_Log_Pos: 181
- Relay_Log_Space: 807
- Until_Condition: None
- Until_Log_File:
- Until_Log_Pos: 0
- Master_SSL_Allowed: No
- Master_SSL_CA_File:
- Master_SSL_CA_Path:
- Master_SSL_Cert:
- Master_SSL_Cipher:
- Master_SSL_Key:
- Seconds_Behind_Master: NULL
- Master_SSL_Verify_Server_Cert: No
- Last_IO_Errno: 0
- Last_IO_Error:
- Last_SQL_Errno: 1007
- Last_SQL_Error:
三、安装配置MHA-manager:
- [root@node1 mha4mysql-manager-0.54]# perl Makefile.PL #perl-cpan非常麻烦,应该会遇到各种错误,按报错找问题再解决吧...
- [root@node1 ~] make && make install
- [root@node1 ~]# mkdir /etc/masterha 配置manager的主文件目录
- [root@node1 ~]# mkdir -p /master/app1 配置manager的log文件目录
- [root@node1 mha4mysql-manager-0.54]# cp samples/conf/* /etc/masterha/ 拷贝配置文件模版
编辑配置文件app1.conf,另一个masterha_default.cnf 无需配置
- [root@node1 ~]# cat /etc/masterha/app1.cnf
- [server default]
- manager_workdir=/masterha/app1
- manager_log=/masterha/app1/manager.log
- user=root #配置manager-mysql的root管理用户和密码
- password=123456
- ssh_user=root #之前配置的用来无密码登录的用户
- replrepl_user=repl #用来数据同步复制的用户
- repl_password=123456
- ping_interval=1 #ping验证检查mysql状态,每隔一秒检查一次
- shutdown_script=""
- #master_ip_failover_script="/data/master_ip_failover"
- master_ip_online_change_script=""
- report_script=""
- [server1] #配置node节点
- hostname=10.0.0.1
- master_binlog_dir="/var/lib/mysql/" #mysql主从复制的二进制文件位置
- candidate_master=1 #master机宕掉后,优先启用这台作为新master
- #no_master=1 #设置使主机不能成为新master
- [server2]
- hostname=10.0.0.2
- master_binlog_dir="/var/lib/mysql/"
- candidate_master=1
保存退出,配置完成。
在两台机器上安装node的rpm包
[root@node1 ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm
[root@node2 ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm
进行ssh验证
- [root@node1 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
- Fri Jun 27 15:28:50 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
- Fri Jun 27 15:28:50 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
- Fri Jun 27 15:28:50 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
- Fri Jun 27 15:28:50 2014 - [info] Starting SSH connection tests..
- Fri Jun 27 15:28:52 2014 - [debug]
- Fri Jun 27 15:28:51 2014 - [debug] Connecting via SSH from root@10.0.0.1(10.0.0.1:22) to root@10.0.0.2(10.0.0.2:22)..
- Fri Jun 27 15:28:52 2014 - [debug] ok.
- Fri Jun 27 15:28:52 2014 - [debug]
- Fri Jun 27 15:28:51 2014 - [debug] Connecting via SSH from root@10.0.0.2(10.0.0.2:22) to root@10.0.0.1(10.0.0.1:22)..
- Fri Jun 27 15:28:52 2014 - [debug] ok.
- Fri Jun 27 15:28:52 2014 - [info] All SSH connection tests passed successfully.
进行mysql主从复制验证
[root@node1 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf
重要信息显示当前master和slave:
10.2.16.253 (current master)
+--10.2.16.254
最后输出以下,则表示配置成功!
MySQL Replication Health is OK.
启动MHA-manager管理进程:
nohup masterha_manager --conf=/etc/masterha/app1.cnf >/tmp/mha_manager.log 2>&1
查看管理节点的进程和当前的Master
masterha_check_status --conf=/etc/masterha/app1.cnf
*****************此时MHA的配置已经全部完成!***********************
模拟当mha-master挂掉之后,master切换为备用的254之后,如何切回来:
首先需要删除rm -f /masterha/app1/app1.failover.complete #此为每次启动MHA管理进程时生成的临时文件
*******************让旧的master先变为slave,同步宕机时丢失的数据***********
在旧的master 253上执行:
- mysql> reset master;
- Query OK, 0 rows affected (0.04 sec)
- #查看现在备用的master上的pos和log_file值之后,再写:
- mysql> change master to master_host='10.0.0.1', master_port=3306, master_user='repl', master_password='123456', master_log_file='mysql-bin.000002', master_log_pos=106;
- Query OK, 0 rows affected (0.05 sec)
- mysql> start slave; #暂时先把旧master变为从
- shell> masterha_master_switch --master_state=alive --conf=/etc/masterha/app1.cnf
- ##########################master成功切换回###########################
如何结合KeepAlived实现宕机时IP漂移达到mysql高可用,请查看本博另一篇文章”keepalived结合MHA实现mysql高可用”