每个mysql服务器节点都需要运行mmmd_agent,同时在另外一台机器【可以是独立的一台服务器也可以是和AppServer共同享一个服务器】上运行mmmd_mon
mmm利用了虚拟IP技术,1个网卡可以使用多个IP
所以使用mmm时,需要2*n+1个IP,n为mysql节点数,包括slave和master
当数据库节点fail时,mmmd_mon检测不到mmmd_agent的心跳或者对应的mysql服务器的状态时
mmmd_mon将作出决定并下令给某个正常的数据库节点的mmmd_agent,使得该mmmd_agent“篡位”
【即 使用刚才fail的那个结点的虚拟IP,使得虚拟IP从fail结点指向此时的正常机器】
MMMM需要5个IP,两个节点使用固定IP,两个程式读IP(只读),1个 程式写IP(用来更新)
后面这三个虚拟IP是根据节点的可用性在节点之间实现跳转的
一。IP分配
IP分配如下:
A :mysql master 246
固定IP:211.100.97.246
程式读IP:211.100.97.244 (虚拟)
B:mysql master 250
固定IP:211.100.97.250
程式读IP:211.100.97.243 (虚拟)
monitor 245
程式写IP:211.100.97.248 (虚拟)
ifconfig eth1:1 211.100.97.244 netmask 255.255.255.224 up
[root@XKWB5510 software]# ifconfig -a|grep "inet addr"|head -3|tail -2|awk -F "[ :]+" '{print $4"/"$NF}'
211.100.97.246/255.255.255.224
211.100.97.244/255.255.255.224
给250添加虚拟IP 211.100.97.243
ifconfig eth0:1 211.100.97.243 netmask 255.255.255.224 up
[root@XKWB5705 software]# ifconfig -a|grep "inet addr"|head
-2|awk -F "[ :]+" '{print $4"/"$NF}'
211.100.97.250/255.255.255.224
211.100.97.243/255.255.255.224
给245添加虚拟IP:211.100.97.248
ifconfig eth1:1 211.100.97.248 netmask 255.255.255.224 up
[root@CentOS mysql-5.1.56]# ifconfig -a|grep "inet addr"|head
-3|tail -2|awk -F "[ :]+" '{print $4"/"$NF}'
211.100.97.245/255.255.255.224
211.100.97.248/255.255.255.224
二 授权
在AB机器添加代理账号 useradd rep_agent
在monitor机器上添加监控账号 useradd rep_monitor
A上授权
mysql> grant all privileges on *.* to 'rep_monitor'@'%'
identified by '123456';
mysql> grant all privileges on *.* to 'rep_agent'@'%' identified
by '123456';
查看授权情况
mysql> select host,user from mysql.user where user like
'rep%';
+----------------+-------------+
| host
| user |
+----------------+-------------+
|
%
| rep_agent |
|
%
| rep_monitor |
| 211.100.97.250 | replication |
| localhost | replication |
+----------------+-------------+
4 rows in set (0.01 sec)
mysql> grant all privileges on *.* to 'rep_monitor'@'%' identified by '123456';
mysql> grant all privileges on *.* to 'rep_agent'@'%' identified by '123456';
mysql> select host,user from mysql.user where user like
'rep%';
+----------------+-------------+
| host
| user |
+----------------+-------------+
|
%
| rep_agent |
|
%
| rep_monitor |
| 211.100.97.246 | replication |
| localhost | replication |
+----------------+-------------+
4 rows in set (0.00 sec)
CentOS软件仓库默认是不含这些软件的,必须要有epel这个包的支持。故我们必须先安装epel:
wget http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-4.noarch.rpm
rpm -Uvh epel-release-5-4.noarch.rpm
源码包安装mysql-mmm
wget
http://mysql-master-master.googlecode.com/files/mysql-master-master-1.2.3.tar.gz
tar zxf mysql-master-master-1.2.3.tar.gz
cd mysql-master-master-1.2.3
./install.pl
另外安装mmm之前需要安装以下几个必须的perl模块
Data::Dumper
POSIX
Cwd
threads
threads::shared
Thread::Queue
Thread::Semaphore
IO::Socket
Proc::Daemon
Time::HiRes
DBI
DBD::mysql
Algorithm::Diff
否则在安装mmm执行install.pl命令的时候,会出现如下报错:
1)
Checking required module 'Proc::Daemon'...Error!
Can't locate Proc/Daemon.pm in @INC (@INC contains:
/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.8 .) at (eval 13) line 2.
BEGIN failed--compilation aborted at (eval 13) line 2.
------------------------------------------------------------
Required module 'Proc::Daemon' is not found on this system!
Install it (e.g. run command 'cpan Proc::Daemon') and try
again.
根据以上报错提示运行
cpan Proc::Daemon 基本上都是回车
2)
Checking required module 'DBI'...Error!
Can't locate DBI.pm in @INC (@INC contains:
/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.8 .) at (eval 16) line 2.
BEGIN failed--compilation aborted at (eval 16) line 2.
------------------------------------------------------------
Required module 'DBI' is not found on this system!
Install it (e.g. run command 'cpan DBI') and try again.
++++++++++++++++++++++++++++
根据报错提示运行命令
cpan DBI
3)
Checking required module 'DBD::mysql'...Error!
Can't locate DBD/mysql.pm in @INC (@INC contains:
/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.8 .) at (eval 19) line 2.
BEGIN failed--compilation aborted at (eval 19) line 2.
------------------------------------------------------------
Required module 'DBD::mysql' is not found on this system!
Install it (e.g. run command 'cpan DBD::mysql') and try again.
报错如下:
CPAN.pm: Going to build C/CA/CAPTTOFU/DBD-mysql-4.020.tar.gz
Can't exec "mysql_config": No such file or directory at Makefile.PL line 83.
Cannot find the file 'mysql_config'! Your execution PATH doesn't
seem
not contain the path to mysql_config. Resorting to guessed
values!
Can't exec "mysql_config": No such file or directory at Makefile.PL
line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
Can't exec "mysql_config": No such file or directory at Makefile.PL
line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
Can't exec "mysql_config": No such file or directory at Makefile.PL
line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
For 'make test' to run properly, you must ensure that the
database user 'root' can connect to your MySQL server
and has the proper privileges that these tests require such
as 'drop table', 'create table', 'drop procedure', 'create
procedure'
as well as others.
mysql> grant all privileges on test.* to 'root'@'localhost' identified by 's3kr1t';
You can also optionally set the user to run 'make test' with:
perl Makefile.PL --testuser=username
Can't exec "mysql_config": No such file or directory at
Makefile.PL line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
Can't exec "mysql_config": No such file or directory at Makefile.PL
line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
Can't exec "mysql_config": No such file or directory at Makefile.PL
line 478.
Can't find mysql_config. Use --mysql_config option to specify where
mysql_config is located
Failed to determine directory of mysql.h. Use
perl Makefile.PL --cflags=-I<dir>
to set this directory. For details see the INSTALL.html
file,
section "C Compiler flags" or type
perl Makefile.PL --help
Running make test
Make had some problems, maybe interrupted? Won't test
Running make install
Make had some problems, maybe interrupted? Won't install
[root@XKWB5510 mysql-master-master-1.2.3]# find /usr/local/mysql/ -name "mysql_config*"
/usr/local/mysql/bin/mysql_config
/usr/local/mysql/share/man/man1/mysql_config.1
看一下mysql_config的权限
[root@XKWB5510 mysql-master-master-1.2.3]# ls -l
/usr/local/mysql/bin/mysql_config
-rwxr-xr-x 1 root root 6105 Sep 21 22:43
/usr/local/mysql/bin/mysql_config
解决办法
URL:http://search.cpan.org/dist/DBD-mysql/
DBD-mysql-4.020.tar.gz
tar zxf DBD-mysql-4.020.tar.gz
cd DBD-mysql-4.020
perl Makefile.PL
--mysql_config=/usr/local/mysql/bin/mysql_config
make
make install
Checking required module 'Algorithm::Diff'...Error!
Can't locate Algorithm/Diff.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at (eval 20) line 2.
BEGIN failed--compilation aborted at (eval 20) line 2.
------------------------------------------------------------
Required module 'Algorithm::Diff' is not found on this system!
Install it (e.g. run command 'cpan Algorithm::Diff') and try
again.
+++++++++++++++++
运行
cpan Algorithm::Diff
一直跟着错误提示走就行,直到没有Error
5)
最后安装成功的提示是这样的:
[root@XKWB5705 mysql-master-master-1.2.3]# ./install.pl
Checking platform support... linux Ok!
Checking required module 'Data::Dumper'...Ok!
Checking required module 'POSIX'...Ok!
Checking required module 'Cwd'...Ok!
Checking required module 'threads'...Ok!
Checking required module 'threads::shared'...Ok!
Checking required module 'Thread::Queue'...Ok!
Checking required module 'Thread::Semaphore'...Ok!
Checking required module 'IO::Socket'...Ok!
Checking required module 'Proc::Daemon'...Ok!
Checking required module 'Time::HiRes'...Ok!
Checking required module 'DBI'...Ok!
Checking required module 'DBD::mysql'...Ok!
Checking required module 'Algorithm::Diff'...Ok!
Checking iproute installation...Ok!
Installing mmm files...
Confgiuration:
- installation directory: '/usr/local/mmm'
- create symlinks: on
- symlinks directory: '/usr/local/sbin'
Copying files to '/usr/local/mmm' directory...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmm_control' ->
'/usr/local/sbin/mmm_control'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmmd_agent' ->
'/usr/local/sbin/mmmd_agent'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmm_restore' ->
'/usr/local/sbin/mmm_restore'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmmd_angel' ->
'/usr/local/sbin/mmmd_angel'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmm_get_dump' ->
'/usr/local/sbin/mmm_get_dump'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmm_backup' ->
'/usr/local/sbin/mmm_backup'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmm_clone' ->
'/usr/local/sbin/mmm_clone'...Ok!
Creating symlink: '/usr/local/mmm/sbin/mmmd_mon' ->
'/usr/local/sbin/mmmd_mon'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmmd_mon.1' ->
'/usr/local/man/man1/mmmd_mon.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmm_restore.1' ->
'/usr/local/man/man1/mmm_restore.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmmd_agent.1' ->
'/usr/local/man/man1/mmmd_agent.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmm_clone.1' ->
'/usr/local/man/man1/mmm_clone.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmm_get_dump.1' ->
'/usr/local/man/man1/mmm_get_dump.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmm_control.1' ->
'/usr/local/man/man1/mmm_control.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmmd_angel.1' ->
'/usr/local/man/man1/mmmd_angel.1'...Ok!
Creating symlink: '/usr/local/mmm/man/man1/mmm_backup.1' ->
'/usr/local/man/man1/mmm_backup.1'...Ok!
Installation is done!
++++++++++++++++++++++++++++++++++++++
四 配置
cp
/home/sysadmin/zhaoyj/software/mysql-master-master-1.2.3/etc/examples/mmm_agent.conf.example
/usr/local/mmm/etc/mmm_agent.conf
--------------------------------
这个配置选项是必须的吗?【不是】
# Cluster interface
cluster_interface eth0
----------------------
db1的配置文件 /usr/local/mmm/etc/mmm_agent.conf
#
# Master-Master Manager config (agent)
#
include mmm_common.conf
# Paths
pid_path /usr/local/mmm/var/mmmd_agent.pid
bin_path /usr/local/mmm/bin
# MMMD command socket tcp-port and ip
bind_port 9989
# Logging setup
log mydebug
file /usr/local/mmm/var/mmm-debug.log
level debug
log mytraps
file /usr/local/mmm/var/mmm-traps.log
level trap
# Define current server id
this db1
mode master
# For masters
peer db2
# Cluster hosts addresses and access params
host db1
ip 211.100.97.246
port 3306
user rep_agent
password 123456
host db2
ip 211.100.97.250
port 3306
user rep_agent
password 123456
------------------------
db2的配置文件 /usr/local/mmm/etc/mmm_agent.conf
# Master-Master Manager config (agent)
#
include mmm_common.conf
# Paths
pid_path /usr/local/mmm/var/mmmd_agent.pid
bin_path /usr/local/mmm/bin
# MMMD command socket tcp-port and ip
bind_port 9989
# Logging setup
log mydebug
file /usr/local/mmm/var/mmm-debug.log
level debug
log mytraps
file /usr/local/mmm/var/mmm-traps.log
level trap
this db2
mode master
# For masters
peer db1
host db1
ip 211.100.97.246
port 3306
user rep_agent
password 123456
host db2
ip 211.100.97.250
port 3306
user rep_agent
password 123456
--------------------------
db1 db2以及monitor共同的配置文件/usr/local/mmm/etc/mmm_common.conf
# Cluster interface
#cluster_interface eth0
# Debug mode
debug no
# Paths
bin_path /usr/local/mmm/bin
pid_path /usr/local/mmm/var/mmmd.pid
status_path /usr/local/mmm/var/mmmd.status
# Choose the default failover method [manual|wait|auto]
failover_method auto
# How many seconds to wait for both masters to become ONLINE
# before switching from WAIT to AUTO failover method, 0 = wait
indefinitely
wait_for_other_master 2
# How many seconds to wait before switching node status from
AWAITING_RECOVERY to ONLINE
# 0 = disabled
auto_set_online 1
log mydebug
file /usr/local/mmm/var/mmm-debug.log
level debug
log mytraps
file /usr/local/mmm/var/mmm-traps.log
level trap
email yanjing.zhao@pcstars.com
email 15810370266@139.com
# Email notification settings
email notify
from_address mmm_control@example.com
from_name MMM Control
# Define roles
active_master_role reader
# MMMD command socket tcp-port
agent_port 9989
monitor_ip 127.0.0.1
host db1
ip 211.100.97.246
port 3306
user rep_agent
password 123456
mode master
pear db2
host db2
ip 211.100.97.250
port 3306
user rep_agent
password 123456
mode master
pear db1
# Mysql Reader role
role reader
mode balanced
servers db1, db2
ip 211.100.97.243,211.100.97.244
# Mysql Writer role
role writer
mode exclusive
servers db1, db2
ip 211.100.97.248
# Replication credentials used by slaves to connect to the
master
replication_user replication
replication_password 123456
# Checks parameters
# Ping checker
check ping
check_period 1
trap_period 5
timeout 2
# Mysql checker
# (restarts after 10000 checks to prevent memory leaks)
check mysql
check_period 1
trap_period 2
timeout 2
restart_after 10000
# Mysql replication backlog checker
# (restarts after 10000 checks to prevent memory leaks)
check rep_backlog
check_period 5
trap_period 10
max_backlog 60
timeout 2
restart_after 10000
# Mysql replication threads checker
# (restarts after 10000 checks to prevent memory leaks)
check rep_threads
check_period 1
trap_period 5
timeout 2
restart_after 10000
----------------------------
monit机上 mmm_mon.conf 的配置文件
# Cluster interface
#cluster_interface eth0
# Debug mode
debug no
# Paths
bin_path /usr/local/mmm/bin
pid_path /usr/local/mmm/var/mmmd.pid
status_path /usr/local/mmm/var/mmmd.status
# Choose the default failover method [manual|wait|auto]
failover_method auto
# How many seconds to wait for both masters to become ONLINE
# before switching from WAIT to AUTO failover method, 0 = wait
indefinitely
wait_for_other_master 2
# How many seconds to wait before switching node status from
AWAITING_RECOVERY to ONLINE
# 0 = disabled
auto_set_online 1
log mydebug
file /usr/local/mmm/var/mmm-debug.log
level debug
log mytraps
file /usr/local/mmm/var/mmm-traps.log
level trap
email yanjing.zhao@pcstars.com
email 15810370266@139.com
# Email notification settings
email notify
from_address mmm_control@example.com
from_name MMM Control
# Define roles
active_master_role writer
# MMMD command socket tcp-port
agent_port 9989
monitor_ip 127.0.0.1
host db1
ip 211.100.97.246
port 3306
user rep_agent
password 123456
mode master
pear db2
host db2
ip 211.100.97.250
port 3306
user rep_agent
password 123456
mode master
pear db1
# Mysql Reader role
role reader
mode balanced
servers db1, db2
ip 211.100.97.243,211.100.97.244
# Mysql Writer role
role writer
mode exclusive
servers db1, db2
ip 211.100.97.248
# Replication credentials used by slaves to connect to the
master
replication_user replication
replication_password 123456
# Checks parameters
# Ping checker
check ping
check_period 1
trap_period 5
timeout 2
# Mysql checker
# (restarts after 10000 checks to prevent memory leaks)
check mysql
check_period 1
trap_period 2
timeout 2
restart_after 10000
# Mysql replication backlog checker
# (restarts after 10000 checks to prevent memory leaks)
check rep_backlog
check_period 5
trap_period 10
max_backlog 60
timeout 2
restart_after 10000
# Mysql replication threads checker
# (restarts after 10000 checks to prevent memory leaks)
check rep_threads
check_period 1
trap_period 5
timeout 2
restart_after 10000
--------------------------
在db1和db2上配置完mmm_agent.conf和mmm_common.conf之后才能启动agent进程
启动 mmmd_agent进程
[root@XKWB5705 etc]# /usr/local/mmm/scripts/init.d/mmm_agent start
Starting MMM Agent daemon: MySQL Multi-Master Replication Manager
Version: 1.2.3
Ok
在monit上配置完mmm_mon.conf之后启动mon进程
[root@CentOS etc]# /usr/local/mmm/scripts/init.d/mmm_mon start
Daemon bin: '/usr/local/mmm/sbin/mmmd_mon'
Daemon pid: '/usr/local/mmm/var/mmmd.pid'
Starting MMM Monitor daemon: MySQL Multi-Master Replication
Manager
Version: 1.2.3
Reading config file: 'mmm_mon.conf'
$VAR1 = {};
Ok
------------------------
db1和db2查看进程
[root@XKWB5510 etc]# ps aux |grep mmm
root 13702 0.4 0.4 107260
8444 ?
S 15:21 0:03 /usr/bin/perl
/usr/local/mmm/sbin/mmmd_agent
monit上查看进程
[root@CentOS etc]# ps aux |grep mmm
root 24608 0.6 1.9 258556 39352
? Sl
15:17 0:04 perl /usr/local/mmm/sbin/mmmd_mon
root 24611 0.0 0.4 107392
8280 ?
S 15:17 0:00 perl
/usr/local/mmm/bin/check/checker rep_backlog
root 24613 0.1 0.4 107392
8252 ?
S 15:17 0:00 perl
/usr/local/mmm/bin/check/checker mysql
root 24615 0.2 0.2
91668 5368 ?
S 15:17 0:01 perl
/usr/local/mmm/bin/check/checker ping
root 24617 0.2 0.4 107392
8280 ?
S 15:17 0:01 perl
/usr/local/mmm/bin/check/checker rep_threads
连续观察了几次monit上的进程变化情况
从变化情况可以看出monitor用fping检测两个节点的存活状况
[root@CentOS etc]# ps aux |grep mmm
root 24608 0.6 1.9 258556 39356
? Sl
15:17 0:07 perl /usr/local/mmm/sbin/mmmd_mon
root 24611 0.0 0.4 107392
8288 ?
S 15:17 0:00 perl
/usr/local/mmm/bin/check/checker rep_backlog
root 24613 0.1 0.4 107392
8264 ?
S 15:17 0:01 perl
/usr/local/mmm/bin/check/checker mysql
root 24615 0.2 0.2
91668 5368 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker ping
root 24617 0.1 0.4 107392
8280 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker rep_threads
[root@CentOS etc]# ps aux |grep mmm
root 24608 0.6 1.9 258556 39356
? Sl
15:17 0:07 perl /usr/local/mmm/sbin/mmmd_mon
root 24611 0.0 0.4 107392
8288 ?
S 15:17 0:00 perl
/usr/local/mmm/bin/check/checker rep_backlog
root 24613 0.1 0.4 107392
8264 ?
S 15:17 0:01 perl
/usr/local/mmm/bin/check/checker mysql
root 24615 0.2 0.2
91668 5368 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker ping
root 24617 0.1 0.4 107392
8280 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker rep_threads
root 25887 0.0 0.0
1804 504 ?
S 15:35 0:00
/usr/local/mmm/bin/sys/fping -q -u -t 500 -C 1 211.100.97.246
[root@CentOS etc]# ps aux |grep mmm
root 24608 0.6 1.9 258556 39356
? Sl
15:17 0:07 perl /usr/local/mmm/sbin/mmmd_mon
root 24611 0.0 0.4 107392
8288 ?
S 15:17 0:00 perl
/usr/local/mmm/bin/check/checker rep_backlog
root 24613 0.1 0.4 107392
8264 ?
S 15:17 0:01 perl
/usr/local/mmm/bin/check/checker mysql
root 24615 0.2 0.2
91668 5368 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker ping
root 24617 0.1 0.4 107392
8280 ?
S 15:17 0:02 perl
/usr/local/mmm/bin/check/checker rep_threads
root 25890 0.0 0.0
1804 504 ?
S 15:35 0:00
/usr/local/mmm/bin/sys/fping -q -u -t 500 -C 1 211.100.97.250
连续观察db2上进程的变化情况
可以看到db2不断检测db1的读进程以及monit的写进程
[root@XKWB5705 etc]# ps aux |grep mmm
root 1613 0.3 0.2
107272 8440 ?
S 15:18 0:04 /usr/bin/perl
/usr/local/mmm/sbin/mmmd_agent
root 12026 0.0 0.1
99564 7056 ?
S 15:38 0:00 perl
/usr/local/mmm/bin/agent/check_role writer(211.100.97.248;)
root 12027 0.0 0.1 102200
7572 ?
R 15:38 0:00 perl
/usr/local/mmm/bin/mysql_allow_write
root 1613 0.3 0.2 107272 8440 ? S 15:18 0:04 /usr/bin/perl /usr/local/mmm/sbin/mmmd_agent
root 12121 0.0 0.1 92552 6176 ? R 15:38 0:00 perl /usr/local/mmm/bin/agent/check_role writer(211.100.97.248;)
root 1613 0.3 0.2
107272 8440 ?
S 15:18 0:04 /usr/bin/perl
/usr/local/mmm/sbin/mmmd_agent
root 12202 0.0 0.1
90072 5744 ?
R 15:38 0:00 perl
/usr/local/mmm/bin/agent/check_role reader(211.100.97.244;)
root 1613 0.3 0.2 107272 8440 ? S 15:18 0:04 /usr/bin/perl /usr/local/mmm/sbin/mmmd_agent
连续观察db1上进程的变化情况
可以看到db1不断检测db2的读进程
[root@XKWB5510 etc]# ps aux |grep mmm |grep -v grep
root 13702 0.4 0.4 107260
8444 ?
S 15:21 0:06 /usr/bin/perl
/usr/local/mmm/sbin/mmmd_agent
root 26820 0.0 0.3
97212 6712 ?
R 15:44 0:00 perl
/usr/local/mmm/bin/agent/check_role reader(211.100.97.243;)
[root@XKWB5510 etc]# ps aux |grep mmm |grep -v grep
root 13702 0.4 0.4 107260
8444 ?
S 15:21 0:06 /usr/bin/perl
/usr/local/mmm/sbin/mmmd_agent
----------------------------
六 添加开机自动启动
db1, db2 开机自启动
cp -r /usr/local/mmm/scripts/init.d/mmm_agent /etc/init.d/mmmd
chkconfig --add mmmd
chkconfig --level 345 mmmd on
查看一下添加结果:
[root@XKWB5705 etc]# chkconfig --list mmmd
mmmd
0:off 1:off 2:off 3:on 4:on 5:on 6:off
Mon 开机自启动
[root@CentOS etc]# cp -r /usr/local/mmm/scripts/init.d/mmm_agent
/etc/init.d/mmmd
[root@CentOS etc]# chkconfig --add mmmd
[root@CentOS etc]# chkconfig --level 345 mmmd on
[root@CentOS etc]# chkconfig --list mmmd
mmmd
0:off 1:off 2:off 3:on 4:on 5:on 6:off
----------------------------------
七 测试
先介绍一下几种状态:
online 正常运行
admin_offline 主机被手动设置成offline
hard_offline 主机处于offline状态,可能是检测ping或者mysql失败
awaiting_recovery 主机正在等待恢复
replication_delay replication
backlog太大了(检测rep_backlog线程失败)
replication_fail
replication线程没有运行(检测rep_threads线程失败)
mmm_control set_online db1 让db1上线
mmm_control set_online db2 让db2上线
[root@CentOS etc]# mmm_control show
Servers status:
db1(211.100.97.246): master/ONLINE. Roles:
reader(211.100.97.243;), writer(211.100.97.248;)
db2(211.100.97.250): master/ONLINE. Roles:
reader(211.100.97.244;)
从以上可以看到db1是主写服务器
-------------------------------------
[2011-09-28 16:00:00]: 24608: Check: CHECK_FAIL('db2', 'rep_threads') Returned message: ERROR: Replication is broken
[2011-09-28 16:00:07]: 24608: Check: CHECK_OK('db1', 'mysql')
[2011-09-28 16:00:08]: 24608: Daemon: State change(db1): HARD_OFFLINE -> AWAITING_RECOVERY
[2011-09-28 16:00:10]: 24608: Daemon: State change(db1): AWAITING_RECOVERY -> ONLINE. Uptime diff = 12.109999999986 seconds; Status change diff = 1317196810
[root@CentOS var]# mmm_control show
Servers status:
db1(211.100.97.246): master/HARD_OFFLINE. Roles: None
db2(211.100.97.250): master/ONLINE. Roles:
reader(211.100.97.243;), reader(211.100.97.244;),
writer(211.100.97.248;)
从以上可以看到主写服务器已经从db1切换到db2,而且db1是offline状态
[2011-09-28 16:01:54]: 24608: Check: CHECK_OK('db1', 'mysql')
[2011-09-28 16:01:55]: 24608: Daemon: State change(db1): HARD_OFFLINE -> AWAITING_RECOVERY
[2011-09-28 16:01:56]: 24608: Check: CHECK_OK('db2', 'rep_threads')
[2011-09-28 16:01:56]: 24608: Daemon: State change(db1): AWAITING_RECOVERY -> ONLINE. Uptime diff = 2.94999999995343 seconds; Status change diff = 1317196916
-------------------------------
当我停止db2【250】的mysql进程时
[2011-09-28 16:29:22]: 24608: Check: CHECK_FAIL('db2',
'mysql') Returned message: ERROR: Connect error (host =
211.100.97.250:3306, user = rep_agent, pass = 'xxxxxx')! Lost
connection to MySQL server at 'reading initial communication
packet', system error: 111
[2011-09-28 16:29:23]: 24608: Daemon: State change(db2): ONLINE
-> HARD_OFFLINE
[2011-09-28 16:29:26]: 24608: Check: CHECK_FAIL('db1',
'rep_threads') Returned message: ERROR: Replication is
broken
[root@CentOS var]# mmm_control show
Servers status:
db1(211.100.97.246): master/ONLINE. Roles:
reader(211.100.97.243;), reader(211.100.97.244;),
writer(211.100.97.248;)
db2(211.100.97.250): master/HARD_OFFLINE. Roles: None
从以上可以看到db2处于offline状态
当我重新启动db2【250】的mysql进程时,日志里面的状态提示已经发生了变化,变成了online状态
[2011-09-28 16:34:26]: 24608: Check: CHECK_OK('db2', 'mysql')
[2011-09-28 16:34:28]: 24608: Daemon: State change(db2):
HARD_OFFLINE -> AWAITING_RECOVERY
[2011-09-28 16:34:29]: 24608: Daemon: State change(db2):
AWAITING_RECOVERY -> ONLINE. Uptime diff = 306.320000000065
seconds; Status change diff = 1317198869
[root@CentOS var]# mmm_control show
Servers status:
db1(211.100.97.246): master/ONLINE. Roles:
reader(211.100.97.244;), writer(211.100.97.248;)
db2(211.100.97.250): master/ONLINE. Roles:
reader(211.100.97.243;)
通过以上测试证明整个搭建成功,已经实现了高可用,实现失败节点的自动切换
MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器)关于mysql主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个节点可以被写入),这个套件也能对居于标准的主从配置的任意数量的从服务器进行读负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟ip,除此之外,它还有实现数据备份、节点之间重新同步功能的脚本。
MySQL本身没有提供replication failover的解决方案,通过MMM方案能实现服务器的故障转移,从而实现mysql的高可用。
官方网站为:http://mysql-mmm.org
Mmm主要功能由下面三个脚本提供
l mmm_mond 负责所有的监控工作的监控守护进程,决定节点的移除等等
l mmm_agentd 运行在mysql服务器上的代理守护进程,通过简单远程服务集提供给监控节点
l mmm_control 通过命令行管理mmm_mond进程