用Keepalived实现PostgreSQL高可用

一、Keepalived工作原理

Keepalived可提供VRRP以及health-check功能,可以只用它提供双机浮动的vip(VRRP虚拟路由功能),这样可以简单实现一个双机热备高可用功能。

Keepalived是以VRRP虚拟路由冗余协议为基础实现高可用的,可以认为是实现路由器高可用的协议,即将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,当backup收不到VRRP包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。

下载地址:(复制链接至浏览器中浏览)

https://www.keepalived.org/download.html
https://github.com/acassen/keepalived

Keepalived工作在TCP/IP参考模型的三层、四层、五层(物理层,链路层):

  • 网络层(3):Keepalived通过ICMP协议向服务器集群中的每一个节点发送一个ICMP数据包(有点类似与Ping的功能),如果某个节点没有返回响应数据包,那么认为该节点发生了故障,Keepalived将报告这个节点失效,并从服务器集群中剔除故障节点。

  • 传输层(4):Keepalived在传输层里利用了TCP协议的端口连接和扫描技术来判断集群节点的端口是否正常,比如对于常见的WEB服务器80端口。或者SSH服务22端口,Keepalived一旦在传输层探测到这些端口号没有数据响应和数据返回,就认为这些端口发生异常,然后强制将这些端口所对应的节点从服务器集群中剔除掉。

  • 应用层(5):Keepalived的运行方式也更加全面化和复杂化,用户可以通过自定义Keepalived工作方式,例如:可以通过编写程序或者脚本来运行Keepalived,而Keepalived将根据用户的设定参数检测各种程序或者服务是否允许正常,如果Keepalived的检测结果和用户设定的不一致时,Keepalived将把对应的服务器从服务器集群中剔除。

二、架构

2.1 主机规划

2.2 架构图

三、环境准备

3.1 关闭防火墙和selinux

[root@localhost ~]# systemctl stop firewalld
[root@localhost keepalived]# vi /etc/selinux/config
SELINUX=disabled
[root@localhost keepalived]# getenforce
Disabled

3.2 安装Keepalived

[root@localhost ~]# yum install gcc openssl openssl‐devel libnl libnl‐devel ipvsadm -y
[root@localhost ~]# cd /usr/local/src/
[root@localhost src]# tar -zxf keepalived-2.2.4.tar.gz
[root@localhost src]# cd keepalived-2.2.4
[root@localhost keepalived-2.2.4] ./configure --prefix=/usr/local/keepalived
[root@localhost keepalived-2.2.4] make && make install
[root@localhost keepalived]# cd /usr/local/keepalived
[root@localhost keepalived]#
ln -s /usr/local/src/keepalived-2.2.4/keepalived/etc/init.d/keepalived /etc/init.d/
[root@localhost keepalived]# ln -s /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@localhost keepalived]# ln -s /usr/local/keepalived/sbin/keepalived /usr/local/sbin/
[root@localhost keepalived]# mkdir /etc/keepalived/
[root@localhost keepalived]# ln -s /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@localhost ~]# cat /etc/keepalived/keepalived.conf

[root@localhost keepalived-2.2.4]# systemctl start keepalived
[root@localhost keepalived-2.2.4]# systemctl status keepalived

3.3 PostgreSQL数据库异步主从流复制搭建

192.168.254.128服务器上配置:

1、创建流复制用户

postgres=# create user replia replication superuser password '123qwert’;

给superuser权限是为了后面做pg_rewind操作。

2、数据目录下配置pg_hba.conf文件添加(192.168.254.129也需要添加)

    host replication all 192.168.254.1/24 md5

3、配置环境变量

[postgres@localhost pg_data]$ cat ~/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then . ~/.bashrc; fi
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
export PGHOME=/usr/local/pgsql
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH
export DATE=`date +"%Y%m%d%H%M"`
export PATH=$PGHOME/bin:$PATH:.
export MANPATH=$PGHOME/share/man:$MANPATH
export PGUSER=postgres
export PGDATA=/data/pg_data
export PGDATABASE=postgres
alias rm='rm -i'
alias ll='ls -lh'
export LANG="zh_CN.UTF8"

192.168.254.129上进行:

4、~/.bash_profile 和上面输入一样的内容

5、制作基础备份

[postgres@localhost ~]$ pg_basebackup -h192.168.254.128 -Ureplia -R -Fp -P --verbose  -c fast -D /data/pg_data

6、启动从库

[postgres@localhost ~]$ pg_ctl -D /data/pg_data/ start

192.168.254.128上查看主库信息:

7、查看主库信息

postgres=# select * from pg_stat_replication ;

3.4 PostgreSQL数据库配置

创建数据库Keepalived,并且创建表探测表sr_delay,后续Keepalived探测,刷新sr_delay表的last_alive字段为当前探测时间。这张表用来判断主备延迟情况,数据库故障切换时会用到这张表。

数据库配置

postgres=# create user keepalived password '123qwert' CONNECTION LIMIT 4 ;
CREATE ROLE
postgres=# create database keepalived owner keepalived;
CREATE DATABASE
postgres=# \c keepalived keepalived
You are now connected to database "keepalived" as user "keepalived".
keepalived=> create table sr_delay(id int4, last_alive timestamp(0) without time zone);
CREATE TABLE

表sr_delay只允许写入一条记录,并且不允许删除此表数据,通过触发器实现。创建触发器函数,如下所示:

CREATE FUNCTION cannt_delete ()
RETURNS trigger
LANGUAGE plpgsql AS $$
BEGIN
RAISE EXCEPTION 'Table sr_delay can not delete !';
END;
$$;

创建触发器:

CREATE TRIGGER trigger_sr_delay_del
BEFORE DELETE ON sr_delay
FOR EACH ROW EXECUTE PROCEDURE cannt_delete() ;
CREATE TRIGGER trigger_sr_delay_tru BEFORE TRUNCATE ON sr_delay FOR STATEMENT EXECUTE PROCEDURE cannt_delete() ;

插入数据:

INSERT INTO sr_delay VALUES(1,now()) ;

参数部分:

hot_standby = on
wal_log_hints = on
archive_mode=on
archive_command='test ! -f /data/pg_archive/%f && cp %p /data/pg_archive/%f'

后续Keepalived会每隔指定时间探测PostgreSQL数据库存活, 并且以Keepalived用户登录Keepalived数据库刷新这张表, 配置主备库pg_hba.conf,增加如下内容

host keepalived keepalived 192.168.254.128/32 md5
host keepalived keepalived 192.168.254.129/32 md5

随后pg_ctl reload 操作使配置生效。

3.5 Keepalived 配置

[root@localhost ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
# 邮件通知信息
notification_email {
# 定义收件人
394106346@qq.com
}
# 定义发件人
notification_email_from Alexandre.Cassen@firewall.loc
# SMTP服务器地址
smtp_server 192.168.254.128
smtp_connect_timeout 30
# 路由器标识,一般不用改,也可以写成每个主机自己的主机名
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}

# 定义用于实例执行的脚本内容,比如可以在线降低优先级,用于强制切换
vrrp_script check_pg_alived {
script "/data/scripts/pg_monitor.sh"
interval 10
fall 3
}

# 一个vrrp_instance就是定义一个虚拟路由器的,实例名称
vrrp_instance VI_1 {
# 定义初始状态,可以是MASTER或者BACKUP
state MASTER
#非抢占模式
nopreempt
# 工作接口,通告选举使用哪个接口进行
interface ens33
# 虚拟路由ID,如果是一组虚拟路由就定义一个ID,如果是多组就要定义多个,而且这个虚拟
# ID还是虚拟MAC最后一段地址的信息,取值范围0-255
virtual_router_id 51
#权重 如果你上面定义了MASTER,这里的优先级就需要定义的比其他的高
priority 100
#通告频率 单位s
advert_int 1
# 通信认证机制,这里是明文认证还有一种是加密认证
authentication {
auth_type PASS
auth_pass 1111
}
# 设置虚拟VIP地址
virtual_ipaddress {
192.168.254.11
}
# 追踪脚本,通常用于去执行上面的vrrp_script定义的脚本内容
track_script {
check_pg_alived
}
# 如果主机状态变成Master|Backup|Fault之后会去执行的通知脚本,脚本要自己写
notify_master "/data/scripts/failover.sh"
notify_fault "/data/scripts/fault.sh"
}

以上是Keepalived主节点的配置,Keepalived备节点的priority参数改成90 ,state改为BACKUP,以及smtp_server 改为192.168.254.129,其余参数配置一样。

3.6 数据库监控脚本

[postgres@localhost scripts]$ cat /data/scripts/pg_monitor.sh
#!/bin/bash
# 配置环境变量
pgport=5432
pguser=keepalived
pgdb=keepalived
pgpwd='123qwert'
LANG=en_US.utf8
PGHOME=/usr/local/pgsql/
LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib
PATH=$PGHOME/bin:$PATH:.
MONITOR_LOG="/data/scripts/pg_monitor.log"
SQL1="UPDATE sr_delay SET last_alive= now();"
SQL2='SELECT 1;'
keeplognums=30000

#pg_port_status=`lsof ‐i :$pgport | grep LISTEN | wc ‐l`
#pg_port_status=`ps ‐ef | grep LISTEN | wc ‐l`
#if [ $pg_port_status ‐lt 1 ];then
# echo ‐e `date +"%F %T"` "Error: The postgreSQL is not running,please chec k the postgreSQL server status!" >> $LOGFILE
# exit 1
#fi

#此脚本不检查备库存活状态,如果是各库则退出

standby_flg=`psql -p$pgport -Upostgres -At -c "SELECT pg_is_in_recovery();"`
if [ ${standby_flg} == 't' ];
then
echo ‐e "`date +%F\ %T`: This is a standby database, exit!\n" > $MONITOR_LOG
exit 0
fi
export PGPASSWORD=$pgpwd
#主库更新sr_delay 表
echo $SQL1 | psql -At -p $pgport -U $pguser -d $pgdb >> $MONITOR_LOG
#判断主库是否可用
echo $SQL2 | psql -At -p $pgport -U $pguser -d $pgdb
if [ $? -eq 0 ] ;
then
echo -e "`date +%F\ %T`:Primary db is health." >> $MONITOR_LOG
exit 0
else
echo -e "`date +%F\ %T`:Attention: Primary db is not health!" >> $MONITOR_LOG
exit 1
fi

#日志保留 keeplognums 行
if [ ! -f ${MONITOR_LOG} ] ;then touch ${MONITOR_LOG};fi
lognums=`cat ${MONITOR_LOG} |wc -l`
catnum=$((${lognums} -${keeplognums}))
if [[ $lognums -gt ${keeplognums} ]] ; then sed -i "1,${catnum}d" ${MONITOR_LOG}; fi

此脚本每隔10秒执行一次,执行频率由keepalived.conf配置文件中interval参数设置, 脚本主要作用为:

  • 检测主库是否存活。

  • 更新sr_delay表last_alive字段为当前探测时间。

[postgres@localhost scripts]$ cat /data/scripts/failover.sh
#!/bin/bash
export PGPORT=5432
export PGUSER=keepalived
export PG_OS_USER=postgres
export PGDBNAME=keepalived
export LANG=en_US.UTF-8
export PGPATH=/usr/local/pgsql/bin/
export PATH=$PATH:$PGPATH
export PGMIP=127.0.0.1
LOGFILE='/data/scripts/failover.log'
# 主备数据库同步时延,单位为秒
sr_allowed_delay_time=100

SQL1='select pg_is_in_recovery from pg_is_in_recovery();'
SQL2="select count(1) as delay_time from sr_delay where now()<(last_alive + interval '${sr_allowed_delay_time} seconds');"
#SQL2="select count(1) as delay_time from sr_delay where now()<(last_alive + interval '100 seconds');"
sleep $sr_allowed_delay_time
db_role=`echo $SQL1 | psql -At -p $PGPORT -U $PGUSER -d $PGDBNAME -w`
db_sr_delaytime=`echo $SQL2 | psql -p $PGPORT -d $PGDBNAME -U $PGUSER -At -w`
SWITCH_COMMAND='pg_ctl promote -D /data/pg_data/'
# 如果为备库,且延迟大于指定时间则切换为主库

if [ ${db_role} == 'f' ];
then
echo -e `date +"%F %T"` "Attention:The current postgreSQL DB is master database,cannot switched!" >> $LOGFILE
exit 0
fi

if [ $db_sr_delaytime -gt 0 ];
then
echo -e `date +"%F %T"` "Attention:The current master database is health,the standby DB cannot switched!" >> $LOGFILE
exit 0
fi

if [ !$db_sr_delaytime ];
then
echo -e `date +"%F %T"` "Attention:The current database is statndby,ready to switch master database!" >> $LOGFILE
su - $PG_OS_USER -c "$SWITCH_COMMAND"
elif [ $? -eq 0 ];
then
echo -e `date +"%F %T"` "success:The current standby database successed to switched the primary PG database !" >> $LOGFILE
exit 0
else
echo -e `date +"%F %T"` "Error: the standby database failed to switch the primary PG database !,pelease checked it!" >> $LOGFILE
exit 1
fi

[postgres@localhost scripts]$ cat /data/scripts/fault.sh
#!/bin/bash
GFILE=/data/scripts/pg_db_fault.log
PGPORT=5432
PGMIP=192.168.254.128
echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE

PGPID="`head -n1 /data/pg_data/postmaster.pid`"

systemctl stop keepalived

kill -9 $PGPID
if [ $? -eq 0 ] ;
then
echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE
systemctl stop keepalived
exit 1
fi

[postgres@localhost scripts]$ chmod 755 /data/scripts/pg_monitor.sh /data/scripts/failover.sh /data/scripts/fault.sh

四、切换演练

4.1 在192.168.254.128上停掉数据库

[root@localhost ~]# su - postgres
[postgres@localhost ~]$ pg_ctl -D /data/pg_data/ stop

查看192.168.254.129上的数据库状态

[postgres@localhost ~]$ psql
psql (13.3)
Type "help" for help.
postgres=# insert into t5 (name) values ('b002');
INSERT 0 1

修复192.168.254.128。

使用pg_rewind 同步新主库的数据到原主库:

pg_rewind -R --target-pgdata '/data/pg_data' --source-server 'host=192.168.254.129 port=5432 user=replia password=123qwert dbname=postgres'

如果这个地方咱们的wal日志已经被覆盖了了很多,那么就需要将我们的备份日志/data/pg_archive/给scp过来。

[postgres@localhost ~]$ scp postgres@192.168.254.129:/data/pg_archive/00000002* /data/pg_archive/

(此时需要注意新主和备上面TimeLineID的差异。并且这种情况需要在配置文件中加上:

restore_command='cp /data/pg_archive/%f %p'

启动数据库,然后检查主备wal sender和receiver是否正常:

[postgres@localhost ~]$ pg_ctl -D /data/pg_data/ start
[postgres@localhost ~]$ ps axu |grep walreceiver
postgres 53474 0.0 0.2 342008 2136 ? Ss 14:45 0:01 postgres: walreceiver streaming 1/4A036EF0

192.168.254.128上启动Keepalived,查看Keepalived状态:

[root@localhost ~]# systemctl start keepalived
[root@localhost ~]# systemctl status keepalived

4.2 再次切换:

在做切换前,先要检查192.168.254.128和192.168.254.129上面的Keepalived正常以及vip和主从正常,在192.168.254.129上停掉数据库:

[postgres@localhost ~]$ pg_ctl -D /data/pg_data/ stop -mf

查看192.168.254.128上的数据库状态:

[postgres@localhost pg_data]$ psql
psql (13.3)
Type "help" for help.
postgres=# insert into t5 (name) values ('b005');
INSERT 0 1

修复192.168.254.129。

使用pg_rewind 同步新主库的数据到原主库:

pg_rewind -R --target-pgdata '/data/pg_data' --source-server 'host=192.168.254.128 port=5432 user=replia password=123qwert dbname=postgres'

如果这个地方咱们的wal日志已经被覆盖了了很多,那么就需要将我们的备份日志/data/pg_archive/给scp过来。

[postgres@localhost ~]$ scp postgres@192.168.254.128:/data/pg_archive/00000003*  /data/pg_archive/  

(此时需要注意新主和备上面TimeLineID的差异。并且这种情况需要在配置文件中加上:

restore_command='cp /data/pg_archive/%f %p'

启动数据库,然后检查主备wal sender和receiver是否正常:

[postgres@localhost ~]$ pg_ctl -D /data/pg_data/ start
[postgres@localhost ~]$ ps axu |grep walreceiver

192.168.254.129上启动Keepalived,查看Keepalived状态:

[root@localhost ~]# systemctl start keepalived
[root@localhost ~]# systemctl status keepalived

五、总结

1、可以完善failover时自己的延迟切换逻辑。

2、可以完善pg_rewind的实现,用脚本代替手动方式。

3、Keepalived较为灵活,能够在脚本上添加更多校验和自己的规则。

墨天轮原文链接:https://www.modb.pro/db/113653?sjhy(复制到浏览器或者点击“阅读原文”立即查看)

关于作者
向前龙  ,云和恩墨PG技术顾问,拥有OGCA、PGCE证书。
END

由中国DBA联盟和墨天轮主办的 2021数据技术嘉年华 将于11月19日-20日在北京丽都皇冠假日酒店盛大召开。

大会门票限时0元领取

    扫描下方的二维码即可!

(0)

相关推荐