藉由前文介紹的 PostgreSQL Physical Replication,我們得以將單一 PostgreSQL server 的資料同步至多個 Server,並且每個 Server 都能獨立處理 Read request,不過前文也有提到 PostgreSQL 原生並沒有支援 Load balancing, high availability 等功能、特性,我們需要多一層額外的 Proxy 來達到這些功能。這篇文章將介紹其中一種功能較完善的選擇 — Pgpool-II 以及其簡易配置。
以下內容、圖片與 Pgpool-II 配置大多參考 Pgpool-II 官方文件
What is Pgpool-II? What Features Does It Support?
Pgpool-II 管理一組 PostgreSQL server (Pool of PostgreSQL servers),如前文所述可以視為一個中介層,即對 Application 來說他就是 PostgreSQL server,對 Server 來說則視 Pgpool-II 為 Client:
flowchart TD App["Application"] --> Pgpool["Pgpool-II"] Pgpool --> cluster subgraph cluster["PostgreSQL Servers "] direction LR Master -- "Physical Replication" --> Slaves end
其所支援的功能大致條列如下:
- Performance
- Load balancing: 將 Read queries 分配至多個 Servers (Write queries 只能導至 Primary server)。
- Connection pooling: 類似 pgbouncer, pgcat,Reuse 已建立的連線來減少連線開銷。
- In-memory query cache: Cache
SELECT
Query 的結果來加快重複查詢速度。
- High availability
- For PostgresQL databases
- Replication: Pgpool-II-II 也有支援自己的 Replication,不過一般還是會建議用 PostgreSQL 原生的 Physical(Streaming) replication,詳見上一篇介紹。
- Automatic failover: Primary server 故障時將其自動轉移至其他 Standby server,繼續提供服務。
- Online Recovery: 將 Detached Server 重新以 Standby server 的身分 Attach 回來。
- For Pgpool-II
- Watchdog: Coordination service,協調多個 Pgpool-II 節點,進行 Failure detection,避免 Single point of failure 和 Split brain 等現象。當目前的 Active Pgpool-II 故障時,Standby Pgpool-II 會被 Promoted 為 Active 然後接管 Virtual IP,讓上層 Application 連線。
- For PostgresQL databases
Setup Steps and Configuration
從上一節可以看出 Pgpool-II 支援的功能相當豐富,這一節我將介紹其中關於 Load balancing 和 Automatic failover 大致的設定流程。
以下的 Command 和 Configuration 都僅供參考,請依實際環境與需求調整。
General Setup
- Setup PostgreSQL physical replication. For detailed setup instruction, refer to the example in the last article.
- Create role for replication and grant permission to show more information in
SHOW POOL NODES
command result:1
2psql -U postgres -tA -c "CREATE ROLE replication WITH LOGIN REPLICATION;"
psql -U postgres -tA -c "GRANT pg_monitor TO replication;" - Setup general configuration in pgpool.conf:
1
2
3
4
5
6
7backend_clustering_mode = 'streaming_replication'
listen_addresses = '*'
port = 9999
sr_check_user = 'replication' # stream replication check for detecting replication delay
sr_check_period = 5
health_check_user = 'replication' # health check for failover
health_check_period = 10
Load Balancing
- Setup backend settings in pgpool.conf:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19# backend_application_name*: to show more information in `SHOW POOL NODES` command result
backend_hostname0 = '${PRIMARY_IP}'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/pgsql/data'
backend_application_name0 = 'primary'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '${REPLICA_1_IP}'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/data'
backend_application_name1 = 'replica_1'
backend_flag1 = 'ALLOW_TO_FAILOVER'
backend_hostname2 = '${REPLICA_2_IP}'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/data'
backend_application_name1 = 'replica_2'
backend_flag1 = 'ALLOW_TO_FAILOVER' - Start PgPool-II:
1
2
3
4# -n: do not run in daemon mode
sudo -u postgres pgpool -n -f ${LOCATION_OF_PGPOOL_CONF} &
# As for how to stop:
sudo -u postgres pgpool -f ${LOCATION_OF_PGPOOL_CONF} stop - Monitor:
1
psql -U postgres -p 9999 -c "SHOW POOL NODES;"
Automatic Failover
Setup failover_command
and follow_primary_command
in pgpool.conf (For the meaning of the command options below, please refer to the corresponding configuration link):
1 | failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S' |
General steps of the scripts are described below. Please refer to failover.sh.sample and follow_primary.sh.sample for sample scripts.
failover_command
: Triggered when a backend node get detached.- If there’s no new main node (
%m
) anymore, skip failover. - If the detached node is a standby node (
%d
!=%P
):- Drop the replication slot for the detached standby node.
1
2
3
4
5OLD_PRIMARY_NODE_HOST="${11}" # %N: old primary node hostname
OLD_PRIMARY_NODE_PORT="${12}" # %S: old primary node port number
# REPLICA_SLOT_NAME can be composed of the self-defined slot name and $2 (%h: failed node hostname)
psql -U postgres -h ${OLD_PRIMARY_NODE_HOST} -p ${OLD_PRIMARY_NODE_PORT} -c "SELECT pg_drop_replication_slot('${REPLICA_SLOT_NAME}');" - Skip failover.
- Drop the replication slot for the detached standby node.
- Otherwise, the detached node is primary node. We therefore promote the selected standby node as new primary node:
1
2
3# The following command should be executed on the standby node via SSH or other remote access methods.
NEW_MAIN_NODE_PGDATA="${10}" # %R: new main database cluster path
pg_ctl -D ${NEW_MAIN_NODE_PGDATA} -w promote
- If there’s no new main node (
follow_primary_command
: Executed on each standby node afterfailover_command
to synchronize the standby node with the new primary node.- Check the connection status of the standby node. Skip follow primary command if the node is not running.
1
2
3NODE_HOST="$2" # %h: hostname
NODE_PORT="$3" # %p: port number
pg_isready -U postgres -h ${NODE_HOST} -p ${NODE_PORT} - Synchronize the standby node with the new primary node via pg_rewind.
- Checkpint on the new primary node before running pg_rewind.
- Create new replication slot for this standby node.
- Run pg_rewind.
- Try pg_basebackup if pg_rewind failed.
- If the synchronization succeeded, attach this node back to Pgpool-II. Otherwise, drop the replication slot we just created.
- Check the connection status of the standby node. Skip follow primary command if the node is not running.
Issues about the Integration of Pgpool-II
至此我們整合 Pgpool-II 為 PostgreSQL physical replication servers 的 Proxy,並且支援 Load balancing, read/write splitting, automatic failover 等功能。關於其他 Pgpool-II 支援功能的設定方式可以參考官方文件與其中的幾個範例。
在文章的最後,我們接續前文最後的討論,延伸討論整合 Pgpool-II 後實務上可能會遇到的問題。
Network Latency
前文最後的討論就有提到 Performance 的部分需要考慮 Network latency ,尤其是如果 Production workload 都是 Light weight query 或是 Buffer cache 常常命中的情境,可能會造成 Scaled-out solution 的效能反而較差,在加入 Pgpool-II 後影響更為顯著。以一個 Query 從 Client 送至 Server 的 Lifecycle 來說明:
- Send query to PostgreSQL directly
- send a query string to PostgreSQL
- process the query on PostgreSQL
- return the query result
- Send query to Pgpool-II
- send a query string to Pgpool-II
- parse the query on Pgpool-II
- send a query string to PostgreSQL
- process the query on PostgreSQL
- return the query result to Pgpool-II
- return the query result to client
可以觀察送至 Pgpool-II 的 Case,其明顯多了幾段 Process communitcation 的開銷,也會使 Network latency 的影響更為顯著。當然在實務上的 Workload 應該會更加複雜,因此通常 Scaled-out solution 還是會有其效益。
Connection Pooling: Pgpool-II vs Pgbouncer
Connection pooling 是指維護資料庫連線的快取 (Pool),在有相同屬性的 (User, database, etc.) 請求進來時重複使用他們,以降低反覆連線的開銷,以 PostgreSQL 來說最有名的就是 Pgbouncer。
上文中有提到 Pgpool-II 的功能完善,對 Connection pooling 也有支援,雖然沒有提及設定方式,不過 Pgpool-II 其實預設就有開啟 Connection pooling 的功能,可以再調整 connection_cache
, max_pool
等參數來控制功能。
不過實際嘗試後會發現 Pgpool-II 和 Pgbouncer 的 Connection pooling 機制上有所不同,我們首先介紹 Pgpool-II 的大致架構:
flowchart TD parent["Pgpool-II parent"] -- Fork --> children subgraph children["Preforked processes"] direction LR child1["Pgpool-II child"] child2["Pgpool-II child"] child3["Pgpool-II child"] end client["SQL Client"] -- Query -->children
即 Pgpool-II 會 Prefork num_init_children
(預設 32) 數量的 Process,並藉由這些 Child process 來 Serve clients。而前面提到 Pgpool-II 和 Pgbouncer 的 Connection pooling 機制不同,詳見比較如下:
- Pgpool-II: the connection pool is shared in each child process. However, we cannot control which child process a client connects to.
- PgBouncer: the connection pool is shared between all clients.
換言之,Pgpool-II 有可能因為連線的 Child process 不同而導致 Connection cache 沒有命中。這點在 Client 的連線屬性 (User, database, etc.) 都相同時沒有影響,但反之如果 Database 會被多個不同的 Application 連線,則需要考慮使用別的 Solution,例如關掉 Pgpool-II 的 Connection pool,每個 Application 自己架 Pgbouncer。