
PostgreSQL 高级性能调优
PostgreSQL 功能强大,但配置不当和缺少索引会严重影响性能。本指南涵盖高流量生产系统的高级技术。
理解 EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name;
关键指标:
- Seq Scan 在大表上 = 缺少索引
- Buffers: hit = 数据来自缓存(良好)
- Buffers: read = 数据来自磁盘(缓慢)
- actual rows 与 rows 的差异 = 规划器准确性

索引策略
-- 基本 B-tree 索引
CREATE INDEX idx_users_email ON users(email);
-- 复合索引(顺序很重要!)
CREATE INDEX idx_orders_status_created
ON orders(status, created_at DESC);
-- 部分索引(更小、更快)
CREATE INDEX idx_orders_pending
ON orders(created_at)
WHERE status = 'pending';
-- 表达式索引
CREATE INDEX idx_users_lower_email
ON users(LOWER(email));
-- 覆盖索引(避免堆获取)
CREATE INDEX idx_orders_user_covering
ON orders(user_id) INCLUDE (total, status, created_at);
索引类型
-- Hash 索引(仅等值查询)
CREATE INDEX idx_sessions_token
ON sessions USING HASH (token);
-- GIN 用于数组和 JSONB
CREATE INDEX idx_products_tags
ON products USING GIN (tags);
CREATE INDEX idx_users_metadata
ON users USING GIN (metadata jsonb_path_ops);
-- BRIN 用于时间序列(体积小)
CREATE INDEX idx_events_created
ON events USING BRIN (created_at)
WITH (pages_per_range = 128);

查询优化
-- EXISTS 与 IN 子查询
-- 好:EXISTS 短路
SELECT * FROM users u WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.user_id = u.id AND o.status = 'completed'
);
-- 窗口函数与子查询
-- 好:单次扫描
SELECT name, dept,
AVG(salary) OVER (PARTITION BY dept) as dept_avg
FROM employees;
-- 避免在索引列上使用函数
-- 坏:无法使用索引
SELECT * FROM users WHERE UPPER(email) = 'ALICE@EXAMPLE.COM';
-- 好:使用表达式索引
SELECT * FROM users WHERE email = LOWER('ALICE@EXAMPLE.COM');
表分区
CREATE TABLE orders (
id BIGSERIAL,
user_id INT NOT NULL,
created_at TIMESTAMPTZ NOT NULL,
total DECIMAL(10,2)
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2024_01
PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
-- 分区裁剪:仅扫描相关分区
SELECT * FROM orders
WHERE created_at >= '2024-01-01' AND created_at < '2024-02-01';

关键配置设置
# postgresql.conf
shared_buffers = 8GB # 25% 的 RAM
effective_cache_size = 24GB # 75% 的 RAM
work_mem = 64MB # 每个查询的排序/哈希
random_page_cost = 1.1 # 对于 SSD
max_parallel_workers_per_gather = 4
监控
-- 查找慢查询
SELECT query,
calls,
total_exec_time / calls AS avg_ms,
100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements
WHERE calls > 100
ORDER BY avg_ms DESC
LIMIT 10;
-- 查找缺失索引
SELECT schemaname, tablename, seq_scan, n_live_tup
FROM pg_stat_user_tables
WHERE n_live_tup > 10000
ORDER BY seq_scan DESC;
总结
PostgreSQL 性能优化流程:
- 使用
pg_stat_statements识别慢查询 - 使用
EXPLAIN (ANALYZE, BUFFERS)分析查询计划 - 根据查询模式添加针对性索引
- 根据工作负载调整内存设置
- 对时间序列数据使用分区
- 部署 PgBouncer 进行连接池管理