
数据库性能是最常见的生产瓶颈。2 毫秒与 2 秒之间的差异,在于理解 PostgreSQL 如何读取查询计划和使用索引。
解读 EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count, SUM(o.total) AS revenue
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
AND o.status = 'completed'
GROUP BY u.id, u.name
ORDER BY revenue DESC
LIMIT 100;
需要关注的关键点:
- Seq Scan:读取每一行——对大表来说性能差
- Index Scan:使用索引——通常良好
- Index Only Scan:所有数据都在索引中——最佳,无需访问堆
- rows=X (actual rows=Y):当 X >> Y 时,统计信息过时 → 运行 ANALYZE
- Buffers: hit=N read=M:高 "read" 表示磁盘 I/O,考虑缓冲池大小

索引设计
-- 1. 复合索引:选择性最高的列放在前面
-- 查询:WHERE status = 'active' AND created_at > '2025-01-01'
CREATE INDEX idx_orders_status_created
ON orders (status, created_at)
WHERE status IN ('pending', 'processing'); -- 部分索引!
-- 2. 覆盖 ORDER BY 的索引
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at DESC);
-- 服务于:WHERE user_id = ? ORDER BY created_at DESC LIMIT 10
-- 3. 覆盖索引:完全避免堆访问
CREATE INDEX idx_users_email_covering ON users (email)
INCLUDE (name, id, subscription_tier);
-- 查询:SELECT name, id, tier FROM users WHERE email = ?
-- → Index Only Scan(无表访问)
-- 4. 用于数组和 JSONB 的 GIN 索引
CREATE INDEX idx_products_tags ON products USING GIN(tags);
CREATE INDEX idx_events_payload ON events USING GIN(payload jsonb_path_ops);
SELECT * FROM products WHERE tags @> ARRAY['python', 'ml'];
SELECT * FROM events WHERE payload @@ '$.user_id == 123';

避免 N+1 查询
-- N+1:先获取用户,然后循环获取每个用户的订单
-- 解决方案:使用 JOIN 和聚合
SELECT
u.id, u.name, u.email,
COUNT(o.id) AS order_count,
COALESCE(SUM(o.total), 0) AS total_spent
FROM users u
LEFT JOIN orders o ON o.user_id = u.id AND o.status = 'completed'
WHERE u.created_at > '2025-01-01'
GROUP BY u.id, u.name, u.email;
-- LATERAL 连接实现每组前 N 条
SELECT u.*, recent.orders
FROM users u
CROSS JOIN LATERAL (
SELECT json_agg(o ORDER BY o.created_at DESC) AS orders
FROM (
SELECT id, total, status
FROM orders WHERE user_id = u.id
ORDER BY created_at DESC LIMIT 3
) o
) recent;

使用 CTE 处理复杂多步查询
WITH
revenue_90d AS (
SELECT user_id, SUM(total) AS revenue
FROM orders
WHERE created_at >= NOW() - INTERVAL '90 days'
GROUP BY user_id
),
segments AS (
SELECT user_id,
CASE
WHEN revenue >= 1000 THEN 'high_value'
WHEN revenue >= 200 THEN 'medium_value'
ELSE 'low_value'
END AS segment
FROM revenue_90d
)
SELECT u.*, COALESCE(s.segment, 'inactive') AS segment
FROM users u
LEFT JOIN segments s ON s.user_id = u.id;
对大表进行分区
-- 按日期范围分区(时间序列数据)
CREATE TABLE events (
id BIGSERIAL,
user_id INT NOT NULL,
event_type TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
payload JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_04 PARTITION OF events
FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');
-- 索引自动应用于所有分区
CREATE INDEX ON events (user_id, created_at);
-- 删除旧数据:瞬间完成 vs 数小时的 DELETE
DROP TABLE events_2024_01; -- 瞬间完成!
连接池
# pgbouncer.ini
[pgbouncer]
pool_mode = transaction # 每个事务后释放
默认池大小 = 25 # 每个数据库的连接数
最大客户端连接数 = 500 # 最大应用连接数
监听端口 = 6432
PostgreSQL 优化涉及数据分布、索引选择性和连接策略。EXPLAIN ANALYZE 会准确告诉你发生了什么——以及通常如何修复。
→ 使用 SQL Prettify 工具格式化你的 SQL 查询。