What High Availability Means for Odoo
High availability (HA) is the practice of eliminating single points of failure so that when one component fails, traffic automatically shifts to a standby. For Odoo, this means: multiple application nodes, a replicated database with automatic failover, shared storage that survives a single disk failure, and a load balancer that detects and routes around failures.
True HA is expensive and complex. Before investing, assess your actual uptime requirement. Most businesses need 99.9% uptime (8.7 hours downtime/year) — achievable with good backups and fast restore. 99.99% (52 minutes/year) requires full HA architecture.
Component Architecture
[ DNS / Cloudflare ]
|
[ Load Balancer (keepalived VIP) ]
/ \
[ nginx LB 1 ] [ nginx LB 2 ] (standby)
| \ / |
[ Odoo Node 1 ] [ Odoo Node 2 ]
\ /
[ pgBouncer (on each Odoo node) ]
|
[ PostgreSQL Primary ] --> [ PostgreSQL Replica ]
\ /
[ Patroni / automatic failover ]
|
[ Shared S3 Filestore ]
Layer 1 — PostgreSQL Streaming Replication
PostgreSQL's built-in streaming replication sends WAL (Write-Ahead Log) records to one or more replica servers in near-real-time. Setup on the primary:
# postgresql.conf (primary)
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB
hot_standby = on
# pg_hba.conf (primary) — allow replica to connect
host replication replicator 10.0.0.6/32 scram-sha-256
# Create replication user:
psql -U postgres -c "CREATE USER replicator WITH REPLICATION PASSWORD 'replpassword';"
# Bootstrap the replica:
pg_basebackup -h 10.0.0.5 -U replicator -D /var/lib/postgresql/data -P -Xs -R
The -R flag writes a standby.signal file and configures primary_conninfo automatically. Start PostgreSQL on the replica — it will begin streaming from primary.
Layer 2 — Automatic Failover with Patroni
Patroni manages PostgreSQL HA automatically: it monitors the primary, promotes a replica if the primary fails, and updates a distributed configuration store (etcd or Consul) so that clients always know which node is primary.
# Install Patroni:
pip install patroni[etcd]
# patroni.yml (abbreviated):
name: pg-node-1
scope: odoo-cluster
restapi:
listen: 0.0.0.0:8008
etcd:
hosts: 10.0.0.20:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 30
maximum_lag_on_failover: 1048576
postgresql:
listen: 0.0.0.0:5432
data_dir: /var/lib/postgresql/data
authentication:
replication:
username: replicator
password: replpassword
After failover, Patroni updates the cluster state in etcd. Use a HAProxy or pgBouncer setup that reads from etcd to automatically route to the new primary.
Layer 3 — Multiple Odoo Nodes
As covered in the load balancing guide, run at least two Odoo nodes. Each points to the same PostgreSQL (via pgBouncer) and the same S3 filestore. If one node goes down, nginx stops routing to it and the other handles all traffic.
# odoo.conf on all nodes:
db_host = pgbouncer-vip # Virtual IP managed by keepalived
data_dir = /mnt/s3fuse/odoo-filestore # S3-backed mount
Layer 4 — Load Balancer HA with keepalived
A single nginx load balancer is itself a single point of failure. keepalived provides a Virtual IP (VIP) that floats between two nginx servers:
# /etc/keepalived/keepalived.conf (on primary LB)
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass secret
}
virtual_ipaddress {
10.0.0.100/24 # the VIP your DNS points to
}
track_script {
chk_nginx
}
}
DNS points to the VIP. If the primary LB fails, the VIP moves to the standby within 1–2 seconds.
Layer 5 — Shared S3 Filestore
Mount S3 on all Odoo nodes using s3fs or goofys (faster), or use the Odoo S3 attachment module to write directly to S3:
s3fs odoo-filestore /mnt/s3fuse -o passwd_file=/etc/s3fs.passwd \
-o url=https://s3.amazonaws.com -o use_path_request_style
S3 itself is highly available and durable (99.999999999% durability). This eliminates the filestore as a failure point.
Recovery Time Objectives
| Component | Without HA | With HA |
|---|---|---|
| Odoo node failure | 5–15 min manual | <30 seconds automatic |
| PostgreSQL failure | 15–60 min restore | 30–60 seconds failover |
| Load balancer failure | 5–10 min manual | <2 seconds (VIP failover) |
How DeployMonkey Provides HA
DeployMonkey's Enterprise plan includes multi-node Odoo, Patroni-managed PostgreSQL replication, and S3 filestore. The infrastructure is maintained, patched, and monitored — you get HA without building it yourself. For most businesses, this is far more cost-effective than running your own HA cluster.
Start free at deploymonkey.app.
Frequently Asked Questions
Is HA necessary for small Odoo installations?
Usually not. For teams under 50 users, reliable backups + fast restore (under 30 minutes) is sufficient and far cheaper. Invest in HA when downtime costs exceed the HA infrastructure cost.
What is the minimum number of servers for a proper HA setup?
Practically: 2 Odoo nodes, 2 PostgreSQL nodes (primary + replica), 2 load balancers, and 3 etcd nodes (for quorum). That is 9 servers minimum — this is why HA is expensive.
Can PostgreSQL replication lag cause data issues?
In async replication, a failover can lose a few seconds of committed transactions (RPO > 0). For zero data loss, use synchronous replication (synchronous_commit = on), which adds latency to every write.
Does Patroni work with pgBouncer?
Yes — run pgBouncer on each Odoo node pointing to the Patroni VIP or use Patroni's REST API to dynamically update pgBouncer's target after a failover.