Why AI Monitoring Beats Traditional Monitoring
Traditional server monitoring tools like Nagios, Zabbix, or Datadog tell you what is wrong: CPU is at 95%, memory is low, disk is full. An AI monitoring agent tells you why it is wrong and what to do about it: "CPU is at 95% because your ir_cron_mail_gateway action has been running for 47 minutes. The mail server at smtp.example.com is not responding. Check your SMTP configuration or temporarily disable the mail gateway cron."
The difference is context. Traditional monitoring has metrics. AI monitoring has metrics plus Odoo knowledge — it understands what workers do, how crons work, what PostgreSQL settings matter for Odoo, and what log patterns indicate specific problems.
What the Agent Monitors
Server Resources
| Metric | Warning Threshold | What the Agent Does |
|---|---|---|
| CPU Usage | > 80% sustained | Identifies the process causing high CPU, checks if it is a cron job, user request, or system process |
| Memory Usage | > 85% | Checks Odoo worker memory, identifies memory leaks in custom modules, recommends limit_memory_hard adjustments |
| Disk Space | > 90% | Identifies largest directories (filestore, logs, backups), recommends cleanup actions |
| Disk I/O | High iowait | Correlates with PostgreSQL vacuum operations, backup jobs, or filestore-heavy operations |
Odoo Application
| Metric | What the Agent Checks |
|---|---|
| Worker Status | Active workers, queued requests, worker restarts, worker timeouts |
| Request Latency | Average and P95 response times, slow endpoint identification |
| Error Rate | 500 errors, access errors, validation errors, correlated with recent changes |
| Cron Jobs | Running duration, missed executions, stuck jobs |
| Session Count | Active user sessions, session growth trends |
PostgreSQL
| Metric | What the Agent Checks |
|---|---|
| Connection Count | Active connections vs max_connections, connection pool utilization |
| Slow Queries | Queries exceeding threshold, missing indexes, sequential scans on large tables |
| Lock Contention | Blocked queries, deadlock detection, advisory lock usage |
| Table Bloat | Tables needing VACUUM, index bloat, dead tuple ratios |
| Replication Lag | For read replicas, replication delay monitoring |
Odoo Logs
The agent continuously parses Odoo logs for:
- Error spikes — Sudden increase in error log entries, correlated with deployment timestamps
- Slow ORM queries — Queries taking more than 100ms, identifying the model and method
- Authentication failures — Brute force attempts, invalid API key usage
- Memory warnings — Worker killed by OOM, limit_memory_hard exceeded
- Deprecated API usage — Warnings about deprecated methods in custom modules
How the Agent Diagnoses Issues
Example: Slow Instance
User asks: "My Odoo is slow today."
Agent diagnostic process:
- Check server resources → CPU 45%, Memory 72% → Not a resource issue
- Check Odoo workers → 2 workers, both busy, requests queuing → Insufficient workers
- Check PostgreSQL → 3 queries taking >5 seconds on sale_order table → Missing index
- Check recent changes → Module update deployed yesterday → Possible new slow query
Agent response: "Your instance is slow for two reasons: (1) You only have 2 workers on a 4-core server — requests are queuing during peak hours. Increase to 5 workers. (2) There is a slow query on sale_order that takes 5.2 seconds because there is no index on date_order + state. Run: CREATE INDEX idx_sale_order_date_state ON sale_order(date_order, state). This was likely introduced by the module update deployed yesterday."
Example: Instance Crashed
User asks: "My Odoo instance was down this morning at 3 AM."
Agent diagnostic process:
- Check system logs → OOM killer invoked at 02:47 AM
- Check Odoo logs → Cron job ir_cron_mass_mailing started at 02:30 AM
- Check memory pattern → Memory climbed from 4GB to 7.8GB during cron execution
- Check mail queue → 15,000 emails queued for sending
Agent response: "Your instance was killed by the Linux OOM killer at 02:47 AM. The mass mailing cron job tried to process 15,000 emails in a single batch, consuming 7.8GB of memory. Set the mass mailing batch size to 500 emails in Settings → Email Marketing → Batch Size, and increase limit_memory_hard to 4GB to prevent OOM kills during legitimate large operations."
Setting Up AI Monitoring
Option 1: DeployMonkey (Managed)
DeployMonkey includes AI monitoring out of the box. The agent has SSH access to your server, reads Odoo logs and configs, and provides diagnostics through the control panel. No setup required — it works on all plans, including free.
Option 2: Self-Hosted Agent
Build your own monitoring agent:
- Collect metrics using Prometheus, node_exporter, and pg_exporter
- Collect Odoo logs via a log shipper (Filebeat, Fluentd)
- Connect an LLM (Claude, GPT-4) with tool access to query metrics and logs
- Create a prompt template that includes Odoo-specific diagnostic knowledge
- Set up alerting thresholds and notification channels
Best Practices
- Start with monitoring before automation — Let the agent observe and recommend before giving it permission to make changes
- Set up alerting tiers — Info, warning, critical. Not every metric spike needs immediate attention
- Keep historical data — The agent makes better diagnoses when it can compare current metrics against historical baselines
- Review agent recommendations — Always review before applying, especially for configuration changes