Review Alert Conditions
On this page
This page describes the conditions for which you can trigger alerts related to your database deployments. You specify conditions and thresholds when configuring alerts.
M0
free clusters and M2/M5
shared clusters
only trigger alerts related to the metrics supported by
those clusters. See Atlas M0 (Free Cluster), M2, and M5 Limitations for complete documentation
on M0/M2/M5
alert and metric limitations.
Host Alerts
The conditions in this section apply if you select Host as the alert target when configuring the alert. You can apply the condition to all hosts or to specific type of host, such as primaries or config servers.
Advisor
Host has index suggestions
Sends an alert if Performance Advisor has index suggestions for the host.
If the query targeting ratio for a host is high for a period of 10 minutes, Performance Advisor checks the host for inefficient queries and possible indexes to improve performance. If Performance Advisor determines that the host benefits from one or more indexes, this alert triggers and directs you to create the suggested indexes.
This alert is only available for
M10+
clusters, and is enabled by default forM10+
clusters that have Performance Advisor enabled. This alert does not trigger for clusters where Performance Advisor is disabled.
Asserts
The following alert conditions measure the rate of asserts for a
MongoDB process, as collected from the MongoDB serverStatus
command's asserts
document. You can view asserts
through cluster monitoring.
Atlas Search
The following alert conditions measure the amount of CPU and memory used by Atlas Search processes. You can view Atlas Search metrics through cluster monitoring.
Atlas Search: Index Replication Lag is
Sends an alert if the approximate number of milliseconds that Atlas Search is behind in replicating changes from the oplog of
mongod
is above or below the threshold.
Atlas Search: Index Size on Disk is
Sends an alert if the total size of all Atlas Search indexes on disk in bytes is above or below the threshold.
Atlas Search: Number of Error Queries is
Sends an alert if the number of queries for which Atlas Search is unable to return a response is above or below the threshold.
Atlas Search: Number of Index Fields is
Sends an alert if the total number of unique fields present in the Atlas Search index is above or below the threshold.
Atlas Search: Number of Successful Queries is
Sends an alert if the number of queries for which Atlas Search successfully returned a response is above or below the threshold.
Atlas Search: Total Number of Queries is
Sends an alert if the number of queries submitted to Atlas Search is above or below the threshold.
Atlas Search Opcounter: Delete is
Sends an alert if the total number of documents or fields (specified in the index definition) removed per second is above or below the threshold.
Atlas Search Opcounter: Getmore is
Sends an alert if the total number of
getmore
commands run on all Atlas Search queries per second is above or below the threshold.
Atlas Search Opcounter: Insert is
Sends an alert if the total number of documents or fields (specified in the index definition) that Atlas Search indexes per second is above or below the threshold.
Atlas Search Opcounter: Update is
Sends an alert if the total number of documents or fields (specified in the index definition) that Atlas Search updates per second is above or below the threshold.
Insufficient disk space to support rebuilding search indexes
Sends an alert if your database deployment runs out of enough free disk space to support your Atlas Search indexes.
Search Memory: Resident is
Sends an alert if the total bytes of resident memory occupied by the Atlas Search process is above or below the threshold.
Search Memory: Shared is
Sends an alert if the total bytes of shared memory occupied by the Atlas Search process is above or below the threshold.
Search Memory: Virtual is
Sends an alert if the total bytes of virtual memory occupied by the Atlas Search process is above or below the threshold.
Search Process: CPU (Kernel) % is
Sends an alert if the percentage of time the CPU spent servicing operating system calls for the Atlas Search process is above the threshold.
Search Process: CPU (User) % is
Sends an alert if the percentage of time the CPU spent servicing the Atlas Search process is above the threshold.
Average Execution Time
The following alert conditions measure the average execution time of
reads, writes, or commands for a MongoDB process, as collected from the
MongoDB serverStatus
command's opLatencies
document. You can view asserts through
cluster monitoring.
Average Execution Time: Commands is
Average execution time for command operations meets your specified threshold.
Opcounter
The following alert conditions measure the rate of database operations
on a MongoDB process since the process last started, as collected from
the MongoDB serverStatus
command's opcounters
document. You can view opcounters through
cluster monitoring.
Opcounter: Getmores is
Sends an alert if the rate of
getmore
operations to retrieve the next cursor batch meets the specified threshold.TipSee also:To learn more, see Cursor Batches in the MongoDB manual.
Opcounter - Repl
The following alert conditions measure the rate of database operations
on MongoDB secondaries, as collected from the
MongoDB serverStatus
command's opcountersRepl
document. You can view these metrics on the
Opcounters - Repl chart, accessed through
cluster monitoring.
Opcounter: Repl Cmd is
Sends an alert if the rate of replicated commands meets the specified threshold.
Opcounter: Repl Update is
Sends an alert if the rate of replicated updates meets the specified threshold.
Atlas Free Clusters
Memory
The following alert conditions measure memory for a MongoDB process, as
collected from the MongoDB serverStatus
command's
mem
document. You can view these metrics on the
Atlas Memory and Non-Mapped Virtual Memory
charts, accessed through cluster monitoring.
Memory: Resident is
Sends an alert if the size of the resident memory meets the specified threshold. It is typical over time, on a dedicated database server, for the size of the resident memory to approach the amount of physical RAM on the box.
Memory: Virtual is
Sends an alert if the size of virtual memory for the
mongod
process meets the specified threshold. You can use this alert to flag excessive memory outside of memory mapping.TipSee also:To learn more, click the Memory chart's i icon.
Memory: Computed is
Sends an alert if the size of virtual memory that is not accounted for by memory-mapping meets the specified threshold. If this number is very high (multiple gigabytes), it indicates that excessive memory is being used outside of memory mapping.
TipSee also:To learn how to use this metric, view the Non-Mapped Virtual Memory chart and click the chart's i icon.
System Memory: Max Used is
Sends an alert if the maximum system memory usage value meets the specified threshold.
System Memory: Free is
Sends an alert if the amount of free system memory drops below the specified threshold.
System Memory: Max Free is
Sends an alert if the maximum amount of free system memory drops below the specified threshold.
Connections
The following alert condition measures connections to a MongoDB
process, as collected from the MongoDB serverStatus
command's
connections
document. You can view this metric on the
Atlas Connections chart, accessed through cluster
monitoring.
Queues
The following alert conditions measure operations waiting on locks, as
collected from the MongoDB serverStatus
command's
globalLock
document. You can view these metrics on the
Atlas Queues chart, accessed through cluster monitoring.
Queues: Total is
Sends an alert if the number of operations waiting on a lock of any type meets the specified average.
Queues: Readers is
Sends an alert if the number of operations waiting on a read lock meets the specified average.
Queues: Writers is
Sends an alert if the number of operations waiting on a write lock meets the specified average.
Page Faults
The following alert condition measures the rate of page faults for a
MongoDB process, as collected from the MongoDB serverStatus
command's extra_info.page_faults
field.
Page Faults is
Sends an alert if the rate of page faults (whether or not an exception is thrown) meets the specified threshold. You can view this metric on the Atlas Page Faults chart, accessed through cluster monitoring.
Cursors
The following alert conditions measure the number of
cursors for a MongoDB process, as collected
from the MongoDB serverStatus
command's
metrics.cursor
document. You can view these metrics on
the Atlas Cursors chart, accessed through
cluster monitoring.
Network
The following alert conditions measure throughput for MongoDB process,
as collected from the MongoDB serverStatus
command's
network
document. You can view these metrics on a
host's Network chart, accessed through cluster
monitoring.
Network: Bytes In is
Sends an alert if the number of bytes sent to MongoDB meets the specified threshold.
Replication Oplog
The following alert conditions apply to the MongoDB process's oplog. You can view these metrics on the following charts, accessed through cluster monitoring:
- Replication Oplog Window
- Replication Lag
- Replication Headroom
- Oplog GB/Hour
The following alert conditions apply to the oplog:
Replication Oplog Window is
Sends an alert if the approximate amount of time available in the primary's replication oplog meets the specified threshold.
Replication Lag is
Sends an alert if the approximate amount of time that the secondary is behind the primary meets the specified threshold. Atlas calculates replication lag using the approach described in Check the Replication Lag in the MongoDB manual.
DB Storage
The following alert conditions apply to database storage, as collected for a MongoDB process by the MongoDB dbStats command. For details on how Atlas handles reaching database storage limits, refer to the FAQ page. These conditions are based on the summed total of all databases on the MongoDB process:
DB Storage is
Sends an alert if the allocated storage meets the specified threshold. This alert condition can be viewed on a host's DB Storage chart, accessed through cluster monitoring.
WiredTiger Storage Engine
The following alert conditions apply to the MongoDB process's
WiredTiger storage engine, as collected
from the MongoDB serverStatus
command's
wiredTiger.cache
and
wiredTiger.concurrentTransactions
documents.
You can view these metrics on the following charts, accessed through cluster monitoring:
- Tickets Available
- Cache Activity
- Cache Usage
The following are the alert conditions that apply to WiredTiger:
Tickets Available: Reads is
Sends an alert if the number of read tickets available to the WiredTiger storage engine meet the specified threshold.
Tickets Available: Writes is
Sends an alert if the number of write tickets available to the WiredTiger storage engine meet the specified threshold.
Cache: Dirty Bytes is
Sends an alert when the number of dirty bytes in the WiredTiger cache meets the specified threshold.
Cache: Used Bytes is
Sends an alert when the number of used bytes in the WiredTiger cache meets the specified threshold.
System and Disk Alerts
The following alert conditions measure usage on your Atlas server clusters:
Currently, Atlas uses a single partition for data, index, and journal files. Even though the alerts reference individual paritions, they point to the same metric.
System: CPU (Steal) % is
Applicable when the EC2 cluster credit balance is exhausted.
The percentage by which the CPU usage exceeds the guaranteed baseline CPU credit accumulation rate. CPU credits are units of CPU utilization that you accumulate. The credits accumulate at a constant rate to provide a guaranteed level of performance. These credits can be used for additional CPU performance. When the credit balance is exhausted, only the guaranteed baseline of CPU performance is provided, and the amount of excess is shown as steal percent.
NoteAtlas triggers this alert only for AWS EC2 clusters that support Burstable Performance. Currently, these are
M10
andM20
cluster types.
System: Max CPU (Steal) % is
Sends an alert if the maximum percentage by which the CPU usage exceeds the guaranteed baseline CPU credit accumulation rate exceeds the specified threshold.
System: CPU (User) % is
The CPU usage of the MongoDB process, normalized by the number of CPUs. This value is scaled to a range of 0-100%.
System: Max CPU (User) % is
Sends an alert if the maximum CPU usage of the MongoDB process, normalized by the number of CPUs exceeds the specified threshold.
System Network In is
Sends an alert if the average rate of physical bytes received per second by the
eth0
network interface reaches the specified threshold.
Max System Network In is
Sends an alert if the maximum number of bytes sent to MongoDB meets the specified threshold.
System Network Out is
Sends an alert if the average rate of physical bytes transmitted per second by the
eth0
network interface reaches the specified threshold.
Max System Network Out is
Sends an alert if the maximum number of bytes sent from MongoDB meets the specified threshold.
Disk space % used on Data Partition is
The percentage of disk space used on any partition that contains the MongoDB collection's data.
To find possible solutions for this alert, see Alert Resolutions.
Max disk space % used on Data Partition is
Sends an alert if the maximum percentage of disk space used on any partition that contains the MongoDB collection's data exceeds the specified threshold.
Disk space % used on Index Partition is
The percentage of disk space used on any partition that contains the MongoDB index data.
To find possible solutions for this alert, see Alert Resolutions.
Max disk space % used on Index Partition is
Sends an alert if the maximum percentage of disk space used on any partition that contains the MongoDB index data exceeds the specified threshold.
Disk space % used on Journal Partition is
The percentage of disk space used on the partition that contains the MongoDB journal, if journaling is enabled.
To find possible solutions for this alert, see Alert Resolutions.
Max disk space % used on Journal Partition is
Sends an alert if the maximum percentage of disk space used on the partition that contains the MongoDB journal exceeds the specified threshold.
Disk I/O % utilization on Data Partition is
The percentage of time during which requests are being issued to any partition that contains the MongoDB collection's data. This includes requests from any process, not just MongoDB processes. The threshold is specified when the alert is created.
Max disk I/O % utilization on Data Partition is
Sends an alert if the maximum percentage of time during which requests are being issued to any partition that contains the MongoDB collection data exceeds the specified threshold.
Disk I/O % utilization on Index Partition is
The percentage of time during which requests are being issued to any partition that contains the MongoDB index data. This includes requests from any process, not just MongoDB processes.
Max disk I/O % utilization on Index Partition is
Sends an alert if the maximum percentage of time during which requests are being issued to any partition that contains the MongoDB index data exceeds the specified threshold.
Disk I/O % utilization on Journal Partition is
The percentage of time during which requests are being issued to the partition that contains the MongoDB journal, if journaling is enabled. This includes requests from any process, not just MongoDB processes.
Max disk I/O % utilization on Journal Partition is
Sends an alert if the maximum percentage of time during which requests are being issued to the partition that contains the MongoDB journal exceeds the specified threshold.
Disk Queue depth on Data Partition is
Sends an alert if the average length of the queue of requests issued to the data partition that MongoDB uses exceeds the specified threshold.
Max disk queue depth on Data Partition is
Sends an alert if the maximum average length of the queue of requests issued to the data partition that MongoDB uses exceeds the specified threshold.
Disk Queue depth on Index Partition is
Sends an alert if the average length of the queue of requests issued to the index partition that MongoDB uses exceeds the specified threshold.
Max disk queue depth on Index Partition is
Sends an alert if the maximum average length of the queue of requests issued to the index partition that MongoDB uses exceeds the specified threshold.
Disk Queue depth on Journal Partition is
Sends an alert if the average length of the queue of requests issued to the journal partition that MongoDB uses exceeds the specified threshold.
Max disk queue depth on Journal Partition is
Sends an alert if the maximum average length of the queue of requests issued to the journal partition that MongoDB uses exceeds the specified threshold.
Disk read IOPS on Data Partition is
Sends an alert if the average number of disk read operations per second exceeds the specified threshold.
Max disk read IOPS on Data Partition is
Sends an alert if the maximum average number of disk read operations per second exceeds the specified threshold.
Disk read IOPS on Index Partition is
Sends an alert if the average number of disk read operations per second exceeds the specified threshold.
Max disk read IOPS on Index Partition is
Sends an alert if the maximum average number of disk read operations per second exceeds the specified threshold.
Disk read IOPS on Journal Partition is
Sends an alert if the average number of disk read operations per second exceeds the specified threshold.
Max disk read IOPS on Journal Partition is
Sends an alert if the maximum average number of disk read operations per second exceeds the specified threshold.
Disk read latency on Data Partition is
Sends an alert if the amount of latency on disk read operations exceeds the specified threshold.
Max disk read latency on Data Partition is
Sends an alert if the maximum amount of latency on disk read operations exceeds the specified threshold.
Disk read latency on Index Partition is
Sends an alert if the amount of latency on disk read operations exceeds the specified threshold.
Max disk read latency on Index Partition is
Sends an alert if the maximum amount of latency on disk read operations exceeds the specified threshold.
Disk read latency on Journal Partition is
Sends an alert if the amount of latency on disk read operations exceeds the specified threshold.
Max disk read latency on Journal Partition is
Sends an alert if the maximum amount of latency on disk read operations exceeds the specified threshold.
Disk write IOPS on Data Partition is
Sends an alert if the average number of disk write operations per second exceeds the specified threshold.
Max disk write IOPS on Data Partition is
Sends an alert if the maximum average number of disk write operations per second exceeds the specified threshold.
Disk write IOPS on Index Partition is
Sends an alert if the average number of disk write operations per second exceeds the specified threshold.
Max disk write IOPS on Index Partition is
Sends an alert if the maximum average number of disk write operations per second exceeds the specified threshold.
Disk write IOPS on Journal Partition is
Sends an alert if the average number of disk write operations per second exceeds the specified threshold.
Max disk write IOPS on Journal Partition is
Sends an alert if the maximum average number of disk write operations per second exceeds the specified threshold.
Disk write latency on Data Partition is
Sends an alert if the amount of latency on disk write operations exceeds the specified threshold.
Max disk write latency on Data Partition is
Sends an alert if the maximum amount of latency on disk write operations exceeds the specified threshold.
Disk write latency on Index Partition is
Sends an alert if the amount of latency on disk write operations exceeds the specified threshold.
Max disk write latency on Index Partition is
Sends an alert if the maximum amount of latency on disk write operations exceeds the specified threshold.
Restarts
Host Down
Host is Down
Sends an alert if Atlas is unable to reach a host for several minutes.
ImportantYou should only configure this alert if you depend on secondary reads. For more information on secondary reads, see Tag Your Replica Sets and Read Preference.
This alert is generally triggered by one of the following conditions:
- The cluster has experienced a failure and is being auto-healed.
- The cluster could not be reached because of a network issue.
MongoDB Atlas checks that the downtime did not occur because of your actions, such as rolling index builds. If MongoDB Atlas confirms that the downtime was not intentional, MongoDB Atlas attempts to replace the affected node. If failures occur, Atlas clusters maintain node availability for both reads and writes as long as a majority of nodes are running. To learn more, see How does MongoDB Atlas deliver high availability?.
Swap
The following alert conditions apply to swap space usage:
Swap Usage: Used is
Sends an alert if the total amount of swap space in use reaches the specified threshold.
Swap Usage: Max Used is
Sends an alert if the maximum total amount of swap space in use reaches the specified threshold.
Inapplicable Host Conditions
The following host conditions do not apply to Atlas. Atlas will not generate alerts for the following conditions:
- Memory: Mapped is
- B-tree: accesses is
- B-tree: hits is
- B-tree: misses is
- B-tree: miss ratio is
- Effective Lock % is
- Background Flush Average is
- Accesses Not In Memory: Total is
- Page Fault Exceptions Thrown: Total is
- Cursors: Client Cursors Size is
- Journaling Commits in Write Lock is
- Journaling MB is
- Journaling Write Data Files MB is
Query Targeting Alerts
The following alerts apply to indexes on your collections. Either alert may indicate a missing or inefficient index.
To learn more about indexing to improve performance, see Indexing Strategies.
Cloud Backup Alerts
The following alerts apply to Cloud Backup snapshots.
Fallback snapshot taken
Sends an alert when a regular backup fails, but Atlas was able to take a fallback snapshot.
Replica Set Alerts
The following alert conditions apply to replica sets:
Replica set has no primary
Sends an alert when a replica set does not have a primary. Specifically, when none of the members of a replica set have a status of
PRIMARY
, the alert triggers. For example, this condition may arise when a set has an even number of voting members resulting in a tie.If Atlas collects data during an election, this alert might send a false positive. To prevent such false positives, set the alert configuration's after waiting interval (in the configuration's Send to section).
To find possible solutions for this alert, see Alert Resolutions.
Replica set elected a new primary
Sends an alert when a replica set elects a new primary.
Number of elections in last hour is > X
Sends an alert when the number of elections that have occurred in the last hour exceeds the user-specified value of
X
. The value ofX
is set when you create the alert. This alert may indicate that the cluster's replication is not in a healthy state, as evidenced by constant elections.
Sharded Cluster Alerts
The following alert condition applies to sharded clusters:
Cluster is missing an active mongos
Sends an alert if Atlas cannot reach a
mongos
for the cluster.
Serverless Alerts
The following alert conditions apply to serverless instances:
User Alerts
The following alert conditions apply to Atlas users.
Organization users do not have multi-factor authentication enabled
Sends an alert when one or more users in an organization do not have multi-factor authentication enabled.
Project Alerts
The following alert conditions apply to your Atlas project.
Billing Alerts
The following alert conditions apply to Atlas billing. You can configure billing alerts from the Atlas UI at the organization level or the project level.
To configure organization-level alerts, select your organization and navigate to Alerts.
To configure project-level alerts, select your project. Navigate to your project settings and then to Alerts, or click the icon in the top right corner of your project cluster view.
All amounts billed are in USD.
Credit card is about to expire
Sends an alert if the credit card on file is about to expire. The alert is triggered at the beginning of the month that the card expires. Atlas enables this alert when a credit card is added for the first time.
This condition applies to both organizations and projects.
Amount billed ($) yesterday is above the threshold
Sends an alert if the organization or project's last daily amount billed exceeds your configured threshold. Atlas does not account for any credits applied for the previous day when calculating the billed amount.
This condition applies to both organizations and projects.
Current bill ($) for any single project is above the threshold
Sends an alert if the monthly total for any project within the organization exceeds your configured threshold for all projects. When the current pending invoice closes, this alert resets.
This alert condition applies to organizations only.
Federation Alerts
Organization's IdP certificate is about to expire
Sends an alert when an IdP certificate associated with an organization for which you have the
Organization Owner
role expires within 14 days. Atlas sends this alert daily until you acknowledge it.NoteAtlas creates this alert automatically when you map an organization to an IdP provider. If you remove the mapping, Atlas deletes all instances of this alert.
Encryption at Rest Alerts
The following alert conditions apply to projects using Encryption at Rest using Customer Key Management.
AWS encryption key elapsed time since last rotation is above (n) days
Sends an alert if the AWS Customer Master Key (CMK) used by the Atlas project has been active for more than the configured number of days (90 by default). You can modify the alert threshold from the Alert Settings tab of the Alerts view.
Atlas automatically rotates MongoDB master encryption keys every 90 days, but does not rotate the project's CMK.
This alert resets automatically if you rotate the project CMK. For documentation on how to rotate your project CMK, see Rotate your AWS Customer Master Key.
Azure encryption key elapsed time since last rotation is above (n) days
Sends an alert if the Azure Key Vault Key Identifier used by the Atlas project has been active for more than the configured number of days (90 by default). You can modify the alert threshold from the Alert Settings tab of the Alerts view.
Atlas automatically rotates MongoDB master encryption keys every 90 days, but does not rotate the project's Key Identifier.
This alert resets automatically if you rotate the project Key Identifier. For documentation on how to rotate your project Key Identifier, see Rotate your Azure Key Identifier.
GCP encryption key elapsed time since last rotation is above (n) days
Sends an alert if the GCP Key Version Resource ID used by the Atlas project has been active for more than the configured number of days (90 by default). You can modify the alert threshold from the Alert Settings tab of the Alerts view.
Atlas automatically rotates MongoDB master encryption keys every 90 days, but does not rotate the project's Key Version Resource ID.
This alert resets automatically if you rotate the project Key Version Resource ID.
TipSee also:To learn how to rotate your project Key Version Resource ID, see Rotate your GCP Key Version Resource ID.
Maintenance Window Alerts
The following alert conditions apply to projects with configured maintenance windows.
You can only configure maintenance window alerts if a project has an active maintenance window.