query alert
query alert
or trace-analytics alert
composite
service check
event alert
query alert
service check
query alert
or service check
process alert
log alert
metric alert
service check
query alert
service check
rum alert
slo alert
event alert
event-v2 alert
time_aggr(time_window):space_aggr:metric{tags} [by {key}] operator #
time_aggr
: avg, sum, max, min, change, or pct_changetime_window
: last_#m
(with #
between 1 and 2880 depending on the monitor type) or last_#h
(with #
between 1 and 48 depending on the monitor type), or last_1d
space_aggr
: avg, sum, min, or maxtags
: one or more tags (comma-separated), or *key
: a ‘key’ in key:value tag syntax; defines a separate alert for each tag in the group (multi-alert)operator
: <, <=, >, >=, ==, or !=#
: an integer or decimal number used to set the threshold_change_
or _pct_change_
time aggregator, instead use change_aggr(time_aggr(time_window), timeshift):space_aggr:metric{tags} [by {key}] operator #
with:
change_aggr
change, pct_changetime_aggr
avg, sum, max, min Learn moretime_window
last_#m (between 1 and 2880 depending on the monitor type), last_#h (between 1 and 48 depending on the monitor type), or last_#d (1 or 2)timeshift
#m_ago (5, 10, 15, or 30), #h_ago (1, 2, or 4), or 1d_agoavg(last_30m):outliers(avg:system.cpu.user{role:es-events-data} by {host}, 'dbscan', 7) > 0
Service Check Query
Example: "check".over(tags).last(count).by(group).count_by_status()
check
name of the check, e.g. datadog.agent.up
tags
one or more quoted tags (comma-separated), or ”*”. e.g.: .over("env:prod", "role:db")
; over
cannot be blank.count
must be at greater than or equal to your max threshold (defined in the options
). It is limited to 100.
For example, if you’ve specified to notify on 1 critical, 3 ok, and 2 warn statuses, count
should be at least 3.group
must be specified for check monitors. Per-check grouping is already explicitly known for some service checks.
For example, Postgres integration monitors are tagged by db
, host
, and port
, and Network monitors by host
, instance
, and url
. See Service Checks documentation for more information.events('sources:nagios status:error,warning priority:normal tags: "string query"').rollup("count").last("1h")"
event
, the event query string:string_query
free text query to match against event title and text.sources
event sources (comma-separated).status
event statuses (comma-separated). Valid options: error, warn, and info.priority
event priorities (comma-separated). Valid options: low, normal, all.host
event reporting host (comma-separated).tags
event tags (comma-separated).excluded_tags
excluded event tags (comma-separated).rollup
the stats roll-up method. count
is the only supported method now.last
the timeframe to roll up the counts. Examples: 45m, 4h. Supported timeframes: m, h and d. This value should not exceed 48 hours.events(query).rollup(rollup_method[, measure]).last(time_window) operator #
query
The search query - following the Log search syntax.rollup_method
The stats roll-up method - supports count
, avg
and cardinality
.measure
For avg
and cardinality rollup_method
- specify the measure or the facet name you want to use.time_window
#m (between 1 and 2880), #h (between 1 and 48).operator
<
, <=
, >
, >=
, ==
, or !=
.#
an integer or decimal number used to set the threshold.processes(search).over(tags).rollup('count').last(timeframe) operator #
search
free text search string for querying processes.
Matching processes match results on the Live Processes page.tags
one or more tags (comma-separated)timeframe
the timeframe to roll up the counts. Examples: 10m, 4h. Supported timeframes: s, m, h and doperator
<, <=, >, >=, ==, or !=#
an integer or decimal number used to set the thresholdlogs(query).index(index_name).rollup(rollup_method[, measure]).last(time_window) operator #
query
The search query - following the Log search syntax.index_name
For multi-index organizations, the log index in which the request is performed.rollup_method
The stats roll-up method - supports count
, avg
and cardinality
.measure
For avg
and cardinality rollup_method
- specify the measure or the facet name you want to use.time_window
#m (between 1 and 2880), #h (between 1 and 48).operator
<
, <=
, >
, >=
, ==
, or !=
.#
an integer or decimal number used to set the threshold.12345 && 67890
, where 12345
and 67890
are the IDs of non-composite monitors
name
[required, default = dynamic, based on query]: The name of the alert.message
[required, default = dynamic, based on query]: A message to include with notifications for this monitor.
Email notifications can be sent to specific users by using the same ‘@username’ notation as events.tags
[optional, default = empty list]: A list of tags to associate with your monitor.
When getting all monitor details via the API, use the monitor_tags
argument to filter results by these tags.
It is only available via the API and isn’t visible or editable in the Datadog UI.error_budget("slo_id").over("time_window") operator #
slo_id
: The alphanumeric SLO ID of the SLO you are configuring the alert for.time_window
: The time window of the SLO target you wish to alert on. Valid options: 7d
, 30d
, 90d
.operator
: >=
or >
.Parameter | Description |
---|---|
Message | A message to include with notifications for this monitor. |
Name | The monitor name. |
Query | The monitor query. |
Type | The type of the monitor. |
Parameter | Description |
---|---|
Enable Logs Sample | Whether or not to send a log sample when the log monitor triggers. |
Priority | Integer from 1 (high) to 5 (low) indicating alert severity. |
Restricted Roles | A list of role identifiers that can be pulled from the Roles API. Cannot be used with locked option. |
Tags | Tags associated to your monitor. |