Monitor Types
The type of monitor chosen from:- anomaly:
query alert - APM:
query alertortrace-analytics alert - composite:
composite - custom:
service check - event:
event alert - forecast:
query alert - host:
service check - integration:
query alertorservice check - live process:
process alert - logs:
log alert - metric:
metric alert - network:
service check - outlier:
query alert - process:
service check - rum:
rum alert - SLO:
slo alert - watchdog:
event alert - event-v2:
event-v2 alert
Query Types
Metric Alert Query Example:time_aggr(time_window):space_aggr:metric{tags} [by {key}] operator #
time_aggr: avg, sum, max, min, change, or pct_changetime_window:last_#m(with#between 1 and 2880 depending on the monitor type) orlast_#h(with#between 1 and 48 depending on the monitor type), orlast_1dspace_aggr: avg, sum, min, or maxtags: one or more tags (comma-separated), or *key: a ‘key’ in key:value tag syntax; defines a separate alert for each tag in the group (multi-alert)operator: <, <=, >, >=, ==, or !=#: an integer or decimal number used to set the threshold
_change_ or _pct_change_ time aggregator, instead use change_aggr(time_aggr(time_window), timeshift):space_aggr:metric{tags} [by {key}] operator # with:
change_aggrchange, pct_changetime_aggravg, sum, max, min Learn moretime_windowlast_#m (between 1 and 2880 depending on the monitor type), last_#h (between 1 and 48 depending on the monitor type), or last_#d (1 or 2)timeshift#m_ago (5, 10, 15, or 30), #h_ago (1, 2, or 4), or 1d_ago
avg(last_30m):outliers(avg:system.cpu.user{role:es-events-data} by {host}, 'dbscan', 7) > 0
Service Check Query
Example: "check".over(tags).last(count).by(group).count_by_status()
checkname of the check, e.g.datadog.agent.uptagsone or more quoted tags (comma-separated), or ”*”. e.g.:.over("env:prod", "role:db");overcannot be blank.countmust be at greater than or equal to your max threshold (defined in theoptions). It is limited to 100. For example, if you’ve specified to notify on 1 critical, 3 ok, and 2 warn statuses,countshould be at least 3.groupmust be specified for check monitors. Per-check grouping is already explicitly known for some service checks. For example, Postgres integration monitors are tagged bydb,host, andport, and Network monitors byhost,instance, andurl. See Service Checks documentation for more information.
events('sources:nagios status:error,warning priority:normal tags: "string query"').rollup("count").last("1h")"
event, the event query string:string_queryfree text query to match against event title and text.sourcesevent sources (comma-separated).statusevent statuses (comma-separated). Valid options: error, warn, and info.priorityevent priorities (comma-separated). Valid options: low, normal, all.hostevent reporting host (comma-separated).tagsevent tags (comma-separated).excluded_tagsexcluded event tags (comma-separated).rollupthe stats roll-up method.countis the only supported method now.lastthe timeframe to roll up the counts. Examples: 45m, 4h. Supported timeframes: m, h and d. This value should not exceed 48 hours.
events(query).rollup(rollup_method[, measure]).last(time_window) operator #
queryThe search query - following the Log search syntax.rollup_methodThe stats roll-up method - supportscount,avgandcardinality.measureForavgand cardinalityrollup_method- specify the measure or the facet name you want to use.time_window#m (between 1 and 2880), #h (between 1 and 48).operator<,<=,>,>=,==, or!=.#an integer or decimal number used to set the threshold.
processes(search).over(tags).rollup('count').last(timeframe) operator #
searchfree text search string for querying processes. Matching processes match results on the Live Processes page.tagsone or more tags (comma-separated)timeframethe timeframe to roll up the counts. Examples: 10m, 4h. Supported timeframes: s, m, h and doperator<, <=, >, >=, ==, or !=#an integer or decimal number used to set the threshold
logs(query).index(index_name).rollup(rollup_method[, measure]).last(time_window) operator #
queryThe search query - following the Log search syntax.index_nameFor multi-index organizations, the log index in which the request is performed.rollup_methodThe stats roll-up method - supportscount,avgandcardinality.measureForavgand cardinalityrollup_method- specify the measure or the facet name you want to use.time_window#m (between 1 and 2880), #h (between 1 and 48).operator<,<=,>,>=,==, or!=.#an integer or decimal number used to set the threshold.
12345 && 67890, where 12345 and 67890 are the IDs of non-composite monitors
name[required, default = dynamic, based on query]: The name of the alert.message[required, default = dynamic, based on query]: A message to include with notifications for this monitor. Email notifications can be sent to specific users by using the same ‘@username’ notation as events.tags[optional, default = empty list]: A list of tags to associate with your monitor. When getting all monitor details via the API, use themonitor_tagsargument to filter results by these tags. It is only available via the API and isn’t visible or editable in the Datadog UI.
error_budget("slo_id").over("time_window") operator #
slo_id: The alphanumeric SLO ID of the SLO you are configuring the alert for.time_window: The time window of the SLO target you wish to alert on. Valid options:7d,30d,90d.operator:>=or>.
External DocumentationTo learn more, visit the Datadog documentation.
Basic Parameters
| Parameter | Description |
|---|---|
| Message | A message to include with notifications for this monitor. |
| Name | The monitor name. |
| Query | The monitor query. |
| Type | The type of the monitor. |
Advanced Parameters
| Parameter | Description |
|---|---|
| Enable Logs Sample | Whether or not to send a log sample when the log monitor triggers. |
| Priority | Integer from 1 (high) to 5 (low) indicating alert severity. |
| Restricted Roles | A list of role identifiers that can be pulled from the Roles API. Cannot be used with locked option. |
| Tags | Tags associated to your monitor. |