Skip to main content

Create Monitor

Create a monitor using the specified options.

Monitor Types

The type of monitor chosen from:

  • anomaly: query alert
  • APM: query alert or trace-analytics alert
  • composite: composite
  • custom: service check
  • event: event alert
  • forecast: query alert
  • host: service check
  • integration: query alert or service check
  • live process: process alert
  • logs: log alert
  • metric: metric alert
  • network: service check
  • outlier: query alert
  • process: service check
  • rum: rum alert
  • SLO: slo alert
  • watchdog: event alert
  • event-v2: event-v2 alert

Query Types

Metric Alert Query

Example: time_aggr(time_window):space_aggr:metric{tags} [by {key}] operator #

  • time_aggr: avg, sum, max, min, change, or pct_change
  • time_window: last_#m (with # between 1 and 2880 depending on the monitor type) or last_#h(with # between 1 and 48 depending on the monitor type), or last_1d
  • space_aggr: avg, sum, min, or max
  • tags: one or more tags (comma-separated), or *
  • key: a 'key' in key:value tag syntax; defines a separate alert for each tag in the group (multi-alert)
  • operator: <, <=, >, >=, ==, or !=
  • #: an integer or decimal number used to set the threshold

If you are using the _change_ or _pct_change_ time aggregator, instead use change_aggr(time_aggr(time_window), timeshift):space_aggr:metric{tags} [by {key}] operator # with:

  • change_aggr change, pct_change
  • time_aggr avg, sum, max, min Learn more
  • time_window last_#m (between 1 and 2880 depending on the monitor type), last_#h (between 1 and 48 depending on the monitor type), or last_#d (1 or 2)
  • timeshift #m_ago (5, 10, 15, or 30), #h_ago (1, 2, or 4), or 1d_ago

Use this to create an outlier monitor using the following query: avg(last_30m):outliers(avg:system.cpu.user{role:es-events-data} by {host}, 'dbscan', 7) &gt; 0

Service Check Query

Example: "check".over(tags).last(count).by(group).count_by_status()

  • check name of the check, e.g. datadog.agent.up
  • tags one or more quoted tags (comma-separated), or "*". e.g.: .over("env:prod", "role:db"); over cannot be blank.
  • count must be at greater than or equal to your max threshold (defined in the options). It is limited to 100. For example, if you've specified to notify on 1 critical, 3 ok, and 2 warn statuses, count should be at least 3.
  • group must be specified for check monitors. Per-check grouping is already explicitly known for some service checks. For example, Postgres integration monitors are tagged by db, host, and port, and Network monitors by host, instance, and url. See Service Checks documentation for more information.

Event Alert Query

Example: events('sources:nagios status:error,warning priority:normal tags: "string query"').rollup("count").last("1h")"

  • event, the event query string:
  • string_query free text query to match against event title and text.
  • sources event sources (comma-separated).
  • status event statuses (comma-separated). Valid options: error, warn, and info.
  • priority event priorities (comma-separated). Valid options: low, normal, all.
  • host event reporting host (comma-separated).
  • tags event tags (comma-separated).
  • excluded_tags excluded event tags (comma-separated).
  • rollup the stats roll-up method. count is the only supported method now.
  • last the timeframe to roll up the counts. Examples: 45m, 4h. Supported timeframes: m, h and d. This value should not exceed 48 hours.

NOTE Only available on US1 and EU.

Event V2 Alert Query

Example: events(query).rollup(rollup_method[, measure]).last(time_window) operator #

  • query The search query - following the Log search syntax.
  • rollup_method The stats roll-up method - supports count, avg and cardinality.
  • measure For avg and cardinality rollup_method - specify the measure or the facet name you want to use.
  • time_window #m (between 1 and 2880), #h (between 1 and 48).
  • operator &lt;, &lt;=, &gt;, &gt;=, ==, or !=.
  • # an integer or decimal number used to set the threshold.

NOTE Only available on US1-FED, US3, and in closed beta on EU and US1.

Process Alert Query

Example: processes(search).over(tags).rollup('count').last(timeframe) operator #

  • search free text search string for querying processes. Matching processes match results on the Live Processes page.
  • tags one or more tags (comma-separated)
  • timeframe the timeframe to roll up the counts. Examples: 10m, 4h. Supported timeframes: s, m, h and d
  • operator <, <=, >, >=, ==, or !=
  • # an integer or decimal number used to set the threshold

Logs Alert Query

Example: logs(query).index(index_name).rollup(rollup_method[, measure]).last(time_window) operator #

  • query The search query - following the Log search syntax.
  • index_name For multi-index organizations, the log index in which the request is performed.
  • rollup_method The stats roll-up method - supports count, avg and cardinality.
  • measure For avg and cardinality rollup_method - specify the measure or the facet name you want to use.
  • time_window #m (between 1 and 2880), #h (between 1 and 48).
  • operator &lt;, &lt;=, &gt;, &gt;=, ==, or !=.
  • # an integer or decimal number used to set the threshold.

Composite Query

Example: 12345 && 67890, where 12345 and 67890 are the IDs of non-composite monitors

  • name [required, default = dynamic, based on query]: The name of the alert.
  • message [required, default = dynamic, based on query]: A message to include with notifications for this monitor. Email notifications can be sent to specific users by using the same '@username' notation as events.
  • tags [optional, default = empty list]: A list of tags to associate with your monitor. When getting all monitor details via the API, use the monitor_tags argument to filter results by these tags. It is only available via the API and isn't visible or editable in the Datadog UI.

SLO Alert Query

Example: error_budget("slo_id").over("time_window") operator #

  • slo_id: The alphanumeric SLO ID of the SLO you are configuring the alert for.
  • time_window: The time window of the SLO target you wish to alert on. Valid options: 7d, 30d, 90d.
  • operator: &gt;= or &gt;.
External Documentation

To learn more, visit the Datadog documentation.

Basic Parameters

ParameterDescription
MessageA message to include with notifications for this monitor.
NameThe monitor name.
QueryThe monitor query.
TypeThe type of the monitor.

Advanced Parameters

ParameterDescription
Enable Logs SampleWhether or not to send a log sample when the log monitor triggers.
PriorityInteger from 1 (high) to 5 (low) indicating alert severity.
Restricted RolesA list of role identifiers that can be pulled from the Roles API. Cannot be used with locked option.
TagsTags associated to your monitor.

Example Output

{
"created": "2000-01-23T04:56:07.000+00:00",
"creator": {
"email": "email",
"handle": "handle",
"name": "name"
},
"deleted": "2000-01-23T04:56:07.000+00:00",
"id": 0,
"message": "message",
"modified": "2000-01-23T04:56:07.000+00:00",
"multi": true,
"name": "name",
"options": {
"aggregation": {
"group_by": "host",
"metric": "metrics.name",
"type": "count"
},
"device_ids": [
null,
null
],
"enable_logs_sample": true,
"escalation_message": "none",
"evaluation_delay": 6,
"groupby_simple_monitor": true,
"include_tags": true,
"locked": true,
"min_failure_duration": 1055,
"min_location_failed": 5,
"new_host_delay": 5,
"no_data_timeframe": 2,
"notify_audit": false,
"notify_no_data": false,
"renotify_interval": 7,
"require_full_window": true,
"silenced": {
"key": 9
},
"synthetics_check_id": "synthetics_check_id",
"threshold_windows": {
"recovery_window": "recovery_window",
"trigger_window": "trigger_window"
},
"thresholds": {
"critical": 3.616076749251911,
"critical_recovery": 2.027123023002322,
"ok": 4.145608029883936,
"unknown": 7.386281948385884,
"warning": 1.2315135367772556,
"warning_recovery": 1.0246457001441578
},
"timeout_h": 1
},
"priority": 3,
"query": "avg(last_5m):sum:system.net.bytes_rcvd{host:host0} \u003e 100",
"restricted_roles": [
"restricted_roles",
"restricted_roles"
],
"state": {
"groups": {
"key": {
"last_nodata_ts": 7,
"last_notified_ts": 1,
"last_resolved_ts": 4,
"last_triggered_ts": 5,
"name": "name"
}
}
},
"tags": [
"tags",
"tags"
],
"type": "metric alert"
}

Workflow Library Example

Create Datadog Monitor for Kubernetes Namespace

Workflow LibraryPreview this Workflow on desktop