Alerts are actionable pieces of data corresponding to some problem, either generated by a machine or human. They can be created via the API, by one of the integrations, the CLI, or through the web application. Examples of alerts might include:
- a full hard drive or host that's no longer reachable as reported by some monitoring software
- a broken URL reported by an uptime check service
- a new ticket created in your bug tracking software, or
- a new support ticket filed by a customer
Each alert is associated with a service.
An alert can have three states:
- firing: the alert is active and hasn't been acknowledged yet by its assignee
- acknowledged: the on-call assignee is aware of and is working on the issue
- resolved: the problem has been fixed or righted (an alert can be resolved by machine or human)
Several actions can be taken on alerts during their lifecycle. As soon as you become aware of the alert, you can acknowledge it, and then either:
- escalate: the schedule defined by the escalation policy for that service will be contacted
- assign: assign to another user who knows how to fix the issue
- resolve: once the issue has been fixed or goes away, marking it as resolved prevents any additional notifications from being sent
Alerts can be created with multiple severity levels where only critical alerts notify the on-call user. If no severity level is specified, critical is the default value.
Read about how alerts are created through the API.