At the most basic level, the Zabbix monitoring system performs three tasks:
- Periodically poll hosts and APIs – this is called metric collection
- Based on the value of metrics, produce events – these are called triggers
- Based on triggers, produce actions such as notify, control, configure
The main monitoring object in Zabbix is a host. Usually a host in Zabbix matches the host on the network, however, there is no restriction on the number of times the same host, the same IP address, the same combination of IP addresses, or the port that is used can appear. It is not recommended, but it is possible for one host on the network to be divided into several hosts within Zabbix to split items being monitored. Hosts can have multiple IP addresses and TCP or UDP port numbers. An assigned network port number is required by Zabbix to connect to the host, but when using protocols that do not need a port, such as ICMP, this parameter is ignored.
Each host has its own set of items that each describe a value of the host. An item can be based on anything that can be monitored such as amount of installed memory, free disk space, network interface speed, the amount of traffic received, the number of connected users, and anything else that the host can tell Zabbix about itself.
- One item is responsible for one parameter
- Each item has its own unique name within the host, this is called “key expression”
- Each item has a type – an integer, a text string, etc.
- Each item has a history of values it had
- The server time is logged with the state of the item value
- There is no current value of an item, only the last value of the item
- Access to the history of items is done through functions
- The description and configuration of the item is typically static but can be automatically created
Each host has its own set of triggers that are linked with the hosts items. A trigger is a calculation of the history of the value of an item, written with Zabbix’s internal language formula. The standard value for a trigger is “false”, when associated calculation is true than the trigger value will be “true”. A trigger with a “true” value usually signifies an accident, a communication failure, or an abnormal situation. For hosts with large numbers of items, triggers can be calculated in units or in tens. Whenever a trigger changes its status, a trigger event is generated.
When a trigger changes status, a trigger event is generated. The event contains details of the trigger state’s change – when it happened and what the new state is. The two types of trigger events are “problem events” and “OK events”. A “problem event” is when the trigger is “true” while an “OK event” is when the trigger is false.
Triggers may cause an action to take place that can be configured various ways to notify people or external programs that a trigger has occurred. An example of this is a trigger causing an action that will generate a ticket in an external ticketing application.
Hosts can have macros associated with them that allow for custom values to be used within Zabbix. Macros are similar to command line parameters and allow universal programs and scripts to be written with various hosts. An example of a macro is a password used to login to a host.
It is normal for a host to be associated with potentially thousands of items, triggers, actions, and macros. These can be easily applied, removed, and modified to an unlimited number of hosts with templates. Zabbix’s specialty is collecting and retaining values of items applied to hosts, even when there are tens of thousands of hosts each with thousands of items, so templates are required to mass update hosts. Manually updating hosts without templates would not be feasible for most environments. Configuring a template is like configuring a single host. All parameters, including but not limited to items, triggers, actions, macros, and discovery rules, are applied to the template are applied to all hosts associated with that template.
Learn more about templates here.
Discovery rules simplify the work of configuring items that are repeated on hosts. An example is having 10 items and several triggers on a single switch port. It would be a lot of work to set this up manually for every port on every switch, but with discovery rules it is automated. Discovery rules work like a template for a template.
Groups are associated with hosts with no limit to the number of groups a host can belong to and no limit to the number of hosts within a group. It is best practice to associate every host to many groups in order to make finding that host easier.