Check Receivers ---> Executive ---> Gateways
Check Receivers read check data generated by external programs; Flapjack ships with a Nagios checks receiver, and connectors to other sources of check data could also be written. Checks define the entity they refer to as being in one of three states (
CRITICAL), are timestamped and can optionally include further data about the reason for the current entity state. The checks are added onto an events queue by the check receivers.
The Flapjack Executive reads events from the events queue and determines which contacts should be notified for them. It runs filters to prevent notifications being fired too often for the same check.
Flapjack Gateways serve as the public interface to Flapjack. There are unidirectional notifiers (e.g.
sms, which read notifications from a queue and send them out to registered contacts, there are bidirectional notifiers (e.g.
pagerduty) which do the above and also offer a back-channel for acknowledgements etc., and there are also
api gateways for retrieving reporting data, creating maintenance periods, acknowledging checks, etc.
In the beginning, there was Executive. This has been split into two separate components, processor and notifier.
Flapjack processor processes events from the events list in redis. It does a blocking read on redis so new events are picked off the events list and processed as soon as they created.
When executive decides somebody ought to be notified (for a problem, recovery, or acknowledgement or what-have-you) it generates a job on the notifications queue.
Notifier picks up jobs from the notifications queue, looks up contact information, applies notification rules and per-media intervals and rollup logic, and then creates notification jobs for the appropriate notification gateways.
flapjack=> starts multiple components (‘pikelets’) within the one ruby process as specified in the configuration file.
flapjack receiver nagios=> reads nagios check output on standard input and places them on the events queue in redis as JSON blobs.
flapjack receiver nsca=> reads the nagios' commandfile output and places them on the events queue in redis as JSON blobs.
flapjack flapper=> runs a daemon that intermittently listens on port 12345 (one minute on, one minute off, …) to be used for generating heartbeat events for end to end monitoring of flapjack
flapjack simulate fail=> simulates a failed check by creating a stream of events for flapjack to process
processor=> processes monitoring events off the events queue (a redis list) and decides what actions to take (generate notification event, record state changes, etc)
notifier => processes notification events off the notifications queue (a redis list) and works out who to notify, and on which media, and with what kind of notification message. It then creates jobs for the various notification gateways (below)
jabber=> connects to an XMPP (jabber) server, sends notifications (to rooms and individuals), handles acknowledgements from jabber users and other commands (xmpp4r)
pagerduty=> sends notifications to and accepts acknowledgements from PagerDuty (NB: contacts will need to have a registered PagerDuty account to use this)
sms_aspsms=> generates sms notifications through aspsms.com
sms_messagenet=> generates sms notifications through MessageNet
sms_nexmo=> generates sms notifications through Nexmo
sms_twilio=> generates sms notifications through Twilio
web=> browsable web interface (sinatra, puma)
jsonapi=> HTTP API server (sinatra, puma)
oobetet=> “out-of-band” end-to-end testing, used for monitoring other instances of flapjack to ensure that they are running correctly
Pikelets are flapjack components which can be run within the same ruby process, or as separate processes.
The simplest configuration will have one
flapjack process running processor, notifier, web, and some notification gateways. The same process will also run the receiver, which receives events from Nagios and places them on the events queue for processing.
We use the following redis database numbers by convention: