Skip to main content

Alerting and monitoring system

Existing systems

Requirements

  • Must:
    • alert Slack team when key infrasture goes offline within 5 minutes
  • Should:
    • be easy to update for new equipment
    • should be easy to configure to notify new volunteers
    • be easy to deploy
    • be reliable
    • be configurable though a version controlled config to enable easy updates
    • be editable by multiple volunteers

Questions

  • frequency? ~1 point/hour

Proposed software

Log