Services & Software

UISP, Unifi, Zabbix, VPN, and the other services hosted on or for the Mesh

Alerting and monitoring system
Bookstack wiki software

Adding attachments
Bookstack wiki tips
How to create pages

Connection Troubleshooting
Grafana Monitoring
OSPF Data API
Overview
Phone System and Configuration

Call Termination: Google Voice and Callcentric
Callcentric Inbound/Call Treatments
Callcentric IVR Setup

Security (outdated)
Slack Support Follow Up Bot
Software services list
Wiki (Bookstack)
Zabbix
Website

Website Update Ideas
Media Ideas

MeshDB

MeshDB Schema Design
How to onboard applications to MeshDB

Alerting and monitoring system

Out of date

Existing systems

#monitoring-unms/UISP
Grafana/Prometheus
- public, setup 4 years ago: https://stats.nycmesh.net
- Mesh only, Omni's etc: http://10.70.90.82:3000/dashboards
support report generator

Zabbix

IP:http://10.70.73.58/
Details: Runs on Quincy's server, connected to Beta Slack

Requirements

Must:
- alert Slack team when key infrasture goes offline within 5 minutes
Should:
- be easy to update for new equipment
- be easy to configure to notify new volunteers
- be easy to deploy
- be reliable
- be configurable though a version controlled config to enable easy updates
- be editable by multiple volunteers

Questions

Major
- What key metrics should we alert based on?
Minor
- frequency? ~1 point/hour

Proposed software

Zabbix
Nagios
Grafana
[add your suggestion here]

Next Steps

Log

prompted by this Slack discussion on Grand St. outage
added Zabbix server during Hack night,

Bookstack wiki software

Adding attachments

Only logged in Editors can add attachments.

Click the paper clip icon on the right menu in edit mode:

Click "Upload File":

Once the file is uploaded click the link button to add a link to the current page:

Bookstack wiki software

Bookstack wiki tips

Enable keyboard shortcuts (ex. press "e" to edit page):
- click profile dropdown on upper right
- click shortcuts
- click checkbox to enable shortcuts

Bookstack wiki software

How to create pages

To create a page:

Navigate to the "Book" or "Chapter" where you page will be located.
From the right hand menu select "New Page".
Edit the page and select "Save Page" from the right hand menu.

Connection Troubleshooting

Restart Network If Ping Google.Com && 8.8.8.8 Fails 4 Times

Wireless networks have a bit of a reputation for instability. Modern hardware has fixed most hardware problems, but there is work that needs to be done to make the firmware reliable. You can do this with "watchdog" scripts. I haven't had to reboot a router that is running our watchdog script.

Our firmware image (based on qMp) comes with a "bmx6health" script that checks whether the mesh software is running correctly and restarts it if necessary. This script by default runs once per day. I've found it better to run this every 5 minutes. You can do this by editing the crontab-

ssh into the router and in the terminal-

crontab -e

This opens a vi editor and you can change or add different scripts to run at different times. (The vi commands you need are "i" to insert, "esc" to stop editing, and ":x" to save and eXit.)

For some nodes, their main purpose is to be an internet gateway. To ensure that they always try to be online, you can add a watchdog script that pings a known website and calls "network restart" if it fails. These kind of scripts often ping 8.8.8.8, which is Google's DNS server.

I've discovered 3 ways to recover a qMp mesh router that has functioning wifi but has lost internet- network restart, bmx6 restart and restarting dnsmasq-killall dnsmasq; dnsmasq start. Sometimes the dns forwarder, dnsmasq will stop working correctly letting you ping some things and not others. dnsmasq will then forward bad dns info to the other routers too so it needs to be fixed quickly! killall dnsmasq; dnsmasq start will fix it.

gwck is a qMp utility that is restarted after network restart.

Another problem I've had occasionally is that the wifi will lose connections. Even though the radio is on and the router lights are normal you can't connect. I've written a simple script to restart wifi if both the ad-hoc and access point interfaces have no connections. It is a bit of a hack since the interface may be ok, but since nothing is connected via wifi it doesn't hurt too much to restart it. I've also found that a network restart is necessary to make the wifi stable.

By default wlan0 is the ad-hoc interface that is used to mesh the routers and wlan0ap is the access point. This script checks to see the number of wireless interfaces so it works with dual-band routers and routers that are only ad-hoc or ap.

I'm using "Signal: unknown" to show there is no connection. It seems to work reliably. You could also try iwinfo wlan0 assoclist.

"sleep 5" is usual between "wifi down" and "wifi up". I've found it not necessary when there are no connections, but I'll leave it there in case.

You can download the watchdog here

in the terminal-

vi /root/mesh-watchdog.sh

and paste this:

#!/bin/sh
# mesh-watchdog v1.1.1, NYC Mesh, Brian Hall

restartWifi()
{
  wifi down
  sleep 5
  wifi up
}

restartNetwork()
{
  /etc/init.d/network restart
  if /etc/init.d/gwck enabled; then
    /etc/init.d/gwck restart
  fi
  /etc/init.d/bmx6 restart
  sleep 4
  killall dnsmasq
  /etc/init.d/dnsmasq start
}

#gets date-time from log and exit if recently run. date-time is first two words of last line
exitIfRecentRestart()
{
if [ -e $LOG ]; then
  set -- `tail -1 $LOG`
  LASTRUN=`date --date="$1 $2" +%s`
  if [ "$?" = "0" ]; then
    #don't run for 1200s (20 minutes)
    NEXTRUN=$(($LASTRUN + 1200))
    NOW=`date +%s`
    es=$(($NOW - $LASTRUN))
    printf "time since last restartNetwork: "
    printf '%dd %dh:%dm:%ds\n' $(($es/86400)) $(($es%86400/3600)) $(($es%3600/60)) $(($es%60))
    if [ $NOW -lt $NEXTRUN ]; then
      echo "waiting $(($NEXTRUN - $NOW)) seconds, use option -f to force"
      exit 1
    else
      echo "run tests-"
    fi
  else
    echo "invalid date from log, run tests-"
  fi
else
 echo "no log, run tests-"
fi
}

LOG="/tmp/log/mesh-watchdog.log"
FORCE=0

if [ "$1" = "-n" ]; then
  echo "restartNetwork"
  restartNetwork
  exit 1
elif [ "$1" = "-f" ]; then
  echo "force tests-"
  FORCE=1  
elif [ "$1" = "-w" ]; then
  echo "restartWifi"
  restartWifi
  exit 1
elif [ "$1" = "-b" ]; then
  echo "restart wifi, wait, restart network"
  restartWifi; wait 60; restartNetwork
  exit 1
elif [ "$1" != "" ]; then
  echo -e "Usage: `basename $0` [OPTION]\n\nTests wifi and internet connections and restarts if necessary (default)\n\n\t-f\tforce test\n\t-n\trestart network\n\t-w\trestart wifi\n\t-b\trestart both wifi and network\n\t-h\toptions\n"
  exit 1
fi

if [ $FORCE != 1 ]; then
  exitIfRecentRestart
fi

DATE=`date +%Y-%m-%d\ %H:%M:%S`  
IWINFO=`iwinfo`

# find lines containing "ESSID"|get name (previous word)|replace return with ","
WI=`echo "$IWINFO" | grep ESSID | grep -Eo '^[^ ]+' | sed ':a;N;$!ba;s/\n/, /g`
# count the number of wlan interfaces, and number of wlans with 'no signal'
WLAN=`echo "$WI" | wc -w`
NOSIGNAL=`echo "$IWINFO" | grep 'Signal: unknown' | wc -l`

if [ $WLAN -eq 0 ]; then
  echo "no wlan interfaces, wifi is probably disabled"
elif [ $WLAN -eq $NOSIGNAL ]; then
  # all wlan interfaces are down, so restart wifi
  echo "$DATE restart wifi- wlans:$WLAN no-signal:$NOSIGNAL interfaces:$WI" | tee -a $LOG
  restartWifi
  sleep 60
  restartNetwork
  exit 1
else
  echo "wifi:ok wlans:$WLAN no-signal:$NOSIGNAL interfaces:$WI"
fi

# restart network if ping google.com && 8.8.8.8 fails 4 times
count=1
while [ "$count" -le 4 ]
 do
   if /bin/ping -c 1 google.com >/dev/null && /bin/ping -c 1 8.8.8.8 >/dev/null; then
      echo "wan:ok  ping-count:$count"
      exit 0
   fi
 let count++
done
echo "$DATE network restart" | tee -a $LOG
restartNetwork

Make it executable-

chmod +x /root/mesh-watchdog.sh

Afterwards, add the following entry with crontab -e

* * * * * /root/mesh-watchdog.sh

It can run once a minute as it detects whether a network restart has just occurred and will wait 20 minutes before restarting again. I added the 20 minute delay so the router is still functional without an internet gateway.

Thanks to Nitin for help with the wifi problem and Zach for help with dnsmasq.

Email me if you have any questions or suggestions.

Grafana Monitoring

Services

Grafana

URL: http://10.70.90.82:3000/
Contents: Dashboards

Prometheus

URL: http://10.70.90.82:9090
Contents: Main hub data
Datesource name: Prometheus

Updating SNMP Data Scraper

Enable SNMP on the device
- login to RouterOS with the device IP
- in webfig go to IP>SNMP and enable
- save
- in quick set note the router ID
Update Prometheus config
- ssh root@10.70.90.82
- nano /opt/prometheus-2.39.1.linux-amd64/prometheus.yml
- add IP for device at the end and use router ID from above in the comments
- save the file
- systemctl restart prometheus.service
- if anything is unclear feel free to look at the command history with the history command
Update Grafana
- Go to a Grafana page where you want to add the new panel, ex. http://10.70.90.82:3000/d/EfHFIMWSz/nostrand-5283?orgId=1
- login with standard password
- duplicate panel
  - right click panel header > more > duplicate
- rename using router id above
- replace the IP
  - not all devices have the same metrics, so you may have to select a different one
- Make sure to save the dashboard!
- If everything worked you should see data

Prometheus-1

URL: http://10.70.76.98:9090
Contents: Omni data
Datesource name: Prometheus-1

OSPF Data API

Ever wanted to play with the OSPF link data yourself but didn’t want to take the time to build an OSPF node or parse the output from bird? Look no further than the OSPF JSON API:

> curl http://api.andrew.mesh/api/v1/ospf/linkdb
{"areas": {"0.0.0.0": {"routers": ... }}}

Data Available

The full state of the entire OSPF network on the mesh is available via this endpoint. The format is as follows:

{
  "areas": {
    "0.0.0.0": {
      "routers": {
        "<router_id>": {
          "links": {
          	"router": [
          	  {"id": "<other router id>", "metric": <integer link cost>},
          	  {"id": "<other router id>", "metric": <integer link cost>, "via": "<another router id>"},
          	  ...
            ],
            "external": [
              {"id": "<external CIDR>", "metric": <integer link cost>},
              ...
            ],
            "stubnet": [
              {"id": "<stubnet CIDR>", "metric": <integer link cost>},
              ...
            ],
			"network": [
              {"id": "<network CIDR>", "metric": <integer link cost>},
              {"id": "<network CIDR>", "metric2": <integer link cost>},
              ...
            ]
          }
        },
        ...
      },
      "networks": {
        "<network CDIR>": {
          "dr": "<router id>",
          "routers": [
            "<router id>",
            ...
          ]
        },
        ...
      },
    }
  },
  "updated": <integer epoch timestamp>
}

For each router, you can see the OSPF links it is advertising, and which type of link they are. Some links have a metric2 value instead of a metric value. This represents a semantically meaningful difference in that router's configuration and the OSPF behavior for that node, but one that is is beyond the scope of this document to explain.

Update Frequency

The server refreshes the JSON data blob once per minute, see the updated field in the top-level JSON object to confirm data freshness.

Authentication

None. This data is publicly available to any OSPF node, so no authentication is needed when accessing from the mesh private IP space.

Source Code?

GitHub Link!

Contact

The OSPF data API is maintained by Andrew Dickinson. Reach out to @Andrew Dickinson on slack for questions and comments. I'd love to see what you build.

Overview

This page intends to list the services "hosted" on NYC Mesh and available directly to NYC Mesh members. Some may be available only to NYC Mesh members while some may as well be available from the Internet via a Public IP address (or through Public DNS)

They are different type of services. Some are network specific or meant for devices, such as DNS or NTP, others are more people oriented such as an email server or video chat server.

If you do host a service that you would like to make available to the Mesh Community please let us know so we can add it here.

You can also discuss services on our slack channel #mesh-services

Network services

Public services

NYC Mesh Meet by @Zach
ExcellentFiles by @Eric Zhu.

It is a free file host hosted on sn3. Anyone can get 10G of free storage. It can support around ~25 users for now.
"I choose Nextcloud because it is very user friendly, and there is a nice mobile app, and desktop sync app. I have also enabled contacts + calendar sync. I use it myself coz i want to rely on other services less; to be more autonomous :)"

Mastodon on @Daniel Heredia's server at SN3, open to all.
NYC Building KML Tool by @Daniel Heredia, takes two address and uses NYC DCP and DOB databases to create a KML line between the rooftops to determine LoS (code).

Projects Services that are in development...

Support Bot on Slack to automatically respond to #support channel inquiries
Chat app by @George on slack

Phone System and Configuration

Call routing and automation flow spanning two third-party services for Voice-over-IP (VoIP) operation from publicly switched telephone network (PSTN) to virtual and physical endpoints.

Phone System and Configuration

Call Termination: Google Voice and Callcentric

Background

Early 2022 the 833-NYC-MESH number was parked at NumberBarn until further research could be done on how to implement the number. In the meantime the Google for Nonprofits platform was used to get access to Google Voice for a preliminary IVR/auto-attendant solution for routing calls to particular Google Workspace users and email to Slack for voicemail.

Google Voice does not support porting-in toll-free numbers for terminating calls and thus an external provider must be used in conjunction with the 833-NYC-MESH number. Currently, Callcentric has been chosen as our SIP provider due to widespread adoption, low rates, and positive testimonial, though the features it provides can all be replicated using a SIP Trunk->IP PBX architecture.

Routing

The 833-NYC-MESH number is ported into Callcentric for inbound termination and outbound origination. Currently, the Call Treatments feature forwards all incoming calls to the Google Voice auto-attendant "Hotline - Main", though the beta IVR setup through Callcentric's internal portal is partially configured and can be switched over at any time for testing.

Pricing

After a rigorous cost-benefit analysis on the plans available through Callcentric compared to the cost of expanding the Google Voice service further, multiple scenarios were drawn out to compare the price per a fixed number of minutes per month.

This Google Sheet has the pricing for pay-as-you-go inbound and outbound calls (which results in a double charge due to the forward to the Google Voice Hotline Root is billed as well as the inbound to the toll-free number), the 500-minute package for outbound calls (to eliminate the double charge for forwarding to Google Voice and regular outbound calls, and eliminating forward to Google Voice all together and handle calls entirely internally.

Based on findings after testing the plans and billing behavior in Callcentric, it was determined that the 500-minute package makes the most sense for preserving Google Voice auto-attendant integration while saving a small amount of funds, while switching entirely to Callcentric for IVR handling is most cost-effective, though there are drawbacks to needing to manually configure SIP endpoints for all users (though those endpoints are more flexibly configured than Google Voice Users.

Callcentric Cost Calculator - Google Sheets

Phone System and Configuration

Callcentric Inbound/Call Treatments

Call Treatments in this context are practically synonymous with Inbound Routes, which may be more commonly seen in VoIP configuration software. NYC Mesh owns two direct-inbound-dialing (DID) numbers registered with Callcentric, toll-free DID 833-NYC-MESH (18336926374) and local DID 13475147546. For more information about how calls are routed through the Callcentric portal, see this separate page, but this page outlines both currently used and unused options that can and are be used in routing NYC Mesh calls.

Interface

"Treatments"

833 to GV: Catches calls coming from the toll-free DID and forwards to the Google Voice "Hotline - Main" auto-attendant.
347 Ring Group: Catches calls coming from the local DID and does a simultaneous ring between the Mesh Room desk-phone and the voice@nycmesh.net Google Voice User, which is responsible for voicemail.
833 to IVR: Future setup to catch calls coming from the toll-free DID and forwards to the internal IVR: 1 - Main - Language.
347 IVR: Testing route to catch calls coming from the local DID and forwards to the internal IVR: 1 - Main - Language. This is not for production use, but to avoid incurring costs when configuring the Callcentric IVR through the toll-free rate plan.

Parameters

The Callcentric-provided documentation can be found here.

The main nuance with the currently-implemented setup is the use of "Ring for (seconds)". In all Treatments and IVRs, any SIP extensions are set to time out at 20 seconds of ringing, while the forward to the Google Voice endpoints is set to 60 seconds of ringing. This is to allow the Google Voice endpoint to connect the call to its voicemail for now.

Phone System and Configuration

Callcentric IVR Setup

Callcentric has a built-in IVR utility that allows for practically infinite permutations of menus, auto attendant scripts, and scriptable interactive call flows using Call Treatments. The flow itself can be found here, but this page is solely to contain the scripts for uploaded recordings for all announcement and menu audio files when read through NaturalReader software "Guy Online (Natural) (Free)" voice read at x1 speed extracted using Audacity software using the Windows WASAPI loopback device as an audio recording into a WAV file, which can be easily uploaded through the portal as described below.

Setup

On the left, you can add and configure IVRs and their menu trees which follow the structure and naming convention listed in the below section. On the right, MP3 or WAV audio files below 1Mb can be uploaded to be used within the IVRs. There is a built-in validator to ensure there are audio files in the mandatory places for calls to be handled correctly.

When adding an IVR or clicking "modify" on an existing IVR, the Edit IVR screen will open. On the left, the Announcement Audio selection is for audio files to be played only once when entering the IVR, and the Menu Audio selection is for audio files to be played after the Accountment Audio, and repeatedly after User error events such as timeout or invalid entry. This audio can be controlled in the User error audio selection, which currently only plays a built-in female voice "Sorry".

On the right, there are multiple options to route calls based on user entry, between direct transfers to extensions, sending to other IVRs, or connecting to other menus through a transfer. Depending on the setting of Repeat on error, after the error limit is reached the call will terminate.

Call Tree Key

Audio file names are based off of the menus where they are used, either as a menu option or as an announcement. Files that begin with 0 refer to common elements shared among multiple root hotlines. Items in red are options are either planned but not implemented or ideas pending discussion.

IVRs in the 0 zone:

0. a. iv. Common - English - Org Info
Non-Default Parameters: Timeout: 0 sec
Comment: with no Last Route setting configured, the call just drops per the documentation. It would be nice to send this back "up" a menu but unfortunately it doesn't appears that there is any option that allows you to select the previous IVR menu.

Hotline Roots:

Main - Language (Root Hotline)
a. Main - English - Menu
Non-Default Parameters: Repeat on error: 3, User error audio: Sorry
i. To Get Connected: Simultaneous ringing to Mesh Room and Marco/VM
ii. Tech Support: Single forward to Marco/VM
iii. Buildings Projects Fiber: Single foward to Mesh Room
iiii. Org Info: Special forward to IVR 0.a.iv - Org Info
b. Grand - Spanish- Menu (doesn't exist yet!)
c. Grand - Chinese- Menu (doesn't exist yet!)
Grand - Language (Root Hotline)
a. Grand - English - Menu
Non-Default Parameters: Repeat on error: 3, User error audio: Sorry
i. To Get Connected: Simultaneous ringing to Mesh Room and Marco/VM
ii. Tech Support: Single forward to Marco/VM
iiii. Org Info: Special forward to IVR 0.a.iv - Org Info
b. Grand - Spanish- Menu (doesn't exist yet!)
c. Grand - Chinese- Menu (doesn't exist yet!)

Text-to-Speech Audio Files and Scripts

Comment: The pound keys are not truly configurable in Callcentric, and despite the script advising its use to repeat the menu, it triggers the User error audio and subsequently the Last route if pressed after the third failure, which disconnects the call.

0a - Root - Language - English

To continue in English, press 1.

0ai - Root - English - Get Connected

To get connected to the mesh, press 1.

0aii - Root - English - Tech Support

For technical support, press 2.

0aiv - Root - Org Info

For more information about our organization, press 4.

0aiv - Root - Org Info - Info

NYC Mesh is a community network offering fast, affordable, and fair access to the Internet for all New Yorkers. By joining NYC Mesh, you can access the Internet while helping your neighbors get better and more accessible internet access. NYC Mesh is a neutral network and we do not monitor, collect, or store any user data or content.

For more information about our community network, visit our website at n y c mesh dot net, and find a list of frequently asked questions and answers at n y c mesh dot net slash f a q.

1 - Main - Language - Thank You

Thank you for calling NYC Mesh.

1aiii - Main - English - Buildings Projects Fiber

For buildings, projects, and fiber installs, press 3.

2 - Grand - Thank You

Thank you for calling NYC Mesh at Grand Street Guild.

Incomplete Recordings

The Callcentric call handling only has IVRs with English. The entry points for other languages would be formatting along the lines of the below:

0b - Root - Language - Spanish

0C - Root - Language - Chinese

Security (outdated)

Security

The goal of this document is to provide the most useful information for anyone interested in the security of the network. If there is missing information that would help understand and improve our network, please reach out to contact@nycmesh.net or join our slack.

We are actively looking for ways to improve the security, resiliancy, and ease-of-use of the network to help the widest range of use cases. If you have ideas on how to improve anything, please join our slack

Our current threat landscape is most concerned with in-mesh security - once traffic is routed over an IXP, provider gateway, or peer, its equivalent to what people are used to.

In mesh threats include:

DoS by announcement of bogus routes
MiTM attacks on SSL servers using letsencrypt (should be alleviated by multiroute verification if we interconnect in more places)
Visibility of who you talk to when using unencrypted HTTP, DNS queries, SNI, etc for someone along the route chain

Data

We do not keep logs of anything in-mesh. However anyone along the route chain could view unencrypted data or metadata (just like any ISP can).
The organizers of nyc mesh can see a spreadsheet of signup information volunteered by participants on the join nycmesh page (name, email, phone, address all but email are optional)
We create a map using map-nodes, from the above spreadsheet

Wifi

A typical home install creates two wireless networks - one open 802.11 access point (with a captive portal), and one WPA2 encrypted upstream gateway. You can change the open access point to be encrypted if you wish.

DNS

The default setup routes .mesh tld DNS requests to 10.10.10.10, which is anycast. Multiple people are running our knot-dns setup available on github (including supernode 1 at 10.10.10.11), but a malicious actor that is closer could take advantage of this.

Slack Support Follow Up Bot

Features

- ticket created on 1st support thread
- - subject includes “follow-up-bot: ”
- if no slack response - every 48 hr nag up to 3 times and then reopen osticket and email nag with auto-re-close.

Problems to solve

member has issue that gets forgotten about after reporting on slack thread
support thread is never responded to by a volunteer
atypical support threads
- volunteer message to many people
  - Is this a community announcement?
    - if no or no response, then run support bot

Complication

slack threads are not structured causing false positives, identical treatment for different types of threads
someone responds out of thread

Programming

need database
- ignore multiple threads

Process

content matches goes to funnel
automated follow up after 48 hours
- is issue resolved?
  - if no response after 3 cycles then reopen ticket
    - reopen in OS ticket and send message and recloses
    - false positives
  - if yes, then say thank you and do nothing (maybe record analytics somewhere)
    - stop nagging
  - if no
    - should stay in slack

Diagram

out of date

link

Software services list

Incomplete list - add your service!

Name	Purpose	Link	Active?	Maintained by
Supportbot	Help diagnose support issues	https://github.com/nycmeshnet/nycmesh-support-bot	Yes	Andrew + Andy
Grafana Private		http://10.70.90.82:3000/dashboards		Olivier + Andy
Millimeter Outages		On Grafana Private (dashboard link)		Andy Baumgartner
Grafana Public		https://stats.nycmesh.net		Zach
Mastodon	Self hosted Twitter alternative		Yes	Daniel
Wiki	Evergreen docs, etc.		Yes	Andy Baumgartner
UISP	Manage Ubiquity devices	https://uisp.mesh.nycmesh.net	Yes	Olivier
OSPF JSON API	Access OSPF Link DB data without running an OSPF node	http://api.andrew.mesh.nycmesh.net/api/v1/ospf/linkdb	Yes	Andrew Dickinson
OS Ticket	Support and install tickets	https://support.nycmesh.net/scp/login.php		Jason
Node Explorer	Shows OSPF Graph	http://node-explorer.andrew.mesh.nycmesh.net/explorer	Yes	Andrew Dickinson
Node Impact Analyzer	Show downstream nodes affected by outage	http://outage-analyzer.andrew.mesh.nycmesh.net/	Yes	Andrew Dickinson
Contacts Map	Shows emails of nodes associated with a hub	http://10.70.178.21:5000/	Yes	Andy Baumgartner
Uptime Kuma	Monitor Mesh services uptime and alert in slack	http://10.70.178.21:3001/	Yes	Andy Baumgartner
Status Page	Mesh Status Page	http://status.mesh.nycmesh.net	Yes	Willard + Andy + Lydon
Zabbix	Metrics and Alerting	http://zabbix.mesh.nycmesh.net	Yes	Willard
UISP2Zabbix	UISP -> Zabbix Broker	zabbix.mesh.nycmesh.net	Yes	Willard
OSPF2Zabbix	OSPF Device Enroller and Noise Report Generator	zabbix.mesh.nycmesh.net	Yes	Willard
Mesh DB	Database to hold mesh network information	https://db.grandsvc.mesh.nycmesh.net/	Yes	Willard + Andrew

Wiki (Bookstack)

Bookstack is a user friendly Wiki software which the NYC Mesh Wiki is built on.

MVP Wiki Launch Features

Before the Wiki can be more broadly used (and and possibly replace docs.nycmesh.net) we must add some key features:

Required

Nice to have

Mesh hosting - Currently on AWS, but we could move to Mesh hosting fairly easily.
Mesh LAN based user creation - Ideally a member on the mesh should be allowed to create an editor account without involving admins. This is possible with a custom theme but needs more development. A disabled in-development theme file is included in the themes directory.

Zabbix

Zabbix lives at http://zabbix.mesh.nycmesh.net

Zabbix is used primarily for historical data collection and Slack. There are a handful of dashboards configured for a few devices, but for the most part, the rest of its configuration is unused.

Data Collection

Zabbix is fed through the following sources:

Data gathered via SNMP from various OSPF devices (mainly OmniTiks) discovered through OSPF2Zabbix
Data forwarded from the UISP API by UISP2Zabbix

Custom Templates

We have a variety of custom templates, some of which were set up manually at one point, the rest either auto-generated or managed by one of the above tools.

Alerting

The main purpose of Zabbix is Alerting. Alerting can be found in the #zabbix-alerts channel. Alerts need to be tuned to what we really care about, such as the antennas on the larger links.

To make a trigger show up in Slack, add the slack tag to it.

The trigger can be any severity level. By default, many triggers are straight-up disabled. Alerting is, unfortunately, a manual process. We're still figuring out what is important and what isn't.

Weekly reports of noisy triggers are published in #zabbix-reports, where the top 20 noisiest triggers are aggregated. This can help us identify problems over time.

Todos:

There is a plan to use certain triggers to automatically switch over links. For example, we'd like to disable the AF60xr on Vernon and use a backup link when it rains.
(Willard): I was working on a service to generate Zabbix templates from MIB files using the Zabbix API. I'd like to tailor it towards specific Ubiquiti devices and use it + the UISP API to discover compatible antennas and use the SNMP data to enrich our DataLink data.
(Willard): Expand UISP2Zabbix to cover more than just DataLinks. It would be cool to get all kinds of data out of it and into Zabbix for analysis
(Willard): Problem heatmap. If I could overlay problems on top of Andrew's Node Explorer, we could see problem areas within the mesh.
(Willard): Integrate Grafana with Zabbix. I know this is possible, the question is what's the best way to do this? And, if we're primarily doing this for UISP, then why not build something that integrates with UISP? (Couldn't be that hard to just query UISP's database directly, right?)

More Info

For (outdated-ish) information on how this was set up, including how Slack alerting was configured, refer to this doc: https://docs.google.com/document/d/1mJI8DWe882P6GCEGdT0xazxwrrCQZD7qEBcsDEjDU7Q/edit?usp=sharing

Website

https://www.nycmesh.net/

Website

Website Update Ideas

Add your website update ideas!

Branding

Clean but playful design: Incorporate hand drawn graphics using standard Mesh color palette
Photo integration: show our physical network
- Photos of our diverse member community
- Photo visualization of mesh: like the below example, but with more playful hand draw lines etc. to highlight our community values:
Clear graphics for most important topics (also good for non English speakers)

Consistency

avoid repeating information (e.g., the $290 install fee is mentioned on a number of pages, sometimes referred to as a donation and sometimes as an equipment cost). use links to an authoritative page instead, so updates will be reflected.

Interactive interface

Users could benefit from interactive design

pre-join/new member presentation
troubleshooting flowchart
install-team sign-up

Website

Media Ideas

[add your photo, graphic, video, etc. ideas here!]

MeshDB

MeshDB Schema Design

Background

MeshDB is an under-development software application with the goal of replacing the New Node Responses Google Sheet (the spreadsheet) as the source of truth for NYCMesh member, install, geolocation, device, and connection information via a proper SQL database. It is built in the Django ORM, using Python Model objects to represent underlying database schema structures. The schema used for development up to this point is unable to faithfully represent some edge cases that occur at atypical NYC mesh sites. In this document, we propose a modified schema and explain each edge case, detailing how the edge case will be represented under the proposed schema

The Schema (Simplified)

The following diagram depicts the proposed schema, showing the relationships between models (SQL tables), and some key attributes of each model. For clarity, non-essential attributes are omitted (see appendix A for a comprehensive diagram).

We propose the following models:

Member - Represents a single NYC Mesh Member (even if they have moved between multiple addresses and therefore have multiple installs or "own" multiple active installs ). Tracks their name, email address, and other contact details
Install - Represents the deployment (or potential deployment) of NYC Mesh connectivity to a single household. This most closely maps to the concept of a row in the spreadsheet. Tracks the unit number of the household, which member lives there, which building the unit is located within. It is keyed by install number, which corresponds to row number on the spreadsheet. With foreign keys to Member, Building, and Device, it acts as the central model, tying the entire schema together. Many objects have a status field, but the install status field maps most closely onto the status tracked in the spreadsheet today. Completed Installs have a foreign key to the device field (via_device) which keeps track of the device they use to connect to the mesh
Building - Represents a location in NYC identified by a single street address (house number and street name). In the case of large physical structures with more than one street address, we will store one Building object for each address that we have received Install requests for. Buildings track a primary network number, to represent the way the site is referred to colloquially. In the case that a building has more than one network number, the primary network number will be set to the one volunteers designate as the “primary” (usually the first assigned, busiest router, etc.)
Device - Represents a networking device (router, AP, P2P antenna, etc.). Most closely corresponds to a “dot” on the map. Not comprehensive of all devices on the mesh, only those that need a map dot. For big hub sites, this may be only the core router. Contains a mandatory field for “network number” (NN) which will be set to the NN of the device, or of the “first hop” router used by this device (for devices like APs which have no NN assigned). It contains optional lat/lon override fields, which can be used to refine the exact location of this device (e.g. for map display). When no lat/lon are provided for a device, is it assumed to reside at the lat/lon of the building it is associated with (via the Install model). Devices can optionally track which install delivers them power, via a powered_by_install foreign key to the Install model, which tells us which unit has the PoE injector.
1. Sector - A special type of device (using Django Model Inheritance to inherit all fields from device) which adds additional fields related to the display of sector coverage information on the map (azimuth, width, and radius)
Link - A connection between devices, which represents a cable or wireless link, whether directly between the devices or via other antennas not represented with their own device objects

Example 1 - NN492 - Typical Multi-Tenant Install

In this simple example, we have two tenants in a single building with a single address, both connected via cables directly to an omni on their shared roof. They are connected to the rest of the mesh via an LBE to Saratoga. The database tables for this scenario look like this:

Installs
Install Number	Via Device	Building
13134	1	1
13276	1	1

Buildings
ID	Primary NN	Address	BIN
1	492	216 Schaefer Street	3079532

Devices
ID	Network Number	lat/lon overrides
1	492	-

Links
ID	From Device	To Device
1	1	<saratoga device id>

Example 2 - NN 4734 - Cross-Building Installs

In this example, members in 3 adjacent buildings, each with their own address, are connected via a single omni, with cable runs across the roofs directly to the member’s apartments. They are connected to the rest of the mesh via an mant 802.11 sector at 4507. The database tables for this scenario look like this:

Installs
Install Number	Via Device	Building
4734	2	2
6972	2	3
13663	2	4

Buildings
ID	Primary NN	Address	BIN
2	4734	31 Clarkson Ave	3115982
3	4734	25 Clarkson Ave	3115985
4	4734	27 Clarkson Ave	3115984

Devices
ID	Network Number	lat/lon overrides
2	4734	-

Links
ID	From Device	To Device
2	2	<4507 device id>

Example 3 - 7th Street (NN 731) - Multiple Omnis on one building

In this example, we have one regular tenant in a single building with a single address. However there is also a rooftop office with its own omni, connected wirelessly to the primary one. They are connected to the rest of the mesh via a GBELR to Grand. The database tables for this scenario look like this:

Installs
Install Number	Via Device	Building
731	3	5
12985	4	5

Buildings
ID	Primary NN	Address	BIN
5	731	190 East 7th Street	1086499

Devices
ID	Network Number	lat/lon overrides
3	731	-
4	311	x, y

Links
ID	From Device	To Device
3	3	4
4	3	<1932 device id>

Example 4 - Vernon (NN 5916) - Courtyard APs

In this example, we have a core hub site in a single building with a single address. However, there are many Access Points (APs) on light poles in the building’s courtyard. These light-poles are unquestionably associated with the same building/address as the core router of this hub, but need to be shown separately on the map.

In this scenario, we treat the light poles as if they are “apartments” in the Vernon building. They each get their own install #, but imagining a tenant living in the light pole, we say that this imaginary install is “connected via” a device object representing the AP. The network number for these APs is set to 5916, reflecting their first hop router (and the fact they are not themselves assigned NNs). Links between the courtyard APs and the core router are included so that they are rendered on the map

The database tables for this scenario look like this:

Installs
Install Number	Via Device	Building
5916	5	6
6345	-	6
11875	6	6
11876	7	6
11877	8	6
11878	9	6
11879	10	6
11880	11	6

Buildings
ID	Primary NN	Address	BIN
6	5916	303 Vernon Avenue	3042881

Devices
ID	Network Number	lat/lon overrides
5	5916	-
6	5916	x, y
7	5916	x, y
8	5916	x, y
9	5916	x, y
10	5916	x, y
11	5916	x, y

Links
ID	From Device	To Device
5	5	<SN3 device id>
6	5	<grand device id>
7	8	9
8	8	6
9	6	5

Example 5 - Prospect Heights (NN 3461) - Multiple NNs for one building

In this example, we have a core hub site in a single building with a single address. The primary NN 3461, also serves a member’s apartment as install #3461. However, there is another apartment which could not due to practical considerations be connected via a cable, and had to be connected via an antenna in their window to a sector on the roof. This antenna needed an NN for configen and naming, and so this building received multiple NNs.

The database tables for this scenario look like this:

Installs
Install Number	Via Device	Building
3461	3461	7
3921	-	7
6723	-	7
11024	377	7
14399	-	7
14960	-	7

Buildings
ID	Primary NN	Address	BIN
7	3461	135 Eastern Parkway	3029628

Devices
ID	Network Number	lat/lon overrides
12	3461	-
13	377	x, y

Links
ID	From Device	To Device
10	12	<SN3 device id>
11	13	12

Example 6 - Jefferson (NN 3606) - Multiple NNs for multiple buildings

In this example, we have a building with 4 addresses and 3 omnis on the roof, each with its own network number. There is no clean mapping between NNs and addresses, since each omni serves installs in multiple buildings. The omni of the primary NN, 3606, provides the uplink to Hex House (NN 1417).

The database tables for this scenario look like this:

Installs (omitting abandoned & potential for brevity)
Install Number	Via Device	Building
3606	14	8
5933	15	8
7177	15	8
8152	16	8
8274	14	9
8085	16	11

Buildings
ID	Primary NN	Address	BIN
8	3606	476 Jefferson Street	3819572
9	3606	488 Jefferson Street	3819572
10	3606	28 Scott Avenue	3819572
11	3606	16 Cypress Avenue	3819572

Devices
ID	Network Number	lat/lon overrides
14	3606	x, y
15	5933	x, y
16	169	x, y

Links
ID	From Device	To Device
12	14	<1417 device id>
13	14	15
14	15	16
15	16	14

Appendix A - Full Schema Diagram

The following is a complete schema diagram, showing all fields. New additions from the current implementation are shown in yellow, and removed fields are shown in red

MeshDB

How to onboard applications to MeshDB

Adding a new user for an application

Navigate to the admin portal at db.grandsvc.mesh.nycmesh.net/admin/ and select add user

Make a new user specifically for the application, not just the author of the application. For example, if Andy is creating an application to measure member distance to link NYC kiosks, don't create a user called AndyB, create a user called AndyB-LinkNYCKioskTool. For the password, enter something secure, like a random password generated by your browser, but there is no need to save this password, we will use a token to authenticate this user.

Save the user, and then click on the username in the Users list to add the necessary permissions directly on the user object. Do not add the user to any groups. Do not grant the user Staff or Superuser permissions

Use the arrows or double click to select permissions from the list of all possible permissions the application could be granted. Most applications do not need change/delete/add permissions. In this example, we grant Andy's tool "view" access to the Install, Building, and Member tables. Save the changes you've made to the user object.

Adding an API token

Follow the instructions under Adding a new user for the application above. Then select "Add" next to Tokens. Select the user you just created in the dropdown provided

Save the new token, then send it to the author of the application. For more information on using this token to query the API, see the API docs here: https://db.grandsvc.mesh.nycmesh.net/api-docs/swagger/

Adding a new web hook recipient

Follow the instructions under Adding a new user for the application above. You may use the same "User" object for both tokens and web hooks if they are for the same application.

Select the "Add" button next to Webhook Targets, then use the magnifying glass icon to select the user you created for this application. Enter the target URL for the notification delivery (will be provided by the application owner). This URL will receive an HTTP POST request every time the selected event is fired.

Select the appropriate event in the dropdown based on the event the application needs to receive, and save. If the application needs to receive more than one event type, add a separate webhook target for each event they need to receive.