by Tykling
05. nov 2020 11:34 UTC
I've been playing with mobile modems recently, and found myself in need of a good way to monitor the signal strength of the modems. Standard AT commands can be used to query each modem about its (perceived) signal strength. I want to get these numbers periodically and put them into Prometheus so I can make myself a nice Grafana dashboard showing how my modems are doing.
This innocent sounding task ended up spawning a couple of new small software project to get the job done. First I needed something scriptable to send a command to a serial device and get the output, like an Expect
script but from the commandline, without writing a script as such. Ideally I just wanted to do echo "AT" | something
and get "OK" on stdout, so it was also usable as a general tool for quick little serial commands.
It turns out such a tool didn't really exist, or I was unable to find one at least. I decided to write one, and the result is PipeSerial (install from PyPi for now). I use it in the monitoring script to get the signal strength from my modems, but it could be used for any sort of serial device interaction where you need to send a payload to a serial device and return the output. It is just a wrapper around on pySerial and Pexpect (with the convenient pexpect-serial acting as glue between the two).
Once that was in place I needed something to use my new PipeSerial tool to get modem signal strength/quality and put it somewhere in a format which can be picked up by Prometheus. This became mobile_modem_exporter (install from PyPi for now), a Prometheus exporter which uses PipeSerial and the Python Prometheus client library to export the metrics I need to a location ingested by the Node Exporter Textfile Collector.
The finished dashboard looks like this. I've created alerts in Prometheus for low dBm
, no traffic the interfaces and a few other things. I am pretty happy with it!
First things first: I need to get the raw data out of the modems. The AT+CSQ
command is the standard method of getting signal strength information from the device, it should work on all mobile modems. Each vendor also may have some additional proprietary commands which can return more detailed information, but for this I am using AT+CSQ
. The output looks like this:
AT+CSQ +CSQ: 18,99 OK
With 18 being rssi
and 99 being BER
(which is not supported on these modems and thus always 99). An RSSI of 18 on this modem means a dBm of -77 (because (18*2)-113=-77, more info below).
Getting the data periodically with pipeserial
is as simple as:
[tykling@container1 ~]$ while true; do date; echo "AT+CSQ" | sudo pipeserial /dev/modem-quectel-control ; sleep 10; done Thu Nov 5 11:16:08 UTC 2020 AT+CSQ +CSQ: 22,99 OK Thu Nov 5 11:16:19 UTC 2020 AT+CSQ +CSQ: 25,99 OK Thu Nov 5 11:16:31 UTC 2020 AT+CSQ +CSQ: 21,99 OK ^C [tykling@container1 ~]$
Wrapping this in a script and grepping for ^+CSQ
is enough to get something to work with, which is basically what mobile_modem_exporter does. I added an ansible task to install it in a virtualenv and run it under supervisord.
I then made supervisord
run mobile_modem_exporter
with the following supervisord.d
config snippet:
[tykling@container1 ~]$ cat /usr/local/etc/supervisord.d/mobile_modem_exporter.conf ; run mobile_modem_exporter [program:mobile_modem_exporter] command=/usr/local/bin/mobile_modem_exporter /var/tmp/node_exporter/mobile_modem.prom /dev/modem-huawei-control /dev/modem-quectel-control user=root stdout_syslog=True stderr_syslog=True startsecs=5 autostart=True [tykling@container1 ~]$
The output ends up in /var/tmp/node_exporter/mobile_modem.prom
and looks like this:
[tykling@container1 ~]$ cat /var/tmp/node_exporter/mobile_modem.prom # HELP mobile_modem_up This metric is always 1 if the mobile_modem scrape worked, 0 if there was a problem getting info from one or more modems. # TYPE mobile_modem_up gauge mobile_modem_up 1.0 # HELP mobile_modem_build_info Information about the mobile_modem_exporter itself. # TYPE mobile_modem_build_info gauge mobile_modem_build_info{pipeserial_version="0.3.0",version="0.1.0"} 1.0 # HELP mobile_modem_info Information about the mobile modem being monitored, including device path, manufacturer, model, revision and serial number. # TYPE mobile_modem_info gauge mobile_modem_info{device="/dev/modem-huawei-control",manufacturer="Huawei Technologies Co., Ltd.",model="ME909s-120",revision="11.617.15.00.00",serial="864172044791624"} 1.0 mobile_modem_info{device="/dev/modem-quectel-control",manufacturer="Quectel",model="EC25",revision="EC25EFAR06A06M4G",serial="860548043742078"} 1.0 # HELP mobile_modem_atcsq_rssi RSSI for the mobile modem as returned by AT+CSQ # TYPE mobile_modem_atcsq_rssi gauge mobile_modem_atcsq_rssi{device="/dev/modem-huawei-control"} 19.0 mobile_modem_atcsq_rssi{device="/dev/modem-quectel-control"} 25.0 # HELP mobile_modem_atcsq_ber BER for the mobile modem as returned by AT+CSQ # TYPE mobile_modem_atcsq_ber gauge mobile_modem_atcsq_ber{device="/dev/modem-huawei-control"} 99.0 mobile_modem_atcsq_ber{device="/dev/modem-quectel-control"} 99.0 [tykling@container1 ~]$
As you can see the mobile_modem_exporter
also exports a mobile_modem_info
metric which has metadata about each modem. It exports manufacturer, model, revision and serial number for each modem. This information is displayed in the tables between the Stat panels
in the top and the dBm
graphs.
Node Exporter will automatically pick up the new metrics from the .prom
file as often as it is scraped by Prometheus, and they are immediately available in Prometheus and Grafana for visualisation.
While the rssi
number itself is a subjective (and manufacturer specific) value, the AT command reference for a modem should contain the information needed to convert the rssi
number to dBm (decibels per miliwatt) which is a standard and comparable unit of measurement. I found the conversion ratio in the AT+CSQ
command description in the AT Command Reference
for my two modems (Huawei, Quectel).
The two modems I have in this server fortunately both use more or less the same scale (at least in the sweet spot where the modems are likely to be operating). The formula is (rssi * 2) - 113
which results in a dBm
value that is (more) comparable across the two (and other) modems. I add this as a Prometheus recording rule so I have the dBm
value available always. The recording rule is simple:
[tykling@prometheus1 ~]$ cat /usr/local/etc/prometheus-rules.yml --- groups: - name: modem-exporter-group rules: - expr: (mobile_modem_atcsq_rssi * 2) - 113 record: mobile_modem_atcsq_dbm [tykling@prometheus1 ~]$
I have other recording rules but they are not relevant for this. This recording rule makes the new metric mobile_modem_atcsq_dbm
available with the same timestamps and labels as the original mobile_modem_atcsq_rssi
metric has. I don't actually use mobile_modem_atcsq_rssi
in any graphs.
For the record, the color thresholds I've defined in Grafana for the dBm
graphs are:
Note that the values above are for LTE
networks, if the modems switch to GSM or something else I will need to use different thresholds. This gives me a pair of dBm
graphs which looks like this:
These graphs will serve us well both when we go to mount the antennas in their permanent location (readings every 10 seconds is wonderful when adjusting antenna angles!) and they will also be great to have in the future when troubleshooting connectivity or whatever.
This took longer than expected, but it did spawn a couple of nice small tools along the way! Stuff like this can always be done faster by yoloing a shell script here and there, and it might work for a while, but in my experience it always breaks eventually. Additionally, little crappy shell scripts are unlikely to be reused by others - people might as well write their own little crappy shell script.
In contrast, a properly linted, documented, tested and packaged tool is so much more likely to benefit others, or attract new contributors and become even better. Finally a well crafted utility is much nicer to work with. It has the -v
knob to show the version, and it does as you would expect when you type -h
and so on.
No matter what the future brings for the two new tools PipeSerial
and mobile_modem_exporter
I am really happy knowing that they are already useful for providing metrics about the BornHack infrastructure. And if I ever run into a similar task in the future I have the tools to handle it!
I've recently signed up for Github Sponsors meaning it is now easy to sponsor me and my work. If this post or some of my other writing, software or services have helped you then you can consider becoming a sponsor.