by Tykling
25. nov 2020 13:54 UTC
I recently configured an unattended remote computer for monitoring humidity in the 20" shipping container we use for BornHack storage. It uses mobile modems for uplinks and it can tell us all sorts of info about the environment inside the container.
Any time a remote unattended system stops responding it can be impossible to know if it is related to network issues, problems with the OS or hardware, or a power outage. We wanted our monitoring to be able to differentiate between a power outage vs. network or computer related issue.
For this purpose I purchased a cheap small UPS, which will of course do the normal stuff a UPS does: function as a surge/lightning protector and keep the computer running through any small power outages shorter than a few hours. But I also wanted to configure communications with the FreeBSD APU so I could get an alert if the power goes away, and to have some idea of how the UPS is doing.
NUT
or Network UPS Tools is a (somewhat messy) software package which includes drivers and software to communicate with and monitor UPS devices. It fought me every step of the way, but eventually I managed to make it act the way I wanted. This post documents the experience.
The UPS hardware I went for was a PowerWalker Basic 600 STL which has USB (HID) management supported by NUT, has a 600VA battery (more than enough for a small APU with a few sensors, webcams and antennas), and cost me less than 50€. It seems UPS devices have become very cheap while I haven't been paying attention.
Picking which UPS to get from the PowerWalker selection wasn't easy. They have a mind-blowing number of models available. I am still not sure I picked the best one for my purpose, because it wasn't easy to see the differences, even using the comparison tool and configurator on their website.
The model I ended up with works well with NUT (I think most or all of their models do!), and it can tell me most of the metrics I expect, except for load
(as in: how many watts are being used by devices connected to the UPS right now). It would have been nice to have load
too, but I can live without.
The UPS has a 230v plug to connect to the mains, a USB A plug to speak to the computer, and two female 230v plugs in the back to power devices.
NUT is a bit of a mouthful. I've tried to break down the different commands and daemons here for a bit of background before we dive in. Once everything is setup and working it is actually pretty intuitive and it works really well, but getting there was painful.
The main NUT daemon is called upsd
and it has two main config files, /usr/local/etc/nut/upsd.conf
which is empty on my system, and a users file at /usr/local/etc/nut/upsd.users
. Finally it has a config file where the UPS devices are defined in at /usr/local/etc/nut/ups.conf
.
The daemon responsible for monitoring UPS and battery state is called upsmon
. It has a config file at /usr/local/etc/nut/upsmon.conf
The upsc
command is used to retrieve one or all variables from the UPS (by speaking to upsd
). It doesn't have a config file.
The upscmd
command is used to run commands on the UPS (also by speaking to upsd
). It also doesn't have a config file, but for some commands it will ask for credentials for a upsd
account (ideally one with permissions for the command in question).
The upsrw
command is used to change variables in the UPS (also by speaking to upsd
). It also doesn't have a config file.
First things first: I needed to write a ups.conf
to match my hardware. This involved quite a bit of trial and error. The main problem was finding the correct driver to use. The HCL section of the NUT website has the following sentence on the page for a similar sounding PowerWalker model: This device is known to work with driver megatec_usb, now replaced by the blazer_usb one. It may also work with driver nutdrv_qx.
This text is what originally made me choose a PowerWalker model, so imagine my surprise when it didn't work at all with neither blazer_usb
nor nutdrv_qx
. I finally found the usbhid-ups
driver which worked for me.
For me, the process of finding the right driver was something of an adventure. All the drivers are installed in /usr/local/libexec/nut/
:
[tykling@container1 ~]$ ls -l /usr/local/libexec/nut/ total 2398 -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 al175 -r-xr-xr-x 1 root wheel 89464 Nov 4 01:00 apcsmart -r-xr-xr-x 1 root wheel 77176 Nov 4 01:00 apcsmart-old -r-xr-xr-x 1 root wheel 60760 Nov 4 01:00 apcupsd-ups -r-xr-xr-x 1 root wheel 101768 Nov 4 01:00 bcmxcp -r-xr-xr-x 1 root wheel 101752 Nov 4 01:00 bcmxcp_usb -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 belkin -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 belkinunv -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 bestfcom -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 bestfortress -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 bestuferrups -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 bestups -r-xr-xr-x 1 root wheel 81288 Nov 4 01:00 blazer_ser -r-xr-xr-x 1 root wheel 89464 Nov 4 01:00 blazer_usb -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 clone -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 clone-outlet -r-xr-xr-x 1 root wheel 60792 Nov 4 01:00 dummy-ups -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 etapro -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 everups -r-xr-xr-x 1 root wheel 73080 Nov 4 01:00 gamatronic -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 genericups -r-xr-xr-x 1 root wheel 64904 Nov 4 01:00 isbmex -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 ivtscd -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 liebert -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 liebert-esp2 -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 masterguard -r-xr-xr-x 1 root wheel 73080 Nov 4 01:00 metasys -r-xr-xr-x 1 root wheel 142712 Nov 4 01:00 mge-shut -r-xr-xr-x 1 root wheel 73080 Nov 4 01:00 mge-utalk -r-xr-xr-x 1 root wheel 73080 Nov 4 01:00 microdowell -r-xr-xr-x 1 root wheel 105832 Nov 4 01:00 netxml-ups -r-xr-xr-x 1 root wheel 64872 Nov 4 01:00 nutdrv_atcl_usb -r-xr-xr-x 1 root wheel 773528 Nov 4 01:00 nutdrv_qx -r-xr-xr-x 1 root wheel 89464 Nov 4 01:00 oldmge-shut -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 oneac -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 optiups -r-xr-xr-x 1 root wheel 81288 Nov 4 01:00 powercom -r-xr-xr-x 1 root wheel 85384 Nov 4 01:00 powerpanel -r-xr-xr-x 1 root wheel 64904 Nov 4 01:00 rhino -r-xr-xr-x 1 root wheel 64872 Nov 4 01:00 richcomm_usb -r-xr-xr-x 1 root wheel 85384 Nov 4 01:00 riello_ser -r-xr-xr-x 1 root wheel 89464 Nov 4 01:00 riello_usb -r-xr-xr-x 1 root wheel 64888 Nov 4 01:00 safenet -r-xr-xr-x 1 root wheel 52568 Nov 4 01:00 skel -r-xr-xr-x 1 root wheel 187880 Nov 4 01:00 snmp-ups -r-xr-xr-x 1 root wheel 89480 Nov 4 01:00 solis -r-xr-xr-x 1 root wheel 64904 Nov 4 01:00 tripplite -r-xr-xr-x 1 root wheel 85384 Nov 4 01:00 tripplite_usb -r-xr-xr-x 1 root wheel 77176 Nov 4 01:00 tripplitesu -r-xr-xr-x 1 root wheel 93576 Nov 4 01:00 upscode2 -r-xr-xr-x 1 root wheel 191848 Nov 4 01:00 usbhid-ups -r-xr-xr-x 1 root wheel 68984 Nov 4 01:00 victronups [tykling@container1 ~]$
Each of these is an executable, and they all have their own manpages and options. Here is the -h
output from the driver I ended up using:
[tykling@container1 ~]$ /usr/local/libexec/nut/usbhid-ups -h Network UPS Tools - Generic HID driver 0.41 (2.7.4) USB communication driver 0.33 usage: usbhid-ups -a[OPTIONS] -a - autoconfig using ups.conf section - note: -x after -a overrides ups.conf settings -V - print version, then exit -L - print parseable list of driver variables -D - raise debugging level -q - raise log level threshold -h - display this help -k - force shutdown -i - poll interval -r - chroot to -u - switch to (if started as root) -x = - set driver variable to - example: -x cable=940-0095B Acceptable values for -x or ups.conf in this driver: Set low battery level, in % (default=30). : -x lowbatt= Set shutdown delay, in seconds (default=20) : -x offdelay= Set startup delay, in seconds (default=30) : -x ondelay= Set polling frequency, in seconds, to reduce data flow (default=30) : -x pollfreq= Don't use interrupt pipe, only use polling : -x pollonly Regular expression to match UPS Manufacturer string : -x vendor= Regular expression to match UPS Product string : -x product= Regular expression to match UPS Serial number : -x serial= Regular expression to match UPS Manufacturer numerical ID (4 digits hexadecimal) : -x vendorid= Regular expression to match UPS Product numerical ID (4 digits hexadecimal) : -x productid= Regular expression to match USB bus name : -x bus= Force redundant call to usb_set_altinterface() (value=bAlternateSetting; default=0) : -x usb_set_altinterface= Diagnostic matching of unsupported UPS : -x explore Activate tweak for buggy APC Back-UPS firmware : -x maxreport Don't use polling, only use interrupt pipe : -x interruptonly Number of bytes to read from interrupt pipe : -x interruptsize= [tykling@container1 ~]$
I use no special -x foo=bar
to speak to the UPS, the driver just works with the following ups.conf
file:
[tykling@container1 ~]$ sudo cat /usr/local/etc/nut/ups.conf user = "root" [bhups1] driver = usbhid-ups port = /dev/powerwalker-ups desc = "PowerWalker Basic VI 600 STL" [tykling@container1 ~]$
One thing that was a bit annoying while testing different drivers was that the drivers actually read ups.conf
to find the id/entry specified with -a
and then complains if another driver is specified in the file. So to test a new driver first change the driver name in the file, then call the driver directly, with -DDDD
to get some debug output.
You may notice that the USB device name in ups.conf
is /dev/powerwalker-ups
and not /dev/ttyUx
like you might expect. Thanks to the following devd.conf
snippet:
[tykling@container1 ~]$ cat /usr/local/etc/devd/symlinkups.conf notify 100 { match "system" "USB"; match "subsystem" "DEVICE"; match "type" "ATTACH"; match "vendor" "0x0764"; match "product" "0x0601"; action "rm -f /dev/powerwalker-ups && ln -s /dev/$cdev /dev/powerwalker-ups"; }; [tykling@container1 ~]$
This means I always have the UPS available with the same device name regardless of the number of connected serial USB devices, and the order they were enumerated in.
upsd
is started by the rc.d script after enabling it in rc.conf
:
nut_enable="YES" nut_upsshut="YES"
The nut_upsshut="YES"
bit makes the rc.d script shutdown UPS power when needed. More on that later.
As mentioned the upsd.conf
default were fine for me, so the file is empty. I've defined two users, an all-powerful admin and a less powerful user used by upsmon:
[tykling@container1 ~]$ sudo cat /usr/local/etc/nut/upsd.conf [tykling@container1 ~]$ sudo cat /usr/local/etc/nut/upsd.users [bhupsadmin] password = hunter12 actions = SET instcmds = ALL [upsmon] password = hunter12 upsmon master [tykling@container1 ~]$
The options are described in upsd.users(5)
but basically actions
permit different actions in upsd
and instcmds
permit different commands to be run with upscmd
. The SET
action permits setting variables in the UPS, and the ALL
instcmd permits all commands the UPS supports.
The upsmon master
bit grants a collection of permissions suitable for a upsmon
user. The advice is to use this rather than trying to match the permissions upsmon
needs to work manually.
The upsmon
user credentials is then included in upsmon.conf
enabling upsmon
to speak to upsd
.
If all I wanted to do was to manually speak to the UPS, query/change settings on it, and montor battery state etc. I would be all set now. But if I want the system to be shut down nicely in case of a power outage which lasts longer than the battery, then I need upsmon
too.
upsmon
has a config file at /usr/local/etc/nut/upsmon.conf
and is enabled by adding nut_upsmon_enable="YES"
to /etc/rc.conf
. After some fiddling my config file looks like this:
[tykling@container1 ~]$ sudo cat /usr/local/etc/nut/upsmon.conf MONITOR bhups1@localhost 1 upsmon hunter12 master SHUTDOWNCMD "/sbin/shutdown -h now" NOTIFYCMD /usr/local/bin/upsmon_notify_command.sh DEADTIME 15 FINALDELAY 60 POWERDOWNFLAG /etc/killpower NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC NOTIFYFLAG FSD SYSLOG+WALL+EXEC NOTIFYFLAG COMMOK SYSLOG+WALL+EXEC NOTIFYFLAG COMMBAD SYSLOG+WALL+EXEC NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC NOTIFYFLAG NOCOMM SYSLOG+WALL+EXEC [tykling@container1 ~]$
The MONITOR
line defines the UPS to be monitored:
ups.conf
) is bhups1
, the host running the upsd
process responsible for managing this UPS is localhost
, so bhups1@localhost
1
defines how many power supplies on this system are powered by this UPS. Big systems might have 2 or 4 power supplies, and may have several UPS devices. In my case I have 1 UPS powering 1 PSU on this system.upsd.users
user with the upsmon
privileges I defined earliermaster
defines the relationship with this UPS. This can be either master
or slave
, the former meaning that this system is drawing power from as well as controlling this UPS, the latter meaning that this system is drawing power from this UPS, but a different system is controlling it (for cases where multiple servers are powered by the same UPS).The SHUTDOWNCMD
line defines what to do to shutdown this system in low battery situations. It doesn't really matter if I use -h
or -p
here, because the system will never get as far as actually powering off, because the UPS will have done so by then. More on that later.
The NOTIFYCMD
line defines the script to run when using the flag EXEC
in one or more NOTIFYFLAG
lines. This tripped me up for a while because the default config has NOTIFYCMD
set to /usr/bin/logger
and I did see the messages in syslog. Turns out this was not because of NOTIFYCMD
being run, it was because the default NOTIFYFLAG
is SYSLOG+WALL
meaning the message did make it to syslog, just not because of my NOTIFYCMD
. So if you do set a NOTIFYCMD
remember to also set NOTIFYFLAG
to include EXEC
for the events where you wish to run NOTIFYCMD
. This is my /usr/local/bin/upsmon_notify_command.sh
notify script:
[tykling@container1 ~]$ cat /usr/local/bin/upsmon_notify_command.sh #!/bin/sh # $UPSNAME is set in the environment by upsmon (echo "Notify from upsmon@$(/bin/hostname): $*" && /bin/echo && /bin/echo "This email generated by $0 at $(/bin/date)" && /bin/echo && /bin/echo "output from upsc ${UPSNAME}:" && /usr/local/bin/upsc ${UPSNAME}) | /usr/bin/mail -s "upsmon@$(/bin/hostname): $*" sysadm@bornhack.org [tykling@container1 ~]$
This results in an email like this arriving whenever upsmon
detects an event:
Return-Path:Delivered-To: sysadm@bornhack.org Received: from mail.bornhack.org ([85.235.250.93]) by imap2.servers.bornhack.org with LMTP id z7fXKIV5wl/NcQAA+yNRXw (envelope-from ) for ; Sat, 28 Nov 2020 16:23:33 +0000 Received: (nullmailer pid 73726 invoked by uid 66); Sat, 28 Nov 2020 16:23:32 -0000 DKIM-Filter: OpenDKIM Filter v2.10.3 mail.bornhack.org 4E6382CAF Authentication-Results: mail.bornhack.org; dkim=none To: sysadm@bornhack.org Subject: upsmon@container1.servers.bornhack.org: UPS bhups1@localhost on battery Date: Sat, 28 Nov 2020 16:23:32 +0000 Message-Id: <1606580612.444150.73246.nullmailer@container1.servers.bornhack.org> From: Notify from upsmon@container1.servers.bornhack.org: UPS bhups1@localhost on battery This email generated by /usr/local/bin/upsmon_notify_command.sh at Sat Nov 28 16:23:32 UTC 2020 output from upsc bhups1@localhost: battery.charge: 100 battery.charge.low: 10 battery.charge.warning: 20 battery.mfr.date: 1 battery.runtime: 3600 battery.runtime.low: 300 battery.type: PbAcid battery.voltage: 13.6 battery.voltage.nominal: 12 device.mfr: 1 device.model: 600 device.serial: device.type: ups driver.name: usbhid-ups driver.parameter.pollfreq: 30 driver.parameter.pollinterval: 2 driver.parameter.port: /dev/powerwalker-ups driver.parameter.synchronous: no driver.version: 2.7.4 driver.version.data: CyberPower HID 0.4 driver.version.internal: 0.41 input.transfer.high: 290 input.transfer.low: 162 input.voltage: 235.7 input.voltage.nominal: 230 output.voltage: 235.7 ups.beeper.status: enabled ups.delay.shutdown: 20 ups.delay.start: 30 ups.load: 0 ups.mfr: 1 ups.model: 600 ups.productid: 0601 ups.realpower.nominal: 360 ups.serial: ups.status: OB DISCHRG ups.timer.shutdown: -60 ups.timer.start: -60 ups.vendorid: 0764
The script calls upsc
so the email contains some info but usually you'd want to log onto the system to check what is happening of course. The flags in ups.status
line defines the current state, OB
and DISCHRG
means "on battery" and "discharging", repectively. I just use the battery charge percent and line input voltage to alert on though.
The primary job for upsmon
(apart from sending notifications to keep us updated on power stuff) is to shutdown the system gracefully when there is no power and the battery is running low. This is not as simple as it sounds though, because you want to make sure the server(s) comes back up when the power returns (regardless of when it happens, power might come back during shutdown for example!).
We (the good people of this planet) have two main types of power supplies, the normal ATX power supplies like most servers have, and the simpler ones like the APU has, which is just a little wall wart and a barrel plug. ATX power supplies are usually configured for "last known state" meaning an ATX PSU will turn itself and the server on when it gets power, if it was on when power was lost. The simpler power supplies do not have this feature, there is no way to make them turn on after a shutdown -p now
short of cycling power to the PSU.
Keeping all this in mind the ideal way for a low battery scenario to play out is to make the operating system shutdown gracefully, and once all disk drives are synced make the UPS cut power to the outputs in a way that turns them on again when line power returns. This means ATX power supplies as well as simpler ones both turn on as intended when the power returns to the UPS, and the UPS turns on the output load again.
As you can probably imagine this can be a bit tricky to get right, but it does work great once configured. The order of events in a typical power outage is like this:
ONBATT
event which triggers my notify script sending an email, and logs to syslog, and sends out a wall to all logged-in users. System keeps running as normal, until battery.charge
is equal to or lower than battery.charge.low
, triggering an LOWBATT
event and the beginning of the low battery shutdown scenario.upsmon
waits for FINALDELAY
seconds, 60 in this case, and then initiates a system shutdown after writing the POWERDOWNFLAG
file code /etc/killpower
. This file is later used to tell the UPS to turn off power to the load, effectively turning the server off during the shutdown process.SHUTDOWNCMD
is executed, initiating the normal OS shutdown sequence, beginning with /etc/rc.shutdown
which in turn calls all the rc.d
scripts with the faststop
argument, shutting everything down nicely./usr/local/etc/rc.d/nut
which contains the following poststop()
function which makes the UPS kill power to the system:
nut_poststop() { if ${nut_prefix}/sbin/upsdrvctl stop && checkyesno nut_upsshut; then if ${nut_prefix}/sbin/upsmon -K; then ${nut_prefix}/sbin/upsdrvctl shutdown fi fi }
upsd
is still running.nut_upsshut
is set to YES in /etc/rc.conf
/usr/local/sbin/upsmon -K
which checks for the existence of the POWERDOWNFLAG
file and sets the exit code 0 if it exists, exit code 1 if it does not:-K checks POWERDOWNFLAG, sets exit code to 0 if set
POWERDOWNFLAG
file exists it then executes /usr/local/sbin/upsdrvctl shutdown
which makes the UPS turn off the output power after waiting ups.delay.shutdown
seconds - 20 in this case. This value should be long enough to allow the OS to finish any remaining shutdown tasks, the UPS default of 20 seconds was enough here.POWERDOWNFLAG
file is deleted and everything boots up normally, the UPS is charging the battery and at some point it will be back to 100% power and everything is back to normal.I tested the various events and the full shutdown procedure a few times to make sure it works as intended. Make sure you do the same!
The next few sections cover the tools of the trade, upsc
, upscmd
, and upsrw
.
The upsc
program is used to display variables (including metrics) from the UPS. It has a couple of different modes of operation. It doesn't require root or authentication with upsd
since it does readonly operations.
[tykling@container1 ~]$ upsc -h Network UPS Tools upsc 2.7.4 usage: upsc -l | -L [[:port]] upsc [ ] upsc -c Demo program to display UPS variables. First form (lists UPSes): -l - lists each UPS on , one per line. -L - lists each UPS followed by its description (from ups.conf). Default hostname: localhost Second form (lists variables and values): - upsd server, [@ [: ]] form - optional, display this variable only. Default: list all variables for Third form (lists clients connected to a device): -c - lists each client connected on , one per line. - upsd server, [@ [: ]] form [tykling@container1 ~]$
It can list the UPS devices upsd
knows about, with or without description:
[tykling@container1 ~]$ upsc -l bhups1 [tykling@container1 ~]$ upsc -L bhups1: PowerWalker Basic VI 600 STL [tykling@container1 ~]$
It can also show connected upsd
clients:
[tykling@container1 ~]$ upsc -c bhups1 ::1 [tykling@container1 ~]$
Finally and most importantly it can show all the variables in a UPS:
[tykling@container1 ~]$ upsc bhups1 battery.charge: 100 battery.charge.low: 10 battery.charge.warning: 20 battery.mfr.date: 1 battery.runtime: 3600 battery.runtime.low: 300 battery.type: PbAcid battery.voltage: 13.6 battery.voltage.nominal: 12 device.mfr: 1 device.model: 600 device.serial: device.type: ups driver.name: usbhid-ups driver.parameter.pollfreq: 30 driver.parameter.pollinterval: 2 driver.parameter.port: /dev/powerwalker-ups driver.parameter.synchronous: no driver.version: 2.7.4 driver.version.data: CyberPower HID 0.4 driver.version.internal: 0.41 input.transfer.high: 290 input.transfer.low: 162 input.voltage: 235.7 input.voltage.nominal: 230 output.voltage: 235.7 ups.beeper.status: enabled ups.delay.shutdown: 20 ups.delay.start: 30 ups.load: 0 ups.mfr: 1 ups.model: 600 ups.productid: 0601 ups.realpower.nominal: 360 ups.serial: ups.status: OL ups.timer.shutdown: -60 ups.timer.start: -60 ups.vendorid: 0764 [tykling@container1 ~]$
Which variables are available depends on the UPS and on the driver in use. A more expensive and advanced UPS would have (a non-0) value for load
, possibly even one per output plug, and maybe more stuff. I used upsc
a lot while playing around with the UPS and testing stuff, and it is also used by the Prometheus exporter I am using. More on that later.
The upscmd
command is used to run commands on the UPS itself.
[tykling@container1 ~]$ upscmd -h Network UPS Tools upscmd 2.7.4 usage: upscmd [-h] upscmd [-l] upscmd [-u ] [-p ] [ ] Administration program to initiate instant commands on UPS hardware. -h display this help text -l show available commands on UPS -u set username for command authentication -p set password for command authentication UPS identifier - [@ [: ]] Valid instant command - test.panel.start, etc. [ ] Additional data for command - number of seconds, etc. [tykling@container1 ~]$
Supported commands vary depending on the UPS model in use. upscmd
can list the supported commands:
[tykling@container1 ~]$ upscmd -l bhups1 Instant commands supported on UPS [bhups1]: beeper.disable - Disable the UPS beeper beeper.enable - Enable the UPS beeper beeper.mute - Temporarily mute the UPS beeper beeper.off - Obsolete (use beeper.disable or beeper.mute) beeper.on - Obsolete (use beeper.enable) load.off - Turn off the load immediately load.off.delay - Turn off the load with a delay (seconds) load.on - Turn on the load immediately load.on.delay - Turn on the load with a delay (seconds) shutdown.return - Turn off the load and return when power is back shutdown.stayoff - Turn off the load and remain off shutdown.stop - Stop a shutdown in progress test.battery.start.deep - Start a deep battery test test.battery.start.quick - Start a quick battery test test.battery.stop - Stop the battery test [tykling@container1 ~]$
I've mostly used upscmd bhups1 beeper.mute
when testing power outages, because the beeping was driving me a bit crazy. When running commands upscmd
will prompt for upsd
credentials before running the command, like so:
[tykling@container1 ~]$ upscmd bhups1 beeper.disable Username (tykling): bhupsadmin Password: OK [tykling@container1 ~]$
The upsrw
command is used to set/change variables inside the UPS. Basic usage:
[tykling@container1 ~]$ upsrw -h Network UPS Tools upsrw 2.7.4 usage: upsrw [-h] upsrw [-s] [-u ] [-p ] Demo program to set variables within UPS hardware. -h display this help text -s specify variable to be changed use -s VAR=VALUE to avoid prompting for value -u set username for command authentication -p set password for command authentication UPS identifier - [@ [: ]] Call without -s to show all possible read/write variables. [tykling@container1 ~]$
Like upscmd
upsrw
will prompt for credentials since changing variables in the UPS is for grownups only.
I was a bit pressed for time when I got around to extracting metrics from the UPS. I searched around and found a few different options:
The output from https://github.com/DRuggeri/nut_exporter
looks like this:
[tykling@container1 ~]$ fetch -qo - http://127.0.0.1:9199/ups_metrics # HELP nut_battery_charge Value of the battery.charge variable from Network UPS Tools # TYPE nut_battery_charge gauge nut_battery_charge 100 # HELP nut_battery_charge_low Value of the battery.charge.low variable from Network UPS Tools # TYPE nut_battery_charge_low gauge nut_battery_charge_low 10 # HELP nut_battery_charge_warning Value of the battery.charge.warning variable from Network UPS Tools # TYPE nut_battery_charge_warning gauge nut_battery_charge_warning 20 # HELP nut_battery_mfr_date Value of the battery.mfr.date variable from Network UPS Tools # TYPE nut_battery_mfr_date gauge nut_battery_mfr_date 1 # HELP nut_battery_runtime Value of the battery.runtime variable from Network UPS Tools # TYPE nut_battery_runtime gauge nut_battery_runtime 3600 # HELP nut_battery_runtime_low Value of the battery.runtime.low variable from Network UPS Tools # TYPE nut_battery_runtime_low gauge nut_battery_runtime_low 300 # HELP nut_battery_voltage Value of the battery.voltage variable from Network UPS Tools # TYPE nut_battery_voltage gauge nut_battery_voltage 13.6 # HELP nut_battery_voltage_nominal Value of the battery.voltage.nominal variable from Network UPS Tools # TYPE nut_battery_voltage_nominal gauge nut_battery_voltage_nominal 12 # HELP nut_device_info UPS Device information # TYPE nut_device_info gauge nut_device_info{contact="",description="",location="",macaddr="",mfr="1",model="600",part="",serial="",type="ups"} 1 # HELP nut_device_mfr Value of the device.mfr variable from Network UPS Tools # TYPE nut_device_mfr gauge nut_device_mfr 1 # HELP nut_device_model Value of the device.model variable from Network UPS Tools # TYPE nut_device_model gauge nut_device_model 600 # HELP nut_driver_parameter_pollfreq Value of the driver.parameter.pollfreq variable from Network UPS Tools # TYPE nut_driver_parameter_pollfreq gauge nut_driver_parameter_pollfreq 30 # HELP nut_driver_parameter_pollinterval Value of the driver.parameter.pollinterval variable from Network UPS Tools # TYPE nut_driver_parameter_pollinterval gauge nut_driver_parameter_pollinterval 2 # HELP nut_driver_version_internal Value of the driver.version.internal variable from Network UPS Tools # TYPE nut_driver_version_internal gauge nut_driver_version_internal 0.41 # HELP nut_input_transfer_high Value of the input.transfer.high variable from Network UPS Tools # TYPE nut_input_transfer_high gauge nut_input_transfer_high 290 # HELP nut_input_transfer_low Value of the input.transfer.low variable from Network UPS Tools # TYPE nut_input_transfer_low gauge nut_input_transfer_low 162 # HELP nut_input_voltage Value of the input.voltage variable from Network UPS Tools # TYPE nut_input_voltage gauge nut_input_voltage 239.7 # HELP nut_input_voltage_nominal Value of the input.voltage.nominal variable from Network UPS Tools # TYPE nut_input_voltage_nominal gauge nut_input_voltage_nominal 230 # HELP nut_output_voltage Value of the output.voltage variable from Network UPS Tools # TYPE nut_output_voltage gauge nut_output_voltage 239.7 # HELP nut_ups_delay_shutdown Value of the ups.delay.shutdown variable from Network UPS Tools # TYPE nut_ups_delay_shutdown gauge nut_ups_delay_shutdown 20 # HELP nut_ups_delay_start Value of the ups.delay.start variable from Network UPS Tools # TYPE nut_ups_delay_start gauge nut_ups_delay_start 30 # HELP nut_ups_load Value of the ups.load variable from Network UPS Tools # TYPE nut_ups_load gauge nut_ups_load 0 # HELP nut_ups_mfr Value of the ups.mfr variable from Network UPS Tools # TYPE nut_ups_mfr gauge nut_ups_mfr 1 # HELP nut_ups_model Value of the ups.model variable from Network UPS Tools # TYPE nut_ups_model gauge nut_ups_model 600 # HELP nut_ups_productid Value of the ups.productid variable from Network UPS Tools # TYPE nut_ups_productid gauge nut_ups_productid 601 # HELP nut_ups_realpower_nominal Value of the ups.realpower.nominal variable from Network UPS Tools # TYPE nut_ups_realpower_nominal gauge nut_ups_realpower_nominal 360 # HELP nut_ups_timer_shutdown Value of the ups.timer.shutdown variable from Network UPS Tools # TYPE nut_ups_timer_shutdown gauge nut_ups_timer_shutdown -60 # HELP nut_ups_timer_start Value of the ups.timer.start variable from Network UPS Tools # TYPE nut_ups_timer_start gauge nut_ups_timer_start -60 # HELP nut_ups_vendorid Value of the ups.vendorid variable from Network UPS Tools # TYPE nut_ups_vendorid gauge nut_ups_vendorid 764 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served. # TYPE promhttp_metric_handler_requests_in_flight gauge promhttp_metric_handler_requests_in_flight 1 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code. # TYPE promhttp_metric_handler_requests_total counter promhttp_metric_handler_requests_total{code="200"} 0 promhttp_metric_handler_requests_total{code="500"} 0 promhttp_metric_handler_requests_total{code="503"} 0 [tykling@container1 ~]$
With these metrics I have all the info I need to make a nice dashboard in Grafana and some rules in Alertmanager, although upsmon
also sends email notifications.
I am pretty happy with the result! The UPS can power the whole setup (including the APU itself, the USB hub, two webcams, two mobile modems and a bunch of SparkFun QWIIC sensors) for just over 3 hours. The 600VA capacity of the UPS is clearly more than enough for this job, even though it was the smallest UPS model I could find. The graphs are very nice, and the email notifications make it easy for us to differentiate between a power outage and some other issue.
Given the very low price of the UPS and since I now know how to do it, and since I have the Ansible roles ready to go, I will absolutely be adding UPSes to more systems in the future :)
I've recently signed up for Github Sponsors meaning it is now easy to sponsor me and my work. If this post or some of my other writing, software or services have helped you then you can consider becoming a sponsor.