by Tykling
02. nov 2020 22:27 UTC
I recently had to configure a computer for environmental monitoring in the 20" shipping container we use for BornHack storage. The container sits outside in the wind and rain all year. This means we've had to fit it with a dehumidifier to make sure the moisture doesn't destroy all our stuff. Being nerds we want graphs to tell us about the temperature and humidity levels in the container, but since it is sitting in a place with no Internet the first challenge was to get online.
The solution was an APU3 with a couple of LTE modems and a whole bunch of sensors attached. Configuring the modems and connectivity meant having to refamiliarise myself with AT
commands, ppp
and considering how to handle routing and network on the APU.
All of BornHacks infrastructure is managed through Ansible, and the container APU is going to be a permanent part of the BornHack infrastructure. This means I need to be able to reach it from our Ansible server, and our Prometheus server needs to be able to connect to it. Both of these present a challenge since Carrier Grade NAT
on LTE
providers makes it impossible to connect in to the servers SSH or other services.
We don't currently have a VPN as part of BornHacks infrastructure, meaning there is no obvious way to get a TCP
connection into the container APU, even if it is online. Any thoughts of putting up a few reverse SSH tunnels were cast aside, instead I decided early on to use Tor Onion Services
as a poor mans VPN. It works really well and saved a lot of hassle. YMMV.
This post is not really about Tor though, it is mainly about configuring ppp
and about the wonderful world of AT
commands anno 2020. Since the final solution uses multiple Forwarding Information Bases
or FIBs
(multiple routing tables) I will also discuss the configuration and merits of such a setup.
The computer is an APU3C4 from PC Engines, a Swiss company (meaning European, but not in the EU). For this reason I went with a Swedish dealer named Teklager.se which I can highly recommend. They have great technical documentation on their website, and they have competent people on their support email, both of which I thought were things of the past.
The APUs are great little computers. The APU3C4
has 3 ethernet ports and of course a few USB ports, inside it has GPIO ports for expansion and sensors, and then three internal miniPCIe slots:
Quectel EC25-EU
modem mounted.
Huawei ME909s-120
modem mounted.
Note: The APU arrived assembled with modems and SSD mounted by Teklager, but I was unable to detect the LTE modem in the first miniPCIe slot (the slot which can also take an msata drive). Turns out that EHCI0 Controller
needs to be enabled in BIOS before an LTE modem can be used in that miniPCIe slot. Thanks to Teklager support for figuring this one out for me! I spent quite some time scratching my head, but after swapping slots for the modems and then detecting "the other one" I knew both modems worked, so I asked support, they asked PCEngines, and we found the solution. Very nice.
For reference, this is a dmesg from the APU3C4 with both modems being detected. The modem related output is highlighted in bold near the end:
[tykling@container1 ~]$ cat /var/run/dmesg.boot ---<<BOOT>>--- Copyright (c) 1992-2020 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 12.2-STABLE r367109 GENERIC amd64 FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2) VT(vga): resolution 640x480 CPU: AMD GX-412TC SOC (998.15-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x730f01 Family=0x16 Model=0x30 Stepping=1 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x3ed8220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x1d4037ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,Topology,PNXC,DBE,PTSC,PL2I> Structured Extended Features=0x8<BMI1> XSAVE Features=0x1<XSAVEOPT> SVM: NP,NRIP,AFlush,DAssist,NAsids=8 TSC: P-state invariant, performance statistics real memory = 4294967296 (4096 MB) avail memory = 4084944896 (3895 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <COREv4 COREBOOT> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) random: unblocking device. ioapic0 <Version 2.1> irqs 0-23 on motherboard ioapic1 <Version 2.1> irqs 24-55 on motherboard Launching APs: 2 3 1 Timecounter "TSC" frequency 998148069 Hz quality 1000 random: entropy device external interface kbd0 at kbdmux0 000.000023 [4336] netmap_init netmap: loaded module [ath_hal] loaded module_register_init: MOD_LOAD (vesa, 0xffffffff81116e40, 0) error 19 nexus0 vtvga0: <VT VGA driver> on motherboard cryptosoft0: <software crypto> on motherboard aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS> on motherboard acpi0: <COREv4 COREBOOT> on motherboard acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 apei0: <ACPI Platform Error Interface> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x818-0x81b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pcib0: could not evaluate _ADR - AE_NOT_FOUND pci0: <ACPI PCI bus> on pcib0 pci0: <base peripheral, IOMMU> at device 0.2 (no driver attached) pcib1: <ACPI PCI-PCI bridge> irq 25 at device 2.2 on pci0 pcib1: failed to allocate initial I/O port window: 0x1000-0x1fff pci1: <ACPI PCI bus> on pcib1 igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> mem 0xf7a00000-0xf7a1ffff,0xf7a20000-0xf7a23fff irq 28 at device 0.0 on pci1 igb0: Using 1024 TX descriptors and 1024 RX descriptors igb0: Using 2 RX queues 2 TX queues igb0: Using MSI-X interrupts with 3 vectors igb0: Ethernet address: 00:0d:b9:58:14:64 igb0: netmap queues/slots: TX 2/1024, RX 2/1024 pcib2: <ACPI PCI-PCI bridge> irq 26 at device 2.3 on pci0 pci2: <ACPI PCI bus> on pcib2 igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x2000-0x201f mem 0xf7b00000-0xf7b1ffff,0xf7b20000-0xf7b23fff irq 32 at device 0.0 on pci2 igb1: Using 1024 TX descriptors and 1024 RX descriptors igb1: Using 2 RX queues 2 TX queues igb1: Using MSI-X interrupts with 3 vectors igb1: Ethernet address: 00:0d:b9:58:14:65 igb1: netmap queues/slots: TX 2/1024, RX 2/1024 pcib3: <ACPI PCI-PCI bridge> irq 27 at device 2.4 on pci0 pci3: <ACPI PCI bus> on pcib3 igb2: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3000-0x301f mem 0xf7c00000-0xf7c1ffff,0xf7c20000-0xf7c23fff irq 36 at device 0.0 on pci3 igb2: Using 1024 TX descriptors and 1024 RX descriptors igb2: Using 2 RX queues 2 TX queues igb2: Using MSI-X interrupts with 3 vectors igb2: Ethernet address: 00:0d:b9:58:14:66 igb2: netmap queues/slots: TX 2/1024, RX 2/1024 pci0: <encrypt/decrypt> at device 8.0 (no driver attached) xhci0: <AMD FCH USB 3.0 controller> mem 0xf7f22000-0xf7f23fff irq 18 at device 16.0 on pci0 xhci0: 32 bytes context size, 64-bit DMA xhci0: Unable to map MSI-X table usbus0 on xhci0 usbus0: 5.0Gbps Super Speed USB v3.0 ahci0: <AMD Hudson-2 AHCI SATA controller> port 0x4010-0x4017,0x4020-0x4023,0x4018-0x401f,0x4024-0x4027,0x4000-0x400f mem 0xf7f25000-0xf7f253ff at device 17.0 on pci0 ahci0: AHCI v1.30 with 2 6Gbps ports, Port Multiplier supported with FBS ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ehci0: <AMD FCH USB 2.0 controller> mem 0xf7f26000-0xf7f260ff irq 18 at device 18.0 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci0 usbus1: 480Mbps High Speed USB v2.0 ehci1: <AMD FCH USB 2.0 controller> mem 0xf7f27000-0xf7f270ff irq 18 at device 19.0 on pci0 usbus2: EHCI version 1.0 usbus2 on ehci1 usbus2: 480Mbps High Speed USB v2.0 isab0: <PCI-ISA bridge> at device 20.3 on pci0 isa0: <ISA bus> on isab0 sdhci_pci0: <Generic SD HCI> mem 0xf7f28000-0xf7f280ff at device 20.7 on pci0 sdhci_pci0: 1 slot(s) allocated acpi_tz0: <Thermal Zone> on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: console (115200,n,8,1) uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 orm0: <ISA Option ROM> at iomem 0xee800-0xeffff pnpid ORM0000 on isa0 hwpstate0: <Cool`n'Quiet 2.0> on cpu0 ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf. ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec ugen0.1: <0x1022 XHCI root HUB> at usbus0 ugen2.1: <AMD EHCI root HUB> at usbus2 ugen1.1: <AMD EHCI root HUB> at usbus1 Trying to mount root from zfs:zroot/ROOT/12-STABLE-367109 []... uhub0: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2 uhub2: Root mount waiting for: usbus0 CAM usbus1 usbus2 <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 ada0 at ahcich1 bus 0 scbus1 target 0 lun 0 ada0: <WDC WDS500G1R0A-68A4W0 411000WR> ACS-4 ATA SATA 3.x device ada0: Serial Number 201802A00C74 ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada0: Command Queueing enabled ada0: 476940MB (976773168 512 byte sectors) uhub0: 4 ports with 4 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered Root mount waiting for: usbus1 usbus2 ugen2.2: <vendor 0x0438 product 0x7900> at usbus2 uhub3 on uhub1 uhub3: <vendor 0x0438 product 0x7900, class 9/0, rev 2.00/0.18, addr 2> on usbus2 ugen1.2: <vendor 0x0438 product 0x7900> at usbus1 uhub4 on uhub2 uhub4: <vendor 0x0438 product 0x7900, class 9/0, rev 2.00/0.18, addr 2> on usbus1 uhub3: 4 ports with 4 removable, self powered uhub4: 4 ports with 4 removable, self powered Root mount waiting for: usbus1 usbus2 ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2 ugen1.3: <Android Android> at usbus1 GEOM_ELI: Device ada0p2.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: hardware intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0 smbus0: <System Management Bus> on intsmb0 lo0: link state changed to UP lo1: link state changed to UP u3g0 on uhub3 u3g1 on uhub4 u3g1: <Android Android, class 239/2, rev 2.00/3.18, addr 3> on usbus1 u3g1: Found 5 ports. u3g0: <Huawei Mobile Connect - Modem> on usbus2 u3g0: Found 5 ports. pflog0: promiscuous mode enabled tun0: link state changed to UP Accounting enabled [tykling@container1 ~]$
The modems are miniPCIe devices but they show up as USB devices. This is what usbconfig(8)
has to say about them:
[tykling@container1 ~]$ sudo usbconfig list ugen0.1: <0x1022 XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA) ugen2.1: <AMD EHCI root HUB> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA) ugen1.1: <AMD EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA) ugen2.2: <vendor 0x0438 product 0x7900> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA) ugen1.2: <vendor 0x0438 product 0x7900> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA) ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA) ugen1.3: <Android Android> at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (500mA) ugen0.2: <Targus Group Intl Targus Group Intl> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (50mA) [tykling@container1 ~]$ sudo usbconfig -d ugen2.3 dump_device_desc ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA) bLength = 0x0012 bDescriptorType = 0x0001 bcdUSB = 0x0200 bDeviceClass = 0x0000 <Probed by interface class> bDeviceSubClass = 0x0000 bDeviceProtocol = 0x00ff bMaxPacketSize0 = 0x0040 idVendor = 0x12d1 idProduct = 0x15c1 bcdDevice = 0x0102 iManufacturer = 0x0001 <Huawei Technologies Co., Ltd.> iProduct = 0x0002 <HUAWEI Mobile V7R11> iSerialNumber = 0x0003 <0123456789ABCDEF> bNumConfigurations = 0x0003 [tykling@container1 ~]$ sudo usbconfig -d ugen1.3 dump_device_desc ugen1.3: <Android Android> at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (500mA) bLength = 0x0012 bDescriptorType = 0x0001 bcdUSB = 0x0200 bDeviceClass = 0x00ef <Miscellaneous device> bDeviceSubClass = 0x0002 bDeviceProtocol = 0x0001 bMaxPacketSize0 = 0x0040 idVendor = 0x2c7c idProduct = 0x0125 bcdDevice = 0x0318 iManufacturer = 0x0001 <Android> iProduct = 0x0002 <Android> iSerialNumber = 0x0000 <no string> bNumConfigurations = 0x0001 [tykling@container1 ~]$
Why the Quectel EC25-EU
shows up as Android Android
I have no idea! I am guessing some device ID is wrong or needs to be added somewhere in the usb stack maybe? Oh well :)
The modems both expose a bunch of serial ports under /dev
:
[tykling@container1 ~]$ ls -l /dev/ttyU* crw------- 1 root wheel 0x65 Nov 1 12:01 /dev/ttyU0.0 crw------- 1 root wheel 0x66 Nov 1 12:01 /dev/ttyU0.0.init crw------- 1 root wheel 0x83 Nov 1 12:01 /dev/ttyU0.0.lock crw------- 1 root wheel 0x87 Nov 1 12:01 /dev/ttyU0.1 crw------- 1 root wheel 0x88 Nov 1 12:01 /dev/ttyU0.1.init crw------- 1 root wheel 0x89 Nov 1 12:01 /dev/ttyU0.1.lock crw------- 1 root wheel 0x8d Nov 1 12:01 /dev/ttyU0.2 crw------- 1 root wheel 0x8e Nov 1 12:01 /dev/ttyU0.2.init crw------- 1 root wheel 0x8f Nov 1 12:01 /dev/ttyU0.2.lock crw------- 1 root wheel 0x93 Nov 2 14:01 /dev/ttyU0.3 crw------- 1 root wheel 0x94 Nov 1 12:01 /dev/ttyU0.3.init crw------- 1 root wheel 0x95 Nov 1 12:01 /dev/ttyU0.3.lock crw------- 1 root wheel 0x99 Nov 1 12:01 /dev/ttyU0.4 crw------- 1 root wheel 0x9a Nov 1 12:01 /dev/ttyU0.4.init crw------- 1 root wheel 0x9b Nov 1 12:01 /dev/ttyU0.4.lock crw------- 1 root wheel 0x9f Nov 1 12:01 /dev/ttyU1.0 crw------- 1 root wheel 0xa0 Nov 1 12:01 /dev/ttyU1.0.init crw------- 1 root wheel 0xa1 Nov 1 12:01 /dev/ttyU1.0.lock crw------- 1 root wheel 0xa5 Nov 1 12:01 /dev/ttyU1.1 crw------- 1 root wheel 0xa6 Nov 1 12:01 /dev/ttyU1.1.init crw------- 1 root wheel 0xa7 Nov 1 12:01 /dev/ttyU1.1.lock crw------- 1 root wheel 0xab Nov 2 22:53 /dev/ttyU1.2 crw------- 1 root wheel 0xac Nov 1 12:01 /dev/ttyU1.2.init crw------- 1 root wheel 0xad Nov 1 12:01 /dev/ttyU1.2.lock crw------- 1 root wheel 0xb1 Nov 1 12:01 /dev/ttyU1.3 crw------- 1 root wheel 0xb2 Nov 1 12:01 /dev/ttyU1.3.init crw------- 1 root wheel 0xb3 Nov 1 12:01 /dev/ttyU1.3.lock crw------- 1 root wheel 0xb7 Nov 1 12:01 /dev/ttyU1.4 crw------- 1 root wheel 0xb8 Nov 1 12:01 /dev/ttyU1.4.init crw------- 1 root wheel 0xb9 Nov 1 12:01 /dev/ttyU1.4.lock [tykling@container1 ~]$
The *.init
and *.lock
devices can be used to set initial and locked/unchangeable settings for the serial ports, respectively. See stty(1) and 26.2.2. Serial Port Configuration in the FreeBSD Handbook for more info. Serial clients like screen
and tip
set the serial port settings themselves, so the *.init
and *.lock
devices can be ignored for our purposes.
For antennas I initially used the antennas supplied with the LTE modems from Teklager. They work great, but since the computer is going to live inside a container I need antennas with a length of cable so I can drill a couple of holes and mount the antennas outside. Additionally, the container is located in a rural woodland area with very shoddy LTE coverage, so a couple of quality and outdoor rated 4G antennas would likely make a real difference in transfer speeds. I got 2 Macab Pro 1000 directional antennas (90 degrees) with 8-9db amplification, the SMA connectors fit right on the APU and 2 meters cable is perfect for this.
Last time I got my hands dirty with AT commands was very briefly about 10 years ago when configuring a 3g modem on a FreeBSD laptop I used back then. Before that we have to go all the way back to the last millenium and my awesome US Robotics Sportster 56K in my teenage years. I barely recall any of the AT command foo so this LTE modem stuff was an opportunity to refresh some of it and see what changed over the last 10-20 years. By the way, I checked, and US Robotics still has a support page for my old teenager modem - impressive :)
Modems basically present a serial port (actually more than one) which you or the ppp
software can speak to with a serial program. I usually use screen
or tip
, the latter is available in FreeBSD base. A typical session might look like this (my commands in bold):
AT OK ATI Manufacturer: Huawei Technologies Co., Ltd. Model: ME909s-120 Revision: 11.617.15.00.00 IMEI: 123456789012345 +GCAP: +CGSM,+DS,+ES OK AT+CSQ +CSQ: 21,99 OK
The first command is simply AT
and the response is OK
. This command does nothing, it simply confirms that the serial connection is working and the modem is responsive. Since each of the two modems expose 5 serial ports a bit of trial and error is involved in finding the right one. This is where the AT
command comes in handy.
The second command ATI
asks the modem for manufacturer information. The IMEI number has been edited out.
The third command AT+CSQ
asks the modem for Signal Strength Quality. The first number is rssi (higher is better), the second number is supposed to be BER (lower is better, but BER doesn't appear to be supported on my modems).
The second and third AT commands are both part of an standard extended AT set for interacting with mobile (GSM/LTE) modems. The details can vary a bit (well, a lot) between implementations in different modems, especially between different manufacturers. A nice general resource is m2msupport.net which has good explanations of all the AT
commands. But the best thing to do really is to find the AT command reference for the modem in question, since each manufacturer has their own proprietary AT commands in addition to the standard set.
The basic commands to get working internet are the same on the two modems, but advanced troubleshooting stuff sometimes requires a lookup in the AT command reference.
The two modems I use identify as USB devices, and on FreeBSD dial-out serial ports from USB devices get prefixed with ttyUx
where x is a number starting from 0. The ports are enumerated in the order the hardware and driver discovers them, meaning that the dial-out port for the first USB serial device is /dev/ttyU0
. If the device has multiple ports they are named /dev/ttyU0.0
, /dev/ttyU0.1
and so on, as shown above. The device numbers (and thus names/paths) is consistent as long as no new USB serial ports are added to the system.
While I was still configuring all this I happened to plug in an unrelated USB-serial adapter which I needed to establish communication with a UPS device I was trying out. The new USB serial device got the device name /dev/ttyU2
and all was well until I rebooted the machine a few days later and suddently ppp
refused to work because the modem devices changed numbers.
Fortunately it is possible with a bit of devd
foo to create symlinks for the devices when they attach, and then use the symlinks as devices, not caring which number they actually got assigned.
The devd
software fires actions on various hardware events, which is just what I need here - I want to run the ln
command when each of my USB modems are attached (detected).
A few examples from the default /etc/devd.conf
:
# Firmware downloader for Atheros AR3011 based USB Bluetooth devices #attach 100 { # match "vendor" "0x0cf3"; # match "product" "0x3000"; # action "sleep 2 && /usr/sbin/ath3kfw -d $device-name -f /usr/local/etc/ath3k-1.fw"; #}; # When a USB keyboard arrives, attach it as the console keyboard. attach 100 { device-name "ukbd0"; action "service syscons setkeyboard /dev/ukbd0"; }; detach 100 { device-name "ukbd0"; action "service syscons setkeyboard /dev/kbd0"; };
The first example (which is commented out) loads some firmware when a device matching the specified vendor and product ID is attached. The next two examples enable and disable a USB keyboard as console keyboard when it is attached and detached.
The manpage for devd.conf
has all the details and pretty soon I had a functioning symlinked modem device setup which I am very happy with. devd
looks for files under /usr/local/etc/devd
so I added the file /usr/local/etc/devd/symlinkmodems.conf
with my devd.conf
snippet:
[tykling@container1 ~]$ cat /usr/local/etc/devd/symlinkmodems.conf attach 100 { match "device-name" "u3g[0-9]+$"; match "vendor" "0x2c7c"; match "product" "0x0125"; action "ln -s /dev/tty$ttyname.2 /dev/modem-quectel-data"; action "ln -s /dev/tty$ttyname.3 /dev/modem-quectel-control"; }; attach 100 { match "device-name" "u3g[0-9]+$"; match "vendor" "0x12d1"; match "product" "0x15c1"; action "ln -s /dev/tty$ttyname.0 /dev/modem-huawei-data"; action "ln -s /dev/tty$ttyname.2 /dev/modem-huawei-control"; }; [tykling@container1 ~]$
The first rule means that every time a u3g
device is attached matching vendor 0x2c7c
and product 0x0125
(I got the IDs from the sudo usbconfig -d ugen1.3 dump_device_desc
command I ran earlier) then it runs the two commands specified with the action
keyword. This is the Quectel modem and it uses port 2 for data and port 3 for control, so I just symlink those to some new device names which make sense for me. Note that I am using the $ttyname
variable made available by devd
which will contain U1
if this USB serial device is getting /dev/ttyU1.*
device names.
The second rule is identical, but with the vendor and product IDs for the Huawei modem, and symlinks for port 0 for data and port 2 for control.
Testing is made easy by the fact that devd
can be stopped with sudo service devd stop
and then run by hand with sudo devd -f /etc/devd/devd.conf -d
, and then it outputs all the stuff it tries to match and what it does.
After the devd config worked I rebooted the machine and now had these lovely symlinks available on the system:
[tykling@container1 ~]$ ls -l /dev/modem-* lrwxr-xr-x 1 root wheel 12 Nov 3 15:24 /dev/modem-huawei-control -> /dev/ttyU1.2 lrwxr-xr-x 1 root wheel 12 Nov 3 15:24 /dev/modem-huawei-data -> /dev/ttyU1.0 lrwxr-xr-x 1 root wheel 12 Nov 3 15:24 /dev/modem-quectel-control -> /dev/ttyU0.3 lrwxr-xr-x 1 root wheel 12 Nov 3 15:24 /dev/modem-quectel-data -> /dev/ttyU0.2 [tykling@container1 ~]$
I use the new device names in ppp.conf
, and in my monitoring, and when speaking manually to the modems via AT commands. It is much easier to remember /dev/modem-huawei-control
than /dev/ttyU1.2
. Works like a charm, and will keep working no matter how many USB serial devices I attach. Only weakness in this approach is that it would not be possible to differentiate between two identical modems, since it doesn't appear that the $sernum
variable with the device serial number is exposed at this time. Anyway, this is not a problem in this setup, since I deliberately purchased two different modems so the server doesn't go offline on both modems due to some stupid firmware bug.
To connect the modems to the Internet I use ppp
which can be configured through /etc/rc.conf
and /etc/ppp/ppp.conf
. Finally I also created a /etc/ppp/ppp.linkup
file with some commands to run after the connection is established.
The format of the ppp.conf
file looks a bit fucked because the chat(8)
dial command needs lots of backslashes to escape everything. I tried to use !include
to put the chat(8)
script in an external file which ppp.conf
is supposed to support, but could not get it working, so everything is in ppp.conf
.
Since both of my providers need the same settings and chat script I put almost all the configuration in the default:
section, so I only have the device
, ifaddr
and set default HISADDR
statements in each profile section.
[tykling@container1 ~]$ sudo cat /etc/ppp/ppp.conf default: set log Phase Chat IPCP CCP LCP tun command set log local Phase Chat IPCP CCP LCP tun command ident user-ppp VERSION disable ipv6 set timeout 0 set dial "ABORT ERROR \ ABORT BUSY \ ABORT NO\\sCARRIER \ TIMEOUT 5 \"\" \ AT OK \ ATH OK \ AT+CGDCONT=1,\\\"IP\\\",\\\"internet\\\" OK \ \\dATD*99# TIMEOUT 40 CONNECT" telmore_quectel: set device /dev/modem-quectel-data set ifaddr 10.100.100.1/0 10.100.100.2 255.255.255.255 0.0.0.0 add default HISADDR telmore_huawei: set device /dev/modem-huawei-data set ifaddr 10.200.200.1/0 10.200.200.2 255.255.255.255 0.0.0.0 add default HISADDR [tykling@container1 ~]$
The few first lines set logging configuration, ident
(so the "other end" can see which ppp software I am using in case of problems, unlikely to ever be used), disables ipv6 (not supported by this ISP), and sets timeout
to 0. This is the timeout that hangs up the internet connection after some idle period. It has nothing to do with the expect timeout mentioned below!
The set dial
statement defines the chat(8)
script to use for dialling the ISP. It breaks down like this:
ABORT
statements tell ppp
to stop if it encounters any of the strings ERROR
, BUSY
, or NO CARRIER
.
TIMEOUT 5
line sends nothing, and waits for 5 seconds for the expected output OK
. The same TIMEOUT
will be used for the following commands (until a different TIMEOUT
is defined).
AT
(which does nothing), and wait for OK
response.
ATH
which hangs up any existing data connection which might be in progress, and wait for OK
response.
AT+CGDCONT=1,"IP","internet"
which is an UMTS Packet Domain Command
used to define a new Packet Data Protocol
. A PDP
is a connection profile for data connections. This is where the APN
your ISP uses is defined, in my case the APN is the string internet
. Wait for the response OK
from the modem.
\dATD*99#
where the \d
is an escape sequence which makes ppp
wait 2 seconds before firing the command, and ATD
is the command to dial the ISP, which in this case uses the number *99#
. Set the timeout to 40 seconds and wait for the response CONNECT
.
I've highlighted in bold the APN
and phonenumber which will need to be changed to the values your ISP uses. The device and profile names should also be changed to something matching your world of course.
Configuring the right dial
chat(8) script is always going to involve a bit of trial and error. Some modems might need an init string (meaning there is some extra AT command(s) that need to be executed before dialing with ATD
) but these two modems both work fine without an init string. /var/log/ppp.log
has lots of output during connecting so be sure to tail it while working on this.
The two profiles are named telmore_quectel
and telmore_huawei
. The only thing I need to define in them are the device
paths, the ifaddr
line which sets IP configuration (although usually (and weirdly) you only get to dictate the IP for the other end):
tun0: flags=8051metric 0 mtu 1500 options=80000 inet 10.130.25.54 --> 10.100.100.2/32 groups: tun nd6 options=21 Opened by PID 24622 tun1: flags=8051 metric 0 mtu 1500 options=80000 inet 10.132.101.230 --> 10.200.200.2/32 groups: tun nd6 options=21 Opened by PID 85756
Note how the IP of the other end of the tunnel is as specified in ppp.conf
but my own end has an assigned IP from the ISP. The strange world of ppp!
Since I have two modems I need to run two instances of ppp
. This is supported in the /etc/rc.d/ppp
script but it doesn't work very well :( I used the following /etc/rc.conf
lines to enable the two ppp profiles and pinning the tun
interface they will use, so firewall configuration is easier:
[tykling@container1 ~]$ grep ppp /etc/rc.conf ppp_enable="YES" ppp_profile="telmore_quectel telmore_huawei" ppp_telmore_quectel_mode="ddial" ppp_telmore_quectel_unit="0" ppp_telmore_huawei_mode="ddial" ppp_telmore_huawei_unit="1" [tykling@container1 ~]$
The profile names are freehand, i named mine $isp_$manufacturer so I can tell them apart more easily. This example uses the same provider telmore
for both modems, but one of the simcards was replaced for the final setup, so the two modems use two different cellular networks. This means it should be possible to stay online even if one of the providers is having a bad day.
This setup means that two instances of ppp
is started on boot, and the _mode="ddial"
means it will keep trying to redial forever if it looses the connection. Finally the _unit="x"
determines the number of the tun
interface which will be used by this ppp
instance. Note that it doesn't say ppp_telmore_huawei_unit="tun1"
but merely has the number of the interface, so just ppp_telmore_huawei_unit="1"
.
Starting and stopping them together works well, what doesn't work is starting a profile individually if another profile is already running:
[tykling@container1 ~]$ sudo service ppp status ppp is running as pid 83309 92053. [tykling@container1 ~]$ sudo service ppp stop telmore_huawei Stopping PPP profile: telmore_huawei. [tykling@container1 ~]$ sudo service ppp status ppp is running as pid 83309. [tykling@container1 ~]$ sudo service ppp start telmore_huawei ppp already running? (pid=83309). [tykling@container1 ~]$ sudo service ppp stop telmore_quectel Stopping PPP profile: telmore_quectel. [tykling@container1 ~]$ sudo service ppp status ppp is not running. [tykling@container1 ~]$ sudo service ppp start telmore_quectel Starting PPP profile: telmore_quectel . [tykling@container1 ~]$ sudo service ppp start telmore_huawei ppp already running? (pid=69181). [tykling@container1 ~]$ sudo service ppp status ppp is running as pid 69181. [tykling@container1 ~]$
This means it is impossible to (re)start one of the links while keeping the other one running, a pretty significant flaw. Maybe I am the first person ever to actually use this feature.
Furthermore the ppp_stop_profile()
function is missing a pwait
call, and since it sometimes takes ppp
a second or two to hang up and disconnect a simple service ppp restart
doesn't always work either:
[tykling@container1 ~]$ sudo service ppp restart Stopping PPP profile: telmore_quectel telmore_huawei. ppp already running? (pid=23948). [tykling@container1 ~]$ sudo service ppp status ppp is not running. [tykling@container1 ~]$ # great, now I'm offline :(
So after spending an afternoon trying to fix the issues I gave up and decided to file a bug and run ppp
under supervisord instead. The following supervisord.d
config snippets do the trick:
[tykling@container1 ~]$ cat /usr/local/etc/supervisord.d/ppp_telmore_quectel.conf ; run ppp for telmore isp, quectel modem [program:ppp_telmore_quectel] command=/usr/sbin/ppp -foreground -nat -unit0 telmore_quectel user=0 stdout_syslog=True stderr_syslog=True startsecs=5 autostart=True priority=1 [tykling@container1 ~]$ cat /usr/local/etc/supervisord.d/ppp_telmore_huawei.conf ; run ppp for telmore isp, huawei modem [program:ppp_telmore_huawei] command=/usr/sbin/ppp -foreground -nat -unit1 telmore_huawei user=0 stdout_syslog=True stderr_syslog=True startsecs=5 autostart=True priority=1 [tykling@container1 ~]$
So I now have two ppp
profiles defined, the Quectel modem uses tun0
and the Huawei modem uses tun1
:
[tykling@container1 ~]$ sudo ps auxww | grep sbin/ppp | grep -v grep root 1485 0.1 0.1 16132 5076 - Ss 15:24 3:30.22 /usr/sbin/ppp -quiet -ddial -nat -unit0 telmore_quectel root 4677 0.0 0.1 16300 5052 - Ss 15:24 0:32.66 /usr/sbin/ppp -quiet -ddial -nat -unit1 telmore_huawei [tykling@container1 ~]$
I can see what they are doing by tailing /var/log/ppp.log
.
Both of my ppp
profiles have the line add default HISADDR
meaning they will set the default gateway of the server after a successful ppp
connection. Since there can only ever be one default gateway it will be a bit random which of the modems is currently handling the default gateway traffic - it will just be the last ppp profile to have connected at any given time. This is okay! The modems are equally fast, and I don't really care which modem is the primary right now. But I am concerned that something could go wrong and the default gateway might suddently point nowhere, and I would then be unable to contact the server without starting the car and going for a long drive.
I decided to run with multi-FIB support and have a routing table dedicated to each modem, in addition to the default routing table. This will help me make a couple of "emergency backdoors" into the system, as a last resort in case of connectivity issues.
I added the line net.fibs=5
to /boot/loader.conf
and after a reboot I can check with sysctl(8)
that the setting has taken effect:
[tykling@container1 ~]$ sysctl net.fibs net.fibs: 5 [tykling@container1 ~]$
NOTE: The current default in FreeBSD is to add interface routes in all FIBs, which is a weird and stupid default. The following entry in /etc/sysctl.conf
makes it act as completely seperate routing tables, as you would expect:
[tykling@container1 ~]$ grep fib /etc/sysctl.conf net.add_addr_allfibs=0 [tykling@container1 ~]$
There is some chatter about changing the default, which would be a welcome change. It is terribly confusing as it is now, because extra FIBs
start out with a whole bunch of interface routes from fib0.
Anyway, multi-FIB support in FreeBSD is a great tool for advanced network setups! A FIB
is simply a routing table. Usually there is just one, FIB 0
, which is the one you are used to interacting with when typing netstat -rn
to show the routing table.
Enabling more than one FIB
does nothing before you start configuring some routes in the new FIB
and then ask some applications to use the alternative FIB instead of the default fib 0. This is done with the setfib(1)
command as demonstrated here:
[tykling@container1 ~]$ netstat -rnf inet | grep default default 10.200.200.2 US tun1 [tykling@container1 ~]$ setfib 0 netstat -rnf inet | grep default default 10.200.200.2 US tun1 [tykling@container1 ~]$ setfib 1 netstat -rnf inet | grep default default 10.100.100.2 UGS tun0 [tykling@container1 ~]$ setfib 2 netstat -rnf inet | grep default default 10.200.200.2 UGS tun1 [tykling@container1 ~]$
What the above output tells me is that currently the Huawei modem (tied to tun1
) is currently handling the traffic in FIB 0
. The setfib
command sets the active routing table for the command following it. Since FIB 0
is the default FIB
prefixing setfib 0
before a command is a NOOP, it is the same as running the command without setfib
. Using setfib 1
makes it use FIB 1
so the usual routing table is not considered at all when executing the program. setfib
can be used with any command:
[tykling@container1 ~]$ setfib 1 ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes 64 bytes from 8.8.8.8: icmp_seq=0 ttl=116 time=35.991 ms 64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=50.908 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=48.970 ms ^C --- 8.8.8.8 ping statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 35.991/45.290/50.908/6.622 ms [tykling@container1 ~]$ setfib 2 ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes 64 bytes from 8.8.8.8: icmp_seq=0 ttl=116 time=39.765 ms 64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=38.633 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=35.848 ms 64 bytes from 8.8.8.8: icmp_seq=3 ttl=116 time=49.123 ms ^C --- 8.8.8.8 ping statistics --- 4 packets transmitted, 4 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 35.848/40.842/49.123/4.989 ms [tykling@container1 ~]$
The full routing tables look like this:
[tykling@container1 ~]$ netstat -rnf inet Routing tables Internet: Destination Gateway Flags Netif Expire default 10.200.200.2 US tun1 10.100.100.2 link#6 UHS tun0 10.130.25.54 link#6 UHS lo0 10.132.101.230 link#7 UHS lo0 10.200.200.2 link#7 UHS tun1 127.0.0.1 link#4 UH lo0 [tykling@container1 ~]$ setfib 1 netstat -rnf inet Routing tables (fib: 1) Internet: Destination Gateway Flags Netif Expire default 10.100.100.2 UGS tun0 10.100.100.2 tun0 UHS tun0 127.0.0.1 lo0 UHS lo0 [tykling@container1 ~]$ setfib 2 netstat -rnf inet Routing tables (fib: 2) Internet: Destination Gateway Flags Netif Expire default 10.200.200.2 UGS tun1 10.200.200.2 tun1 UHS tun1 127.0.0.1 lo0 UHS lo0 [tykling@container1 ~]$
So FIB 0
has a host route (flag H
) for the remote ends of each ppp
connection, so it knows which interface to use to reach that IP. It also has the default
route of course, which points to the remote end of the ppp
connection running on tun1
. These routes are configured by the ppp.linkup
script.
Since both of the ppp
processes run in FIB 0
(unsurprisingly adding ppp_telmore_quectel_fib="1"
to /etc/rc.conf
did nothing). Instead I've created a custom /etc/ppp/ppp.linkup
script. Linkup scripts are used to do stuff after a ppp
connection is established. I use it to add routes to FIB 1 and 2 as needed:
[tykling@container1 ~]$ cat /etc/ppp/ppp.linkup telmore_quectel: !bg /usr/sbin/setfib 1 /sbin/route delete default !bg /usr/sbin/setfib 1 /sbin/route add 10.100.100.2 -iface INTERFACE !bg /usr/sbin/setfib 1 /sbin/route add default 10.100.100.2 telmore_huawei: !bg /usr/sbin/setfib 2 /sbin/route delete default !bg /usr/sbin/setfib 2 /sbin/route add 10.200.200.2 -iface INTERFACE !bg /usr/sbin/setfib 2 /sbin/route add default 10.200.200.2 [tykling@container1 ~]$
Each section is named for the ppp
profile they relate to. The !bg
statement means "execute this command in the background", and setfib
is used to make the changes in the desired FIB
. First I delete any existing default gateway in the FIB
, then I add the host/interface route (remember the extra FIBs have no routes at all, not even interface routes!), and finally I set the default gateway in the FIB
.
This setup means I can now tie applications to FIB
1 or 2 and be sure they are using a specific modem. Armed with this knowledge I could start for example a VPN client in each FIB, so I always have a fallback connection in case FIB 0
isn't working for whatever reason. I ended up using Tor
as a poor mans VPN, where I run an onion service for SSH in each FIB:
tor_enable="YES" tor_instances="telmore telmore2 " tor_telmore_fib="1" tor_telmore2_fib="2"
The multi-instance support in the Tor
rc.d script works much better than in the ppp
rc.d script. The above configuration runs the default Tor
instance "main" in FIB 0
, and defines two more instances, telmore
and telmore2
. It then specified that the telmore
instance should run in FIB 1
and telmore2
instance in FIB 2
. The end result is this (add -O fib
to ps
to see which FIB
each process is running in):
[tykling@container1 ~]$ sudo ps auxO fib | grep -E "(FIB|bin/tor)" | grep -v grep USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND PID FIB TT STAT TIME COMMAND _tor 84367 0.2 1.0 49592 40296 - S 08:28 0:16.70 /usr/local/bin/t 84367 1 - S 0:16.70 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc@telmore --PidFile /var/run/tor/tor.pid@telmore --RunAsDaemon 1 --DataDirectory /var/db/tor/instance@telmore _tor 6539 0.0 0.9 47888 38504 - S 20:13 1:20.16 /usr/local/bin/t 6539 2 - S 1:20.16 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc@telmore2 --PidFile /var/run/tor/tor.pid@telmore2 --RunAsDaemon 1 --DataDirectory /var/db/tor/instance@telmore2 _tor 10155 0.0 1.0 49936 40600 - S 20:13 5:24.00 /usr/local/bin/t 10155 0 - S 5:24.00 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc --PidFile /var/run/tor/tor.pid --RunAsDaemon 1 --DataDirectory /var/db/tor [tykling@container1 ~]$
Note how each Tor
instance is running in a different FIB
. Tor
itself has no idea this is happening, it just makes connections and the kernel routes them as usual, but with a different routing table.
I could have used OpenVPN or SSH tunnels or something else, but I like Tor
and it requires no extra infrastructure in the other end. YMMV.
Apart from a few headaches around AT commands, ppp rc.d scripts and other little snags along the way this was pretty smooth to setup. FreeBSD is a fantastic tool for this sort of thing. The multi-FIB network stuff works so well! Since I had all the Ansible roles for all the basic setup ready configuring everything was a breeze. I dogfooded along the way to make sure the connections are stable, and I can report that running Ansible over a Tor Onion service over an LTE connection is not at all as painful as it sounds :)
Next thing I configured was Prometheus monitoring of the signal strength of the modems. I added it to a Grafana dashboard along with a few traffic- and pps-graphs on the two tun
interfaces. The final result looks like this, a great tool to monitor the connectivity of our storage container.
Prometheus monitors the exporters via an Onion service running in FIB 0
so it uses whichever modem connected to the ISP most recently. It works great! Since the two modems use two different networks there is a decent chance we can find coverage even in the low coverage area.
That's about it! Well done if you got this far. Take care :)
I've recently signed up for Github Sponsors meaning it is now easy to sponsor me and my work. If this post or some of my other writing, software or services have helped you then you can consider becoming a sponsor.