FreeBSD, Multiple LTE Modems, PPP and Multi-FIB on APU3C4

02. nov 2020 22:27 UTC

I recently had to configure a computer for environmental monitoring in the 20" shipping container we use for BornHack storage. The container sits outside in the wind and rain all year. This means we've had to fit it with a dehumidifier to make sure the moisture doesn't destroy all our stuff. Being nerds we want graphs to tell us about the temperature and humidity levels in the container, but since it is sitting in a place with no Internet the first challenge was to get online.

The solution was an APU3 with a couple of LTE modems and a whole bunch of sensors attached. Configuring the modems and connectivity meant having to refamiliarise myself with AT commands, ppp and considering how to handle routing and network on the APU.

All of BornHacks infrastructure is managed through Ansible, and the container APU is going to be a permanent part of the BornHack infrastructure. This means I need to be able to reach it from our Ansible server, and our Prometheus server needs to be able to connect to it. Both of these present a challenge since Carrier Grade NAT on LTE providers makes it impossible to connect in to the servers SSH or other services.

We don't currently have a VPN as part of BornHacks infrastructure, meaning there is no obvious way to get a TCP connection into the container APU, even if it is online. Any thoughts of putting up a few reverse SSH tunnels were cast aside, instead I decided early on to use Tor Onion Services as a poor mans VPN. It works really well and saved a lot of hassle. YMMV.

This post is not really about Tor though, it is mainly about configuring ppp and about the wonderful world of AT commands anno 2020. Since the final solution uses multiple Forwarding Information Bases or FIBs (multiple routing tables) I will also discuss the configuration and merits of such a setup.

Hardware

The computer is an APU3C4 from PC Engines, a Swiss company (meaning European, but not in the EU). For this reason I went with a Swedish dealer named Teklager.se which I can highly recommend. They have great technical documentation on their website, and they have competent people on their support email, both of which I thought were things of the past.

The APUs are great little computers. The APU3C4 has 3 ethernet ports and of course a few USB ports, inside it has GPIO ports for expansion and sensors, and then three internal miniPCIe slots:

The first miniPCIe slot can take either an msata drive, a modem or some other miniPCIe device. One of the SIM card slots is wired to this miniPCIe slot. This slot has a Quectel EC25-EU modem mounted.
The second miniPCIe slot can take either a modem or some other miniPCIe device. The other SIM card slot is wired to this miniPCIe slot. This slot has a Huawei ME909s-120 modem mounted.
The third miniPCIe slot has no SIM card slot wired to it, it can be used for example for a wifi card. Is is unused in this setup.

Note: The APU arrived assembled with modems and SSD mounted by Teklager, but I was unable to detect the LTE modem in the first miniPCIe slot (the slot which can also take an msata drive). Turns out that EHCI0 Controller needs to be enabled in BIOS before an LTE modem can be used in that miniPCIe slot. Thanks to Teklager support for figuring this one out for me! I spent quite some time scratching my head, but after swapping slots for the modems and then detecting "the other one" I knew both modems worked, so I asked support, they asked PCEngines, and we found the solution. Very nice.

For reference, this is a dmesg from the APU3C4 with both modems being detected. The modem related output is highlighted in bold near the end:

[tykling@container1 ~]$ cat /var/run/dmesg.boot
---<<BOOT>>---
Copyright (c) 1992-2020 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.2-STABLE r367109 GENERIC amd64
FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
VT(vga): resolution 640x480
CPU: AMD GX-412TC SOC                                (998.15-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x730f01  Family=0x16  Model=0x30  Stepping=1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x3ed8220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x1d4037ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,Topology,PNXC,DBE,PTSC,PL2I>
  Structured Extended Features=0x8<BMI1>
  XSAVE Features=0x1<XSAVEOPT>
  SVM: NP,NRIP,AFlush,DAssist,NAsids=8
  TSC: P-state invariant, performance statistics
real memory  = 4294967296 (4096 MB)
avail memory = 4084944896 (3895 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <COREv4 COREBOOT>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
random: unblocking device.
ioapic0 <Version 2.1> irqs 0-23 on motherboard
ioapic1 <Version 2.1> irqs 24-55 on motherboard
Launching APs: 2 3 1
Timecounter "TSC" frequency 998148069 Hz quality 1000
random: entropy device external interface
kbd0 at kbdmux0
000.000023 [4336] netmap_init               netmap: loaded module
[ath_hal] loaded
module_register_init: MOD_LOAD (vesa, 0xffffffff81116e40, 0) error 19
nexus0
vtvga0: <VT VGA driver> on motherboard
cryptosoft0: <software crypto> on motherboard
aesni0: <AES-CBC,AES-CCM,AES-GCM,AES-ICM,AES-XTS> on motherboard
acpi0: <COREv4 COREBOOT> on motherboard
acpi0: Power Button (fixed)
cpu0: <ACPI CPU> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.000000s
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
apei0: <ACPI Platform Error Interface> on acpi0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x818-0x81b on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pcib0: could not evaluate _ADR - AE_NOT_FOUND
pci0: <ACPI PCI bus> on pcib0
pci0: <base peripheral, IOMMU> at device 0.2 (no driver attached)
pcib1: <ACPI PCI-PCI bridge> irq 25 at device 2.2 on pci0
pcib1: failed to allocate initial I/O port window: 0x1000-0x1fff
pci1: <ACPI PCI bus> on pcib1
igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> mem 0xf7a00000-0xf7a1ffff,0xf7a20000-0xf7a23fff irq 28 at device 0.0 on pci1
igb0: Using 1024 TX descriptors and 1024 RX descriptors
igb0: Using 2 RX queues 2 TX queues
igb0: Using MSI-X interrupts with 3 vectors
igb0: Ethernet address: 00:0d:b9:58:14:64
igb0: netmap queues/slots: TX 2/1024, RX 2/1024
pcib2: <ACPI PCI-PCI bridge> irq 26 at device 2.3 on pci0
pci2: <ACPI PCI bus> on pcib2
igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x2000-0x201f mem 0xf7b00000-0xf7b1ffff,0xf7b20000-0xf7b23fff irq 32 at device 0.0 on pci2
igb1: Using 1024 TX descriptors and 1024 RX descriptors
igb1: Using 2 RX queues 2 TX queues
igb1: Using MSI-X interrupts with 3 vectors
igb1: Ethernet address: 00:0d:b9:58:14:65
igb1: netmap queues/slots: TX 2/1024, RX 2/1024
pcib3: <ACPI PCI-PCI bridge> irq 27 at device 2.4 on pci0
pci3: <ACPI PCI bus> on pcib3
igb2: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3000-0x301f mem 0xf7c00000-0xf7c1ffff,0xf7c20000-0xf7c23fff irq 36 at device 0.0 on pci3
igb2: Using 1024 TX descriptors and 1024 RX descriptors
igb2: Using 2 RX queues 2 TX queues
igb2: Using MSI-X interrupts with 3 vectors
igb2: Ethernet address: 00:0d:b9:58:14:66
igb2: netmap queues/slots: TX 2/1024, RX 2/1024
pci0: <encrypt/decrypt> at device 8.0 (no driver attached)
xhci0: <AMD FCH USB 3.0 controller> mem 0xf7f22000-0xf7f23fff irq 18 at device 16.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
xhci0: Unable to map MSI-X table 
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
ahci0: <AMD Hudson-2 AHCI SATA controller> port 0x4010-0x4017,0x4020-0x4023,0x4018-0x401f,0x4024-0x4027,0x4000-0x400f mem 0xf7f25000-0xf7f253ff at device 17.0 on pci0
ahci0: AHCI v1.30 with 2 6Gbps ports, Port Multiplier supported with FBS
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ehci0: <AMD FCH USB 2.0 controller> mem 0xf7f26000-0xf7f260ff irq 18 at device 18.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci0
usbus1: 480Mbps High Speed USB v2.0
ehci1: <AMD FCH USB 2.0 controller> mem 0xf7f27000-0xf7f270ff irq 18 at device 19.0 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci1
usbus2: 480Mbps High Speed USB v2.0
isab0: <PCI-ISA bridge> at device 20.3 on pci0
isa0: <ISA bus> on isab0
sdhci_pci0: <Generic SD HCI> mem 0xf7f28000-0xf7f280ff at device 20.7 on pci0
sdhci_pci0: 1 slot(s) allocated
acpi_tz0: <Thermal Zone> on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: console (115200,n,8,1)
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
orm0: <ISA Option ROM> at iomem 0xee800-0xeffff pnpid ORM0000 on isa0
hwpstate0: <Cool`n'Quiet 2.0> on cpu0
ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
            to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
ZFS filesystem version: 5
ZFS storage pool version: features support (5000)
Timecounters tick every 1.000 msec
ugen0.1: <0x1022 XHCI root HUB> at usbus0
ugen2.1: <AMD EHCI root HUB> at usbus2
ugen1.1: <AMD EHCI root HUB> at usbus1
Trying to mount root from zfs:zroot/ROOT/12-STABLE-367109 []...
uhub0: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
uhub1: <AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
uhub2: Root mount waiting for: usbus0 CAM usbus1 usbus2
<AMD EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
ada0 at ahcich1 bus 0 scbus1 target 0 lun 0
ada0: <WDC WDS500G1R0A-68A4W0 411000WR> ACS-4 ATA SATA 3.x device
ada0: Serial Number 201802A00C74
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 476940MB (976773168 512 byte sectors)
uhub0: 4 ports with 4 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
Root mount waiting for: usbus1 usbus2
ugen2.2: <vendor 0x0438 product 0x7900> at usbus2
uhub3 on uhub1
uhub3: <vendor 0x0438 product 0x7900, class 9/0, rev 2.00/0.18, addr 2> on usbus2
ugen1.2: <vendor 0x0438 product 0x7900> at usbus1
uhub4 on uhub2
uhub4: <vendor 0x0438 product 0x7900, class 9/0, rev 2.00/0.18, addr 2> on usbus1
uhub3: 4 ports with 4 removable, self powered
uhub4: 4 ports with 4 removable, self powered
Root mount waiting for: usbus1 usbus2
ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2
ugen1.3: <Android Android> at usbus1
GEOM_ELI: Device ada0p2.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: hardware
intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0
smbus0: <System Management Bus> on intsmb0
lo0: link state changed to UP
lo1: link state changed to UP
u3g0 on uhub3
u3g1 on uhub4
u3g1: <Android Android, class 239/2, rev 2.00/3.18, addr 3> on usbus1
u3g1: Found 5 ports.
u3g0: <Huawei Mobile Connect - Modem> on usbus2
u3g0: Found 5 ports.
pflog0: promiscuous mode enabled
tun0: link state changed to UP
Accounting enabled
[tykling@container1 ~]$

The modems are miniPCIe devices but they show up as USB devices. This is what usbconfig(8) has to say about them:

[tykling@container1 ~]$ sudo usbconfig list
ugen0.1: <0x1022 XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
ugen2.1: <AMD EHCI root HUB> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.1: <AMD EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen2.2: <vendor 0x0438 product 0x7900> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen1.2: <vendor 0x0438 product 0x7900> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (100mA)
ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA)
ugen1.3: <Android Android> at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (500mA)
ugen0.2: <Targus Group Intl Targus Group Intl> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (50mA)
[tykling@container1 ~]$ sudo usbconfig -d ugen2.3 dump_device_desc
ugen2.3: <Huawei Technologies Co., Ltd. HUAWEI Mobile V7R11> at usbus2, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA)

  bLength = 0x0012 
  bDescriptorType = 0x0001 
  bcdUSB = 0x0200 
  bDeviceClass = 0x0000  <Probed by interface class>
  bDeviceSubClass = 0x0000 
  bDeviceProtocol = 0x00ff 
  bMaxPacketSize0 = 0x0040 
  idVendor = 0x12d1 
  idProduct = 0x15c1 
  bcdDevice = 0x0102 
  iManufacturer = 0x0001  <Huawei Technologies Co., Ltd.>
  iProduct = 0x0002  <HUAWEI Mobile V7R11>
  iSerialNumber = 0x0003  <0123456789ABCDEF>
  bNumConfigurations = 0x0003 

[tykling@container1 ~]$ sudo usbconfig -d ugen1.3 dump_device_desc
ugen1.3: <Android Android> at usbus1, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (500mA)

  bLength = 0x0012 
  bDescriptorType = 0x0001 
  bcdUSB = 0x0200 
  bDeviceClass = 0x00ef  <Miscellaneous device>
  bDeviceSubClass = 0x0002 
  bDeviceProtocol = 0x0001 
  bMaxPacketSize0 = 0x0040 
  idVendor = 0x2c7c 
  idProduct = 0x0125 
  bcdDevice = 0x0318 
  iManufacturer = 0x0001  <Android>
  iProduct = 0x0002  <Android>
  iSerialNumber = 0x0000  <no string>
  bNumConfigurations = 0x0001 

[tykling@container1 ~]$

Why the Quectel EC25-EU shows up as Android Android I have no idea! I am guessing some device ID is wrong or needs to be added somewhere in the usb stack maybe? Oh well :)

The modems both expose a bunch of serial ports under /dev:

[tykling@container1 ~]$ ls -l /dev/ttyU*
crw-------  1 root  wheel  0x65 Nov  1 12:01 /dev/ttyU0.0
crw-------  1 root  wheel  0x66 Nov  1 12:01 /dev/ttyU0.0.init
crw-------  1 root  wheel  0x83 Nov  1 12:01 /dev/ttyU0.0.lock
crw-------  1 root  wheel  0x87 Nov  1 12:01 /dev/ttyU0.1
crw-------  1 root  wheel  0x88 Nov  1 12:01 /dev/ttyU0.1.init
crw-------  1 root  wheel  0x89 Nov  1 12:01 /dev/ttyU0.1.lock
crw-------  1 root  wheel  0x8d Nov  1 12:01 /dev/ttyU0.2
crw-------  1 root  wheel  0x8e Nov  1 12:01 /dev/ttyU0.2.init
crw-------  1 root  wheel  0x8f Nov  1 12:01 /dev/ttyU0.2.lock
crw-------  1 root  wheel  0x93 Nov  2 14:01 /dev/ttyU0.3
crw-------  1 root  wheel  0x94 Nov  1 12:01 /dev/ttyU0.3.init
crw-------  1 root  wheel  0x95 Nov  1 12:01 /dev/ttyU0.3.lock
crw-------  1 root  wheel  0x99 Nov  1 12:01 /dev/ttyU0.4
crw-------  1 root  wheel  0x9a Nov  1 12:01 /dev/ttyU0.4.init
crw-------  1 root  wheel  0x9b Nov  1 12:01 /dev/ttyU0.4.lock
crw-------  1 root  wheel  0x9f Nov  1 12:01 /dev/ttyU1.0
crw-------  1 root  wheel  0xa0 Nov  1 12:01 /dev/ttyU1.0.init
crw-------  1 root  wheel  0xa1 Nov  1 12:01 /dev/ttyU1.0.lock
crw-------  1 root  wheel  0xa5 Nov  1 12:01 /dev/ttyU1.1
crw-------  1 root  wheel  0xa6 Nov  1 12:01 /dev/ttyU1.1.init
crw-------  1 root  wheel  0xa7 Nov  1 12:01 /dev/ttyU1.1.lock
crw-------  1 root  wheel  0xab Nov  2 22:53 /dev/ttyU1.2
crw-------  1 root  wheel  0xac Nov  1 12:01 /dev/ttyU1.2.init
crw-------  1 root  wheel  0xad Nov  1 12:01 /dev/ttyU1.2.lock
crw-------  1 root  wheel  0xb1 Nov  1 12:01 /dev/ttyU1.3
crw-------  1 root  wheel  0xb2 Nov  1 12:01 /dev/ttyU1.3.init
crw-------  1 root  wheel  0xb3 Nov  1 12:01 /dev/ttyU1.3.lock
crw-------  1 root  wheel  0xb7 Nov  1 12:01 /dev/ttyU1.4
crw-------  1 root  wheel  0xb8 Nov  1 12:01 /dev/ttyU1.4.init
crw-------  1 root  wheel  0xb9 Nov  1 12:01 /dev/ttyU1.4.lock
[tykling@container1 ~]$

The *.init and *.lock devices can be used to set initial and locked/unchangeable settings for the serial ports, respectively. See stty(1) and 26.2.2. Serial Port Configuration in the FreeBSD Handbook for more info. Serial clients like screen and tip set the serial port settings themselves, so the *.init and *.lock devices can be ignored for our purposes.

External Antennas

For antennas I initially used the antennas supplied with the LTE modems from Teklager. They work great, but since the computer is going to live inside a container I need antennas with a length of cable so I can drill a couple of holes and mount the antennas outside. Additionally, the container is located in a rural woodland area with very shoddy LTE coverage, so a couple of quality and outdoor rated 4G antennas would likely make a real difference in transfer speeds. I got 2 Macab Pro 1000 directional antennas (90 degrees) with 8-9db amplification, the SMA connectors fit right on the APU and 2 meters cable is perfect for this.

AT Commands

Last time I got my hands dirty with AT commands was very briefly about 10 years ago when configuring a 3g modem on a FreeBSD laptop I used back then. Before that we have to go all the way back to the last millenium and my awesome US Robotics Sportster 56K in my teenage years. I barely recall any of the AT command foo so this LTE modem stuff was an opportunity to refresh some of it and see what changed over the last 10-20 years. By the way, I checked, and US Robotics still has a support page for my old teenager modem - impressive :)

Modems basically present a serial port (actually more than one) which you or the ppp software can speak to with a serial program. I usually use screen or tip, the latter is available in FreeBSD base. A typical session might look like this (my commands in bold):

AT
OK
ATI
Manufacturer: Huawei Technologies Co., Ltd.
Model: ME909s-120
Revision: 11.617.15.00.00
IMEI: 123456789012345
+GCAP: +CGSM,+DS,+ES

OK
AT+CSQ
+CSQ: 21,99

OK

The first command is simply AT and the response is OK. This command does nothing, it simply confirms that the serial connection is working and the modem is responsive. Since each of the two modems expose 5 serial ports a bit of trial and error is involved in finding the right one. This is where the AT command comes in handy.

The second command ATI asks the modem for manufacturer information. The IMEI number has been edited out.

The third command AT+CSQ asks the modem for Signal Strength Quality. The first number is rssi (higher is better), the second number is supposed to be BER (lower is better, but BER doesn't appear to be supported on my modems).

The second and third AT commands are both part of an standard extended AT set for interacting with mobile (GSM/LTE) modems. The details can vary a bit (well, a lot) between implementations in different modems, especially between different manufacturers. A nice general resource is m2msupport.net which has good explanations of all the AT commands. But the best thing to do really is to find the AT command reference for the modem in question, since each manufacturer has their own proprietary AT commands in addition to the standard set.

For the Huawei ME909s-120 I used HUAWEI ME909s Series LTE ModuleV100R001AT Command Interface Specification, Issue 03, dated 2019-03-13.
For the Quectel EC25-EU I used Quectel_EC25&EC21_AT_Commands_Manual_V1.3.pdf.

The basic commands to get working internet are the same on the two modems, but advanced troubleshooting stuff sometimes requires a lookup in the AT command reference.

Static Device Names

The two modems I use identify as USB devices, and on FreeBSD dial-out serial ports from USB devices get prefixed with ttyUx where x is a number starting from 0. The ports are enumerated in the order the hardware and driver discovers them, meaning that the dial-out port for the first USB serial device is /dev/ttyU0. If the device has multiple ports they are named /dev/ttyU0.0, /dev/ttyU0.1 and so on, as shown above. The device numbers (and thus names/paths) is consistent as long as no new USB serial ports are added to the system.

While I was still configuring all this I happened to plug in an unrelated USB-serial adapter which I needed to establish communication with a UPS device I was trying out. The new USB serial device got the device name /dev/ttyU2 and all was well until I rebooted the machine a few days later and suddently ppp refused to work because the modem devices changed numbers.

Fortunately it is possible with a bit of devd foo to create symlinks for the devices when they attach, and then use the symlinks as devices, not caring which number they actually got assigned.

The devd software fires actions on various hardware events, which is just what I need here - I want to run the ln command when each of my USB modems are attached (detected).

A few examples from the default /etc/devd.conf:

# Firmware downloader for Atheros AR3011 based USB Bluetooth devices
#attach 100 {
#       match "vendor" "0x0cf3";
#       match "product" "0x3000";
#       action "sleep 2 && /usr/sbin/ath3kfw -d $device-name -f /usr/local/etc/ath3k-1.fw";
#};

# When a USB keyboard arrives, attach it as the console keyboard.
attach 100 {
        device-name "ukbd0";
        action "service syscons setkeyboard /dev/ukbd0";
};
detach 100 {
        device-name "ukbd0";
        action "service syscons setkeyboard /dev/kbd0";
};

The first example (which is commented out) loads some firmware when a device matching the specified vendor and product ID is attached. The next two examples enable and disable a USB keyboard as console keyboard when it is attached and detached.

The manpage for devd.conf has all the details and pretty soon I had a functioning symlinked modem device setup which I am very happy with. devd looks for files under /usr/local/etc/devd so I added the file /usr/local/etc/devd/symlinkmodems.conf with my devd.conf snippet:

[tykling@container1 ~]$ cat /usr/local/etc/devd/symlinkmodems.conf 
attach 100 {
     match "device-name"             "u3g[0-9]+$";
     match "vendor"                  "0x2c7c";
     match "product"                 "0x0125";
     action "ln -s /dev/tty$ttyname.2 /dev/modem-quectel-data";
     action "ln -s /dev/tty$ttyname.3 /dev/modem-quectel-control";
};

attach 100 {
     match "device-name"             "u3g[0-9]+$";
     match "vendor"                  "0x12d1";
     match "product"                 "0x15c1";
     action "ln -s /dev/tty$ttyname.0 /dev/modem-huawei-data";
     action "ln -s /dev/tty$ttyname.2 /dev/modem-huawei-control";
};

[tykling@container1 ~]$

The first rule means that every time a u3g device is attached matching vendor 0x2c7c and product 0x0125 (I got the IDs from the sudo usbconfig -d ugen1.3 dump_device_desc command I ran earlier) then it runs the two commands specified with the action keyword. This is the Quectel modem and it uses port 2 for data and port 3 for control, so I just symlink those to some new device names which make sense for me. Note that I am using the $ttyname variable made available by devd which will contain U1 if this USB serial device is getting /dev/ttyU1.* device names.

The second rule is identical, but with the vendor and product IDs for the Huawei modem, and symlinks for port 0 for data and port 2 for control.

Testing is made easy by the fact that devd can be stopped with sudo service devd stop and then run by hand with sudo devd -f /etc/devd/devd.conf -d, and then it outputs all the stuff it tries to match and what it does.

After the devd config worked I rebooted the machine and now had these lovely symlinks available on the system:

[tykling@container1 ~]$ ls -l /dev/modem-*
lrwxr-xr-x  1 root  wheel  12 Nov  3 15:24 /dev/modem-huawei-control -> /dev/ttyU1.2
lrwxr-xr-x  1 root  wheel  12 Nov  3 15:24 /dev/modem-huawei-data -> /dev/ttyU1.0
lrwxr-xr-x  1 root  wheel  12 Nov  3 15:24 /dev/modem-quectel-control -> /dev/ttyU0.3
lrwxr-xr-x  1 root  wheel  12 Nov  3 15:24 /dev/modem-quectel-data -> /dev/ttyU0.2
[tykling@container1 ~]$

I use the new device names in ppp.conf, and in my monitoring, and when speaking manually to the modems via AT commands. It is much easier to remember /dev/modem-huawei-control than /dev/ttyU1.2. Works like a charm, and will keep working no matter how many USB serial devices I attach. Only weakness in this approach is that it would not be possible to differentiate between two identical modems, since it doesn't appear that the $sernum variable with the device serial number is exposed at this time. Anyway, this is not a problem in this setup, since I deliberately purchased two different modems so the server doesn't go offline on both modems due to some stupid firmware bug.

ppp

To connect the modems to the Internet I use ppp which can be configured through /etc/rc.conf and /etc/ppp/ppp.conf. Finally I also created a /etc/ppp/ppp.linkup file with some commands to run after the connection is established.

ppp.conf

The format of the ppp.conf file looks a bit fucked because the chat(8) dial command needs lots of backslashes to escape everything. I tried to use !include to put the chat(8) script in an external file which ppp.conf is supposed to support, but could not get it working, so everything is in ppp.conf.

Since both of my providers need the same settings and chat script I put almost all the configuration in the default: section, so I only have the device, ifaddr and set default HISADDR statements in each profile section.

[tykling@container1 ~]$ sudo cat /etc/ppp/ppp.conf 
default:
 set log Phase Chat IPCP CCP LCP tun command
 set log local Phase Chat IPCP CCP LCP tun command
 ident user-ppp VERSION
 disable ipv6
 set timeout 0
 set dial "ABORT ERROR \
        ABORT BUSY \
        ABORT NO\\sCARRIER \
        TIMEOUT 5 \"\" \
        AT OK \
        ATH OK \
        AT+CGDCONT=1,\\\"IP\\\",\\\"internet\\\" OK \
        \\dATD*99# TIMEOUT 40 CONNECT"

telmore_quectel:
 set device /dev/modem-quectel-data
 set ifaddr 10.100.100.1/0 10.100.100.2 255.255.255.255 0.0.0.0
 add default HISADDR

telmore_huawei:
 set device /dev/modem-huawei-data
 set ifaddr 10.200.200.1/0 10.200.200.2 255.255.255.255 0.0.0.0
 add default HISADDR
[tykling@container1 ~]$

The few first lines set logging configuration, ident (so the "other end" can see which ppp software I am using in case of problems, unlikely to ever be used), disables ipv6 (not supported by this ISP), and sets timeout to 0. This is the timeout that hangs up the internet connection after some idle period. It has nothing to do with the expect timeout mentioned below!

The set dial statement defines the chat(8) script to use for dialling the ISP. It breaks down like this:

The first three ABORT statements tell ppp to stop if it encounters any of the strings ERROR, BUSY, or NO CARRIER.
The TIMEOUT 5 line sends nothing, and waits for 5 seconds for the expected output OK. The same TIMEOUT will be used for the following commands (until a different TIMEOUT is defined).
Send AT (which does nothing), and wait for OK response.
Send ATH which hangs up any existing data connection which might be in progress, and wait for OK response.
Send AT+CGDCONT=1,"IP","internet" which is an UMTS Packet Domain Command used to define a new Packet Data Protocol. A PDP is a connection profile for data connections. This is where the APN your ISP uses is defined, in my case the APN is the string internet. Wait for the response OK from the modem.
Send \dATD*99# where the \d is an escape sequence which makes ppp wait 2 seconds before firing the command, and ATD is the command to dial the ISP, which in this case uses the number *99#. Set the timeout to 40 seconds and wait for the response CONNECT.

I've highlighted in bold the APN and phonenumber which will need to be changed to the values your ISP uses. The device and profile names should also be changed to something matching your world of course.

Configuring the right dial chat(8) script is always going to involve a bit of trial and error. Some modems might need an init string (meaning there is some extra AT command(s) that need to be executed before dialing with ATD) but these two modems both work fine without an init string. /var/log/ppp.log has lots of output during connecting so be sure to tail it while working on this.

The two profiles are named telmore_quectel and telmore_huawei. The only thing I need to define in them are the device paths, the ifaddr line which sets IP configuration (although usually (and weirdly) you only get to dictate the IP for the other end):

tun0: flags=8051 metric 0 mtu 1500
        options=80000
        inet 10.130.25.54 --> 10.100.100.2/32
        groups: tun
        nd6 options=21
        Opened by PID 24622
tun1: flags=8051 metric 0 mtu 1500
        options=80000
        inet 10.132.101.230 --> 10.200.200.2/32
        groups: tun
        nd6 options=21
        Opened by PID 85756

Note how the IP of the other end of the tunnel is as specified in ppp.conf but my own end has an assigned IP from the ISP. The strange world of ppp!

Problem: Multiple PPP Profiles

Since I have two modems I need to run two instances of ppp. This is supported in the /etc/rc.d/ppp script but it doesn't work very well :( I used the following /etc/rc.conf lines to enable the two ppp profiles and pinning the tun interface they will use, so firewall configuration is easier:

[tykling@container1 ~]$ grep ppp /etc/rc.conf
ppp_enable="YES"
ppp_profile="telmore_quectel telmore_huawei"
ppp_telmore_quectel_mode="ddial"
ppp_telmore_quectel_unit="0"
ppp_telmore_huawei_mode="ddial"
ppp_telmore_huawei_unit="1"
[tykling@container1 ~]$

The profile names are freehand, i named mine $isp_$manufacturer so I can tell them apart more easily. This example uses the same provider telmore for both modems, but one of the simcards was replaced for the final setup, so the two modems use two different cellular networks. This means it should be possible to stay online even if one of the providers is having a bad day.

This setup means that two instances of ppp is started on boot, and the _mode="ddial" means it will keep trying to redial forever if it looses the connection. Finally the _unit="x" determines the number of the tun interface which will be used by this ppp instance. Note that it doesn't say ppp_telmore_huawei_unit="tun1" but merely has the number of the interface, so just ppp_telmore_huawei_unit="1".

Starting and stopping them together works well, what doesn't work is starting a profile individually if another profile is already running:

[tykling@container1 ~]$ sudo service ppp status
ppp is running as pid 83309 92053.
[tykling@container1 ~]$ sudo service ppp stop telmore_huawei
Stopping PPP profile: telmore_huawei.
[tykling@container1 ~]$ sudo service ppp status
ppp is running as pid 83309.
[tykling@container1 ~]$ sudo service ppp start telmore_huawei                                                                                                                                                                                                                                                                                          
ppp already running?  (pid=83309).
[tykling@container1 ~]$ sudo service ppp stop telmore_quectel                                                                                                                                                                                                                                                                                          
Stopping PPP profile: telmore_quectel.
[tykling@container1 ~]$ sudo service ppp status
ppp is not running.
[tykling@container1 ~]$ sudo service ppp start telmore_quectel
Starting PPP profile: telmore_quectel
.
[tykling@container1 ~]$ sudo service ppp start telmore_huawei 
ppp already running?  (pid=69181).
[tykling@container1 ~]$ sudo service ppp status
ppp is running as pid 69181.
[tykling@container1 ~]$

This means it is impossible to (re)start one of the links while keeping the other one running, a pretty significant flaw. Maybe I am the first person ever to actually use this feature.

Furthermore the ppp_stop_profile() function is missing a pwait call, and since it sometimes takes ppp a second or two to hang up and disconnect a simple service ppp restart doesn't always work either:

[tykling@container1 ~]$ sudo service ppp restart
Stopping PPP profile: telmore_quectel telmore_huawei.
ppp already running?  (pid=23948).
[tykling@container1 ~]$ sudo service ppp status 
ppp is not running.
[tykling@container1 ~]$ # great, now I'm offline :(

So after spending an afternoon trying to fix the issues I gave up and decided to file a bug and run ppp under supervisord instead. The following supervisord.d config snippets do the trick:

[tykling@container1 ~]$ cat /usr/local/etc/supervisord.d/ppp_telmore_quectel.conf 
; run ppp for telmore isp, quectel modem
[program:ppp_telmore_quectel]
command=/usr/sbin/ppp -foreground -nat -unit0 telmore_quectel
user=0
stdout_syslog=True
stderr_syslog=True
startsecs=5
autostart=True
priority=1

[tykling@container1 ~]$ cat /usr/local/etc/supervisord.d/ppp_telmore_huawei.conf 
; run ppp for telmore isp, huawei modem
[program:ppp_telmore_huawei]
command=/usr/sbin/ppp -foreground -nat -unit1 telmore_huawei
user=0
stdout_syslog=True
stderr_syslog=True
startsecs=5
autostart=True
priority=1
[tykling@container1 ~]$

So I now have two ppp profiles defined, the Quectel modem uses tun0 and the Huawei modem uses tun1:

[tykling@container1 ~]$ sudo ps auxww | grep sbin/ppp | grep -v grep
root      1485   0.1  0.1  16132  5076  -  Ss   15:24      3:30.22 /usr/sbin/ppp -quiet -ddial -nat -unit0 telmore_quectel
root      4677   0.0  0.1  16300  5052  -  Ss   15:24      0:32.66 /usr/sbin/ppp -quiet -ddial -nat -unit1 telmore_huawei
[tykling@container1 ~]$

I can see what they are doing by tailing /var/log/ppp.log.

ppp.linkup and multi-FIB

Both of my ppp profiles have the line add default HISADDR meaning they will set the default gateway of the server after a successful ppp connection. Since there can only ever be one default gateway it will be a bit random which of the modems is currently handling the default gateway traffic - it will just be the last ppp profile to have connected at any given time. This is okay! The modems are equally fast, and I don't really care which modem is the primary right now. But I am concerned that something could go wrong and the default gateway might suddently point nowhere, and I would then be unable to contact the server without starting the car and going for a long drive.

I decided to run with multi-FIB support and have a routing table dedicated to each modem, in addition to the default routing table. This will help me make a couple of "emergency backdoors" into the system, as a last resort in case of connectivity issues.

I added the line net.fibs=5 to /boot/loader.conf and after a reboot I can check with sysctl(8) that the setting has taken effect:

[tykling@container1 ~]$ sysctl net.fibs     
net.fibs: 5
[tykling@container1 ~]$

NOTE: The current default in FreeBSD is to add interface routes in all FIBs, which is a weird and stupid default. The following entry in /etc/sysctl.conf makes it act as completely seperate routing tables, as you would expect:

[tykling@container1 ~]$ grep fib /etc/sysctl.conf 
net.add_addr_allfibs=0
[tykling@container1 ~]$

There is some chatter about changing the default, which would be a welcome change. It is terribly confusing as it is now, because extra FIBs start out with a whole bunch of interface routes from fib0.

Anyway, multi-FIB support in FreeBSD is a great tool for advanced network setups! A FIB is simply a routing table. Usually there is just one, FIB 0, which is the one you are used to interacting with when typing netstat -rn to show the routing table.

Enabling more than one FIB does nothing before you start configuring some routes in the new FIB and then ask some applications to use the alternative FIB instead of the default fib 0. This is done with the setfib(1) command as demonstrated here:

[tykling@container1 ~]$ netstat -rnf inet | grep default
default            10.200.200.2       US         tun1
[tykling@container1 ~]$ setfib 0 netstat -rnf inet | grep default
default            10.200.200.2       US         tun1
[tykling@container1 ~]$ setfib 1 netstat -rnf inet | grep default
default            10.100.100.2       UGS        tun0
[tykling@container1 ~]$ setfib 2 netstat -rnf inet | grep default
default            10.200.200.2       UGS        tun1
[tykling@container1 ~]$

What the above output tells me is that currently the Huawei modem (tied to tun1) is currently handling the traffic in FIB 0. The setfib command sets the active routing table for the command following it. Since FIB 0 is the default FIB prefixing setfib 0 before a command is a NOOP, it is the same as running the command without setfib. Using setfib 1 makes it use FIB 1 so the usual routing table is not considered at all when executing the program. setfib can be used with any command:

[tykling@container1 ~]$ setfib 1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=116 time=35.991 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=50.908 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=48.970 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 35.991/45.290/50.908/6.622 ms
[tykling@container1 ~]$ setfib 2 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=116 time=39.765 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=38.633 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=35.848 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=116 time=49.123 ms
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 35.848/40.842/49.123/4.989 ms
[tykling@container1 ~]$

The full routing tables look like this:

[tykling@container1 ~]$ netstat -rnf inet
Routing tables

Internet:
Destination        Gateway            Flags     Netif Expire
default            10.200.200.2       US         tun1
10.100.100.2       link#6             UHS        tun0
10.130.25.54       link#6             UHS         lo0
10.132.101.230     link#7             UHS         lo0
10.200.200.2       link#7             UHS        tun1
127.0.0.1          link#4             UH          lo0
[tykling@container1 ~]$ setfib 1 netstat -rnf inet
Routing tables (fib: 1)

Internet:
Destination        Gateway            Flags     Netif Expire
default            10.100.100.2       UGS        tun0
10.100.100.2       tun0               UHS        tun0
127.0.0.1          lo0                UHS         lo0
[tykling@container1 ~]$ setfib 2 netstat -rnf inet
Routing tables (fib: 2)

Internet:
Destination        Gateway            Flags     Netif Expire
default            10.200.200.2       UGS        tun1
10.200.200.2       tun1               UHS        tun1
127.0.0.1          lo0                UHS         lo0
[tykling@container1 ~]$

So FIB 0 has a host route (flag H) for the remote ends of each ppp connection, so it knows which interface to use to reach that IP. It also has the default route of course, which points to the remote end of the ppp connection running on tun1. These routes are configured by the ppp.linkup script.

Since both of the ppp processes run in FIB 0 (unsurprisingly adding ppp_telmore_quectel_fib="1" to /etc/rc.conf did nothing). Instead I've created a custom /etc/ppp/ppp.linkup script. Linkup scripts are used to do stuff after a ppp connection is established. I use it to add routes to FIB 1 and 2 as needed:

[tykling@container1 ~]$ cat /etc/ppp/ppp.linkup 
telmore_quectel:
 !bg /usr/sbin/setfib 1 /sbin/route delete default
 !bg /usr/sbin/setfib 1 /sbin/route add 10.100.100.2 -iface INTERFACE
 !bg /usr/sbin/setfib 1 /sbin/route add default 10.100.100.2

telmore_huawei:
 !bg /usr/sbin/setfib 2 /sbin/route delete default
 !bg /usr/sbin/setfib 2 /sbin/route add 10.200.200.2 -iface INTERFACE
 !bg /usr/sbin/setfib 2 /sbin/route add default 10.200.200.2
[tykling@container1 ~]$

Each section is named for the ppp profile they relate to. The !bg statement means "execute this command in the background", and setfib is used to make the changes in the desired FIB. First I delete any existing default gateway in the FIB, then I add the host/interface route (remember the extra FIBs have no routes at all, not even interface routes!), and finally I set the default gateway in the FIB.

This setup means I can now tie applications to FIB 1 or 2 and be sure they are using a specific modem. Armed with this knowledge I could start for example a VPN client in each FIB, so I always have a fallback connection in case FIB 0 isn't working for whatever reason. I ended up using Tor as a poor mans VPN, where I run an onion service for SSH in each FIB:

tor_enable="YES"
tor_instances="telmore telmore2 "
tor_telmore_fib="1"
tor_telmore2_fib="2"

The multi-instance support in the Tor rc.d script works much better than in the ppp rc.d script. The above configuration runs the default Tor instance "main" in FIB 0, and defines two more instances, telmore and telmore2. It then specified that the telmore instance should run in FIB 1 and telmore2 instance in FIB 2. The end result is this (add -O fib to ps to see which FIB each process is running in):

[tykling@container1 ~]$ sudo ps auxO fib | grep -E "(FIB|bin/tor)" | grep -v grep
USER       PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED       TIME COMMAND            PID FIB TT  STAT       TIME COMMAND
_tor     84367   0.2  1.0  49592 40296  -  S    08:28      0:16.70 /usr/local/bin/t 84367   1  -  S       0:16.70 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc@telmore --PidFile /var/run/tor/tor.pid@telmore --RunAsDaemon 1 --DataDirectory /var/db/tor/instance@telmore
_tor      6539   0.0  0.9  47888 38504  -  S    20:13      1:20.16 /usr/local/bin/t  6539   2  -  S       1:20.16 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc@telmore2 --PidFile /var/run/tor/tor.pid@telmore2 --RunAsDaemon 1 --DataDirectory /var/db/tor/instance@telmore2
_tor     10155   0.0  1.0  49936 40600  -  S    20:13      5:24.00 /usr/local/bin/t 10155   0  -  S       5:24.00 |-- /usr/local/bin/tor -f /usr/local/etc/tor/torrc --PidFile /var/run/tor/tor.pid --RunAsDaemon 1 --DataDirectory /var/db/tor
[tykling@container1 ~]$

Note how each Tor instance is running in a different FIB. Tor itself has no idea this is happening, it just makes connections and the kernel routes them as usual, but with a different routing table.

I could have used OpenVPN or SSH tunnels or something else, but I like Tor and it requires no extra infrastructure in the other end. YMMV.

Conclusion

Apart from a few headaches around AT commands, ppp rc.d scripts and other little snags along the way this was pretty smooth to setup. FreeBSD is a fantastic tool for this sort of thing. The multi-FIB network stuff works so well! Since I had all the Ansible roles for all the basic setup ready configuring everything was a breeze. I dogfooded along the way to make sure the connections are stable, and I can report that running Ansible over a Tor Onion service over an LTE connection is not at all as painful as it sounds :)

Next thing I configured was Prometheus monitoring of the signal strength of the modems. I added it to a Grafana dashboard along with a few traffic- and pps-graphs on the two tun interfaces. The final result looks like this, a great tool to monitor the connectivity of our storage container.

Prometheus monitors the exporters via an Onion service running in FIB 0 so it uses whichever modem connected to the ISP most recently. It works great! Since the two modems use two different networks there is a decent chance we can find coverage even in the low coverage area.

That's about it! Well done if you got this far. Take care :)

Donating

I've recently signed up for Github Sponsors meaning it is now easy to sponsor me and my work. If this post or some of my other writing, software or services have helped you then you can consider becoming a sponsor.

Search this blog

Tags for this blogpost

freebsd onion tor fib lte 4g modem 3g