Multi-Vender MPSK deployment with Dynamic VLAN Assignment on FreeRadius 

Multi-Vender MPSK deployment with Dynamic VLAN Assignment on FreeRadius 

In today’s diverse networking environments, there’s a growing trend towards combining proprietary vendor solutions with open-source alternatives to achieve a balance of reliable performance and cost-effectiveness. This article explores this idea by demonstrating a Multi-Pre-Shared Key (MPSK) Wi-Fi solution using both Juniper APs and OpenWiFi APs, where authentication and accounting is managed through FreeRadius server.

The Network Setup

At the heart of the network, we have the 3.0 version of the FreeRadius server—an open-source solution capable of handling authentication for a variety of devices which I am running on my home NUC Ubuntu box, alongside a ISC-DHCP-Server to handle DHCP requests, and some iptables NAT rules to nat clients to the outside world . In this scenario, we’re looking at the Juniper AP45 and the TIP OpenWiFi CIG WF-196, both are configured through their cloud controller and added as clients within the FreeRadius’s clients.conf file.

# Juniper AP45
client a83a79a8367e {
    secret = secret
    ipaddr = 10.0.122.18
}

# OpenWiFi WF-196
client d4babaa1484 {
    secret = secret
    ipaddr = 10.0.122.23
}

For continuous roaming tests between these two APs, the Candela LANforge WiFi Roam test is employed. The test basically allows the user to construct a script that can run for a set amount of time or indefinitely. The operator can determine where, when, and how long the client should roam.

Configuring MPSK

The MPSK configuration requires minimal adjustments on the FreeRadius server. For each device, the MAC address and the PSK are entered into the users file:

# LANforge AX210
e8f4080324ba Cleartext-Password := "e8f4080324ba"
        Tunnel-Type = 13,
        Tunnel-Medium-Type = 6,
        Tunnel-Private-Group-Id = 1000,
        # Expected PSK for OpenWiFi using the tunneled password attribute
        Tunnel-Password = 12345678,
        # Expected PSK for Juniper in a Cisco AVPair attributes
        Cisco-AVPair = "psk-mode=ascii",
        Cisco-AVPair += "psk=12345678"

Notably, TIP OpenWiFi APs require the Tunnel-Password attribute, while the Juniper APs utilize the Cisco-AVPair attributes. The 2 different approaches show case how an homogenous ecosystem can be built by integrating proprietary and open source solutions with a universal authentication system like FreeRadius.

Mixed Ecosystem Benefits

The merger of TIP OpenWiFi and proprietary solutions like Juniper’s APs provides numerous benefits:

  • Cost Efficiency: TIP OpenWiFi APs like the WF-196 are significantly more affordable than many proprietary solutions and can sometimes be the only realistic solution in industries with limited budget. A simple comparison of the two APs demonstrates why the open source solution might be a game changer for industries where budgets are limited.
  • Flexibility: TIP OpenWiFi’s open-source nature allows for extensive customization and integration. The open source code allows for customizations that might not be available with the vendor locked solution.
  • Innovation: Proprietary solutions often come with advanced features and robust support, while open-source projects bring community-driven innovation. As a member of the OpenWiFi community you have the opportunity to lead the development of new features on corner cases that the vendor might not be interested in developing.

Configuraing The OpenWiFi Controller: An SDK with a UI

The OpenWiFi controller stands out as an SDK (Software Development Kit) with a user interface, designed for developers and organizations to create their own user-friendly controllers.below we show a screenshot of the configs in a JSON file format. Not all fields have been expanded to not overwhelm the reader. Only the important parts that highlight the settings for the SSID and the RADIUS server.

The Juniper Controllers Configuration

Configuring the Juniper Controller was straightforward and uncomplicated. As someone who had never used it before, I felt as if I had been using it for years. Every configuration was where you expected it to be, and the transitions between steps were effortless.

Running the Test.

The brief movie below shows a wireless STA managed by the Canddela LANforge roaming between the AP45 (channel 48) and the WF-196 (channel 44). Two additional radios were utilized as sniffers to catch the traffic. An EAPOL display filter was employed to separate the 4-Way Handshake from the rest of the air traffic.

I did not get into the specifics of how to set up a FreeRadius server or the more in-depth aspects of the Juniper Mist MPSK setups because that was not the purpose of this article. If you need assistance with the issues listed above, please see my wonderful friend Mohammad Al‘s post at https://artofrf.com/2024/04/02/mist-mpsk-with-freeradius/.

DHCP Option 82 and the endless possibilties

DHCP Option 82 and the endless possibilties

Recently, the development team at OpenWiFi has added a DHCP-Relay feature to the OpenWiFi firmware. I worked with the Q&A team to come up with a test plan for the feature. At first it sounded straight forward. All we have to do is capture the DHCP request packets leaving the AP and verify whether they are sent out as broadcast or unicast. The other aspect we needed to verify was that under option 82 the development team allowed for 3 values to be forwarded to the DHCP server under the 1st and 2nd sub-options, (Agent Circuit ID and Agent Remote ID) the SSID, the VLAN, or the AP MAC address but, this could also be verified with the same simple Wireshark capture.

This is all good but, is it enough? Does that mean that the DHCP-Relay will work as a part of an ecosystem? Will the DHCP server recognize the information sent under option 82 and filter clients accordingly ? Those questions made me realize that a simple capture is insufficient to guarantee whether or not this new feature actually works. We needed to come up with a scenario where this feature is tested under real world conditions. A test plan that mimics how this feature will be used in the field.

But, first let’s take a step back and refresh our memory on what does a DHCP Relay do and why is it important ? For a client to obtain an IP from a DHCP server it has to go through the DORA frame work.

The process starts with broadcasting a Discover packet with a destination address 255.255.255.255. If there is an active DHCP sever on the same VLAN or subnet it will reply with an Offer message and the client and server go through the DORA framework. However, sometimes the DHCP server is on a different VLAN or sets in the cloud serving many locations at once. We all know that broadcasts are dropped by the routers so how can a client reach a DHCP server that doesn’t live on the same subnet. This is where the DHCP-Relay comes into play. The DHCP-Relay intercepts the Discover packet and transform it on behave of the client from a broadcast to a unicast that targets a specific preconfigured DHCP server.

Considering all of the above, it was time to get to the whiteboard. I decided to use my home network as the ground for testing. My home router which is just an Ubuntu NUC running linux will act as the NAT router for 2 new VLANs 100, and 200 (along side my native VLAN subnet 10.0.0.0/16 that I use for my home devices and AP and switches management interfaces). My Ubuntu server will provide DHCP pools for the 2 new VLANs through ISC-DHCP-Server service. I needed a switch to tie it all together so I bought a managed L3 TP-Link switch (TL-SG2008P). I created VLANs on it and connected everything together.

I configured the OpenWiFi AP to act as the DHCP-Relay for any clients connected on the 2 SSIDs I created, WAN100 and WAN200 (WAN100 -> VLAN100, WAN200-> VLAN200). I also configured the DHCP-Relay to include the SSID name under the 1st sub-option of option 82. The reason for that is to use this piece of information as the basis for filtering. The DHCP server will decide on what pool should the clients gets its IP from.

The real challenge laid with understanding and configuring the ISC-DHCP-Server to distinguish between the different request and assign the correct IP address.

I created 2 classes to filter sub-option1 under DHCP option 82. I used the allow and deny statement to filter which pool should serve each VLAN. I deep dove into the packet capture to find out the offset and length of the information included in sub-option 1 to input the correct values for the match if condition statement. This required a capture of the unicast DHCP Discover packet and finding out the number of bytes included in the sub-option.

It can be seen in the screenshot above the total number of highlighted bytes are 8 and the offset is 0 since we started counting from the first received byte.

All was left now if to create 2 STAs and connect each one to a different SSID on the AP and hope they get the correct IP addresses. I took me around 3 days to perfect the setup but at the end the 2 virtual STAs I created on my Candela LANforge box acquired the correct IP addresses with respect to the SSIDs they were connected to.

The 2 highlighted STAs wlan0 and wlan1 are shown to be connected to their respected SSIDs and the IP addresses match the pools assigned to those SSIDs.

The DHCP-Relay feature combined with the ability to filter based on the information included in option 82 on the DHCP server side opens up countless opportunities for different deployment scenarios. Clients can identified and assigned to specific pools based on many different aspects. For example, one can include the AP MAC to differentiate clients based on their physical location, or assign variable lease times based on whether this is a guest or employee VLAN.

DHCP issues with IoT devices?  How the misbehavior of few IoT devices made me appreciate OpenWiFi more

DHCP issues with IoT devices?  How the misbehavior of few IoT devices made me appreciate OpenWiFi more

As a new fan for home automation, I’ve recently started a journey to install several IoT devices that improve the convenience and effectiveness of my daily life. 95% of those devices are 2.4GHz only, which has a unique set of challenges to begin with. A few of the gadgets I have so far installed are a Govee water leak detector, a Ring Amazon Camera, and TP-Link light switches and plugs. Unfortunately, when Proxy-ARP was enabled on my personal OpenWiFi APs setup, I had an issue where some IoT devices were unable to receive IP addresses.

The Govee Wi-Fi gateway and the Amazon Ring Camera were the main devices that displayed the IP address acquisition problem in the presence of active Proxy-ARP functionality on the AP. Understanding how devices behave during the IP acquiring process is crucial for identifying the root cause of this issue.

When an IoT device (or just any device for that matter) connects to a network, it requests an IP address from the network’s DHCP server. To ensure that the IP address it receives is available for use, the device sends a special type of ARP (Address Resolution Protocol) called the DHCP ARP Probe to check if any other devices on the network are using the same IP address. The expectation is that if the IP address is free, no response will be received, indicating its availability. The device may, however, think the IP address is already in use if it receives a response to its ARP Probe and declines the DHCP offer.

The IoT devices’s misbehavior manifests when Proxy-ARP is enabled on the AP. In this scenario, when an IoT device sends an ARP probe to validate IP address availability, the access point responds on behalf of the device. However, the IoT device fails to recognize that the MAC address in the response from the Proxy-ARP is its own.

As a result, the IoT device mistakenly believes that the IP address it acquired from the DHCP server is already in use since it receives a response to its ARP probe. Consequently, the device declines the DHCP offer, causing it to fail in acquiring an IP address and establishing connectivity.

As it can be seen in both packet captures for the Amazon Ring Camera and the Govee WiFi Gateway, both devices decline the DHCP offer after receiving an ARP response to the ARP probe of the offered IP address by the DHCP server.

On the other hand, the remaining devices (about 45 devices from different vendors) on both the 5GHz and 2.4GHz bands recognize the MAC addresses in the responses as being their own MAC addresses and accept the DHCP IP offers. When the Proxy-ARP capability is enabled, an iPhone 13 in the example below behaves as predicted.

Disabling the Proxy-ARP feature on the 2.4GHz SSID (as 90% of IoT devices can only operate on 2.4GHz) was the simplest and most direct solution to the issue. it worked… in my relatively quiet 2.4GHz home environment but, do I really want to compromise a topology design because of couple of misbehaving clients? and I can only picture the AirTime utilization nightmare that would arise in a larger, industrial, and much noisy environment.

I wanted to conduct one more test to further my conviction that this is a device-specific issue, which I was 99% certain of at this time. To my amazement, the Ring Camera functioned flawlessly with Proxy-ARP enabled on the SSID when I tested it against a Ruckus R730 AP that I had lying around. This prompted me to return to the drawing board and think about additional key factors that may have influenced the issue on the AP side.

Reading about how Proxy-ARP is implemented in OpenWiFi (or, to be more precise, the Linux kernel) and discussing the issue with the community. We uncovered that the ARP bounce-back capability is enabled by default in the Linux Kernel. This meant that the Kernel was replying back to the source of the ARP Probe and telling them they are the owner of that MAC address. Now all I had to is to disable that feature in the Kernel and Ring Camera should work with Proxy-ARP enabled.

Disabling the a feature in Kernel’s and building a new functional image may sounds like a tremendous task to take on that could weeks or even months. Imagine all the support nightmare you will have to go through with a vendor to just get such a fix included in their next release. The beauty of OpenWiFi being open-source is that you can easily reconfigure any fundamental functionality and create a new image to mitigate against any issues, like misbehaving clients for example. Utilizing the CICD pipeline of OpenWiFi to disable the ARP bounce-back feature in the Kernel, a new dev image with the Kernel ARP bounce-back feature set to disable from one of the program maintainers took only a few hours.

The fresh AP NOS image had the capacity to correct the misbehavior of a few IoT devices while utilizing less AirTime since I was able to enable the Proxy-ARP capabilities. While it fixed the issue, I wanted to also make sure there are no unintended consequences.  Running the new image through out nightly open-source 5000 test cases ensued that I had a stable image that is ready for deployment.

My objective when begin to write the article was to present an interesting technical case that could potentially save someone from losing a few days of their life attempting to troubleshoot it. By the time the draft was finished, I could clearly see how the message had evolved to emphasize how this case demonstrates the strength and potential of OpenWiFi and its transformative open and collaborative lifecycle, which gives everyone a voice, the power to directly affect and prioritize roadmaps, and the agility to ensure the stability of the ecosystem.

Screenshots of the packet capture with Proxy-ARP functionality is disabled on the 2.4GHz SSID. The IoT devices tend to send around 3-4 ARP Probes before it declares to the network over an ARP announcement that it has updated it is IP address.