Blogs

Multi-Vender MPSK deployment with Dynamic VLAN Assignment on FreeRadius 

Multi-Vender MPSK deployment with Dynamic VLAN Assignment on FreeRadius 

In today’s diverse networking environments, there’s a growing trend towards combining proprietary vendor solutions with open-source alternatives to achieve a balance of reliable performance and cost-effectiveness. This article explores this idea by demonstrating a Multi-Pre-Shared Key (MPSK) Wi-Fi solution using both Juniper APs and OpenWiFi APs, where authentication and accounting is managed through FreeRadius server.

The Network Setup

At the heart of the network, we have the 3.0 version of the FreeRadius server—an open-source solution capable of handling authentication for a variety of devices which I am running on my home NUC Ubuntu box, alongside a ISC-DHCP-Server to handle DHCP requests, and some iptables NAT rules to nat clients to the outside world . In this scenario, we’re looking at the Juniper AP45 and the TIP OpenWiFi CIG WF-196, both are configured through their cloud controller and added as clients within the FreeRadius’s clients.conf file.

# Juniper AP45
client a83a79a8367e {
    secret = secret
    ipaddr = 10.0.122.18
}

# OpenWiFi WF-196
client d4babaa1484 {
    secret = secret
    ipaddr = 10.0.122.23
}

For continuous roaming tests between these two APs, the Candela LANforge WiFi Roam test is employed. The test basically allows the user to construct a script that can run for a set amount of time or indefinitely. The operator can determine where, when, and how long the client should roam.

Configuring MPSK

The MPSK configuration requires minimal adjustments on the FreeRadius server. For each device, the MAC address and the PSK are entered into the users file:

# LANforge AX210
e8f4080324ba Cleartext-Password := "e8f4080324ba"
        Tunnel-Type = 13,
        Tunnel-Medium-Type = 6,
        Tunnel-Private-Group-Id = 1000,
        # Expected PSK for OpenWiFi using the tunneled password attribute
        Tunnel-Password = 12345678,
        # Expected PSK for Juniper in a Cisco AVPair attributes
        Cisco-AVPair = "psk-mode=ascii",
        Cisco-AVPair += "psk=12345678"

Notably, TIP OpenWiFi APs require the Tunnel-Password attribute, while the Juniper APs utilize the Cisco-AVPair attributes. The 2 different approaches show case how an homogenous ecosystem can be built by integrating proprietary and open source solutions with a universal authentication system like FreeRadius.

Mixed Ecosystem Benefits

The merger of TIP OpenWiFi and proprietary solutions like Juniper’s APs provides numerous benefits:

  • Cost Efficiency: TIP OpenWiFi APs like the WF-196 are significantly more affordable than many proprietary solutions and can sometimes be the only realistic solution in industries with limited budget. A simple comparison of the two APs demonstrates why the open source solution might be a game changer for industries where budgets are limited.
  • Flexibility: TIP OpenWiFi’s open-source nature allows for extensive customization and integration. The open source code allows for customizations that might not be available with the vendor locked solution.
  • Innovation: Proprietary solutions often come with advanced features and robust support, while open-source projects bring community-driven innovation. As a member of the OpenWiFi community you have the opportunity to lead the development of new features on corner cases that the vendor might not be interested in developing.

Configuraing The OpenWiFi Controller: An SDK with a UI

The OpenWiFi controller stands out as an SDK (Software Development Kit) with a user interface, designed for developers and organizations to create their own user-friendly controllers.below we show a screenshot of the configs in a JSON file format. Not all fields have been expanded to not overwhelm the reader. Only the important parts that highlight the settings for the SSID and the RADIUS server.

The Juniper Controllers Configuration

Configuring the Juniper Controller was straightforward and uncomplicated. As someone who had never used it before, I felt as if I had been using it for years. Every configuration was where you expected it to be, and the transitions between steps were effortless.

Running the Test.

The brief movie below shows a wireless STA managed by the Canddela LANforge roaming between the AP45 (channel 48) and the WF-196 (channel 44). Two additional radios were utilized as sniffers to catch the traffic. An EAPOL display filter was employed to separate the 4-Way Handshake from the rest of the air traffic.

I did not get into the specifics of how to set up a FreeRadius server or the more in-depth aspects of the Juniper Mist MPSK setups because that was not the purpose of this article. If you need assistance with the issues listed above, please see my wonderful friend Mohammad Al‘s post at https://artofrf.com/2024/04/02/mist-mpsk-with-freeradius/.

DHCP Option 82 and the endless possibilties

DHCP Option 82 and the endless possibilties

Recently, the development team at OpenWiFi has added a DHCP-Relay feature to the OpenWiFi firmware. I worked with the Q&A team to come up with a test plan for the feature. At first it sounded straight forward. All we have to do is capture the DHCP request packets leaving the AP and verify whether they are sent out as broadcast or unicast. The other aspect we needed to verify was that under option 82 the development team allowed for 3 values to be forwarded to the DHCP server under the 1st and 2nd sub-options, (Agent Circuit ID and Agent Remote ID) the SSID, the VLAN, or the AP MAC address but, this could also be verified with the same simple Wireshark capture.

This is all good but, is it enough? Does that mean that the DHCP-Relay will work as a part of an ecosystem? Will the DHCP server recognize the information sent under option 82 and filter clients accordingly ? Those questions made me realize that a simple capture is insufficient to guarantee whether or not this new feature actually works. We needed to come up with a scenario where this feature is tested under real world conditions. A test plan that mimics how this feature will be used in the field.

But, first let’s take a step back and refresh our memory on what does a DHCP Relay do and why is it important ? For a client to obtain an IP from a DHCP server it has to go through the DORA frame work.

The process starts with broadcasting a Discover packet with a destination address 255.255.255.255. If there is an active DHCP sever on the same VLAN or subnet it will reply with an Offer message and the client and server go through the DORA framework. However, sometimes the DHCP server is on a different VLAN or sets in the cloud serving many locations at once. We all know that broadcasts are dropped by the routers so how can a client reach a DHCP server that doesn’t live on the same subnet. This is where the DHCP-Relay comes into play. The DHCP-Relay intercepts the Discover packet and transform it on behave of the client from a broadcast to a unicast that targets a specific preconfigured DHCP server.

Considering all of the above, it was time to get to the whiteboard. I decided to use my home network as the ground for testing. My home router which is just an Ubuntu NUC running linux will act as the NAT router for 2 new VLANs 100, and 200 (along side my native VLAN subnet 10.0.0.0/16 that I use for my home devices and AP and switches management interfaces). My Ubuntu server will provide DHCP pools for the 2 new VLANs through ISC-DHCP-Server service. I needed a switch to tie it all together so I bought a managed L3 TP-Link switch (TL-SG2008P). I created VLANs on it and connected everything together.

I configured the OpenWiFi AP to act as the DHCP-Relay for any clients connected on the 2 SSIDs I created, WAN100 and WAN200 (WAN100 -> VLAN100, WAN200-> VLAN200). I also configured the DHCP-Relay to include the SSID name under the 1st sub-option of option 82. The reason for that is to use this piece of information as the basis for filtering. The DHCP server will decide on what pool should the clients gets its IP from.

The real challenge laid with understanding and configuring the ISC-DHCP-Server to distinguish between the different request and assign the correct IP address.

I created 2 classes to filter sub-option1 under DHCP option 82. I used the allow and deny statement to filter which pool should serve each VLAN. I deep dove into the packet capture to find out the offset and length of the information included in sub-option 1 to input the correct values for the match if condition statement. This required a capture of the unicast DHCP Discover packet and finding out the number of bytes included in the sub-option.

It can be seen in the screenshot above the total number of highlighted bytes are 8 and the offset is 0 since we started counting from the first received byte.

All was left now if to create 2 STAs and connect each one to a different SSID on the AP and hope they get the correct IP addresses. I took me around 3 days to perfect the setup but at the end the 2 virtual STAs I created on my Candela LANforge box acquired the correct IP addresses with respect to the SSIDs they were connected to.

The 2 highlighted STAs wlan0 and wlan1 are shown to be connected to their respected SSIDs and the IP addresses match the pools assigned to those SSIDs.

The DHCP-Relay feature combined with the ability to filter based on the information included in option 82 on the DHCP server side opens up countless opportunities for different deployment scenarios. Clients can identified and assigned to specific pools based on many different aspects. For example, one can include the AP MAC to differentiate clients based on their physical location, or assign variable lease times based on whether this is a guest or employee VLAN.

DHCP issues with IoT devices?  How the misbehavior of few IoT devices made me appreciate OpenWiFi more

DHCP issues with IoT devices?  How the misbehavior of few IoT devices made me appreciate OpenWiFi more

As a new fan for home automation, I’ve recently started a journey to install several IoT devices that improve the convenience and effectiveness of my daily life. 95% of those devices are 2.4GHz only, which has a unique set of challenges to begin with. A few of the gadgets I have so far installed are a Govee water leak detector, a Ring Amazon Camera, and TP-Link light switches and plugs. Unfortunately, when Proxy-ARP was enabled on my personal OpenWiFi APs setup, I had an issue where some IoT devices were unable to receive IP addresses.

The Govee Wi-Fi gateway and the Amazon Ring Camera were the main devices that displayed the IP address acquisition problem in the presence of active Proxy-ARP functionality on the AP. Understanding how devices behave during the IP acquiring process is crucial for identifying the root cause of this issue.

When an IoT device (or just any device for that matter) connects to a network, it requests an IP address from the network’s DHCP server. To ensure that the IP address it receives is available for use, the device sends a special type of ARP (Address Resolution Protocol) called the DHCP ARP Probe to check if any other devices on the network are using the same IP address. The expectation is that if the IP address is free, no response will be received, indicating its availability. The device may, however, think the IP address is already in use if it receives a response to its ARP Probe and declines the DHCP offer.

The IoT devices’s misbehavior manifests when Proxy-ARP is enabled on the AP. In this scenario, when an IoT device sends an ARP probe to validate IP address availability, the access point responds on behalf of the device. However, the IoT device fails to recognize that the MAC address in the response from the Proxy-ARP is its own.

As a result, the IoT device mistakenly believes that the IP address it acquired from the DHCP server is already in use since it receives a response to its ARP probe. Consequently, the device declines the DHCP offer, causing it to fail in acquiring an IP address and establishing connectivity.

As it can be seen in both packet captures for the Amazon Ring Camera and the Govee WiFi Gateway, both devices decline the DHCP offer after receiving an ARP response to the ARP probe of the offered IP address by the DHCP server.

On the other hand, the remaining devices (about 45 devices from different vendors) on both the 5GHz and 2.4GHz bands recognize the MAC addresses in the responses as being their own MAC addresses and accept the DHCP IP offers. When the Proxy-ARP capability is enabled, an iPhone 13 in the example below behaves as predicted.

Disabling the Proxy-ARP feature on the 2.4GHz SSID (as 90% of IoT devices can only operate on 2.4GHz) was the simplest and most direct solution to the issue. it worked… in my relatively quiet 2.4GHz home environment but, do I really want to compromise a topology design because of couple of misbehaving clients? and I can only picture the AirTime utilization nightmare that would arise in a larger, industrial, and much noisy environment.

I wanted to conduct one more test to further my conviction that this is a device-specific issue, which I was 99% certain of at this time. To my amazement, the Ring Camera functioned flawlessly with Proxy-ARP enabled on the SSID when I tested it against a Ruckus R730 AP that I had lying around. This prompted me to return to the drawing board and think about additional key factors that may have influenced the issue on the AP side.

Reading about how Proxy-ARP is implemented in OpenWiFi (or, to be more precise, the Linux kernel) and discussing the issue with the community. We uncovered that the ARP bounce-back capability is enabled by default in the Linux Kernel. This meant that the Kernel was replying back to the source of the ARP Probe and telling them they are the owner of that MAC address. Now all I had to is to disable that feature in the Kernel and Ring Camera should work with Proxy-ARP enabled.

Disabling the a feature in Kernel’s and building a new functional image may sounds like a tremendous task to take on that could weeks or even months. Imagine all the support nightmare you will have to go through with a vendor to just get such a fix included in their next release. The beauty of OpenWiFi being open-source is that you can easily reconfigure any fundamental functionality and create a new image to mitigate against any issues, like misbehaving clients for example. Utilizing the CICD pipeline of OpenWiFi to disable the ARP bounce-back feature in the Kernel, a new dev image with the Kernel ARP bounce-back feature set to disable from one of the program maintainers took only a few hours.

The fresh AP NOS image had the capacity to correct the misbehavior of a few IoT devices while utilizing less AirTime since I was able to enable the Proxy-ARP capabilities. While it fixed the issue, I wanted to also make sure there are no unintended consequences.  Running the new image through out nightly open-source 5000 test cases ensued that I had a stable image that is ready for deployment.

My objective when begin to write the article was to present an interesting technical case that could potentially save someone from losing a few days of their life attempting to troubleshoot it. By the time the draft was finished, I could clearly see how the message had evolved to emphasize how this case demonstrates the strength and potential of OpenWiFi and its transformative open and collaborative lifecycle, which gives everyone a voice, the power to directly affect and prioritize roadmaps, and the agility to ensure the stability of the ecosystem.

Screenshots of the packet capture with Proxy-ARP functionality is disabled on the 2.4GHz SSID. The IoT devices tend to send around 3-4 ARP Probes before it declares to the network over an ARP announcement that it has updated it is IP address.

Why choose a 4×4 AP over a 2×2 one ?

Why choose a 4×4 AP over a 2×2 one ?

In today’s market, 2×2 APs are widely available, and several suppliers promote them as a less expensive, high-density deployment-capable alternative to the 4×4. On paper, 2×2 APs can theoretically offer a 2×2 client the same throughput numbers as a 4×4 one, which makes them more desirable.

The actual query is, in practice, do they realistically offer a 2×2 client the same throughput numbers? I’ve always known by heart that, a 4×4 AP has about 3dB of beam-forming gain on the downlink because of the two extra chains, and 3dB of gain on the uplink because of the MRC gain of the extra chains. That sounds wonderful, until you have to explain to someone why they should choose the more expensive 4×4 AP over the less expensive 2×2 AP when the manufacturer has assured them that the 2×2 AP can manage high density installations and can offer the same throughput numbers to a 2×2 client.

I start seeking for scientific evidence to back up my argument that in real life scenarios, a 4×4 AP is superior to a 2×2. During my research, I stumbled across Wes Purvis’ outstanding talk from the 2018 WLPC in Phoenix. Wes went on to demonstrate that a 4×4 had a gain over a 3×3 AP of roughly 2.4dB on the downlink and 1dB on the uplink. In multi-client scenarios, this resulted in a 15% increase in data rate and a 10% increase in throughput.

I was content with what I had discovered up until this point, but not quite. After all, this demonstrates that a 4×4 is superior to a 3×3, but not superior to a 2×2. What if the odd number of antennas is to blame for the 3×3’s subpar performance? The question might seem absurd or strange to us wireless engineers, but it is an illustration of the kind of inquiry non-technical management might ask you in an effort to comprehend why you selected the more expensive 4×4 AP.

As a result, I carried out more research and discovered this fantastic Matlab work that described how to demonstrate the advantages of beam-forming from a 4×4 AP to a 2×2 client over using techniques such spatial expansion.
https://www.mathworks.com/help/wlan/ug/802-11ac-transmit-beamforming.html?fbclid=IwAR2ZBkgO_NV4Fz4e0-qwR7dmGusgv6fen9pRoLVnIQSbDpMx-rON799OOQo#responsive_offcanvas

I hypothesized that by just changing the number of transmitting antennas from 4 to 2, I could use the same illustration to demonstrate how beam-forming from a 4×4 AP to a 2×2 client is superior to beam-forming from a 2×2 AP to a 2×2 client. I executed the code twice under identical setup conditions, first for the case of 4×4=>2×2 and a second time for the case of 2×2=>2×2.

The setup evaluated a 2 spatial streams transmission using MCS4 on a 20MHz BW and TGac channel model with a Model-B delay profile from the AP to a client at a distance of 100 meters.

 4×4=>2×2 SS12×2=>2×2 SS14×4=>2×2 SS22×2=>2×2 SS2
EVM RMS2.0%4.7%4.1%8.4%
EVM dB-33.9-26.5-27.7-21.5
Error Vector Magnitude values for SS1 and SS2 for the 4×4=>2×2 and the 2×2=>2×2
Constellation patterns for spatial 1 and 2 in the case of beam-forming from a 4×4 AP to a 2×2 client.

Constellation patterns for spatial 1 and 2 in the case of beam-forming from a 2×2 AP to a 2×2 client.

One can clearly see from the above results that the 4×4 AP had a lower EVM values than the 2×2 AP. This would lead to improved SNR values in real-world scenarios, which in turn would enable the 4×4 AP to shift gears and switch to a higher MCS rate.

CableLabs Innovation Bootcamp

CableLabs Innovation Bootcamp

CableLabs Innovation bootcamp was a unique experience. It was different from the wireless oriented events that I have attended in the last couple years. The camp was designed to introduce the attendee to the FIRE framework of Innovation. FIRE (Focus, Ideation, Ranking, and Execution) is a framework put together by Phil McKinney the current CEO of CableLabs and a previous CTO of HP.

I left the camp with a bunch of new skills, like how to ideate, and how to define a clear problem statement. If I was to name the most important thing that I got out of attending the bootcamp it would be how it helped me overcome the fear that my ideas are silly and not worth pursuing. I have come to realize that no matter how silly an idea may sound, it is always worth writing down and ideating on.

Upon graduating the camp, every attendee was assigned a code name based around their personality, contributions, and the way they ideate and attack problems. I was assigned the code name ROCKER 🙂 I was told that the name was chosen because of my love for Heavy Metal and Rock music and how I was not afraid to shake the norms and question the facts.

Code Name : Rocker