Oh Give Me a Home, Where the Wireless Clients Roam…

3 01 2009

Roaming…it’s a huge reason to deploy a wireless network and it’s also the part you spend troubleshooting the most.  As I’ve said before, it’s easy to just set up an Access Point to serve a small area like a conference room.  Setting it up so users can walk from that conference room through the building and never lose connection?  Not so easy.  Making them able to do that?  Priceless.  Roaming is how we do that and in larger environments, say a hospital with 5 or 6 buildings in its campus, layer 3 roaming is even necessary.

CCNA-Wireless only seems to cover lightweight infrastructure deployments…at least so far in my studies.  Where I work we have both autonomous and lightweight deployments and we’re working to upgrade and transition them all to lightweight.  The differences between the way the two are set up are huge and almost seem to be more than what they have in common.  The biggest and most important difference is that in an autonomous infrastructure, much more work is done on the Access Point itself.  For roaming in this kind of environment, we’re dependent on a device called a WLSM, which is a blade in our 6509 switches.  This serves as a Wireless Domain Manager.  Wireless domains exist in both autonomous and lightweight infrastructures and are similar in concept to any other domain…they are a logical grouping within which clients can roam.  In the case of lightweight, a wireless domain is called a mobility domain and is a logical group of WLAN controllers that may or may not be in the same mobility group.  In the case of autonomous, this logical group is comprised of AP’s which may or may not be in the same subnet.

A good real-world example would be that hospital again.  In this case, all the buildings would belong to the same mobility domain while each individual building could be in its own mobility group.  A user walking from one floor in the same building to another would most likely just perform layer 2 roaming, which takes 10ms or less.  They may remain on the same WLAN controller or might move to another controller, but from the user’s perspective, nothing happens.  They simply walk along from AP to AP without having to reauthenticate and without having to change IP address because the controllers keep track of them and are aware they are already authenticated.  In the case of an autonomous infrastructure, it is the WLSM that keeps track of clients in a mobility table.  The WLSM depends on the AP’s to tell it of clients that associate with them.  It puts the client’s MAC address and the MAC address of the AP in its mobility table.  Then things get complicated…the WLSM has no way of knowing layer 3 information *except* by overhearing it.  In our case, the switch the WLSM is installed in uses something called “dhcp snooping” to listen in to dhcp messages going to and from the wireless clients.  It uses this information to keep track of IP addresses given to the clients and adds those addresses to the mobility table.

As I said, things get interesting when clients do layer 3 roaming.  In our example, this would happen when our user decides to leave their building, but need to keep connection as they move from one building to another.  Say that user is on a wireless VoIP phone and they are walking from one building to another.  The same SSID exists on the AP’s in each building, but in each building that SSID is tied to a different subnet since the buildings are in different networks.  In both lightweight and autonomous infrastructures, the preparation for them to roam was made the moment they connected to their first AP back in the building they began in.  For lightweight, the controller that they first registered with marked their entry as an “anchor” entry and then informed all other controllers about it.  The other controllers, including those in the other buildings, mark this entry as “foreign,” but are aware of it.  In an autonomous implementation, an entry was made for them in the WLSM’s mobility table with their MAC address, the MAC address of the AP they connected to, and the IP address they were given which was learned via dhcp snooping.  The client is ready to roam just as soon as they have connected.

*When* the client roams to another building, in a lightweight infrastructure, the controllers note that the client has moved and all traffic to and from that client is sent by the controller of the building they have roamed to back to their anchor controller to be tunneled to the client.  The client keeps their same IP address even though they are physically on a different network and it works similar to a GRE tunnel, with traffic routed to that original IP regardless of where the client roams to.  You can even force clients to be anchored to a specific network if you like.  Things aren’t quite as nice and neat in an autonomous infrastructure.  There, the client does have to get a new IP address, but they do not have to re-authenticate.  The WLSM keeps track of the client and knows that the client is authenticated.  The client does lose connection for a couple of pings while they get a new dhcp address, but the process is a lot quicker than it would be without the WLSM.  Generally, only VoIP clients notice the loss of pings, while laptop users may not even notice that they are obtaining a new IP address.  Once the client has roamed, the mobility table is updated with their new information.  In the lightweight scenario, there is no new information to update as the client remains tied to their anchor entry on the controller they are anchored to.

If all this sounds complicated…that’s because it is.  Roaming allows wireless clients mobility and is the goal of a wireless LAN.  It is also the most frequent thing to break on a wireless network.  Most of the work I have done as a WLAN administrator has been to troubleshoot issues with layer 3 roaming.  A big part of the problem is that so many different problems on the infrastructure side can show up with the same symptom on the client side.  For a user, they either connect or they don’t.  Wireless either works or it doesn’t.  They don’t realize there are so many stages they have to go through in order to connect and then in order to roam and that a problem can happen at any one of those stages.  This can make troubleshooting a WLAN challenging since you may not get much more information than “user can’t connect” or “user keeps disconnecting.”  Even worse, no matter how many problems you fix, the users will have the impression that you haven’t done anything until you have fixed everything.  When it all works, though…it is golden.   :D


Actions

Information

Leave a comment

You must be logged in to post a comment.