WIFI & CAPTIVE PORTALS WiFi Published 2020

Deutsche Bahn AG

Surfing on the ICE on the move, then carrying on in the station WiFi at the next stop without signing in again β€” under the hood that's a RADIUS aggregator we built for Deutsche Bahn, and it works where standard RADIUS servers (FreeRADIUS and friends) hit their limits.
Pattern WiFi platforms & captive portals
  • .NET Core 3.1
  • Radius Client/Server (FreeRadius-ΓΌberlegen)
  • SNMP Agent

THE STARTING POINT

The starting point

Travellers experience WiFi today as part of the basic infrastructure β€” especially on trains and at stations. What they shouldn't experience: signing in twice at the transition. Someone who's just signed into the on-board WiFi of an ICE and then steps off at the station shouldn't have to click through the station network's captive portal again. The same in the other direction when boarding.

That sounds like a trivial authentication problem. It isn't. Honestly: Deutsche Bahn doesn't operate every WiFi node itself β€” sub-contracting partners and roaming partners provide a substantial part of the infrastructure. An incoming RADIUS authentication request must be intelligently routed to the right authentication source β€” based on the realm in the username and the MAC address. And at a latency the traveller perceives as "seamless".

Standard RADIUS servers like FreeRADIUS don't solve this out of the box. They simply don't know the pattern "send the request to several providers in parallel and answer with the first positive response". Anyone trying anyway ends up building half a custom stack around them β€” and overlooks the requirements for multi-client support, realm-specific routing, and SNMP-based monitoring of the request paths.

An out-of-the-box solution wasn't an option here.

WHAT WE BUILT

What we built

A customer-specific RADIUS proxy architecture in two services, packaged for container operation on Debian Linux infrastructure.

The RadiusAggregator as an intelligent proxy

The central service receives RADIUS access requests and routes them according to clear rules. The architecture follows four principles:

  • Serve multiple request sources ("clients") in parallel β€” each with its own IP and shared secret. The aggregator knows who's asking.
  • Realm-based routing β€” the realm in the username decides which providers (sub-contracting partners / roaming partners) the request goes to.
  • "Fastest positive response wins" β€” the request goes to all plausible providers in parallel, and the first positive response is returned to the original caller. That's exactly the pattern FreeRADIUS-style standard solutions don't handle.
  • Granular monitoring of all paths β€” per client, per realm, per provider, metrics are collected and made available to a central monitoring system via SNMP.

The UserService as data layer

Alongside the RadiusAggregator runs a UserService that manages end-user-relevant data β€” decoupled from the authentication logic. That keeps the two responsibilities cleanly separated.

SNMP as a first-class observation layer

That the aggregator publishes its own state and metrics over SNMP isn't an afterthought. It means: existing network monitoring systems (Spectre and similar at DB) see the aggregator like any other network component β€” no special handling, no separate toolchain. Both the requesting manager systems and the authentication depth (None / MD5 / SHA, plus privacy protocol DES/3DES/AES up to 256) are configurable. The system speaks SNMPv2 and SNMPv3.

Container-first deployment

Both services are packaged as Docker containers and orchestrated together via Docker Compose. That has a quiet but important effect: the solution can be deployed at a new site within minutes, instead of via a complex Linux setup with system service configuration.

Note from practice: Linux isn't a matter of taste here β€” both services need to see the source IP of incoming UDP packets to identify the logical sender. Docker on Windows or macOS rewrites the source IP via NAT β€” that only works when there's exactly one caller. For multi-client scenarios, Linux is mandatory.

Configuration as JSON, not as code

Both the RadiusAggregator and the UserService are controlled by two JSON files: daemonconfig.json (basic behaviour, logging, SNMP, RADIUS ports) and infrastructure.json (request sources, providers, servers, routing rules). An operator can change the topology without touching code β€” add new providers, reroute realms, rotate secrets.

WHAT IT GIVES THEM

What it gives them

  • "Sign in again" becomes "keep surfing". The traveller experiences a transition that, for them, doesn't exist.
  • DB can integrate sub-contracting partners without adapting their own architecture every time. New providers are added via infrastructure.json without redeploying the aggregator.
  • "Fastest positive response wins" reduces perceived latency for the end user. Whoever answers first wins β€” the aggregator doesn't wait for all providers.
  • SNMP-first means no second monitoring layer. Existing network operations tools see the aggregator as a familiar participant.
  • Docker Compose deployment lowers the barrier for new sites and test environments β€” from "Linux setup workshop" to "one Compose file and three seconds".

WHAT WE DELIBERATELY DID NOT AUTOMATE

What we deliberately did not automate

  • We don't supply WiFi hardware. Access points stay with the respective site operator; we focus on the authentication layer.
  • We don't make business decisions about provider contracts. Which sub-contracting partners are integrated is decided by DB; the aggregator only knows the configuration.
  • We don't operate the system 24/7. After rollout, DB's operations team takes over. We deliver containers, documentation, and updates β€” day-to-day operations belong with internal IT.
  • No identity stack of our own. Identity stays with the respective authentication providers. The aggregator doesn't decide what's true; it routes the question and forwards the first reliable answer.

WHY THIS PATTERN TRANSFERS

Why this pattern transfers

The setup works wherever a multi-site infrastructure with roaming requirements exists and standard RADIUS solutions aren't enough β€” transport operators (bus, tram, ferry), hospitality chains with cross-site WiFi, stadium federations, university networks (eduroam-style), corporate branch structures with a shared identity pool.

The pattern: site captive portal β†’ custom RADIUS aggregator (realm routing + fastest response) β†’ multiple authentication providers in parallel β†’ seamless transition between sites without re-authentication. With Docker Compose deployment and SNMP monitoring, no separate operations layer.

The DB solution has been in production since 2020. The pattern itself has since been reused multiple times in variations.

Talk to us

Two doors, one address.

Specific bottleneck?

Let us talk for 30 minutes about your use case.

No obligation, no cost, with concrete next steps at the end.

Book a 30-minute call

Your own AI platform?

See CompanyWizard live in action.

Demo with your own data is possible. We bring the pseudonymisation set up and ready.

Request a demo