Southwest Airlines executives met Monday before Christmas to plan for Winter Storm Elliot.
Southwest Headquarters began cancelling flights ahead of the storm the next day after internal debate and alerting the pilots’ union.
Customers at these airports were initially advised to check for flight disruptions. Airports began to multiply.
The airline canceled hundreds of flights. When the winter storm developed, the call proved worthwhile—in hindsight, they didn’t go far enough. But, the idea that they could handle cancellations early enough for customers to change arrangements and staff to reposition for other routings proved fatally incorrect.
You know the rest: temperatures fell, Denver thermometers dropped 37 degrees in one hour, failed travel plans disrupted holidays, 5,700 flights nationwide were canceled before Christmas, and Southwest collapsed as other carriers struggled.
into trouble, cancelling over 17,000 flights and losing $825 million in the final ten days of the year.
The meltdown was unprecedented. Others blamed the storm, the carrier’s point-to-point strategy, or the company’s culture. As the heated takes cooled, a less interesting culprit emerged, one that’s not unique to Southwest and more of an industry problem than a bug.
The collapse was Southwest’s largest, but it wasn’t the first. Southwest’s breakdown was triggered by winter Storm Elliot, not caused by it. Southwest collapsed because to legacy. Southwest pioneered low-cost business models with short turnarounds and point-to-point route networks.
Southwest flights start in San Diego, then head to Sacramento, Colorado, and Nashville before ending in Tampa. Southwest became the sixth biggest airline by leveraging this niche by offering non-stop flights while competitors only had one-stops. This technique is efficient in ideal conditions, but delays and cancellations make it a liability. Pilots time out, crews miss their nighttime destination, and planes aren’t ready for the next day’s flights.
Weather and unpredictable events constantly exist. Southwest uses GE Aeronautics SkySolver for these circumstances. In SkySolver’s brochure, its algorithms will perform flight schedule modifications and cancellations, aircraft routing and equipment swaps, and crew assignments and pairing.
That means that if a developing storm rolls over Denver and causes Southwest to cancel this inbound flight from Sacramento due to the risk that the plane and crew will stay grounded by the weather, it would call on SkySolver to effectively stem the bleeding and ensure that problems don’t cascade to the rest of the routing. SkySolver may send this jet on a passenger-less deadhead to Nashville, where the crew might clock out and hand the plane over to the next.
Of course, that’s just the beginning: although finding the best cost-effective solution for this routing is straightforward, it gets orders of magnitude more difficult when considering the hundreds of additional Southwest planes going through. Like, where to move this 737 that would have been flying in and out of Denver all day that has space for the plane and can start services the next morning?
What about this 737 that’s flying between cities in the storm’s route and might be caught in a snowed-in airport? How to fly the later legs of this route when the plane planned to run it won’t leave Denver that morning? SkySolver can cancel 300 batches in 20 minutes, however it has limits.
Under a static baseline, it solves cancellations. If half of Denver’s flights are canceled, SkySolver has to know that the Sacramento-to-Denver trip has been canceled. If it doesn’t, or if the flight is canceled while the system is running, SkySolver’s solutions become less feasible, necessitating re-runs or manual intervention. Second, data input limits SkySolver. Southwest’s Baker tool provides some of these data. The Baker reroutes planes and people after delays and cancellations, then SkySolver receives the data. Crew information—available, who’s where, when, and how long until they’re timed out—is crucially missing from The Baker’s data.
Southwest’s solver’s limits combined as the storm progressed and cancellations surged, causing a catastrophic spiral. Algorithms without pilots and attendants produced viable answers. As it became clear that these planes had just half a crew near to departure time, SkySolver’s baseline assumptions changed, and everything collapsed.
The software’s blind spot turned the predicted challenges of a winter storm—hundreds of flights canceled at least 24 hours out on the 20th and 21st—into a full-blown crisis where Southwest’s tools compounded issues that it then sought to repair, only to worsen them more. Through Christmas, cancellations occurred closer and closer to departure times as pilots arrived to planes with no flight attendants, while planes with no passengers dead headed on circular, nonsensical routings like this as the software panicked, positioning and repositioning planes only to time out and get stuck again.
Southwest went manual, training volunteer headquarters workers to handle pilot and flight attendant paperwork and preparing for a hard reset. Southwest canceled 10,700 flights in the four days after Christmas because it neglected to resolve a known weakness.
They overburdened and complicated a vulnerable IT infrastructure that everyone recognized required an update.
When the calendar turned, Southwest stabilized, and more details about the breakdown emerged, a new meltdown took center stage. The FAA ordered a nationwide ground stop. For the first time since 9/11, American commercial aviation came to a total halt for 90 minutes when a software upgrade went wrong, forcing pre-flight safety messages to be sent manually until they could no longer keep up with the morning rush of aircraft.
Then the cycle began again: disgruntled consumers, irate journalists, and promises that the outdated technology that everyone knew would fail would be looked into. And this tendency didn’t start with Southwest, either. IT meltdowns attacked British Airways in 2017, then in 2019, then again in 2022; they led Delta to postpone hundreds of flights in 2016 and 2017; they drove American Airlines to cancel over 1,000 regional flights in 2018.
All said they’d do better the next time and yet all still rely on outdated, over-stretched IT systems that were never designed to handle the volume of industry growth in the decades since their initial development. But the problems of old, over-stacked IT systems don’t stop with the backend: while customer-facing systems might look good on the surface, the most important one, the very system that makes it all possible was created as close in time to the Wright Brothers’ first flight as to today.
When a traveler sees this, it may look modern, but what they’re actually looking at is a translation of this. Behind the scenes, Google Flights entered a query into the Global Distribution System: an interconnected set of softwares run by Amadeus, Sabre, or Travelport. In fact, specifically, it entered this: AD, to initiate an availability lookup, 08FEB, to indicate the date, then DENLHR, to specify Denver as the origin and London as the destination. This initiates a query from the GDS system to another: OAG, which essentially acts as the definitive source of airline scheduling data worldwide. The GDS then comes back and spits out two options, one by British Airways, one by United.
mbers indicate availability across the different fare classes, then the right side provides most other data: departure time, arrival time, aircraft type, and flight duration. But fare data is still missing: for that, Google Flights needs to pick a flight, pick a fare class, and enter another query for its price. But the GDS doesn’t have that data itself: it seeks the answer from the Airline Tariff Publishing Company which, similarly to the OAG with scheduling, is the one-stop-shop for pricing data, used by essentially every major airline globally. So, once again, the GDS will repackage their response into this: reiterating the flight schedule, then displaying the base fare, the carrier surcharge, US passenger facility charge, security fee, and international departure tax, to give a total $630.20 one-way fare on the February 8th British Airways Denver to London flight. This entire process is repeated for the United option, which gives Google Flights all the information necessary to populate this screen. But next up is booking, and that’s even more involved.
g takes place, with some limited exceptions, the user interface is just a translation tool to input a string of commands into the GDS. This sequence initiates the process of creating a passenger name record, or PNR, to which it will then input a phone number, the name of the person making the booking, then a precisely-formatted description of the desired flight itinerary. After reconfirming availability and fare price, the GDS will submit the PNR and return with this: the time limit for actually paying for the booking. At this point, the booking is confirmed: that seat is reserved and will stay reserved unless that payment deadline passes without payment.
While these days customers typically pay immediately, in the background it’s rarely required, but this delayed deadline gives an online travel agency like Expedia, for example, time to process and verify payment from the customer before actually paying for the ticket themselves, at which point it might become nonrefundable, leaving the company in a tricky spot if a customer’s credit card payment declined. Yet even before paying, Expedia will finish up the transaction with the GDS, consequently sending the passenger name record to the airline who will react with a modified version with a confirmation number, which is then removed and given to the consumer is a flashy, well-formatted email.
The GDS, and its linked technologies, were groundbreaking for the airline industry. Not only did they make bookings a lot less labour-intensive procedure, they made basically every airline reservation system is compatible with each other. Airlines use the same GDS, thus United’s internal systems interpret and change a PNR the same way Lufthansa’s would. United could technically sell you an itinerary connecting onto an American Airlines flight—they wouldn’t, because they don’t have a commercial partnership with the airline, but if your United flight cancels, the airline can and sometimes will rebook tickets onto American, Delta, and others using this system.
The GDS was innovative, but not anymore. In reality, in 2023, when you approach a bookings agent at an airport, the GDS is still managed as a command-line interface, the way computers worked before graphical user interfaces were widespread in the 1980s. Command-line interfaces are fine, but utilizing them successfully needs imagination and experience.
For example, a client in Denver is flying to Zurich, but their 1:40 PM flight to Chicago to connect to the 7:10 overnight trip to Zurich was canceled. A rebooking agent would check the GDS for the next itinerary between the two locations and find a plausible one connecting onto a 7:20 PM Swiss Air aircraft out of LAX, but the only two planes to LA that would make that, at 1:36 and 3:55, are oversold, so the stranded customer has no seat. Hence, the GDS cannot send that passenger to Zurich until the next day.
That doesn’t imply it’s impossible. An expert rebooking agent would know that United also flies to John Wayne Airport in Orange County, roughly an hour from LAX, so if desperate, the traveler might fly there, take a taxi to LAX, and then fly overnight to Switzerland. With this knowledge, an agent could purchase tickets for each leg and thread them together into an itinerary in the PNR, even if the GDS doesn’t provide it since the airline wouldn’t sell it. The GDS wouldn’t know that United’s 11:45 AM LAX-bound aircraft was delayed until 1:30, allowing this customer to catch the connection. United may buy the stranded passenger a ticket on American’s 4:10 PM trip to LAX through the GDS to continue their travel on Switzerland, but Swiss Air and American don’t usually ticket together.
This is only the beginning—there are countless ways to use the GDS to solve issues, but it takes expertise to know which ones work. Passengers are regularly stuck, which is terrible for airlines and travelers. The luck of the draw determines whether the rebooking agent you go up to has been there for thirty years and knows the GDS like the back of their hand or has just completed the little needed training to execute those basic jobs. This increases the number of roadside stranded passengers and the expense to airlines of distributing meal coupons, reserving accommodations, and paying delay compensation.
The GDS’s constraints increase expenses and prohibit income. Air New Zealand created Skycouch, a short, lie-flat bed made from three economy-class seats. As usual, the seats can be offered separately or as a set of three to one or two passengers.
The GDS cannot effectively communicate or reserve transformable seats, hence Skycouch seats are always shown as occupied on the GDS. Booking them requires visiting Air New Zealand’s website. This wastes airline revenue. As United feeds inbound US traffic to Air New Zealand’s long-haul flights out of LA, San Francisco, Houston, and Chicago, a large chunk of the airline’s traffic is ticketed through United. Those passengers may book and check-in entirely with United, so they never had the option to pay Air New Zealand for a Skycouch. The same goes for customers who booked through other airlines or online travel agencies.
Airlines have complained about GDS systems for decades, so there are more technological fixes and workarounds, but even those have issues. IATA’s New Distribution Capability standard provides more complicated back-end connectivity. Instead than fixing the Issue, it shows what’s wrong. IATA has recognized 162 airlines, travel agencies, system suppliers, and others as having implemented this new standard—or part of it—into their systems.
156 have adopted the NDC standard for flight shopping. But then you come to the more interesting functions: SHPAN2 enables for the selling of increasingly-common unbundled ancillaries like in-flight internet, lounge access, priority boarding, and more, so theoretically Expedia customers might buy in-flight wifi in their checkout flow. 51 of 162 are qualified in this capability.
Then you get into the really specialized stuff: NDC has enabled airlines to have interoperable dynamic pricing systems, so instead of working with the GDS and ATPCO, which only allow price updates a certain number of times per day, each fare would be customized for each consumer based on an algorithm’s estimation of their price sensitivity.
No airline is approved on this revenue-generating benchmark. This is a common scenario: airlines want more advanced technologies, they are produced, and then they determine the expense of integrating them isn’t worth it.
Some airlines are even refusing new interoperable systems. If you search Expedia for a Denver-Long Beach itinerary, it will suggest a one-stop American flight through Phoenix, even though there are three non-stop flights a day. Southwest flights cannot be booked through the GDS, thus Expedia cannot book them for customers. Google Flights, which directs consumers where to book rather than booking itself, shows Southwest schedules submitted with OAG but not fares because there are no GDS sales.
Southwest can modify and revenue-optimize the booking process on its website, where most bookings are made. Before confirming a fare, a pop-up touts the benefits of the next-highest fare class. The same up-sell, a $200 credit for signing up for their credit card, and an opportunity to pay for early check-in to secure a good boarding position appear on the next screens. Upsells matter.
Southwest earned $4 billion in 2021 from ancillary fees, which increase when clients book directly and are offered all up-sells. Direct booking costs cheaper since there are no developer fees, GDS fees, or Expedia charges. Budget airlines are increasingly not participating in GDSs, following Southwest’s lead, while full-service network carriers are going around it to establish direct business and system partnerships with key sales channels and partners.
Yet this is fragmenting the industry: airlines are innovating, which is good, but they are losing the collective interoperability that has created the worldwide seamless travel experience customers demand.
Like a reverse tragedy of the commons. If all nodes invested in interoperable systems, each would increase income. If only one node invests, they’ll spend money to build interoperability but have no partners to interoperate with, resulting in no income and negative cost. So, each node is incentive to restrict interoperability to steer users toward their own systems, which are better at increasing per-customer income. Strategic deadlock.
This also explains the strategic IT deadlock. Every airline’s IT is awful, so there’s no motivation to fix it. Why spend to reach a level the public doesn’t realize is possible?
The airline industry has landed on a mentality of “good enough” globally, which is problematic because airlines are now public utilities. They are the sole realistic way of long-haul and medium-haul transport in North America, Russia, Australia, and other bigger, less-dense countries.
Airlines delay people and economies. The US acknowledges this by financing airport development and upkeep, and even more aggressively by paying airlines to fly to rural and isolated regions, arguing that a lack of air transport would hurt local economy. Countries recognized this beforehand, classifying the initial airlines as government services and administering them as state-owned firms, but in the century afterwards, the trend has been privatization, meaning the only true commitment is to the shareholder. This is one of those all-too-common cases when shareholder interests and global interests diverge.