Published July 2020
During the 2008 financial crisis, a number of Wall Street firms were considered “too big to fail.” Following the taxpayer-funded bailout, the term has remained etched in mainstream consciousness. In this article, we examine how “too connected to fail” could be a relevant concept for Big Tech, particularly with reference to the dominance of Amazon’s AWS in the cloud business.
In addition, the fast-growing Cloud Management Platform (CMP) sector could address cloud vulnerability issues by making it easier for companies to use multiple cloud providers, accelerating the move towards the multicloud.
In the deals space, Cloudbolt raised $23 million in a Series A round in 2018. There were also a number of acquisitions, mostly by larger software and technology companies. The financial services sector could identify future acquisition opportunities, especially for CMPs that specialize in areas such as data security on the cloud.
The Everything Store
Amazon has not only dominated the online retail market, but also successfully ventured into other industries such as online streaming (Amazon Prime Video), grocery stores (Whole Foods) and artificial intelligence (powering its Echo smart speakers).
Chief among Amazon’s business lines is Amazon Web Services (AWS), its cloud-computing division that has posted accelerating growth in profits over the years. Its quarterly revenue exceeded $10 billion for the first time in 1Q 2020, with Apple and Facebook counting as some of the biggest customers.  
Without a doubt, Amazon’s scale is not merely a reflection of its valuation, it also highlights our increasing dependence on the company to provide critical cloud-based infrastructure for our economy. Is it about time we consider a “too big to fail” analog for Amazon and other Big Tech firms?
Too Connected to Fail
The use of the term “too big to fail” may have obfuscated the more important issue of being “too connected to fail”. “Too big to fail” refers to how financial institutions have become large enough that their failure would be disastrous to the health of the economy, which was originally applied to a sequence of bank bailouts in the 1970s. The concept was later popularized in a 1984 Congressional Hearing to debate the eventual bailout of Continental Illinois, the eighth largest bank in the US before it found itself in financial distress. 
“Too big to fail” was again referenced during the 2008 financial crisis, but since then many have come to understand how bank interconnectedness played a key role in amplifying shocks throughout the global financial system. Beyond optimal levels of interconnectivity, systemic risk in a financial system could unleash a global financial crisis that could offset its purported benefits. In this way, some banks would be considered “too connected to fail.” Although size and interconnectivity could reinforce each other, they should not be considered synonyms; the former is a symptom while the latter is the root of a crisis.
The primary concern of systemic risk is that a firm’s centrality in a particular network could contribute a degree of vulnerability to the entire system. In the context of the financial crisis, the failure of Lehman Brothers resulted in widespread contagion around the world because it was a highly-connected node with over one million counterparties. In a similar way, considering the scale of data processed through Amazon’s AWS, a prolonged outage could paralyze the operations of many essential companies and cause shockwaves throughout the economy.
Growing Costs from Technology Risks
Since its inception in 2006, AWS has faced a series of technical issues. Recent ones include the S3 (AWS Simple Storage System) outage in 2017 and Direct Connect (a service to enhance connectivity and reduce network costs) disruption in 2018.  These disruptions lasted hours before they were resolved. Early on, these disruptions only affected a fraction of companies but the consequences have become much more significant in recent years.
Based on a Wall Street Journal article, the 2017 outage cost S&P 500 companies an estimated “$150 million, according to Cyence Inc., a startup that specializes in estimating cyber-risks. Apica Inc., a website-monitoring company, said 54 of the internet’s top 100 retailers saw website performance slow by 20% or more.” In terms of consumer experience, this means that many websites were not accessible at all. Quora was returning a “504 Gateway Timed Out” error, while Mashable failed to load many images hosted by S3.
To put in perspective, the International Working Group on Cloud Computing Resiliency (IWGCR) estimated a combined cost of $72 million for all outages among 13 cloud service providers from 2007 to 2012. In comparison, a single outage by AWS in 2017 cost more than twice that amount, the result of a mistyped command during a debugging process. As Amazon grew its market share over the years, it has also become responsible for a lion’s share of financial losses from its outages.
The Move towards the Multicloud
To diversify their cloud operations and protect themselves from any outages, many companies have begun to adopt a “multicloud” strategy by migrating some services towards other cloud providers like Microsoft Azure and Google Cloud Platform (GCP). A 2020 survey conducted by O’Reilley Media showed that slightly more than half of the 1,283 IT managers were using multiple cloud providers. Also, a Gartner survey in 2019 revealed 81% of the surveyed public cloud users were working with multiple providers, with most companies doing so “to avoid vendor lock-in or to take advantage of best-of-breed solutions.”
However, the use of “multicloud” does not necessarily imply having cloud redundancy (i.e. data are duplicated across multiple providers as a failsafe against outages); it only refers broadly to the use of two or more cloud providers. Regardless, this trend suggests the decreasing dependency on AWS as the single cloud provider.
To cater to this strategy, various cloud management platforms (CMP) have emerged in recent years. One such platform, RightScale, was acquired by Flexera for an undisclosed amount in 2018. It was then the largest funded cloud management company with a total funding of $62 million. Other CMP providers include Cloudbolt, Scalr, Embotics and VMWare CloudHealth.
The Cloud Economy’s Continued Reliance on AWS
Amazon’s AWS has a commanding share of the cloud computing market and it will continue to have favorable revenue growth prospects in the Post-Covid-19 era. Meanwhile, the increasing adoption of the multicloud approach will inevitably shrink its share of the cake, but that is far from a complete shift-away.
As companies continue to transition to the cloud, a much larger portion of the economy will also become more dependent on AWS. As for those companies with a multicloud strategy, many will continue to rely on AWS to support some of their services. Since a sustained outage would result in tremendous financial losses, Amazon (and its main competitors) will increasingly become “too connected to fail.”
Despite any assurances or improvements in the technical processes, cloud outages will not cease to happen in the foreseeable future. It may take a single disastrous event for companies to begin taking the issue seriously (in parallel, Covid-19 may have been the impetus to realizing the lack of pandemic stockpiles and the subsequent scramble to fix the supply shortage).
The rise of the CMP sector will hopefully drive more awareness about the importance of having multiple cloud providers, eventually implementing cloud redundancy measures in the most critical operations.