In my last post, we covered the basic concepts of cloud computing. This time, we’ll be diving down a little deeper into the cloud to cover the actual structure that most cloud environments use to operate. There are a lot of differences between cloud providers, but they all use very similar hardware and network designs to get things done.
1. The Physical Cloud
Every cloud providers utilize a very similar, if not identical, physical hardware design to power their solutions. Cloud systems rely on data centers filled with servers, storage, and network hardware. All of this equipment is designed to be “Highly Available.” This means that if a single component fails, the system will “fail over” to an identical component that takes over until the original component is repaired or replaced. For cloud systems, that “fail over” period must be extremely short and invisible to the end user. All of this hardware is contained in data centers around the world (for the big providers).
Data centers in the cloud are dispersed across many physical locations. For instance, Office 365 may have a data center in Texas and another in Oregon. Portions of the data that exists in a cloud “tenant” (a single cloud environment) regularly move and replicate between multiple data centers. This allows the cloud customer’s data to be available in another data center if something happens to take the primary out. This could be anything from power outages and natural disasters to dinosaur attacks and alien abductions. This physical resilience further advances the Highly Available goal of cloud systems.
2. The Virtual Cloud
Cloud data centers obviously consist of a huge number of physical servers, but modern servers are far too powerful to run just a single application. Doing so is usually a waste of money, power, and other resources. This is where virtualization comes in to play. Virtualization in IT refers to the use of sliced up chunks of an actual server’s resources. Each “Virtual” server can be assigned a portion of a physical server’s CPU (Also called Compute), RAM, and Storage capacity to run multiple applications on separate OS installations that can be completely segregated from one another. Virtualized servers are what allow the cloud to function. Without virtualization, cloud services would not be profitable, secure, or useful. There are some notable exceptions to this rule, however.
Microsoft’s Exchange Online (A single piece of Office 365) is built on physical servers with little to no virtualization for the Server OS or the Exchange server application. However, Exchange Online’s resources are different than most other cloud solutions in that its chunks of data are based on Mailboxes. Each Exchange Online mailbox is assigned a specific amount of drive space (up to 100GB for a primary mailbox and unlimited storage for archived data). CPU and RAM resources are shared between all mailboxes on a server. Exchange is not a CPU intensive application and its RAM use is extremely stable and independent of mailbox use. This makes it easy to sell server resources for email and collaboration through Microsoft’s premier messaging suite without having to virtualize servers. In general, SaaS solutions follow this model of selling services by user, organization, or other easily defined barriers, but most SaaS systems also run on virtualized infrastructure.
Along with server virtualization, we also have network virtualization. Building data networks with virtual routers, firewalls, switches, etc. is usually referred to as Software Defined Networking (SDN). SDN makes it possible to create simple or complex data transmission and routing networks without worrying about wires, networking equipment, or all the other physical pieces of a traditional network.
Next we have storage virtualization, which is the allocation of data storage to applications, servers, and users. The allocation of data in the cloud is based on a lot of factors. In Azure and AWS, data is mostly sold based on factors like read/write speed, number of times data is written or read from storage, how much data is read from the storage medium, and other factors. It’s usually fairly complicated and confusing to talk about, but the large number of factors used to sell storage allow for a variety of needs. For instance, Archive servers will prioritize size over speed and won’t have a lot of interaction, so low tier storage, which tends to be much cheaper, can be used. On the other hand, back-end databases for web applications that serve millions of users a month will need to have more expensive speed requirements, resulting in higher costs for customers.
3. Cloud Limitations
The range of options available does provide most companies with all they will need from storage and compute resources, but there is a drawback. Another key attribute of cloud systems is that they have limits to how customized the infrastructure and services can be. You are completely limited to the options available, and while modern cloud platforms provide a lot of choice, companies with specific, special needs are often better served by keeping systems on-prem. This is especially true with SaaS solutions like Exchange Online. Exchange Server offers significantly more control over email and has some situational advantages to it.
For instance, mailbox size in Exchange Server is effectively limitless. If you want to store 300GB of email in a mailbox, you can do so with no special actions or limitations. Exchange Online, on the other hand, does *offer* unlimited email for their enterprise level subscriptions, but the implementation of that is clunky. To use unlimited mail in the cloud, you first have to use up your 100GB allotment, then an administrator needs to enable an archive mailbox for you. The archive mailbox is stored in slower, cheaper storage systems than the original 100GB, which is what allows MS to offer it with unlimited storage. The archive mailbox starts with a 50GB allocation of additional space. Once the space is almost taken up, the archive mailbox is expanded by 10-20GB more. Each time the archive mailbox is expanded, the user will see a new folder created that denotes the date range for the mail stored in the folder.
This is all fine for most purposes, except that searching through a mailbox requires users to search each folder individually. This feature may change in the future, but for now, it has major limitations in comparison with the on-prem solution that allows you to store everything in a user’s mailbox or to expand mailboxes into an archive mailbox that has as much or as little storage as company policy allows. So there are limits to what cloud services can do, and you are often limited by the controls made available by the cloud provider. It may be annoying at times, but it is a necessity that is caused by the shared responsibilities that cloud providers and cloud consumers have. And that’s where we’ll go with the next part of this guide, the “Shared Responsibility Model” that explains who is responsible for what in a cloud deployment.
Links to the Other Parts
Because I like links and it seems to be standard to include a link to each part of a multi-part guide, here’s the links.