Scroll Top

Understanding Scope and Durability of Data on AWS

Home Understanding Scope and Durability of Data on AWS

Understanding the scope and duribility of data

By Brady O’Brien Cloud Insight Blog January 26, 2021

Moving valuable data and workload to the cloud requires careful thought and consideration to ensure that the data gets the durability and availability that best fits the use case. There is no magic cloud button to ensure 100% durability and availability — not even in the most expensive solutions. But have no fear, there is hope.

“We moved it to the cloud! How did we lose our data? We thought the cloud was automatically redundant and infinitely durable!” Anyone who has worked in the public cloud segment for any amount of time has heard these questions. The truth is that moving valuable data and workload to the cloud requires careful thought and consideration to ensure that the data gets the durability and availability that best fits the use case. There is no magic cloud button to ensure 100% durability and availability — not even in the most expensive solutions. But have no fear, there is hope. To illustrate the importance of durability and availability, we will use the largest cloud provider, Amazon Web Services (AWS), as an example.

Durability vs. Availability
So, what’s the difference between durability and availability anyway? Durability is often referred to as the probability that you will be able to retrieve your data from the storage system at hand when you need it. In AWS, most storage tiers have what is often called the “nine 9s” of durability, meaning 99.999999999% durability over the course of a year. In other words, for every 10,000 files or objects stored, you can expect to lose only one of them to corruption or other issues each year. Pretty good, right? If only we could expect that kind of durability from our cars. Availability, on the other hand, is a bit of a different story.

Availability is better referred to as the probability that you can retrieve your data immediately when you need it. Most AWS storage services offer 99.99% availability, so there is a .01% chance that you won’t be able to access your data immediately, but as mentioned before, there is a much lower chance that the data won’t actually be intact upon arrival.

To differentiate between the two concepts, in simple terms, availability is the speed at which data arrives, while durability is how safely the data will get to you without corruption. Availability must be built and is inherently important.

Architecting High Availability
For this discussion, we will focus on Amazon’s object storage solution, Simple Storage Service (S3), with the goal of understanding the basics. Most tiers of Amazon S3 store your data on multiple devices, across what Amazon refers to as multiple Availability Zones (AZs). Think of Availability Zones as independent locations, but all in the same region. If the server hosting your data in one location (AZ) goes down, there will be two other locations ready to pick up the slack immediately.

The tricky part here is that MOST of the S3 services store data in three AZs; however, some less-expensive tiers only store data in one AZ. This is where planning comes into play. Often, the assumption that “cloud is automatically redundant” causes issues when the incorrect tier or service is chosen. Sadly, most organizations don’t find this out until it’s time to retrieve their data. While some data is highly critical, other data can wait to be retrieved for a few hours or even a few days.

So, is that it? Just choose a storage tier that uses multiple AZs? The short answer is probably not; rather, it is dependent on just how critical access to the particular dataset is. Let’s say there’s a natural disaster, such as a flood, earthquake, or tornado, and all three of the AZs go down? Now what? This is where the previously mentioned regions come into play. If the data is located in California, we can also replicate the data somewhere less earthquake prone, such as Virginia. We now have the data stored in six Availability Zones, across two regions, which, of course, will double the cost of data storage.

While many of the planning techniques and features of object storage carry over to operating system data, it is highly critical, when moving workload to the cloud, that it be carefully architected and vetted, usually by a third party like Presidio, to ensure availability, durability, and price, to name just a few considerations.

Learn more about Presidio Cloud Solutions.

Brady O’Brien

+ posts

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use nonessential cookies that help us analyze and understand how you use this website and enhance your user experience. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.

Necessary

Always Enabled

Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other".
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Functional

Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.

Cookie	Duration	Description
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
Zoominfo	session	Zoominfo uses technologies to collect and store information when you interact with services it offer to their partners, such as advertising services or analytics. All of those processes are meant to improve your user experience and the overall quality of our services.

Analytics

Analytics cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111355416_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	This cookie is used to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This is a Hotjar cookie that is set when the customer first lands on a page using the Hotjar script.
_hjIncludedInPageviewSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's pageview limit.
_hjIncludedInSessionSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's daily session limit.
_hjTLDTest	session	Hotjar test cookie to check the most generic cookie path it should use, instead of the page hostname. This is done so that cookies can be shared across subdomains (where applicable). To determine this, we store the _hjTLDTest cookie for different URL substring alternatives until it fails. After this check, the cookie is removed.
oktgid	1 year	This cookie is used for storing the visitor ID of the user who clicked on an okt.to link.
oktsid	session	This cookie is used for storing the session ID of the user who clicked on an okt.to link.

Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by YouTube and is used to track the views of embedded videos on YouTube pages.

Other

Other uncategorized cookies are those that are being analyzed and have not yet been classified into a category according to their type and purpose.

Cookie	Duration	Description
__gwtCookieCheck	session	This cookie is used to check if the visitors' browser supports cookies.
AnalyticsSyncHistory	1 month	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
li_gc	2 years	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
UserMatchHistory	1 month	LinkedIn - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Understanding Scope and Durability of Data on AWS

Brady O’Brien

UJIMA

VETs

WONDER

PRIDE

Mental Health Matters

KEVIN FLEURIE

Dan O’Brien

BRYAN CALDER

ROBERT KIM

VINCENT TRAMA

WAHEED CHOUDHRY

CHRIS CAGNAZZI

BRID GRAHAM

MICHAEL KELLY

KEVIN WATKINS

JENNIFER JACKSON

MANNY KORAKIS

JUSTIN FILIA

GOPINATHAN PANDURANGAN

STEVEN PALMESE

ELLIOT BRECHER

BARBARA ROBIDOUX

Please select your venue city and
complete registration form below.

Please complete registration
form below.

Dave Hart

CHRIS BARNEY

JOHN HANLON

Greg Hedrick

Courtney Washington

CHRISTINE KOMOLA

VINU THOMAS

JULIETTE AUSTIN