70-534 – Maintaining the Azure Cloud

70-534 – Maintaining the Azure Cloud

70-534 – Maintaining the Azure Cloud

Azure Hand Cloud

Azure Overview

As explained in the previous post around the Azure Datacentres, Microsoft’s Azure offerings have to be reliable, have high performance and be incredibly resilient. Therefore maintaining the Azure Datacentres can be quite a complex procedure. Microsoft has to have a plan in place for the two possible scenarios of maintenance, planned and the unplanned. Planned maintenance happens on a schedule, while unplanned maintenance occurs in response to an unexpected event, normally due to a hardware failure.

Azure Planned Maintenance

Microsoft routinely schedule maintenance of their hosting hardware. Whether these are a firmware update or applying a security patch to the underlying hypervisor. While most of these will not effect the virtual machines you have running on this infrastructure, there are some circumstances which may cause your VMs to shutdown and restart. Obviously Microsoft providing a multi-tenanted environment, it would be near impossible to schedule the downtime of all their customers servers that would be effected by the maintenance, so hence this may occur to your VMs

Azure Stop

Azure Availability Sets

So how do you avoid this and ensure your application keeps on going? Well Microsoft have an Service Level Agreement (SLA) in place only for multi instance VMs in the same logical group, which is called an availability set. When Microsoft performs maintenance, they ensure that not all the virtual machines within the same availability set will be restarted at the same time. So to give your applications the best chance, ensure that you have at least two virtual machines performing the same function (perhaps clustered for example) within the one availability set. Always remember that a single virtual machine will not have an SLA available and could be restarted at any time. During an Azure datacentre maintenance, the single instance VMs are brought down in parallel, then upgraded and restarted in no particular order. So if you have your applications on single instance Virtual Machines, they will naturally be unavailable during the maintenance window. Microsoft does send customers an email prior to any scheduled maintenance, detailing the date and time the outage is to be expected, but this is only for planned maintenance. Unplanned maintenance you will of course not be notified.

Azure Availability Sets

Azure Resilience

As shown in the example picture above, we have two Front End servers for the application within their own availability set, with the corresponding database servers also in their own one. AppSrv1 will be on a different host, and perhaps even rack to AppSrv2. Should the host running AppSrv1 have an issue and have the need to restart all the virtual machines running on that host, then this will not effect AppSrv2. Same thing goes for the database servers. It is best practice to also separate your application databases and other roles and have them in their own availability sets.

Where possible, always create multiple instances of your virtual machines and have them within the same availability set. If you do this you will then qualify for the Microsoft SLA.

Azure Update Domains

Azure Update Domains (sometimes these maybe called Upgrade Domains) are utilised for planned updates to the Azure Cloud service. The default number of Update domains is five with a maximum of twenty available to each availability set. Your virtual machines are spread across update domains to avoid outages to your applications and as Microsoft rolls out updates to their infrastructure, they will only ever update one update domain at any time. This will avoid unnecessary outages to your system

Azure Update Domains

Azure Unplanned Maintenance

So what happens when there is unplanned maintenance I hear you ask? As I am sure you are quite aware, problems with hardware can be a regular occurrence at times. Failures with the network, server issues and even total rack failures can and do happen. Azure detects these failures automatically and will migrate your virtual machines to another host that is healthy.

Azure Fault Domains

Azure fault domains are a boundary between the infrastructure within the same datacentre to help prevent issues caused by unplanned outages. Multiple virtual machines that are deployed in the same availability set are also allocated to different fault domains.  Fault Domains can be on separate racks, separate power supplies, different switches and sometimes even cooling systems. Fault Domains within Azure are assigned in a pattern, FD0, FD1, FD0, FD1 and so forth. All this helps alleviate any unplanned localised hardware failures that will interrupt services to your virtual machines. It is very unlikely that there will be issues with two or more fault domains, in fact it is more likely that there is a whole datacentre outage, which in this case you would need cross region replication.

Azure Fault Domains

Azure Fault Domain Example

Now we have shown two fault domains with the availability sets detailed in the earlier diagram. You can see that AppSrv1 and DBSrv1 are in the same fault domain, and therefore more than likely on the same hardware or within the same rack. Should the rack or hardware have a failure, then AppSrv2 and DBSrv2 will not be effected by this outage and will continue delivering your applications.

VM                 Fault Domain

AppSrv1         0

AppSrv2         1

DBSrv1           0

DBSrv2           1

When you boot your servers within your availability set they will be allocated to a fault domain in an order, e.g. FD0, FD1, FD0, FD1, FD0, FD1 etc. The pattern of fault domain allocation never changes and will always follow this pattern.

So how does this work?

It is worth noting, that each availability set automatically creates two Fault Domains and is assigned to five Update Domains. For example, you build an availability set with six virtual machines. The first five are allocated to the five Fault Domains, and the sixth virtual machine is then added in to the first Fault Domain, with the first VM. In the worst case, VMs number one and six could be restarted at the same time if a maintenance event was to occur. As Update Domains are only ever restarted one at a time and that the restart order of the Update Domains isnt always sequential, these can be restarted in any order.

 Cross Region Redundancy

Now, what happens in the unlikely event that a complete Azure Datacentre has an issue. Cross region redundancy is available within Azure which is basically a backup copy of your data in a secondary Azure datacentre (replication of your VMs to a second region). You can set up Cross Region Redundancy for your applications that require this level of service (thinking Tier 1 applications for the most part). You select the primary region to deliver your services from, choose a secondary region and Azure will take care of the replication. In the event of something catastrophic of the primary region, the system will automatically failover to the secondary region. The beauty of this service is that this happens automatically, there is no manual intervention required. Azure automatically takes care of the replication and the failover.

Service Throttling

As Microsoft’s Azure is a multi-tenant environment, with many many customers, how can Microsoft fairly monitor consumption? Service throttling will ensure consistent delivery of services to every customer they have according to the customers subscription limits. If throttling does ever occur, the experience that will be delivered will be degraded services. Azure bases this throttling on a few different criteria. From the amount of data stored, the number of transactions and system throughputs. You do always have the option to increase your limits should you ever reach them. As always, you should plan your architecture within Azure with performance in mind, but if the need arises you can scale up and scale out as needed.

       FAQs

Question Answer
What is Azure planned maintenance?
Azure planned maintenance is when Microsoft schedules maintenance of their hosting hardware, which could include firmware updates or applying security patches to the underlying hypervisor. Some virtual machines may need to be shutdown and restarted during this process.
What is Azure Availability Sets?
Azure Availability Sets is a feature that allows customers to group virtual machines together in the same logical group to ensure that they are not all restarted at the same time during maintenance. Having multiple instances of virtual machines in the same availability set qualifies customers for the Microsoft SLA.
What is Azure resilience?
Azure resilience refers to the ability of a system to withstand and recover from hardware failures or other unexpected events. To ensure resilience, it is best practice to separate application databases and other roles, and to have them in their own availability sets.
What are Azure Update Domains?
Azure Update Domains are used for planned updates to the Azure Cloud service. Virtual machines are spread across update domains to avoid outages to applications, and Microsoft will only ever update one update domain at any time.
What is Azure unplanned maintenance?
Azure unplanned maintenance occurs in response to unexpected events such as hardware failures. Azure automatically detects these failures and migrates virtual machines to another healthy host.
What are Azure Fault Domains?
Azure Fault Domains are a boundary between infrastructure within the same datacenter to prevent issues caused by unplanned outages. Multiple virtual machines deployed in the same availability set are allocated to different fault domains, which can be on separate racks, power supplies, switches, or cooling systems.
How do I ensure my applications are resilient in Azure?
To ensure application resilience in Azure, it is recommended to group virtual machines in the same availability set, separate application databases and other roles, and have them in their own availability sets.
How does Azure handle unplanned outages?
Azure automatically detects unplanned outages and migrates virtual machines to another healthy host.
How does Azure prevent outages during planned maintenance?
Azure uses Azure Availability Sets and Azure Update Domains to prevent outages during planned maintenance.

Well thats it for todays post. Ill continue with the Architecting Azure Solutions 70-534 study in a further post. Make sure you book mark this site for further updates.

70-534 – Azure Datacentres

70-534 – Azure Datacentres

70-534 – Azure Datacentres

The second post of many more to come to help you understand and pass the Architecting Microsoft Azure Solutions exam and gain that sort after certification.

Well first things first, lets cover off the Microsoft Azure Datacentres. The datacentres may be known as Azure GFS datacentres (Global Foundation Services) or they were newly renamed to Microsoft Cloud Infrastructure and Operations (MCIO).

MS Azure DCs

Microsoft’s Azure datacentres are in all 17 different regions throughout the world all networked together with access available to these datacentres from 140 different countries. They are operate in 10 different languages and 24 different currencies. Not only can you run your servers and applications in these datacentres, they also are used by Microsoft to deliver their own services, like Office 365 services, Bing search, Xbox live as well as the Azure platform. These datacentres are huge (some as big as three large cruise ships placed end to end) with over one million servers serving over one billion customers. They have to be to provide infrastructure to themselves as well as all their clients around the world with real time replication, low latency and very very high reliability.

The regions they are available in are;

Azure Region             Location

Central US                   Iowa

East US                        Virginia

East US 2                     Virginia

US Gov Iowa                Iowa

US Gov Virginia           Virgina

North Central US         Illinois

South Central US         Texas

West US                       California

North Europe               Ireland

West Europe                Netherlands

East Asia                      Hong Kong

Southeast Asia             Singapore

Japan East                   Tokyo, Saitama

Japan West                  Osaka

Brazil South                 Sao Paulo State

Australia East              New South Wales

Australia South East    Victoria

Central India                Pune

South India                   Chennai

West India                    Mumbai

Choosing a Microsoft Azure Datacentre

Whenever choosing a datacentre to build your environment in, its always best practice to choose the one that is closest to your users, this will help with any latency, performance and reliability issues. Not all of the Microsoft Azure datacentres share the same set of services. (Microsoft regularly roll out new services. To see which services are available and where, visit the Microsoft website https://azure.microsoft.com/en-us/regions/services/). Australia has an additional constraint that only customers residing within Australia and New Zealand can uses the services within that region. Additionally, China which you may have noticed isnt specified above, delivers Azure services independently from the others as it is offered by one of their largest Internet Service Providers, 21Vianet. Data within the China Azure infrastructure remains within China and doesnt replicate or share data to the other regions.

Azure Datacentre Resiliency

Having datacentres that big and making them highly available creates a huge problem. Just think about having to manage over one million servers, patching them, updating firmware, replacing failed hardware. The number of servers alone is enough to make the average administrator faint. The advantage that Azure has over the average datacentre is, the amount of physical hardware servers. When one server starts to fail, its virtual machines can be migrated to another healthy server. Faults are detected and migration is handled automatically. The ability to quickly recover, or in most instances, migrate these virtual machines live, means high resilience is built in. This is known as Mean Time to Recover (MTTR), which allows Microsoft to provide the availability of services to their customers, quickly and without user intervention.

Azure Security

Microsoft takes security of seriously. Imagine all the data belonging to all these customers and Microsoft have a rogue employee start stealing data. Well Microsoft has locked down Azure only so that the administrators only have enough access and time to do the task they require. This is known as Just in Time Administrator Access. By default, Microsoft administrators do not have access to customer data and can only gain access when granted by the client and only during a predetermined window. All their administrator access and actions are logged, monitored and audited. Physical access to the Microsoft Azure Datacentres and hardware is also monitored with continuous surveillance.

As you can imagine, Microsoft Azure datacentres would be a target for all sort of nefarious type of hackers and threats. Threat management is also provided as part of the service. Data is scrubbed and monitored for any potential threats prior to it coming in to your precious servers. Intrusion detection, Denial of Service attack prevention, regular penetration testing, data analytics and machine learning tools help to keep your servers and data safe. Azure scans all software during all physical server builds. They also have real time protection and on demand scanning of their cloud services and virtual machines.

Deployment of patching is automated to the Azure infrastructure. Patching deployment is based on the severity of the patch. Azure will also patch customers virtual machines unless the customer has requested to manually patch their systems themselves (ie using SCCM or WSUS or the like).

Having so many customers share infrastructure between them in the multitenant environment, could be a huge security risk. Azure logically isolates each customer from each other so that no customer should be able to access any other customers data. For customers own security and compliance, Microsoft Azure provides a set of tools to help the client achieve this. Azure offers technology like data encryption in transit and at rest (Azure storage is encrypted). Azure also obtains some of the highest security certifications, such as ISO27001 and ISO27002,
HIPPA, FISMA, FedRAMP etc (The Microsoft Azure Trust Centre details the certifications held further. Please visit https://www.microsoft.com/en-us/trustcenter/Compliance for more information).

 Azure Datacentre Designs

With so many datacentres that are this large and with so many customers utilising their services and expecting reliability and performance, every Azure datacentre is designed with infrastructure availability as the main concern. Every critical component of Azure is built with redundancy in mind. Multiple Uninterruptible Power Supplies (UPS), huge arrays of batteries and large generators with fuel reserves to compensate in case of a tremendous disaster.

As you can imagine, running each of these datacentres is a huge expense for Microsoft. So each datacentre is also designed with to lower their total cost of ownership. Each of the Azure datacentres operate with a lower Power Usage Effectiveness (PUE) rating as low as 1.125, in comparison an average datacentre PUE rating is an 1.8. A low PUE means that the datacentre consumes less power and Microsoft achieve this by looking at the datacentre as a whole, not just focusing on each single component.

Azure Datacentre FAQs

Question Answer
What are Microsoft Azure Datacentres?
Microsoft Azure Datacentres are facilities that house and maintain servers and other infrastructure for running applications and services on the Azure platform. They are located in 17 different regions throughout the world and are used by Microsoft to deliver their own services as well as provide infrastructure to clients around the world.
What are the regions in which Microsoft Azure Datacentres are available?
Microsoft Azure Datacentres are available in 17 different regions around the world, including Central US, East US, West US, North Europe, West Europe, East Asia, Southeast Asia, Japan East, Japan West, Brazil South, Australia East, Australia South East, Central India, South India, and West India.
How do I choose a Microsoft Azure Datacentre?
When choosing a Microsoft Azure Datacentre to build your environment in, it’s best practice to choose the one that is closest to your users to improve latency, performance, and reliability. Not all of the datacentres share the same set of services, so it’s important to check which services are available and where on the Microsoft website. Additionally, customers residing within Australia and New Zealand can only use the services within the Australia region.
What is Azure Datacentre Resiliency?
Azure Datacentre Resiliency refers to the high resilience built into Microsoft’s Azure datacentres, which allows virtual machines to be quickly recovered or migrated live to another healthy server in the event of a failure. Faults are detected and migration is handled automatically, resulting in a Mean Time to Recover (MTTR) that allows Microsoft to provide the availability of services to their customers quickly and without user intervention.
How does Microsoft ensure the security of its Azure Datacentres?
Microsoft takes the security of its Azure Datacentres seriously, and has implemented measures such as Just in Time Administrator Access, physical access monitoring, and continuous surveillance to prevent unauthorized access. Threat management is also provided as part of the service, which includes intrusion detection, Denial of Service attack prevention, regular penetration testing, data analytics, and machine learning tools to help keep customers’ servers and data safe. Azure logically isolates each customer from each other to reduce the risk of security breaches in the multi-tenant environment.

Well thats enough for the moment. I will continue on to the next blog post for the 70-534 exam another day.

70-534 – Architecting Microsoft Azure Solutions

70-534 – Architecting Microsoft Azure Solutions

70-534 – Skills to study

MS Azure Logo

Well if you love Microsoft products as much as we do, and you wish to learn about them further and be recognised as one of the select few that are solution experts, well you will need to study, study and complete more study. There is nothing better than hands on experience combined with excellent articles and blogs to help you pass. First thing to do would be to sign up to Azure for a free account. Microsoft offers free $200 a month which should be more than enough to put what you learn in to practice. As always with Microsoft’s Azure, you are billed by the minute, so make sure to shutdown and deallocate anything you build to avoid using up your free credits. (You could also use our tool, the Azure Virtual Machine Scheduler to automate the shutdown, deallocate and power back on your VMs with a schedule you specify. You can download a free 30 day trial to test for yourself.)

If you are familiar with Microsoft exams, they are never easy.  Be prepared to spend many nights reading books, watching videos and playing in your test Azure subscription to gain first hand experience. The following is taken from the Microsoft site for the required sections that you will need to study to pass the 70-534 exam, and I will cover off in future posts to help you with this.

Design Microsoft Azure infrastructure and networking (15-20%)

Describe how Azure uses Global Foundation Services (GFS) datacenters

Understand Azure datacenter architecture, regional availability, and high availability

Design Azure virtual networks, networking services, DNS, DHCP, and IP addressing configuration

Extend on-premises Active Directory, deploy Active Directory, define static IP reservations, understand ACLs and Network Security Groups, design resource groups

Design Azure Compute

Design Azure virtual machines (VMs) and VM architecture for IaaS and PaaS; understand availability sets, fault domains, and update domains in Azure; differentiate between machine classifications

Describe Azure virtual private network (VPN) and ExpressRoute architecture and design

Describe Azure point-to-site (P2S) and site-to-site (S2S) VPN, understand the architectural differences between Azure VPN and ExpressRoute

Describe Azure services

Understand, at a high level, Azure load balancing options, including Traffic Manager, Azure Media Services, CDN, Azure Active Directory (Azure AD), Azure Cache, Multi-Factor Authentication, and Service Bus

Design Azure virtual networks, network services, DNS, DHCP and IP addressing configuration (15-20%)

Secure resources by using managed identities

Describe the differences between Active Directory on-premises and Azure AD, programmatically access Azure AD using Graph API, secure access to resources from Azure AD applications using OAuth and OpenID Connect

Secure resources by using hybrid identities

Use SAML claims to authenticate to on-premises resources, describe DirSync synchronization, implement federated identities using Azure Access Control service (ACS) and Active Directory Federation Services (ADFS)

Secure resources by using identity providers

Provide access to resources using identity providers, such as Microsoft account, Facebook, Google, and Yahoo!; manage identity and access by using Azure Active Directory B2C

Identify an appropriate data security solution

Use the appropriate Access Control List (ACL), identify security requirements for data in transit and data at rest; identify, assess, and mitigate security risks by using Azure Operations Management Suite

Design a role-based access control strategy

Secure resource scopes, such as the ability to create VMs and Azure Web Apps

Design an application storage and data access strategy (15-20%)

Design data storage

Design storage options for data, including Table Storage, SQL Database, DocumentDB, Blob Storage, MongoDB, and MySQL; design security options for SQL Database or Azure Storage; identify the appropriate VM type and size for a solution

Design applications that use Mobile Apps

Create Azure Mobile Services, consume Mobile Apps from cross-platform clients, integrate offline sync capabilities into an application, extend Mobile Apps using custom code, implement Mobile Apps using Microsoft .NET or Node.js, secure Mobile Apps using Azure AD

Design applications that use notifications

Implement push notification services in Mobile Apps, send push notifications to all subscribers, specific subscribers, or a segment of subscribers

Design applications that use a web API

Implement a custom web API, scale using Azure Web Apps, offload long-running applications using WebJobs, secure a web API using Azure AD

Design a data access strategy for hybrid applications

Connect to on-premises data from Azure applications using Service Bus Relay, Hybrid Connections, or the VPN capability of Websites, identify constraints for connectivity with VPN, identify options for joining VMs to domains or cloud services

Design a media solution

Describe Media Services, understand key components of Media Services, including streaming capabilities, video on-demand capabilities, and monitoring services

Design an advanced application (15-20%)

Create compute-intensive applications

Design high-performance computing (HPC) and other compute-intensive applications using Azure Services

Create long-running applications

Implement worker roles for scalable processing, design stateless components to accommodate scale

Select the appropriate storage option

Use a queue-centric pattern for development, select the appropriate storage for performance, identify storage options for cloud services and hybrid scenarios with compute on-premises and storage on Azure, differentiate between cloud services and VMs interacting with storage service and SQL Database

Integrate Azure services in a solution

Identify the appropriate use of Azure Machine Learning, big data, Azure Media Services, and Azure Search services

Design Azure Web Apps (15-20%)

Design Azure Web Apps for scalability and performance

Globally scale Azure Web Apps, create Azure Web Apps using Visual Studio, debug Azure Web Apps, understand supported languages, differentiate between Azure Web Apps to VMs and cloud services

Deploy Azure Web Apps

Implement Azure Site Extensions, create packages, App service plans, deployment slots, resource groups, publishing options, Web Deploy, and FTP locations and settings

Design Azure Web Apps for business continuity

Scale up and scale out using Azure Web Apps and SQL Database, configure data replication patterns, update Azure Web Apps with minimal downtime, back up and restore data, design for disaster recovery, deploy Azure Web Apps to multiple regions for high availability, design the data tier; use Azure Resource Manager (ARM) templates to configure highly available Web Apps

Design a management, monitoring, and business continuity strategy (15-20%)

Evaluate hybrid and Azure-hosted architectures for Microsoft System Center deployment

Understand, at an architectural level, which components are supported in Azure; describe design considerations for managing Azure resources with System Center; understand which scenarios would dictate a hybrid scenario

Design a monitoring strategy

Identify the Microsoft products and services for monitoring Azure solutions; understand the capabilities of System Center for monitoring an Azure solution; understand built-in Azure capabilities; identify third-party monitoring tools, including open source; describe use cases for Operations Manager, Global Service Monitor, and Application Insights; describe the use cases for Windows Software Update Services (WSUS), Configuration Manager, and custom solutions; describe the Azure architecture constructs, such as availability sets and update domains, and how they impact a patching strategy; analyze logs by using the Azure Operations Management Suite

Describe Azure business continuity/disaster recovery (BC/DR) capabilities

Understand the architectural capabilities of BC/DR, describe Hyper-V Replica and Azure Site Recovery (ASR), describe use cases for Hyper-V Replica and ASR; use Azure Backup to back up ARM VMs

Design a disaster recovery strategy

Design and deploy Azure Backup and other Microsoft backup solutions for Azure, understand use cases when StorSimple and System Center Data Protection Manager would be appropriate, design and deploy Azure Site recovery

Design Azure Automation and PowerShell workflows

Create a PowerShell script specific to Azure, automate tasks by using the Azure Operations Management Suite

Describe the use cases for Azure Automation configuration

Understand when to use Azure Automation, Chef, Puppet, PowerShell, or Desired State Configuration (DSC)

Azure Exam FAQs

Question Answer

What is the 70-534 exam?

The 70-534 exam is a Microsoft certification exam that tests your ability to design and implement Azure solutions.

What skills do I need to study for the 70-534 exam?

You will need to have a solid understanding of Microsoft Azure infrastructure and networking, as well as experience with Azure virtual networks, networking services, DNS, DHCP, and IP addressing configuration.

What resources are available to help me prepare for the exam?

Microsoft offers free $200 a month Azure credits which should be more than enough to put what you learn into practice. You can also sign up for hands-on experience, read articles and blogs, watch videos and use test Azure subscriptions to gain first-hand experience.

How can I secure my resources when using Azure?

You can use managed identities, hybrid identities, and identity providers to secure your resources. You can also implement role-based access control strategies and identify appropriate data security solutions.

What options do I have for data storage in Azure?

You can choose from several storage options, including Table Storage, SQL Database, DocumentDB, Blob Storage, MongoDB, and MySQL. You should also design security options for SQL Database or Azure Storage and identify the appropriate VM type and size for a solution.

How can I design applications that use notifications or a web API?

You can implement push notification services in Mobile Apps and send push notifications to all subscribers or specific segments. You can also implement a custom web API, scale using Azure Web Apps, offload long-running applications using WebJobs, and secure a web API using Azure AD.

What is the importance of a data access strategy for hybrid applications?

A data access strategy is important for connecting to on-premises data from Azure applications using Service Bus Relay, Hybrid Connections, or the VPN capability of Websites. It helps you identify constraints for connectivity with VPN and options for joining VMs to domains or cloud services.

How can I design an advanced application?

You can create compute-intensive applications using Azure Services, implement worker roles for scalable processing, and select the appropriate storage option. You should also design stateless components to accommodate scale.

Continue reading the next blog post to learn about the Azure Datacentres for the 70-534 exam.