LearnwithVishnu
Basics → Production → Architect
← Home
🌐Multi-Cloud
BeginnerEngineerArchitectMulti-cloud strategy, Terraform multi-provider, challenges and patterns
What is Multi-CloudToolsChallengesQ&A

🌐 What is Multi-Cloud?

What is multi-cloud?

Multi-cloud means using services from more than one public cloud provider simultaneously. For example: AWS for machine learning workloads (SageMaker has the deepest ML ecosystem), Azure for Kubernetes and enterprise identity (AKS + Active Directory integration), and GCP for analytics (BigQuery is unmatched for large-scale data warehousing). Each cloud used for what it does best.

Multi-cloud is different from hybrid cloud. Hybrid cloud = one public cloud connected to your own on-premise data centre. Multi-cloud = multiple public cloud providers (AWS, Azure, GCP) — no on-premise required.

Why organisations use multi-cloud

ReasonExplanationReal example
Best-of-breed servicesEach cloud has unique strengths that no other cloud matchesGCP BigQuery, Azure AD + M365, AWS broadest managed services
Avoid vendor lock-inOne cloud controls your pricing. Two clouds gives negotiation leverage.Renegotiate AWS contract because Azure is a real option
Regulatory complianceSome regions require data processing by specific providersIndia RBI regulations, European data residency requirements
ResilienceMajor cloud outages happen — AWS us-east-1, Azure East US both have had multi-hour incidentsRoute traffic to Azure if AWS has an outage
Mergers and acquisitionsAcquired company uses a different cloud — consolidation takes yearsPost-acquisition: one team on AWS, another on Azure

Multi-cloud by design vs by accident

Most organisations are multi-cloud by accident — different teams chose different clouds independently, or acquisitions brought in companies using different providers. Managing this unplanned multi-cloud is harder than designing for it. The tools and patterns below apply to both scenarios.

🔷 Terraform Multi-Cloud

Multi-provider Terraform + Route53 failover

☸️ Multi-Cluster Kubernetes

ArgoCD ApplicationSet + Cluster API + Submariner

🔧 Tools — Terraform, Kubernetes, Observability

Why organisations go multi-cloud

ReasonExplanationReal example
Avoid vendor lock-inNo single provider can hold you hostage on pricing or featuresUse AWS for compute, Azure for AD integration
Best-of-breed servicesEach cloud has unique strengthsGCP BigQuery for analytics, Azure for enterprise identity, AWS for ML
Regulatory complianceSome countries require specific cloud providersIndia: RBI mandates some data on local providers
ResilienceOne cloud outage does not take down everythingRoute traffic to AWS if Azure has an outage
Mergers and acquisitionsAcquired company uses different cloudPost-acquisition: one team on AWS, another on Azure
Cost arbitrageDifferent pricing for different workloadsSpot instances cheaper on AWS for batch

Multi-cloud challenges

ChallengeImpactMitigation
Skills gapNeed expertise in 2-3 clouds — expensiveFocus on cloud-agnostic tools (K8s, Terraform)
Data egress costsMoving data between clouds is expensiveKeep compute close to data — process where data lives
Security complexitySeparate IAM, policies, tools per cloudCSPM tools (Prisma Cloud, Wiz) for unified view
Operational overhead2x the monitoring, alerting, ops runbooksUnified observability (Datadog, Prometheus federation)
Networking complexityCross-cloud connectivity needs VPN or SD-WANAvoid cross-cloud data paths where possible

Tools that make multi-cloud manageable

ToolCategoryWhat it does
TerraformIaCManage AWS, Azure, GCP with one codebase. Provider per cloud, same workflow.
KubernetesComputeSame manifests, Helm charts, ArgoCD on any cloud K8s cluster
Datadog / PrometheusObservabilityUnified metrics and alerts across all clouds
Vault (HashiCorp)SecretsOne secrets store accessed from any cloud
PackerImagesBuild machine images for AMI (AWS), VHD (Azure), GCE image (GCP) from one template
PulumiIaCLike Terraform but uses Python/TypeScript instead of HCL

Terraform multi-cloud example

# providers.tf — configure multiple clouds
terraform {
  required_providers {
    aws   = { source = "hashicorp/aws",     version = "~> 5.0" }
    azurerm = { source = "hashicorp/azurerm", version = "~> 3.0" }
    google  = { source = "hashicorp/google",  version = "~> 5.0" }
  }
}

provider "aws" {
  region = "us-east-1"
}
provider "azurerm" {
  features {}
  subscription_id = var.azure_subscription_id
}
provider "google" {
  project = var.gcp_project
  region  = "us-central1"
}

# Use AWS for primary, Azure for DR
resource "aws_instance" "primary" { ... }
resource "azurerm_virtual_machine" "dr" { ... }

⚠️ Challenges and Tradeoffs

Common multi-cloud architecture patterns

PatternHow it worksWhen to use
Cloud-agnostic appsApps containerised with K8s, deploy anywhere, use abstracted storage/DB APIsPortability is a requirement, no vendor lock-in acceptable
Cloud burstingPrimary workload on one cloud, burst to second cloud for peaksCost optimisation, seasonal peaks
Active-active DRFull deployment on two clouds, Route 53/Traffic Manager routes between themZero downtime requirement, can afford 2x cost
Best-of-breedEach cloud used for what it does best, integrated via APIsGCP BigQuery + Azure AD + AWS EKS in same platform
Shadow IT consolidationDifferent business units chose different clouds — consolidate with unified governancePost-merger, large enterprise with many teams

Multi-cloud networking options

OptionWhat it isCost
Cloud VPNIPSec VPN tunnels between clouds over internetLow — pay for gateway + data transfer
SD-WANSoftware-defined WAN connecting all clouds and on-premMedium — managed service cost
Direct interconnectDedicated physical link between AWS Direct Connect + Azure ExpressRouteHigh — dedicated bandwidth
Aviatrix / MegaportThird-party cloud networking fabricMedium — subscription-based

Multi-cloud adds real operational complexity — know the tradeoffs before choosing it

ChallengeImpactHow to address
Skills gapTeam needs expertise across 2-3 clouds — expensive to hire and trainInvest in cloud-agnostic skills (K8s, Terraform, Prometheus). Accept some per-cloud specialisation.
Data egress costsMoving data between clouds costs $0.08-0.09/GB — active-active multi-cloud is expensiveKeep compute close to data. Process where data lives. Avoid cross-cloud data movement.
Security complexitySeparate IAM per cloud, more attack surface, harder to auditCSPM tool (Prisma Cloud, Wiz) for unified security view. Consistent tagging and policy enforcement.
Operational overheadSeparate on-call runbooks, separate monitoring, double the expertise requiredUnified observability (Datadog or Prometheus federation). Standardised runbook format.
Networking latencyCross-cloud API calls add significant latency compared to intra-cloudDesign workloads as self-contained within one cloud. Use async patterns for cross-cloud.

The honest answer for interviews

Multi-cloud adds significant operational complexity. Only adopt it deliberately when the business benefit (best-of-breed services, resilience, compliance) clearly outweighs the cost. Most medium-sized organisations are better served by one cloud done excellently than two clouds done adequately. If you inherit a multi-cloud situation, standardise with cloud-agnostic tools as the priority.

🎯 Interview Questions

MULTI-CLOUD · ARCHITECT
When does multi-cloud make sense and when is it over-engineering?
Multi-cloud makes sense in specific, well-defined scenarios. First: DR with vendor diversity. Primary on AWS, DR on Azure. If AWS has a major regional outage (it has happened), you fail over to Azure. This gives genuine resilience that multi-AZ within one cloud does not provide. Second: regulatory requirements. Some governments mandate data stored with multiple providers, or require geographic distribution across different legal jurisdictions. Third: best-of-breed services. Your analytics team needs BigQuery (GCP). Your K8s runs on AWS. Your enterprise identity is Azure Active Directory. You are already multi-cloud by necessity. Fourth: large enterprise with existing contracts on multiple clouds, separate teams, and the engineering maturity to manage complexity. Multi-cloud is over-engineering when: you are a startup or small team and should master one cloud first. You want to avoid vendor lock-in theoretically but have no concrete plan. Kubernetes does not fully abstract cloud differences — networking, storage, load balancers, IAM all differ significantly. The operational overhead of running two cloud environments is real: two sets of tooling expertise, two sets of monitoring dashboards, two incident response playbooks. My recommendation: start with one cloud, do it excellently. Add a second cloud only when you have a specific documented business requirement that justifies the added complexity.
MULTI-CLOUD · ENGINEER
What is multi-cloud and what are the main reasons organisations adopt it?
Multi-cloud means using services from multiple public cloud providers simultaneously. Different from hybrid cloud (which is on-premise + one cloud). Main reasons: Best-of-breed services — GCP BigQuery has no equivalent in AWS or Azure for analytics speed. Azure Active Directory integrates best with Microsoft 365. AWS has the widest range of managed services. Using each for its strength is rational. Avoid vendor lock-in — with one cloud, the provider controls your pricing. Negotiating is hard when migration would take years. Two clouds creates leverage. Regulatory — some regions require data to be processed by specific providers or kept in country, forcing multi-cloud. Resilience — a major cloud outage (AWS us-east-1 has had multi-hour outages) can take down the entire business if you are single-cloud. Multi-cloud provides geographic and provider redundancy. M&A — after acquisition, two companies on different clouds. Consolidation takes years. The reality: most organisations are multi-cloud by accident (different teams chose different clouds) rather than by strategy. The challenge is managing the complexity: separate IAM systems, different billing models, different CLI tools, double the ops runbooks. Use cloud-agnostic tools (Kubernetes, Terraform, Datadog) to reduce this overhead.
MULTI-CLOUD · ARCHITECT
How do you use Terraform to manage infrastructure across multiple clouds?
Terraform supports multi-cloud through its provider system. Each cloud has a provider (hashicorp/aws, hashicorp/azurerm, hashicorp/google) and each is configured separately in the providers block. In one Terraform codebase you can create an AKS cluster in Azure and an EKS cluster in AWS, manage DNS in Route53, and deploy a GCS bucket — all in the same plan. Structure for multi-cloud Terraform: separate module directories per cloud (modules/aws/, modules/azure/, modules/gcp/). Each module is self-contained and uses only its cloud provider. Root modules compose them together. Separate state per cloud and per environment — azure/production has its own state file, aws/production has its own. Cross-cloud references: Terraform outputs from one module can be used as inputs to another. Example: output the Azure AKS endpoint, use it as the ArgoCD target cluster URL in a Kubernetes provider block. Credentials: each provider gets its own auth — AWS via IAM role/env vars, Azure via service principal, GCP via service account key or application default credentials. In CI/CD: pipelines authenticate to all three clouds, Terraform plan shows changes across all. The risk: a single plan can accidentally destroy resources in multiple clouds. Always review plan output carefully before apply.
MULTI-CLOUD · ARCHITECT
What are the main challenges of multi-cloud and how do you address them?
Skills and knowledge: teams need expertise in multiple clouds. Address by investing in cloud-agnostic skills (Kubernetes, Terraform, Prometheus work the same everywhere) and accepting some cloud-specific expertise per team. Security complexity: each cloud has different IAM models, security services, compliance tools. A misconfiguration in one cloud is harder to detect when you are monitoring multiple dashboards. Address with a CSPM (Cloud Security Posture Management) tool like Prisma Cloud or Wiz — unified security view across all clouds. Data egress costs: moving data between clouds is expensive ($0.08-0.09 per GB). Address by keeping compute close to data — process where data lives, minimise cross-cloud data movement. Operational overhead: separate alerting, separate runbooks, separate on-call procedures. Address with unified observability — Datadog, New Relic, or Prometheus federation pulling from all clouds into one dashboard. Networking latency: cross-cloud API calls add latency. Design so each workload is self-contained within one cloud. Only use cross-cloud for async operations (event queues, batch jobs). The honest answer for interviews: multi-cloud adds significant complexity. Only adopt it when the business benefit (resilience, best-of-breed, compliance) outweighs the operational cost. For most medium-sized organisations, one cloud done well is better than two clouds done poorly.
Continue Learning
☁️ Cloud Fundamentals☁️ AWS🏠 Home
🤖
AI Assistant
Ask anything about this topic
👋 Hi! I have read this page and can answer your questions.

Try asking: "Explain this topic in simple terms" or "Give me an example" or ask any specific question.