Privacy-Preserving Inter-DC IT Load Live-Migration: A CATALYST Success Story

On April 22 2020, Google posted a blog praising their data centres (DC) capability of shifting IT workloads among their infrastructures, based on the availability of green power among the various installments they operate. Such statement, when made by one of the world’s biggest DC operators, highlights the great need (and business opportunity) for more flexibility in the way cloud IT loads are operated and served.

It is exactly this kind of flexibility that CATALYST sets as part of its primal key-features, only taking this idea a tad further; towards shifting IT loads not only among DCs of the same administrative domain, but also to different ones, achieving inter-DC flexibility!

The realization of privacy-preserving inter-DC IT Load Live-Migration, with zero (or close to zero) downtime has been, until now, an uncharted territory of Cloud Computing. This statement is not to be perceived as nihilistic, in the sense that remarkable works can be found with the live – migration of virtual machines in the epicentre. Indeed, various contemporary attempts have addressed the issue of not only intra, but also inter-DC IT Load Migration. Many of these approaches make two fundamental assumptions:

The task performed by the IT Load is not critical, in the sense that a sensibly low downtime is not harmful for the running services. Widely applied approaches include the creation of the IT Load’s snapshot in the source datacenter (DC), the transfer of the snapshot to the destination DC or shared storage accessible by both DCs, and the re-instantiation of the IT Load in the destination DC.

The owner / tenant of the IT Load, is also owner of at least one account in a different DC. For the inter-DC IT Load Migration to be realized, a tenant would need at least two accounts in discrete DCs, so that the IT Load could be migrated between the DCs.

From these fundamental assumptions, two fundamental questions arise:

  • What if the task performed by the IT Load is critical, in the sense that it is dangerous and / or not affordable to have downtime longer than 1’’?
  • What if the owner / tenant of the IT Load cannot afford to own and / or manage multiple accounts in multiple DCs?

The H2020 CATALYST Project moves past these assumptions and attempts to address the questions arisen by implementing the CATALYST Datacenter Migration Controller (DCMC) to support inter-DC IT Load Live-Migration. This framework makes only one assumption:

The DCs participating in the migration are members of the CATALYST Federation. The CATALYST Federation enables DCs to sell or purchase energy in the form of IT Loads, among others. In the CATALYST context, a DC is capable of live-migrating an IT Load to be hosted by another DC, member of the Federation.

In order to understand the main operating principle of the DCMC, we should notice that a simple OpenStack deployment might include a Controller node and one or more Compute nodes. The Compute nodes serve as hypervisors and it is the place where the VMs are deployed. Additionally, it should be mentioned that OpenStack supports the live-migration, i.e. with zero downtime, of virtual machines between Compute nodes managed by the same Controller. Ergo, one of the aforementioned issues, i.e. the downtime of the IT Load, is solvable in case the IT Load is migrated between two Compute nodes of the same OpenStack deployment.

What if we exploit the intra-DC live-migration feature in inter-DC live-migration? Then, since a DC is member of the CATALYST Federation, there is no need for a tenant to own multiple accounts in multiple DCs, thus solving the ownership of multiple accounts issue.

How could it be possible to exploit the intra-DC live-migration? In OpenStack, live-migration is feasible between Compute nodes lying in the same deployment. Hence, the trick would be to make OpenStack think that it is migrating IT Load to another Compute node of its own. Here, some revolutionary ideas, essentially the core of the DCMC framework, are applied. The source DC is requesting the creation of a VPN network from a VPNaaS, a component which has the ability to create a VPN connection on demand. At the same time, a Compute Node is deployed as a VM in the DC that is going to host the migrated load, thus forming a Virtual Compute (vCMP) node. The vCMP node connects to the VPN network and, when the connection is established, registers to the Controller of the source DC as a Compute Node. The Controller recognizes this vCMP node as one of its own, ergo, the live-migration of the IT Load can now be executed!

What about the privacy of the migrated IT Load? A fair question. After all, it is hosted in a foreign DC, with administrative access on the IT Load. The CATALYST DCMC framework claims privacy-preserving inter-DC IT Load Migration. Besides the various law agreements that would enforce data protection within the CATALYST Federation, the deployment of the vCMP node is password-less, non-accessible , with encrypted volumes and ephemeral disks. A malicious administrator of the destination DC can, in no way, have access to the actual IT Load. The only possible malicious action that could be executed is the deletion of the vCMP node while hosting IT Load, though this is an action that would be strictly forbidden through the aforementioned law agreements.

 

The components of the CATALYST DCMC framework include the DCMC Server at the core of the federation, responsible for synchronizing the essential actions of the participating DCs, and the DCMC Master and DCMC Lite Clients, deployed in all DCs of the federation. The CATALYST Datacenter Migration Controller (DCMC) framework was implemented and tested over OpenStack, yet it is considered easily extensible to support other types of Virtualized Infrastructure Managers.

As a last notice, the design of CATALYST IT Load migration framework is universal and can actually work with any cloud-management infrastructure; the reference implementation is OpenStack-based (Open Source for the win!) but literally any other could also be supported, spanning from VMWare to Kubernetes.

At a glance

  • No: 768739
  • Acronym: CATALYST
  • Title: Converting data centres in energy flexibility ecosystems
  • Starting date: October 1, 2017
  • Duration in months: 36
  • Call identifier: H2020-EE-2017-RIA-IA
  • Topic: EE-20-2017; Bringing to market more energy efficient and integrated data centres