Wai-Leong Yeow, C´ dric Westphal, e and Ulas C. Kozat ¸
DoCoMo USA Labs, 3240 Hillview Ave, Palo Alto, CA 94304, USA e-mail: wlyeow@ieee.org, {cwestphal,kozat}@docomolabs-usa.com
Abstract—Virtualization is a key enabler to autonomic management of hosted services in data centers. We show that it can be used to manage reliability of these virtual entities with virtual backups. An architecture is proposed to autonomously manage and allocate the physical resources, ensure reliability guarantees, and manage the pools of virtual backups for failure recovery and resource conservation. It is fault tolerant by design so that component failures do not bring down the entire data center.
I. I NTRODUCTION Virtualization technology has changed the way hosted services are managed on today’s data centers. Significant cost savings are passed to the service providers as resources in the physical infrastructure are more efficiently utilized when pooled together and shared across the hosted servers. More importantly, virtualization is a key enabler to autonomic management of hosted services in a data center [1]. Planned or unplanned maintenance, asynchronous backups and service migration can be achieved easily with virtualization [2]. For more details on network virtualization, see this survey [3]. The many benefits of virtualization can be extended to managing reliability of the hosted services in a virtualized data center. Typically, load balancing between k service replicas with over-provisioning has been a common and straightforward way to provide fault tolerance. However, this is unsuitable for “stateful” services in which a failure will cause discontinuation in a service. Through asynchronous backups of the virtual hosted entities, states of the active services can be saved to backup nodes that are reserved with complete fail-over bandwidth for reliability guarantees. Furthermore, the backup