博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
CloudStack + KVM + HA
阅读量:6227 次
发布时间:2019-06-21

本文共 3647 字,大约阅读时间需要 12 分钟。

KVM高可用性CS4.2暂时没有实现

The Linux Kernel Virtual Machine (KVM) is a very popular hypervisor choice amongst CloudStack and OpenStack users. It is free and comes ready with popular Linux distributions like CentOS/RedHat and Ubuntu. In some cases, customers insist on using and end to end Open Source solution for their private cloud and KVM ends up being the only choice available.

So in a recent deployment experience, where the private cloud had to run on full open source mature solutions, the obvious choice was to use Apache CloudStack 4.1 and KVM (on CentOS 6.x) Hypervisors. Building the management tier and the KVM hosts itself was a breeze with CentOS KickStart and the SSH based for post-install configuration of services.

The infrastructure too was built for resilience – dual power supplies, dual 4 port network controllers wired across east-west switches with LACP, HA for storage. From the CloudStack side, the management servers behind load balancers with MySQL replication services, multiple PODs, multiple Clusters and multiple Hosts in a cluster. Also, new service offerings created with HA enabled.

One of the resilience tests was to simply power off a random KVM hypervisor within a logical cluster and watch the affected HA enabled VM(s) auto start on another host within the same cluster after the time out period. To everyones surprise, the Guest VMs just sat there marked in ‘Up’ state despite physically being offline. A close look at the management logs show little to no activity that CloudStack even cared for these affected guest VMs and the KVM host.

CloudStack VM HA with KVM was simply not working.

After spending some time on the Apache CloudStack mailing lists and JIRA, it turns out that its a CloudStack feature to “do nothing” in a host down scenario. This is primarily to avoid any split brain situations where we could potentially end up with the multiple copies of the guest VMs running on more than one physical host due to network connectivity problems. Since KVM does not have in built clustering/HA features, it is up to the CloudStack layer to decide on a corrective course of action. At this time, CloudStack simply chooses to ignore failed KVM hosts.

The situation could be even more problematic if you unfortunately happen to have the CloudStack “virtual router” also running on the failed host. All basic network services like DHCP, DNS and routing for that POD will fail as the router would be offline. This actually happened to a someone on the mailing lists. The “fix” would be to go into the CloudStack database and mark the Virtual Router as “destroyed”. CloudStack would then create a newvirtual router and services would resume.

This issue is currently being discussed in this JIRA Ticket and there is developer interest in coming up with a solution for an upcoming Apache CloudStack 4.1.x release. Also see the thread on cloudstack-users mailing list.

Please note that this problem is specific to KVM hypervisors only as they do not have in-built clustering capabilities. CloudStack with VMware and XenServers do not have this issue. Both VMware and XenServers clusters automatically do the right thing using their in-built clustering features.

As a side note, Citrix XenServer 6.2 has been fully open sourced in July and installation ISOs are available from . Given the enterprise features that XenServer (like HA clustering and fault tolerance) already has over KVM, it is very likely to have massive adoption in fully open source clouds with future releases of Apache CloudStack.

Update: According to thread, XCP is also affected.

 

资源引用:

bug号:

转载于:https://www.cnblogs.com/heidsoft/p/3422839.html

你可能感兴趣的文章
IDEA 问题汇总
查看>>
vmware安装软件包时出错 windows installer返回1613
查看>>
XenDesktop5.x/XenApp6.x访问数据流
查看>>
python 的日志logging模块学习
查看>>
HBase 源码编译错误: RpcServer.java: cannot find symbol
查看>>
zabbix监控中遇到的错误
查看>>
Centos7.5-文件权限管理
查看>>
Linux下安装wordpress和phpMyadmin,并为phpMyadmin添加ssl
查看>>
VM中文字界面linux调整分辨率
查看>>
tomcat虚拟主机 server.xml文件配置
查看>>
i-checks 简单应用
查看>>
列举数据挖掘领域的十大挑战性问题
查看>>
校园网解决方案分析
查看>>
Web Component 实战 读书笔记
查看>>
SpringMVC 参数注解
查看>>
源码构建lamp环境
查看>>
第四周作业
查看>>
/boot目录存储空间满导致apt-get安装软件失败
查看>>
LaTeX - 可伸缩箭头
查看>>
关于IT
查看>>