怎么實(shí)現(xiàn)CloudStack High Availability源碼分析,很多新手對(duì)此不是很清楚,為了幫助大家解決這個(gè)難題,下面小編將為大家詳細(xì)講解,有這方面需求的人可以來(lái)學(xué)習(xí)下,希望你能有所收獲。
創(chuàng)新互聯(lián)專注于淮安區(qū)網(wǎng)站建設(shè)服務(wù)及定制,我們擁有豐富的企業(yè)做網(wǎng)站經(jīng)驗(yàn)。 熱誠(chéng)為您提供淮安區(qū)營(yíng)銷型網(wǎng)站建設(shè),淮安區(qū)網(wǎng)站制作、淮安區(qū)網(wǎng)頁(yè)設(shè)計(jì)、淮安區(qū)網(wǎng)站官網(wǎng)定制、微信小程序開(kāi)發(fā)服務(wù),打造淮安區(qū)網(wǎng)絡(luò)公司原創(chuàng)品牌,更為您提供淮安區(qū)網(wǎng)站排名全網(wǎng)營(yíng)銷落地服務(wù)。
我們先來(lái)看DirectAgentAttache的內(nèi)部類PingTask,首先我們要知道每一個(gè)注冊(cè)到CS中的主機(jī)都有一個(gè)對(duì)應(yīng)的DirectAgentAttache,這也就意味著每一個(gè)HOST都有一個(gè)PingTask線程在后臺(tái)循環(huán)運(yùn)行,時(shí)間間隔是由全局變量ping.interval來(lái)指定的,默認(rèn)是60s.
我們來(lái)看PingTask的代碼
ServerResource resource = _resource; if (resource != null) { PingCommand cmd = resource.getCurrentStatus(_id); int retried = 0; while (cmd == null && ++retried <= _HostPingRetryCount.value()) { Thread.sleep(1000*_HostPingRetryTimer.value()); cmd = resource.getCurrentStatus(_id); } if (cmd == null) { s_logger.warn("Unable to get current status on ">
_id代表host_id,當(dāng)getCurrentStatus能返回正確的cmd就說(shuō)明能夠Ping通該host,那接下來(lái)就是執(zhí)行_agentMgr.handleCommands
public void handleCommands(final AgentAttache attache, final long sequence, final Command[] cmds) { for (final Pair<Integer, Listener> listener : _cmdMonitors) { final boolean processed = listener.second().processCommands(attache.getId(), sequence, cmds); } }
其中我們關(guān)心BehindOnPingListener,我們來(lái)看它的processCommands方法
@Override public boolean processCommands(final long agentId, final long seq, final Command[] commands) { final boolean processed = false; for (final Command cmd : commands) { if (cmd instanceof PingCommand) { pingBy(agentId); } } return processed; }
接下來(lái)是pingBy方法
public void pingBy(final long agentId) { // Update PingMap with the latest time if agent entry exists in the PingMap if (_pingMap.replace(agentId, InaccurateClock.getTimeInSeconds()) == null) { s_logger.info("PingMap for agent: " + agentId + " will not be updated because agent is no longer in the PingMap"); } }
這里重點(diǎn)就是這個(gè)_pingMap,我們看到它其實(shí)是一個(gè)ConcurrentHashMap,key是agentId(比如hostId),value是一個(gè)時(shí)間戳,就是當(dāng)我們這一次如果Ping通之后會(huì)把當(dāng)前時(shí)間作為value插入到_pingMap 中。我們回顧一下上面說(shuō)過(guò)PingTask是每ping.interval時(shí)間間隔執(zhí)行一次,所以如果我們的主機(jī)是在正常運(yùn)行的話那么_pingMap就會(huì)幾乎每ping.interval更新一次。(當(dāng)然執(zhí)行g(shù)etCurrentStatus方法會(huì)有一定的延遲)那如果主機(jī)出現(xiàn)突然的故障導(dǎo)致網(wǎng)絡(luò)無(wú)法連接的情況下,那_pingMap中的時(shí)間就會(huì)一直停留在上一次Ping通的那個(gè)時(shí)間戳。
所以我們來(lái)總結(jié)一下PingTask的邏輯:就是每隔ping.interval(默認(rèn)60s)去Ping我們的主機(jī),如果能夠Ping通就更新_pingMap中的value為當(dāng)前時(shí)間戳,否則什么都不做。
接下來(lái)我們要看的另一個(gè)后臺(tái)線程是MonitorTask,同樣是每隔ping.interval執(zhí)行一次,先是方法findAgentsBehindOnPing
protected List<Long> findAgentsBehindOnPing() { final List<Long> agentsBehind = new ArrayList<Long>(); final long cutoffTime = InaccurateClock.getTimeInSeconds() - getTimeout(); for (final Map.Entry<Long, Long> entry : _pingMap.entrySet()) { if (entry.getValue() < cutoffTime) { agentsBehind.add(entry.getKey()); } } return agentsBehind; } protected long getTimeout() { return (long) (PingTimeout.value() * PingInterval.value()); }
全局變量ping.timeout默認(rèn)值是2.5,這段代碼的意思就是找出上一次Ping通的時(shí)間距離現(xiàn)在超過(guò)ping.interval的2.5倍的主機(jī),簡(jiǎn)單講就是Ping不通或者Ping通的延時(shí)超過(guò)我們認(rèn)為的不合理時(shí)間的主機(jī)。 正常情況下該方法返回的會(huì)是一個(gè)空的List,這個(gè)時(shí)候MonitorTask就結(jié)束當(dāng)前任務(wù)。但是如果出現(xiàn)網(wǎng)絡(luò)延時(shí)或者主機(jī)故障的時(shí)候,就要執(zhí)行接下來(lái)的代碼。
final List<Long> behindAgents = findAgentsBehindOnPing(); for (final Long agentId : behindAgents) { final QueryBuilder<HostVO> sc = QueryBuilder.create(HostVO.class); sc.and(sc.entity().getId(), Op.EQ, agentId); final HostVO h = sc.find(); if (h != null) { final ResourceState resourceState = h.getResourceState(); if (resourceState == ResourceState.Disabled || resourceState == ResourceState.Maintenance || resourceState == ResourceState.ErrorInMaintenance) { disconnectWithoutInvestigation(agentId, Event.ShutdownRequested); } else { final HostVO host = _hostDao.findById(agentId); if (host != null && (host.getType() == Host.Type.ConsoleProxy || host.getType() == Host.Type.SecondaryStorageVM || host.getType() == Host.Type.SecondaryStorageCmdExecutor)) { disconnectWithoutInvestigation(agentId, Event.ShutdownRequested); } else { disconnectWithInvestigation(agentId, Event.PingTimeout); } } } }
我們假設(shè)出問(wèn)題的是一臺(tái)計(jì)算節(jié)點(diǎn),那么一路往下將要執(zhí)行的將是AgentManagerImpl的handleDisconnectWithInvestigation方法
protected boolean handleDisconnectWithInvestigation(final AgentAttache attache, Status.Event event) { final long hostId = attache.getId(); HostVO host = _hostDao.findById(hostId); if (host != null) { Status nextStatus = null; nextStatus = host.getStatus().getNextStatus(event); if (nextStatus == Status.Alert) { Status determinedState = investigate(attache); if (determinedState == null) { if ((System.currentTimeMillis() >> 10) - host.getLastPinged() > AlertWait.value()) { determinedState = Status.Alert; } else { return false; } } final Status currentStatus = host.getStatus(); if (determinedState == Status.Down) { event = Status.Event.HostDown; } else if (determinedState == Status.Up) { agentStatusTransitTo(host, Status.Event.Ping, _nodeId); return false; } else if (determinedState == Status.Disconnected) { if (currentStatus == Status.Disconnected) { if ((System.currentTimeMillis() >> 10) - host.getLastPinged() > AlertWait.value()) { event = Status.Event.WaitedTooLong; } else { return false; } } else if (currentStatus == Status.Up) { event = Status.Event.AgentDisconnected; } } } } handleDisconnectWithoutInvestigation(attache, event, true, true); host = _hostDao.findById(hostId); // Maybe the host magically reappeared? if (host != null && host.getStatus() == Status.Down) { _haMgr.scheduleRestartForVmsOnHost(host, true); } return true; }
我們先看一下該方法最后的那個(gè)if,就是在特定的條件下我們最終的目的就是重啟該主機(jī)上的所有虛擬機(jī),這才是 HA的真正目的。但是我們要記住我們進(jìn)入這個(gè)handleDisconnectWithInvestigation方法的前提其實(shí)是很簡(jiǎn)單的,就是只要我們發(fā)現(xiàn)距離上一次Ping通該主機(jī)的時(shí)間超過(guò)比如說(shuō)2分半鐘就會(huì)進(jìn)入該方法,而我們要真正執(zhí)行HA應(yīng)該是要非常確定該主機(jī)確實(shí)是掛掉了的情況下才發(fā)生的。所以該方法前面一大堆都是在反復(fù)的確認(rèn)主機(jī)的狀態(tài),就如方法名所示Inverstigation(調(diào)查)。 我們假設(shè)該主機(jī)的currentStatus是UP,event我們知道是PingTimeout,所以nextStatus就是Alert。接下來(lái)就是執(zhí)行investigate方法
protected Status investigate(final AgentAttache agent) { final Long hostId = agent.getId(); final HostVO host = _hostDao.findById(hostId); if (host != null && host.getType() != null && !host.getType().isVirtual()) { final Answer answer = easySend(hostId, new CheckHealthCommand()); if (answer != null && answer.getResult()) { final Status status = Status.Up; return status; } return _haMgr.investigate(hostId); } return Status.Alert; }
該方法先會(huì)向該hostId發(fā)送一個(gè)CheckHealthCommand,這個(gè)時(shí)候會(huì)有兩種可能:
1、如果能夠接受到應(yīng)答說(shuō)明此時(shí)該主機(jī)是正常的就直接返回UP狀態(tài),我們?cè)诨氐絟andleDisconnectWithInvestigation就會(huì)發(fā)現(xiàn)此時(shí)該任務(wù)也就基本結(jié)束了意思就是觸發(fā)該方法的僅僅是臨時(shí)的網(wǎng)絡(luò)不通或者什么情況現(xiàn)在主機(jī)已經(jīng)恢復(fù)正常
2.那另一種情況就是CheckHealthCommand沒(méi)有得到應(yīng)答,也就是說(shuō)我直接從management-server去請(qǐng)求你主機(jī)你沒(méi)有反應(yīng),那也不代表你就真的掛了,接下來(lái)怎么辦呢,我們?nèi)フ腋鞣N偵探(investigators)去調(diào)查你是否alive
@Override public Status investigate(final long hostId) { final HostVO host = _hostDao.findById(hostId); if (host == null) { return Status.Alert; } Status hostState = null; for (Investigator investigator : investigators) { hostState = investigator.isAgentAlive(host); if (hostState != null) { return hostState; } } return hostState; }
那假如我們的主機(jī)是一臺(tái)XenServer的主機(jī)的話,最重要的當(dāng)然是XenServerInvestigator,我們來(lái)看它的isAgentAlive方法
public Status isAgentAlive(Host agent) { CheckOnHostCommand cmd = new CheckOnHostCommand(agent); List<HostVO> neighbors = _resourceMgr.listAllHostsInCluster(agent.getClusterId()); for (HostVO neighbor : neighbors) { Answer answer = _agentMgr.easySend(neighbor.getId(), cmd); if (answer != null && answer.getResult()) { CheckOnHostAnswer ans = (CheckOnHostAnswer)answer; if (!ans.isDetermined()) { continue; } return ans.isAlive() ? null : Status.Down; } } return null; }
邏輯也很簡(jiǎn)單就是我直接找不到你我就去找你同一個(gè)Cluster中的鄰居,我向你的每一個(gè)鄰居主機(jī)發(fā)送一個(gè)CheckOnHostCommand命令,看它們能不能知道你到底怎么了。關(guān)于CheckOnHostCommand命令的具體實(shí)現(xiàn)在開(kāi)頭那篇官網(wǎng)的文章里有詳細(xì)的說(shuō)明
If the network ping investigation returns that it cannot detect the status of the host, CloudStack HA then relies on the hypervisor specific investigation. For VmWare, there is no such investigation as the hypervisor host handles its own HA. For XenServer and KVM, CloudStack HA deploys a monitoring script that writes the current timestamp on to a heartbeat file on shared storage. If the timestamp cannot be written, the hypervisor host self-fences by rebooting itself. For these two hypervisors, CloudStack HA sends a CheckOnHostCommand to a neighboring hypervisor host that shares the same storage. The neighbor then checks on the heartbeat file on shared storage and see if the heartbeat is no longer being written. If the heartbeat is still being written, the host reports that the host in question is still alive. If the heartbeat file’s timestamp is lagging behind, after an acceptable timeout value, the host reports that the host in question is down and HA is started on the VMs on that host.
大致的意思是CS會(huì)在每一個(gè)XenServer和KVM的主機(jī)上運(yùn)行一段監(jiān)控腳本,這個(gè)腳本會(huì)將當(dāng)前時(shí)間戳寫(xiě)入一個(gè)在共享存儲(chǔ)的文件中。如果某一臺(tái)主機(jī)發(fā)現(xiàn)自己無(wú)法往文件中寫(xiě)入數(shù)據(jù)將會(huì)強(qiáng)制自己重啟。 那上面那段代碼的邏輯就是向與該被調(diào)查的主機(jī)共享存儲(chǔ)的其他主機(jī)發(fā)送CheckOnHostCommand命令,鄰居主機(jī)接受到命令就去查看文件中被調(diào)查主機(jī)有沒(méi)有持續(xù)的更新時(shí)間戳,如果有它就返回相應(yīng)說(shuō)該主機(jī)is still alive,否則就返回說(shuō)該主機(jī)is down. 這樣只有主機(jī)確實(shí)出了故障無(wú)法連接的情況下,handleDisconnectWithInvestigation方法中的determinedState才會(huì)是Status.Down,那么event就變成了Status.Event.HostDown,接下來(lái)就執(zhí)行HighAvailabilityManagerImpl的scheduleRestartForVmsOnHost方法重起該主機(jī)上的所以虛擬機(jī) ,然后是在數(shù)據(jù)庫(kù)中出入一個(gè)HaWorkVO,然后喚醒CS啟動(dòng)的時(shí)候初始化好的WorkerThread,到很重要的同樣是HighAvailabilityManagerImpl的restart方法
protected Long restart(final HaWorkVO work) { boolean isHostRemoved = false; Boolean alive = null; if (work.getStep() == Step.Investigating) { if (!isHostRemoved) { Investigator investigator = null; for (Investigator it : investigators) { investigator = it; try { (1) alive = investigator.isVmAlive(vm, host); break; } catch (UnknownVM e) { s_logger.info(investigator.getName() + " could not find " + vm); } } boolean fenced = false; if (alive == null) { for (FenceBuilder fb : fenceBuilders) { (2) Boolean result = fb.fenceOff(vm, host); if (result != null && result) { fenced = true; break; } } } (3) _itMgr.advanceStop(vm.getUuid(), true); } } vm = _itMgr.findById(vm.getId()); (4)if (!_forceHA && !vm.isHaEnabled()) { return null; // VM doesn't require HA } try { HashMap<VirtualMachineProfile.Param, Object> params = new HashMap<VirtualMachineProfile.Param, Object>(); (5) if (_haTag != null) { params.put(VirtualMachineProfile.Param.HaTag, _haTag); } WorkType wt = work.getWorkType(); if (wt.equals(WorkType.HA)) { params.put(VirtualMachineProfile.Param.HaOperation, true); } (6) try{ _itMgr.advanceStart(vm.getUuid(), params, null); }catch (InsufficientCapacityException e){ s_logger.warn("Failed to deploy vm " + vmId + " with original planner, sending HAPlanner"); _itMgr.advanceStart(vm.getUuid(), params, _haPlanners.get(0)); } } return (System.currentTimeMillis() >> 10) + _restartRetryInterval; }
如上代碼我所標(biāo)記的有5個(gè)重點(diǎn)需要關(guān)注的。 大致的流程如下:
(1)調(diào)用各個(gè)
investigator.isVmAlive
方法,如果isAlive則什么都不做,否則往下走(2)調(diào)用
fb.fenceOff
方法(3)執(zhí)行
_itMgr.advanceStop
方法(4)關(guān)于_forceHA變量,因?yàn)槲以谌肿兞亢蛿?shù)據(jù)庫(kù)的configuration表中都沒(méi)有找到,所以初始化的值為 FALSE,那么也就是說(shuō)只有
vm.isHaEnabled
為ture的VM才會(huì)繼續(xù)執(zhí)行下去,否則直接return了(5)
_haTag
的值是由全局變量ha.tag來(lái)指定的,默認(rèn)為空,如果指定了這個(gè)值對(duì)后面確定VM分配主機(jī)很重要,記住這行代碼params.put(VirtualMachineProfile.Param.HaTag, _haTag);
(6)這里有沒(méi)有很熟悉,是的,凡是讀過(guò)CS創(chuàng)建VM實(shí)例的過(guò)程代碼的人都知道這個(gè)方法就是去分配一個(gè)VM,那么到這里整個(gè)CS的HA執(zhí)行代碼就完成大部分了,接下來(lái)就是重啟VM,至于該VM能否重啟就要依賴各種條件了,比如該Cluster中有沒(méi)有合適的主機(jī)、主機(jī)的物理資源是否充足、有沒(méi)有設(shè)置ha.tag、VM有沒(méi)有使用標(biāo)簽等等這里就不再詳述了。
看完上述內(nèi)容是否對(duì)您有幫助呢?如果還想對(duì)相關(guān)知識(shí)有進(jìn)一步的了解或閱讀更多相關(guān)文章,請(qǐng)關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道,感謝您對(duì)創(chuàng)新互聯(lián)的支持。
新聞標(biāo)題:怎么實(shí)現(xiàn)CloudStackHighAvailability源碼分析
文章URL:http://aaarwkj.com/article18/jjjigp.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供網(wǎng)站排名、用戶體驗(yàn)、靜態(tài)網(wǎng)站、App開(kāi)發(fā)、營(yíng)銷型網(wǎng)站建設(shè)、小程序開(kāi)發(fā)
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)