高級操作系統 第7章 分布式系統_第1頁
高級操作系統 第7章 分布式系統_第2頁
高級操作系統 第7章 分布式系統_第3頁
高級操作系統 第7章 分布式系統_第4頁
高級操作系統 第7章 分布式系統_第5頁
已閱讀5頁,還剩107頁未讀 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

ConsistencyandReplicationChapter77.1.1ReasonsforReplication:1)reliability2)performanceCaveatGaininperformance;Costofincreasedbandwidthformaintainingreplication;

Introduction(7.1)Dataaregenerallyreplicatedtoenhancereliabilityorimproveperformanceindistributedsystemorparallelcomputerssystem.Howtokeepreplicasconsistentisoneofthemajorproblems.whenonecopyisupdated,weneedtoensurethattheothercopiesareupdatedaswell.Introduction(7.1)7.1.2ObjectReplication1)Adistributedremoteobjectissharedbymultipleclients-------howtoprotecttheobjectagainstsimultaneousaccessbymultipleclients.2)Adistributedremoteobjectisreplicatedatdifferentnode-------howtoensurethatconcurrentinvocationsareperformedinthecorrectorderateachofthereplicas.Introduction(7.1)ObjectsharedOrganizationofadistributedremoteobjectsharedbytwodifferentclients.ObjectsharedAremoteobjectcapableofhandlingconcurrentinvocationsonitsown.AremoteobjectforwhichanobjectadapterisrequiredtohandleconcurrentinvocationsObjectsharedFig(a):Forexample,javaobject----whichcanbeconstructedasamonitorbydeclaringtheobject’smethodtobesynchronized.publicsynchronizedvoidenter(Objectitem){ while(count==BUFFER_SIZE) Thread.yield(); ++count; buffer[in]=item; in=(in+1)%BUFFER_SIZE;}ObjectReplicationAdistributedsystemforreplication-awaredistributedobjects.(SOS,Globeetc.)Adistributedsystemresponsibleforreplicamanagement(CORBAetc.)

實現機理與一致性模型?ObjectReplication7.1.3ReplicationasScalingTechniquePlacingcopiesofdataandobjectsclosetotheprocessesusingthemcanimproveperformancethroughreductionofaccesstime,andthussolvescalabilityproblems.Replicationandcachingforperformancearewidelyappliedasscalingtechniques.Dilemmaproblem:Scalabilityproblemcanbealleviatedbyapplyingreplicationandcaching,leadingtoimprovedperformance.Tokeepallcopiesconsistentgenerallyrequiresglobalsynchronization,whichisinherentlycostlyintermsofperformance.ObjectReplication1.DATA-CENTRICCONSISTENCYMODELS.2.CLIENT-CENTRICCONSISTENCYMODELSTWOCONSISTENCYMODELSDATA-CENTRICCONSISTENCYMODELSAdatastoremaybephysicallydistributedacrossmultiplemachines:(distributed)sharedmemory,(distributed)shareddatabase,ora(distributed)filesystem.Adataoperationisclassifiedas:writeoperation;readoperation;eachprocessthatcanaccessdatafromthestoreisassumedtohavealocal(ornearby)copyavailableoftheentirestore.Writeoperationsarepropagatedtotheothercopies.Data-CentricConsistencyModelsThegeneralorganizationofalogicaldatastore,physicallydistributedandreplicatedacrossmultipleprocesses.Data-CentricConsistencyModelsAconsistencymodelisessentiallyacontractbetweenprocess(read/write)andthedatastore.Itsaysthatifprocessesagreetoobeycertainrules,thestorepromisestoworkcorrectly.Data-CentricConsistencyModelsIntheabsenceofaglobalclock,itisdifficulttodefinepreciselywhichwriteoperationisthelastone,soweneedtoprovidedefinitions,leadingtoarangeofconsistencymodels.Eachmodeleffectivelyrestricts(限定)thevaluesthatareadoperationonadataitemcanreturn.StrictConsistency(7.2.1)strictconsistency:

Anyreadonadataitemxreturnsavaluecorrespondingtotheresultofthemostrecentwriteonx.例:單一處理機遵守嚴格一致性,有如下程序:A=1;A=2;PRINT(A);打印1或2以外的任何值是編程者無法理解的。StrictConsistency(7.2.1)假設在DCS中,有如下情形:先讀后寫。NodeANode

BP1x

read(B:xatt1)P2

…t2-t1=1nsWrite(B:xatt2)嚴格一致性:A應該讀出原來x值。假設機器間距離3米,從A到B傳送讀操作并使之先于B的寫操作,則信號必須以十倍光速傳遞,與愛因斯坦相對論矛盾.1ns(納秒)=10-9秒。可理解為單/多副本方式.StrictConsistency(7.2.1)Behavioroftwoprocesses,operatingonthesamedataitem.(a)Astrictlyconsistentstore;(b)Astorethatisnotstrictlyconsistent;StrictConsistency(7.2.1)嚴格一致性存儲器,寫操作在任一時刻對所有進程都是可見的,后讀出的都是新更改值。反之,無論后面的寫操作有多快,前面的讀操作仍應是原來的值。系統維護一個絕對全局時間。嚴格一致性是理想的編程模式,在分布式系統中,這幾乎不可能實現。LinearizabilityandSequentialConsistency(7.2.2)Sequentialconsistent(順序一致性條件,Lamport1979):Theresultofanyexecutionisthesameasifthe(readandwrite)operationbyallprocessesonthedatastorewereexecutedinsomesequentialorderandtheoperationsofeachindividualprocessappearinthissequenceintheorderspecifiedbyitsprogram.

所有進程對數據執行的結果順序與每個進程各自對數據執行的結果順序一致。不同機器上并發運行的進程,任何有效的交錯是可以接受的,但所有進程必須遵守同一訪問順序。LinearizabilityandSequentialConsistency(7.2.2)Asequentiallyconsistentdatastore.Adatastorethatisnotsequentiallyconsistent.不保證讀返回值是1ns/1ms/1分鐘以前另一進程寫入的。只保證所有進程以相同的順序看見存儲器訪問(圖a),如果缺少明確的同步操作,則結果是不確定的。LinearizabilityandSequentialConsistency(2)Threeconcurrentlyexecutingprocesses.ProcessP1ProcessP2ProcessP3x=1;print(y,z);y=1;print(x,z);z=1;print(x,y);6個語句共有720(6!)種可能的執行順序,但是,只有90個是有效的。注:x=1開始的順序有120(5!)種,其中只有1/4(y=1/z=1在print()前,各1/2)即30個是有效的。另外,分別是以y=1和z=1開頭的各30個。LinearizabilityandSequentialConsistency(3)Fourvalidexecutionsequencesfortheprocessesofthepreviousslide.Theverticalaxisistime.x=1;print((y,z);y=1;print(x,z);z=1;print(x,y);Prints:001011Signature:

001011(a)x=1;y=1;print(x,z);print(y,z);z=1;print(x,y);Prints:101011Signature:

101011(b)y=1;z=1;print(x,y);print(x,z);x=1;print(y,z);Prints:010111Signature:010111(c)y=1;x=1;z=1;print(x,z);print(y,z);print(x,y);Prints:111111Signature:111111(d)LinearizabilityandSequentialConsistency(3)Linearizabilityconsistency:Theresultofanyexecutionisthesameasifthe(readandwrite)operationbyallprocessesonthedatastorewereexecutedinsomesequentialorderandtheoperationsofeachindividualprocessappearinthissequenceintheorderspecifiedbyitsprogram.Inaddition,iftsOP1(x)<tsOP2(y),thenoperationOP1(xshouldprecedeOP2(y)inthissequence.LettsOP1(x)denotethetimestampassignedtooperationOPthatisperformedondataitemx;滿足順序一致性約束的同時滿足時間戳的順序(比順序一致性強)LinearizabilityandSequentialConsistency(4)Acommonapproachtoformallyexpresssequentialconsistencyisasfollows(Ahamadetal.,1992;Mizunoetal.,1995).另一種表示。EachprocessPihasanassociatedexecutionEi,whichisasequenceofreadandwriteoperationsbyprocessPiperformedonadatastoreS.ThissequenceadherestotheprogramorderassociatedwithPi.Forexample.

E1:W1(x)a;E2:W2(x)b;E3:R3(x)b,R3(x)a;E4:R4(x)b,R4(x)a;ThenH=W1(x)b,R3(x)b,R4(x)b,W2(x)a,R3(x)a,R4(x)aCasualConsistency(7.2.3)Necessarycondition:Writesthatarepotentiallycasuallyrelatedmustbeseenbyallprocessesinthesameorder.Concurrentwritesmaybeseeninadifferentorderondifferentmachines.所有進程僅看到滿足因果關系一致性模型的寫操作結果。CasualConsistency(2)Thissequenceisallowedwithacasually-consistentstore,butnotwithsequentiallyorstrictlyconsistentstore.(W2(x)bandW1(x)careconcurrent).可以是不同進程的因果順序.比如W(x)b與W(x)b有因果關系寫無因果關系寫CasualConsistency(3)Aviolationofacasually-consistentstore(不滿足).Acorrectsequenceofeventsinacasually-consistentstore.Implement:requiringkeepingtrackofwhichprocesseshaveseenwhichwrites(實現思想:建立并維護一個依賴圖:即一個操作依賴于其它什么操作)。FIFOConsistency(orPRAM)(7.2.4)NecessaryCondition:

Writesdonebyasingleprocessareseenbyallotherprocessesintheorderinwhichtheywereissued,butwritesfromdifferentprocessesmaybeseeninadifferentorderbydifferentprocesses

一個進程的寫操作可以被其它進程以相同的順序接收到,但不同進程的寫操作在不同進程看來次序可以是不同的.強調一個進程的操作順序.

PRAM一致性由LIPTON和SANDBERG(1988)提出,PRAM代表管道RAM,一個進程的寫操作可以是流水線的,進程不必在開始下一個操作之前等待一操作結束,只要求一個進程的寫操作順序被所有進程看到。FIFOConsistencyAvalidsequenceofeventsofFIFOconsistency

保證由同一個進程的寫操作順序被所有進程看到。可通過設置序列號實現。

themodelcanbeimplementedbysimplytaggingeachwriteoperationwitha(process,sequencenumber)pair,andperformingwritesperprocessintheorderoftheirsequencenumber.

與因果一致性的區別:因果關系可以是不同進程之間,比如同步點。確定的順序FIFOConsistencyStatementexecutionasseenbythethreeprocessesfromthepreviousslide.Thestatementsinboldaretheonesthatgeneratetheoutputshown.(結果只受本進程順序影響)x=1;print(y,z);y=1;print(x,z);z=1;print(x,y);Prints:00(a)x=1;y=1;print(x,z);print(y,z);z=1;print(x,y);Prints:10(b)y=1;print(x,z);z=1;print(x,y);x=1;print(y,z);Prints:01(c)如果將三個進程的輸出順序相接,得到結果為001001FIFOConsistency在FIFO中不同進程看到的語句執行順序不同,進程P1、P2和P3分別看到的是圖(a)、(b)和(c)。對于順序一致存儲器,三個不同顯示是不允許的,比如結果“001001”在順序一致性下不可能的。yzxzxy不同之處:前者盡管未確定語句(和存儲器訪問)的執行順序,但至少所有進程都遵守共同的順序。后者就不遵守,不同進程看到的操作順序不同(不同進程寫結果可不同)。FIFOConsistencyTwoconcurrentprocesses.GOODMAN(1989)提出了一種略微不同但仍支持PRAM一致的存儲器形式。上述例子中,順序一致性只能出現三種結果:1)P1被KILL;2)P2被KILL;3)兩者都不被KILL。

FIFO一致性兩個進程都可能被KILL(不同進程的寫是并發的)。ProcessP1ProcessP2x=1;if(y==0)kill(P2);y=1;if(x==0)kill(P1);WeakConsistency(7.2.5)FIFO要求一個進程的寫操作順序被所有進程看到,實際應用中,并非所有應用程序必須要看到所有寫操作。比如:臨界區中一個進程通過循環讀寫復制數據。只需讓進程完成臨界區后的操作結果可見,中間結果不重要。Properties(Duboisetal.1986):AccessestosynchronizationvariablesassociatedwithadatastorearesequentiallyconsistentNooperationonasynchronizationvariableisallowedtobeperformeduntilallpreviouswriteshavebeencompletedeverywhereNoreadorwriteoperationondataitemsareallowedtobeperformeduntilallpreviousoperationstosynchronizationvariableshavebeenperformed.WeakConsistency(7.2.5)需要一個同步變量完成一致性操作。1.對同步變量的訪問是順序一致的;2.在先前所有的寫操作完成之前,不能訪問同步變量;3.在先前所有同步變量的訪問完成前,不能訪問(讀或寫)數據;

2和3說明了對同步變量操作的約束。弱一致性是對一組操作的一致性約束,不是單獨的讀或寫。WeakConsistencyAvalidsequenceofeventsforweakconsistency.Aninvalidsequenceforweakconsistency.WeakConsistencyAprogramfragmentinwhichsomevariablesmaybekeptinregisters.若允許另外一個進程可隨意讀取存儲器的話,只需讓編譯器寫一標志位說明存儲器沒有更新。若另一進程訪問A,它會在標志位上等待。應用實例:一個優化的編譯器可以在寄存器中計算a和b,并保存中間結果不更新存儲器,只有當調用函數f時才將a和b當前值寫回存儲器。inta,b,c,d,e,x,y; /*variables*/

int*p,*q; /*pointers*/

intf(int*p,int*q); /*functionprototype*/a=x*x; /*astoredinregister*/

b=y*y; /*baswell*/

c=a*a*a+b*b+a*b; /*usedlater*/

d=a*a*c; /*usedlater*/

p=&a; /*pgetsaddressofa*/

q=&b /*qgetsaddressofb*/

e=f(p,q) /*functioncall*/ReleaseConsistency(7.2.6)弱一致性存在的問題:訪問同步變量時,存儲器不知道是因為進程已完成對共享變量的寫操作還是要開始讀共享變量。因此引入釋放一致性(Gharachorlooetal.,1990),提供兩種操作:獲取(acquire)——訪問用于通知存儲器系統臨界區已就緒。釋放(release)——訪問表明臨界區剛退出。程序員需要在程序中編寫代碼明確說明何時做這些操作。ReleaseConsistency(7.2.6)Avalideventsequenceforreleaseconsistency.ReleaseConsistencyRules:Beforeareadorwriteoperationonshareddataisperformed,allpreviousacquiresdonebytheprocessmusthavecompletedsuccessfully.Beforeareleaseisallowedtobeperformed,allpreviousreadsandwritesbytheprocessmusthavecompletedAccessestosynchronizationvariablesareFIFOconsistent(sequentialconsistencyisnotrequired).ReleaseConsistency遵守以下規定:在訪問共享變量前,進程所有先前的獲取訪問都必須成功地完成;在允許釋放訪問前,進程先前的所有讀寫操作都必須結束;獲取訪問和釋放訪問必須是FIFO一致的。ReleaseConsistency在DSM的應用:進程A將獲取操作請求發送給同步管理者,要求在一特殊鎖上執行獲取訪問:無競爭時請求獲準,獲取訪問完成。A對共享數據在本地進行讀寫,不必同步更新副本。當執行釋放訪問時,同步更新副本完成后,同步管理者被告知可以執行釋放了。EntryConsistency(7.2.7)Conditions:Anacquireaccessofasynchronizationvariableisnotallowedtoperformwithrespecttoaprocessuntilallupdatestotheguardedshareddatahavebeenperformedwithrespecttothatprocess.Beforeanexclusivemodeaccesstoasynchronizationvariablebyaprocessisallowedtoperformwithrespecttothatprocess,nootherprocessmayholdthesynchronizationvariable,noteveninnonexclusivemode.Afteranexclusivemodeaccesstoasynchronizationvariablehasbeenperformed,anyotherprocess'snextnonexclusivemodeaccesstothatsynchronizationvariablemaynotbeperformeduntilithasperformedwithrespecttothatvariable'sowner.EntryConsistency(7.2.7)1.當某一進程的保護共享變量全部被更新以后,該進程才允許執行同步變量的獲取訪問;2.在一進程以互斥模式訪問該進程的同步變量之前,不允許其它進程持有此同步變量;3.在結束互斥模式下對一個同步變量的訪問后,任意其它進程必須與該變量的擁有者核查,才能試圖以非互斥模式訪問該同步變量。EntryConsistencyAvalideventsequenceforentryconsistency.SummaryofConsistencyModelsConsistencymodelsnotusingsynchronizationoperations.Modelswithsynchronizationoperations.ConsistencyDescriptionStrictAbsolutetimeorderingofallsharedaccessesmatters.LinearizabilityAllprocessesmustseeallsharedaccessesinthesameorder.Accessesarefurthermoreorderedaccordingtoa(nonunique)globaltimestampSequentialAllprocessesseeallsharedaccessesinthesameorder.AccessesarenotorderedintimeCausalAllprocessesseecausally-relatedsharedaccessesinthesameorder.FIFOAllprocessesseewritesfromeachotherintheordertheywereused.Writesfromdifferentprocessesmaynotalwaysbeseeninthatorder(a)ConsistencyDescriptionWeakShareddatacanbecountedontobeconsistentonlyafterasynchronizationisdoneReleaseShareddataaremadeconsistentwhenacriticalregionisexitedEntryShareddatapertainingtoacriticalregionaremadeconsistentwhenacriticalregionisentered.(b)Client-CentricConsistencyModels(7.3)Client-centricconsistencyprovidesguaranteesforasingleclientconcerningtheconsistencyofaccessestoadatastorebythatclient.Noguaranteesaregivenconcerningconcurrentaccessesbydifferentclients(Nosimultaneousupdates)Client-CentricConsistencyModels

EventualConsistency(7.3.1)Theprincipleofamobileuseraccessingdifferentreplicasofadistributeddatabase.EventualConsistencyTherearemanycasesof(large-scale)distributedandreplicateddatabasesthattoleratearelativelyhighdegreeofinconsistency.Theyhaveincommonthatifnoupdatestakeplaceforalongtime,allreplicaswillgraduallybecomeconsistent.---eventualconsistency.Forexample,DNS,WWW,etc.Monotonic-ReadConsistency(7.3.2)Condition:

Ifaprocessreadsthevalueofadataitemx,anysuccessivereadoperationonxbythatprocesswillalwaysreturnthatsamevalueoramorerecentvalue.

(toguaranteethatifaprocesshasseenavalueofxattimet,itwillneverseeanolderversionofxatalatertime).MonotonicReadsThereadoperationsperformedbyasingleprocessPattwodifferentlocalcopiesofthesamedatastore.Amonotonic-readconsistentdatastore;Adatastorethatdoesnotprovidemonotonicreads;(WS(xi[t])表示在時刻t,本地副本Li上的一系列write操作的結果,t可以省略)。Monotonic-WriteConsistence(7.3.3)Condition:

Awriteoperationbyaprocessonadataitemxiscompletedbeforeanysuccessivewriteoperationonxbythesameprocess.MonotonicWritesThewriteoperationsperformedbyasingleprocessPattwodifferentlocalcopiesofthesamedatastoreAmonotonic-writeconsistentdatastore.Adatastorethatdoesnotprovidemonotonic-writeconsistency(missingW(x1)).Read-Your-WritesConsistency(7.3.4)Condition:Theeffectofawriteoperationbyaprocessonadataitemxwillalwaysbeseenbyasuccessivereadoperationonxbythesameprocess.Read-Your-WritesConsistency(7.3.4)Adatastorethatprovidesread-your-writesconsistency.Adatastorethatdoesnot(W(x1)notbeenpropagatedtoL2).Writes-Follow-ReadsConsistency(7.3.5)Condition:Awriteoperationbyaprocessonadataitemxfollowingapreviousreadoperationonxbythesameprocess,isguaranteedtotakeplaceonthesameoramorerecentvalueofthatwasread.(Inotherwords,anysuccessivewriteoperationbyaprocessonadataitemxwillbeperformedonacopyofxthatisuptodatewiththevaluemostrecentlyreadbythatprocess)WritesFollowReadsAwrites-follow-readsconsistentdatastoreAdatastorethatdoesnotprovidewrites-follow-readsconsistencyWrites-Follow-ReadsWrites-Follow-ReadsConsistencycanbeusedtoguaranteethatusersofanetworknewsgroupseeapostingofareactiontoanarticleonlyaftertheyhaveseentheoriginalarticle.Implementation(7.3.6)Anativeimplementation:Eachwriteoperationisassignedagloballyuniqueidentifierbytheserverthatacceptstheoperationforthefirsttime(initiatedbyserver).Foreachclient,wekeeptrackoftwosetsofwriteidentifiers.Thereadset={thewriteidentifiersrelevantforthereadoperationsperformedbyaclient}.讀之前的寫操作集合。Forexample:Pid-read-set={wid1,wid2,……widn}Thewriteset={theidentifiersofthewritesperformedbytheclient}.Implementation(7.3.6)monotonic-readconsistency:

Whenaclientperformsareadoperationataserver,thatserverishandedtheclient’sreadsettocheckwhetheralltheidentifiedwriteshavetakenplacelocally.Ifnot:

1)itcontactstheotherserverstoensurethatitisbroughtuptodatebeforecarryingoutthereadoperation.or2)thereadoperationisforwardedtoaserverwherethewriteoperationsthatalreadytakenplace.Implementation(7.3.6)ClientARead-setReadAfterthereadoperationisperformed,thewriteoperationsthathavetakenplaceattheselectedserverandwhicharerelevantforthereadoperation,areaddedtotheclient’sreadset.Implementation(7.3.6)monotonic-writeconsistency:wheneveraclientinitiateanewwriteoperationataserver,itcontactstheotherserverstoensurethatitisbroughtuptodatebeforecarryingoutthewriteoperation.orthewritesetisforwardedtoaserverwherethewriteoperationsthatalreadytakenplace.Itthenensuresthattheidentifiedwriteoperationsareperformedfirstandinthecorrectorder.Implementation(7.3.6)ClientAWrite-setWriteAfterperformingthenewoperation,thatoperation’swriteidentifierisaddedtothewriteset.Implementation(7.3.6)read-your-writesconsistency:Requiringthattheserverwherethereadoperationisperformedhasseenallthewriteoperationsintheclient’swriteset.1)Thewritescansimplybefetchedfromotherserversbeforethereadoperationisperformed.2)Theclient-sidesoftwarecansearchforaserverwheretheidentifiedwriteoperationsintheclient’swritesethavealreadybeenperformed.Implementation(7.3.6)ClientAWrite-setReadImplementation(7.3.6)writes-follow-readsconsistency:bringingtheselectedserveruptodatewiththewriteoperationsintheclient’sreadset,andthenlateraddingtheidentifierofthewriteoperationtothewriteset,alongwiththeidentifiersinthereadset.Implementation(7.3.6)ClientARead-setWriteImprovingEfficiencySetssizesession:Client’sreadandwriteoperationsaregroupedintosessionsassociatedwithanapplication,whichopenedwhentheapplicationstartsandisclosedwhenitexits.會話大小的設置.Setsrepresentation:ts(WID),ts---timestampDISTRIBUTIONPROTOCOLS(7.4)Discussingthedifferentwaysofpropagating(distributingupdatestoreplicas).Decidingwhere,when,andbywhomcopiesofthedatastorearetobeplaced.ReplicaPlacement(7.4.1)Thelogicalorganizationofdifferentkindsofcopiesofadatastoreintothreeconcentricrings.TheinitialsetofreplicasthatconstituteadistributeddatastoreServer-InitiatedReplicasKnowsaspushcaches/pushmode:toreducetheloadonaserver;tobemigrated,orreplicatedtoserverplacedintheproximityofclientsthatissuemanyrequestsforthosefiles;Eachserverkeepstrackofaccesscountsperfile,andwhereaccessrequestscomefrom.Server-InitiatedReplicasCountingaccessrequestsfromdifferentclients.Server-InitiatedReplicascntQ(P,F)-allaccesscountforFatQfromC1andC2(C1andC2sharethesameclosestserverP),訪問計數.del(S,F)-deletionthresholdrep(S,F)-replicationthresholdIfcnt(S,F)<del(S,F)thenremoveFfromS;Ifcnt(S,F)>rep(S,F)thenreplicateFatS;Ifdel(S,F)<cnt(P,F)andcnt(P,F)<rep(S,F)thenonlytobemigrated.Client-InitiatedReplicasKnownasclientcaches/pullmode.Creatingacacheattheclientwhenneededandmanagingthecacheisleftentirelytothatclient.UpdatePropagation(7.4.2)Threepropagations:1.Propagateonlyanotificationofanupdate(knownasinvalidationprotocols)2.Transferdatafromonecopytoanother.3.Propagatetheupdateoperationtoothercopies.EnsuringrelevantconsistencyaccordingtoneedPullversusPushProtocolsAcomparisonbetweenpush-basedandpull-basedprotocolsinthecaseofmultipleclient,singleserversystems.IssuePush-basedPull-basedStateofserverListofclientreplicasandcachesNoneMessagessentUpdate(andpossiblyfetchupdatelater)PollandupdateResponsetimeatclientImmediate(orfetch-updatetime)Fetch-updatetimeIfinvalidationprotocolEpidemicProtocolsImplementeventual-consistentbasedonepidemics.aserverPpicksanotherserverQatrandom,andsubsequentlyexchangesupdateswithQ.Threeapproaches:1.PonlypushesitsownupdatestoQ2.PonlypullsinnewupdatesfromQ3.PandQsendupdatestoeachother(pull-pull)CONSISTENCYPROTOCOLS(7.5)Implementaspecificconsistencymodel(sequential,weakconsistency,andatomictransaction).“Globalsequence”7.5.1Primary-basedProtocolsRemote-WriteProtocols(1)Primary-basedremote-writeprotocolwithafixedservertowhichallreadandwriteoperationsareforwarded(主副本固定).BlockingornonblockingRemote-WriteProtocols(2)Theprincipleofprimary-backupprotocol(帶復制的主副本).Local-WriteProtocols(1)Primary-basedlocal-writeprotocolinwhichasinglecopyismigratedbetweenprocesses(可移動的主副本).Local-WriteProtocols(2)Primary-backupprotocolinwhichtheprimarymigratestotheprocesswantingtoperformanupdate(帶復制的可移動主副本).ActiveReplication(1)Theproblemofreplicatedinvocations(eachreplicahasanassociatedprocessthatcarriesoutupdateoperation).Replicated-WriteProtocols(7.5.2)ActiveReplication(2)Forwardinganinvocationrequestfromareplicatedobject.Returningareplytoareplicatedobject.Quorum-BasedProtocolsTwoconstraints(forNreplicas):1.NR+NW>N;2.NW

>N/2Threeexamplesofthevotingalgorithm:(a)Acorrectchoiceofreadandwriteset;(b)Achoicethatmayleadtowrite-writeconflicts;(c)Acorrectchoice,knownasROWA(readone,writeall);背景:Paxos是古希臘的一個小島,其法令通過議會表決形成,議員和信使都是兼職,且不一定在會議廳,議員通過信使傳遞消息,并且信使可能重復投遞消息,也可能一去不復返。需要一種協議:

保證在這種情況下法令仍然能夠正確產生,并且不會出現沖突!Paxos算法LeslieLamport1998在ACMTCS上發表了關于paxos的論文“ThePart-TimeParliament”,由于晦澀難懂,2001年又發表了“PaxosMadeSimple”;Paxos算法介紹Google使用該算法后使其聲名大噪,GoogleChubby的作者MikeBurrows:Thereisonlyoneconsensusprotocol,andthat‘sPaxos”-allotherapproachesarejustbrokenversionsofPaxos.(OSDI2006)在分布式系統中有一組Process,它們需要確定一個Value。每個Process都提出了一個Value,共識(consensus)就是指只能有一個Value被選中作為最后確定的值,并且當這個值被確定后,所有Processes都要被通知到.IntroductionLockistheeasiestwaytomanageconcurrencyMutexandsemaphore.Readandwritelocksin2PLfortransaction.Indistributedsystem:Nomasterforissuinglocks.Failures.Problem:

Howtoreachconsensus/dataconsistencyindistributedsystemthatcantoleratenon-maliciousfailures?RequirementsSafetyOnlyavaluethathasbeenproposedmaybechosen.只有提案才能被選定;Onlyasinglevalueischosen.只能有一個值被選定;Anodeneverlearnsthatavaluehasbeenchosenunlessitactuallyhasbeen.除非那個提案被選中,否則,某個節點不會知道有提案被選中。RequirementsLivenessSomeproposedvalueiseventuallychosen.保證最終有一個提案會被選定;Ifavaluehasbeenchosen,anodecaneventuallylearnthevalue.當提案被選定后,進程最終也能獲取到被選定的提案。Paxos‘snotationandalgorithmClassesofagentsAnodecanactasmorethanoneclients(usually3):Proposers;Acceptors;Learners;算法思想PaxosalgorithmPhase1(prepare):Aproposerselectsaproposalnumbernandsendsapreparerequestwithnumberntomajorityofacceptors.Ifanacceptorreceivesapreparerequestwithnumberngreaterthanthatofanypreparerequestitsaw,itresponsesYEStothatrequestwithapromisenottoacceptanymoreproposalsnumberedlessthannandincludethehighest-numberedproposal(ifany)thatithasaccepted.

接受者只響應目前收到的編號最大的提案請求者.PaxosalgorithmPhase2(accept):IftheproposerreceivesaresponseYEStoitspreparerequestsfromamajorityofacceptors,thenitsendsanacceptrequest

toeachofthoseacceptorsforaproposalnumberednwithavaluesv

whichisthevalueofthehighest-numberedproposalamongtheresponses.Ifanacceptorreceivesanacceptrequestforaproposalnumberedn,itacceptstheproposalunlessithasalreadyrespondedtoapreparerequesthavinganumbergreaterthann.DefinitionofchosenAvalueischosenatproposalnumberniffmajorityofacceptoracceptthatvalueinphase2oftheproposalnumber.Paxos’spropertiesP1:Anyproposalnumberisunique.P2:Anytwosetofacceptorshaveatleastoneacceptorincommon.P3:thevaluesentoutinphase2isthevalueofthehighest-numberedproposalofalltheresponsesinphase1.LearningachosenvalueTherearesomeoptions:Eachacceptor,wheneveritacceptsaproposal,informsallthelearners.Acceptorsinformsadistinguishedlearner(usuallytheproposer)andletthedistinguishedlearnerbroadcasttheresult.ClassicPaxos情形1:PaxoswithoutProposerCompetitionPreparesetabc=0abc=X(1)Response(1)&&PromiseResponse

ClassicPaxosAcceptabc=XAccept(1,setabc=0)&&Promise(1,setabc=0)Response(1)&&PromiseClassicPaxosAcceptabc=0Acceptedsetabc=0accpetedAccep

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論