計算機結構英文版課件:Chapter 13 Reduced Instruction Set Computers_第1頁
計算機結構英文版課件:Chapter 13 Reduced Instruction Set Computers_第2頁
計算機結構英文版課件:Chapter 13 Reduced Instruction Set Computers_第3頁
計算機結構英文版課件:Chapter 13 Reduced Instruction Set Computers_第4頁
計算機結構英文版課件:Chapter 13 Reduced Instruction Set Computers_第5頁
已閱讀5頁,還剩45頁未讀 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

1、William Stallings Computer Organization and Architecture7th EditionChapter 13Reduced Instruction Set Computers1Chapter 13Reduced Instruction Set ComputersKey terms CISC complex instruction set computerRISC reduced instruction set computerDelayed branchDelayed loadHLL high-level languageRegister file

2、Register window SPARC2Major Advances in Computers(1)The family conceptIBM System/360 1964DEC PDP-8Separates architecture from implementationMicroporgrammed control unitIdea by Wilkes 1951Produced by IBM S/360 1964Cache memoryIBM S/360 model 85 19693Major Advances in Computers(2)Solid State RAM(See m

3、emory notes)MicroprocessorsIntel 4004 1971PipeliningIntroduces parallelism into fetch execute cycleMultiple processors4The Next Step - RISCReduced Instruction Set ComputerKey featuresLarge number of general purpose registersor use of compiler technology to optimize register useLimited and simple ins

4、truction setEmphasis on optimising the instruction pipeline5Comparison of processors6Driving force for CISCSoftware costs far exceed hardware costsIncreasingly complex high level languagesSemantic gapLeads to:Large instruction setsMore addressing modesHardware implementations of HLL statementse.g. C

5、ASE (switch) on VAXSemantic: 語義的;語義學的 Gap 英音:gp 豁口,裂口 7Intention of CISCEase compiler writingImprove execution efficiencyComplex operations in microcodeSupport more complex HLLsease 減輕 HLL 縮寫詞 abbr. high-level language 【電腦】高級語言8Execution CharacteristicsOperations performedOperands usedExecution sequ

6、encingStudies have been done based on programs written in HLLsDynamic studies are measured during the execution of the program9OperationsAssignmentsMovement of dataConditional statements (IF, LOOP)Sequence controlProcedure call-return is very time consumingSome HLL instruction lead to many machine c

7、ode operationsAssignment 分配;指派,選派10Weighted Relative Dynamic Frequency of HLL Operations PATT82a Dynamic OccurrenceMachine-Instruction WeightedMemory-Reference WeightedPascalCPascalCPascalCASSIGN45%38%13%13%14%15%LOOP5%3%42%32%33%26%CALL15%12%31%33%44%45%IF29%43%11%21%7%13%GOTO3%OTHER6%1%3%1%2%1%11O

8、perandsMainly local scalar variablesOptimisation should concentrate on accessing local variablesPascalCAverageInteger Constant16%23%20%Scalar Variable58%53%55%Array/Structure26%24%25%12Procedure CallsVery time consumingDepends on number of parameters passedDepends on level of nestingMost programs do

9、 not do a lot of calls followed by lots of returnsMost variables are local(c.f. locality of reference)13ImplicationsBest support is given by optimising most used and most time consuming featuresLarge number of registersOperand referencingCareful design of pipelinesBranch prediction etc.Simplified (r

10、educed) instruction set14Large Register FileSoftware solutionRequire compiler to allocate registersAllocate based on most used variables in a given timeRequires sophisticated program analysisHardware solutionHave more registersThus more variables will be in registers15Registers for Local VariablesSt

11、ore local scalar variables in registersReduces memory accessEvery procedure (function) call changes localityParameters must be passedResults must be returnedVariables from calling programs must be restored16Register WindowsOnly few parametersLimited range of depth of callUse multiple small sets of r

12、egistersCalls switch to a different set of registersReturns switch back to a previously used set of registers17Register Windows cont.Three areas within a register setParameter registersLocal registersTemporary registersTemporary registers from one set overlap parameter registers from the nextThis al

13、lows parameter passing without moving datacont. 1.內容,所含之物 (contents) 2.繼續的;不斷的;連續的 18Overlapping Register Windows19Circular Buffer diagram主程序1子程序A2子程序B3子程序C4子程序D5子程序E6子程序F7子程序G20Operation of Circular BufferWhen a call is made, a current window pointer is moved to show the currently active register w

14、indowIf all windows are in use, an interrupt is generated and the oldest window (the one furthest back in the call nesting) is saved to memoryA saved window pointer indicates where the next saved windows should restore to21Global VariablesAllocated by the compiler to memoryInefficient for frequently

15、 accessed variablesHave a set of registers for global variables22Registers v CacheLarge Register FileCacheAll local scalarsRecently-used local scalarsIndividual variablesBlocks of memoryCompiler-assigned global variablesRecently-used global variablesSave/Restore based on procedure nesting depthSave/

16、Restore based on cache replacement algorithmRegister addressingMemory addressing23Referencing a Scalar - Window Based Register File24Referencing a Scalar - Cache25Referencing a Scalar - Window Based Register File26Compiler Based Register OptimizationAssume small number of registers (16-32)Optimizing

17、 use is up to compilerHLL programs have no explicit references to registersusually - think about C - register intAssign symbolic or virtual register to each candidate variable Map (unlimited) symbolic registers to real registersSymbolic registers that do not overlap can share real registersIf you ru

18、n out of real registers some variables use memory27Graph ColoringGiven a graph of nodes and edgesAssign a color to each nodeAdjacent nodes have different colorsUse minimum number of colorsNodes are symbolic registersTwo registers that are live in the same program fragment are joined by an edgeTry to

19、 color the graph with n colors, where n is the number of real registersNodes that can not be colored are placed in memory28Graph Coloring Approach29Graph Coloring Approach30Why CISC (1)?Compiler simplification? DisputedComplex machine instructions harder to exploitOptimization more difficult (開發)Sma

20、ller programs?Program takes up less memory butMemory is now cheapMay not occupy less bits, just look shorter in symbolic formMore instructions require longer op-codesRegister references require fewer bits dispute 英音:dispju:t 爭論;爭執 simplification 1. 單純化 2. 簡單化 31Why CISC (2)?Faster programs?Bias towa

21、rds use of simpler instructionsMore complex control unitMicroprogram control store largerthus simple instructions take longer to executeIt is far from clear that CISC is the appropriate solution Bias 傾向,趨勢 32RISC CharacteristicsOne instruction per cycleRegister to register operationsFew, simple addr

22、essing modesFew, simple instruction formatsHardwired design (no microcode)Fixed instruction formatMore compile time/effort33RISC v CISCNot clear cutMany designs borrow from both philosophiese.g. PowerPC and Pentium IIphilosophies 哲學;觀點34RISC PipeliningMost instructions are register to registerTwo ph

23、ases of executionI: Instruction fetchE: ExecuteALU operation with register input and outputFor load and storeI: Instruction fetchE: ExecuteCalculate memory addressD: MemoryRegister to memory or memory to register operation35Figure 13.6cFigure 13.6c, three instructions can be overlapped, and the impr

24、ovement is as much as a factor of 3.I36Figure 13.6dE1: Register file readE2: ALU operation and register writeI37Effects of Pipelining38Optimization of PipeliningDelayed branchDelayed LoadLoop Unrolling39Optimization of Pipelining (1)Loop UnrollingReplicate body of loop a number of timesIterate loop

25、fewer timesReduces loop overheadIncreases instruction parallelismImproved register, data cache or TLB locality unroll 展開,打開(卷著的東西) replicate 英音:replikeit 折疊;復制 iterate 英音:itreit 反復,重復 overhead 英音:uvhed日常開支,額外開銷 40Optimization of Pipelining (2)Delayed branchDoes not take effect until after execution

26、of following instructionThis following instruction is the delay slot41Normal and Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOADX, rALOAD X, rA101ADD1, rAADD1, rAJUMP105102JUMP105JUMP106ADD1, rA103ADDrA, rBNOOPADDrA, rB104SUBrC, rBADDrA, rBSUBrC, rB105STORE r

27、A, ZSUBrC, rBSTORE rA, Z106STORE rA, Z42Use of Delayed BranchAddressNormal BranchDelayed BranchOptimized Delayed Branch100LOADX, rALOAD X, rALOAD X, rA101ADD 1, rAADD 1, rAJUMP 105102JUMP 105JUMP 106ADD 1, rA103ADD rA, rBNOOPADD rA, rB104SUB rC, rBADD rA, rBSUB rC, rB105STORE rA, ZSUB rC, rBSTORE rA

28、, Z106STORE rA, Z43Optimization of Pipelining (3)Delayed LoadRegister to be target is locked by processorContinue execution of instruction stream until register requiredIdle until load completeRe-arranging instructions can allow useful work whilst loading4480486 Instruction Pipeline Examples45Controversy (1)Quantitativecompare program sizes and execution speedsQualitativeexamine issues of

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論