指令电脑提示虚拟内存不足

电脑提示虚拟内存不足时间:2021-01-19 阅读:()

WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机组成与结构习题讲解(1)WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的逻辑部件2.
4设计用若干个全加器和若干个与门、或门实现的8421码十进制加法器单元电路.
分析与解答:BCD码:0000-1001二进制:0000-1111WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的逻辑部件+6WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的逻辑部件何时要对结果作修正当二进制加法的结果为1010、1110、1210、1310、1410、1510时二进制结果有进位时由卡诺图,得到:结果=E3E2+E3E1WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的逻辑部件WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
1用32位二进制2的补码表示法表示数51210分析与解答:51210=(1000000000)2=(00000000000000000000001000000000)2WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
2用32位二进制2的补码表示法表示数-102310分析与解答:-102310=-(1111111111)2=(10000000000000000000001111111111)原=(11111111111111111111110000000001)补WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
4给出如下用二进制2的补码表示法表示的数的十进制数:111111111111111111111110000011002分析与解答:(11111111111111111111111000001100)补=(10000000000000000000000111110011)反=(10000000000000000000000111110100)原=-(111110100)2=-500WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
8给出二进制数110010101111111011111010110011102的十六进制数分析与解答:(Hex)0-F(B)0000-1111110010101111111011111010110011102=(1210151415101214)=(CAFEFACE)16WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
14二进制数的各个数位本身并不是天生就有某种特定的含义.
请考虑如下的二进制位串:10001111111011111100000000000000若它分别表示如下所示的三种数,那么他们的含义各是什么2的补码表示的整数无符号整数单精度浮点数WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算分析与解答:2的补码表示的整数(10001111111011111100000000000000)补=(11110000000100000100000000000000)原=-(1110000000100000100000000000000)=-188011315210无符号整数(10001111111011111100000000000000)=+(10001111111011111100000000000000)=+241485414410WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算单精度浮点数10001111111011111100000000000000S=(-1)1=-1E=00011111=3110F'=110+(11011111100000000000000)2单精度浮点数=S*F'*2ESEFWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算4.
26请根据IEEE754标准,写出10.
510分别为单、双精度浮点数时,其二进制形式分析与解答:规格化:10.
510=(1010.
1)2=(1.
0101)2*232单精度浮点数公式(S:1位,E:8位,F:23位)移码偏移值=127S=0E'=3=>E=3+127=130=(10000010)2F'=(1.
0101)2=>F=F'-1=(0101)201000001001010000000000000000000WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机的算术运算10.
510=(1010.
1)2=(1.
0101)2*232双精度浮点数公式(S:1位,E:11位,F:53位)移码偏移值=1023S=0E'=3=>E=3+1023=1026=(10000000010)2F'=(1.
0101)2=>F=F'-1=(0101)201000000001001010000000000000000.
.
.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
9设机器字长16位.
定点表示时,数值15位,符号位1位;浮点表示时,阶码6位,其中阶符1位,尾数10位,其中,数符1位;阶码底为2.
试求:1)定点原码整数表示时,最大正数、最小负数各是多少2)定点原码小数表示时,最大正数、最小负数各是多少3)浮点原码表示时,最大浮点数和最小浮点数各是多少绝对值最小的呢(非0)估算表示的十进制值的有效数字位数.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件分析与解答:1)定点整数:数值15位,符号位1位-11111~+11111-(215-1)10~+(215-1)102)定点小数:数值15位,符号位1位-0.
11111~+0.
11111-(1-2-15)10~+(1-2-15)1015101415WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3)浮点:阶码6位,其中阶符1位,尾数10位,其中数符1位+2(+25-1)*(1-2-9)=+231*(1-2-9)-2(+25-1)*(1-2-9)=-231*(1-2-9)2(-(25-1))*2-9=2-31*2-9=2-4051014159189WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
12写出下列各数的移码+01101101-11001101-00010001+00011101WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件分析与解答:原码反码补码移码001101101001101101101101101+01101101(001101101)100110010100110011000110011-11001101(111001101)-00010001(100010001)111101110111101111011101111+00011101(000011101)000011101000011101100011101WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
19用补码一位乘计算X=0.
1010,Y=-0.
0110的积XY分析与解答:X=0.
1010->(00.
1010)原->(00.
1010)补Y=-0.
0110->(10.
0110)原->(11.
1010)补-X=-0.
1010->(10.
1010)原->(11.
0110)补WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件补码一位乘[XY]补=1.
11000100[XY]=-0.
00111100WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
21X=0.
10110,Y=0.
11111,用加减交替法补码一位除计算X/Y的商分析与解答:X=0.
10110->(00.
10110)原Y=0.
11111->(00.
11111)原-Y=-0.
11111->(10.
11111)原->(11.
00001)补WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件+++++++++++++++[-Y]补[X]补=00.
10110[Y]补=00.
11111[-Y]补=11.
00001+[Y]补+[-Y]补+[Y]补[X/Y]补=0.
10111[X/Y]=0.
10111+[-Y]补+[-Y]补WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
27设某运算器只由一个加法器和A、B两个D型边沿寄存器组成,A、B均可接收加法器输出,A还可接收外部数据,如图.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件问:1)外部数据如何才能传送到B2)如何实现A+B->A和A+B->B3)如何估算加法执行时间4)若A、B均为锁存器,实现A+B->A和A+B->B有何问题WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件分析与解答:1)外部数据如何才能传送到BWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件2)实现A+B->ALoadDS=1:D->ACPA、CPB脉冲:A+B->SumS=0:Sum->A同理:实现A+B->B10WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件3.
28今有一串行加法器,计算两个n位数据之和,已知相加两数存放在A、B寄存器中,请画出能实现(A)+(B)->A的逻辑图.
图中只准用一个一位加法器,逐位进行计算分析与解答:一位加法器=>各位串行计算寄存器要有移位功能i位进位和i+1位的操作数一起计算=>全加器n位数据加法=>使用计数器确定加法是否完成WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU运算方法和运算部件113322444i--WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU计算机系统结构基础知识1-2.
如有一个经解释实现的计算机,可以按功能划分成4级.
每一级为了执行一条指令需要下一级的N条指令解释.
若执行第一级的一条指令需要Kns时间,那么执行第2、3、4级的一条指令各需要用多少时间4321分析与解答:NKns、N2Kns、N3KnsWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题2-13.
某机14条指令的使用频度分别是:0.
01、0.
15、0.
12、0.
03、0.
02、0.
04、0.
02、0.
04、0.
01、0.
13、0.
15、0.
14、0.
11、0.
03.
分别求出用等长二进制编码、Huffman编码、只能用两种码长的扩展操作码编码等3种方式的操作码平均长度.
分析与解答:等长编码时,二进制码位数:[log2n]Huffman编码,平均码长:∑piliWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题共14条指令等长编码:平均码长[log214]=4Huffman编码:平均码长∑=3.
38扩展操作码(3/5扩展编码法):000~101:0.
15、0.
15、0.
14、0.
13、0.
12、0.
11110XX和111XX:0.
03、0.
02、0.
04、0.
02、0.
04、0.
01、0.
03、0.
01平均码长∑pili=3*0.
8+5*0.
2=3.
4WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题2-14.
某模型机有9条指令,使用频度为:ADDSUBCLAJMPSTOJOMCILSHRSTP0.
300.
240.
200.
070.
070.
060.
030.
020.
01要求:有两种指令字长,都按双操作数地址指令格式编排.
采用扩展操作码,限制只能用两种码长.
该机有若干个通用寄存器,主存16位宽,按字节编址,采用整数边界存储,任何指令都在一个主存周期中取得,短指令为寄存器-寄存器型,长指令为寄存器-主存型,主存地址应能变址寻址.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题1)仅根据使用频度,设计Huffman操作码,并计算平均码长;2)考虑题目其它要求,设计优化的指令操作码,并计算码长;3)该机允许使用多少可编址的通用寄存器4)画出该机两种指令字格式,标出各字段之位数;5)访存操作数地址寻址的最大相对位移量为多少字节分析与解答:构造Huffman树,解出1),后面各小题的解题关键是确定两种指令字的格式及其各字段的位数.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题1)Huffman树0.
560.
440.
140000001001111111ADDSUBCLAJMPSTOJOMCILSHRSTP0.
300.
240.
200.
070.
070.
060.
030.
020.
0110.
200.
240.
260.
300.
120.
060.
070.
070.
060.
030.
030.
010.
02WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题011110SHR011111STP01110CIL0110JOM0101STO0100JMP11CLA10SUB00ADD操作码码制指令Huffman编码:0.
5600110.
441010.
200.
240.
260.
300.
14010.
120.
060.
030.
070.
070.
060.
030.
0101011010∑Pili=2.
610.
02WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题2)分析与解答:要求用两种码长,ADD(0.
30)、SUB(0.
24)和CLA(0.
20)3条指令频度相对较高,因此短码宜采用2位长,共22=4个码点,剩下一个作为扩展标志码,有6条频度低的指令,所以需扩展出3位才可以满足.
于是长操作码为5位.
这样就得到扩展操作码.
WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题011110SHR011111STP01110CIL0110JOM0101STO0100JMP11CLA10SUB00ADD操作码码制指令Huffman编码:扩展的操作码编码11100SHR11101STP11011CIL11010JOM11001STO11000JMP10CLA01SUB00ADD操作码码制指令WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题后3小题分析与解答:3)该机允许使用多少可编址的通用寄存器由已知条件:两种指令都在一个主存周期中取得、主存16位宽=>长指令不超过16位.
由已知条件:按字节编址、采用按整数边界存储=>短指令只能是8位,长指令16位.
由已知条件:短指令为寄存器-寄存器型,长指令为寄存器-主存型=>指令按双操作数编排WANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题4)画出该机两种指令字格式,标出各字段之位数:短指令寄存器-寄存器型,其格式2位3位3位长指令为寄存器-主存型,主存地址应能变址寻址,格式为:5位3位3位5位寄存器2寄存器1OP相对位移变址寄存器寄存器号OPWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题5)访存操作数地址寻址的最大相对位移量为多少字节允许通用寄存器数22=8个;寻址最大相对位移量25=32字节.
5位3位3位5位相对位移变址寄存器寄存器号OPWANGWei,ComputerOrganizationandArchitecture,Copyright2004TJU指令系统设计与优化习题2-15.
某机指令字长16位.
设有单地址指令和双地址指令两类.
若每个地址字段均为6位,且双地址指令为x条,问单地址指令最多可以有多少条分析与解答:依据是扩展码中的短码不能是长码的前缀.
双地址指令:格式为4位6位6位操作码4位,共24=16种短操作码,x条双地址指令占用了x个码点,剩16-x个作为扩展标志.
单地址指令:操作码10位,每个码扩展出6位操作码,所以,最多可以表示单地址指令(16-x)26条.
地址码2地址码1操作码WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU计算机组成与结构计算机组成与结构习题讲解习题讲解(2)(2)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器4.
5有一个512K*16的存储器,由64K*1的2164RAM芯片构成(芯片内是4个128*128结构),问:(1)总共需要多少个RAM芯片(2)采用分散刷新方式,如单元刷新间隔不超过2ms,则刷新信号的周期是多少(3)如采用集中刷新方式,设读/写周期T=0.
1μs,存储器刷新一遍最少用多少时间WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器分析与解答:(1)64K*1=>512K*16位扩展:16/1=16片字扩展:512/64=8片∴共要16*8=128片WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器(2)分散刷新每个2164RAM由4个128*128的芯片构成2ms/128=15.
625μs(3)集中刷新0.
1μs*128=12.
8μsWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器4.
6某机器中,已知道有一个地址空间为0000H~1FFFH的ROM区域,现在再用RAM芯片(8K*4)形成一个16K*8的RAM区域,起始地址为2000H,假设RAM芯片有CS#和WE#信号控制端.
CPU地址总线为A15~A0,数据总线为D7~D0,控制信号为R/W(读/写),MREQ#(当存储器进行读或写操作时,该信号指示地址总线上的地址是有效的).
要求画出逻辑图.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器分析与解答:ROM(0000H~1FFFH)+RAM(16K*8)ROM容量:8K*8RAM由(8K*4)的RAM芯片构成:位扩展:2片字扩展:2片∴RAM1地址空间:2000H~3FFFHRAM2地址空间:4000H~5FFFHROM:00000000000000000001111111111111RAM1:00100000000000000011111111111111RAM2:01000000000000000101111111111111WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU主存储器WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU辅助存储器8.
3设某磁盘存储器的平均找道时间为tS,转速为每分r转,每磁道容量为N个字,每信息块为n个字.
试推导读写一个信息块所需总时间tB的计算公式.
分析与解答:找道时间:tS找磁道时间:r转/min=>每转一次的时间=60/r=>找磁道时间:60/2r读写每个字的时间:60/2rN*2=60/rN=>读写n个字的时间:60n/rN∴总时间tB=tS+60/2r+60n/rNWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU辅助存储器8.
5设磁盘组有11个盘片,每片有两个记录面;存储区域内直径2.
36英寸,外直径5.
00英寸;道密度为1250TPI(每英寸磁盘数),内层位密度52400bpi(每英寸位数),转速为2400rpm.
问:(1)共有多少个存储面可用(2)共有多少柱面(3)每道存储多少字节盘组总存储容量是多少(4)数据传输率是多少(5)每扇区存储2KB数据,在寻址命令中如何表示磁盘地址(6)如果某文件长度超过了一个磁道的容量,应将它记录在同一个存储面上,还是记录在同一个柱面上WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU辅助存储器分析与解答:(1)11个盘片,每片有两个记录面=>11*2-2=20个存储面可用(2)柱面数=(5.
00-2.
36)/2*1250TPI=1650(3)每道存储字节=内层位密度*内层磁道长度=52400bpi*2.
36*PI=48.
5KB盘组总容量=每道存储字节*道数*存储面数=48.
5KB*1650*20=1600500KB=1.
6GB(4)转速为2400rpm=40rps数据传输率=每道存储字节*转速=48.
5KB*40rps=1940KB=1.
94MBWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU辅助存储器(5)已知:每扇区存储2KB数据找存储面:20个存储面=>5位找柱面(磁道):1650个柱面=>11位找扇区:每柱面上扇区数=48.
5KB/2KB=25个扇区=>5位总共需要:5+11+5=21位(磁盘地址)(6)如果某文件长度超过了一个磁道的容量,应将它记录在同一个存储面上,还是记录在同一个柱面上同一个柱面上:使得一次访存可以读写文件的所有内容WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU6.
1CPU结构如图所示,其中有一个累加器AC、一个状态条件寄存器和其他四个寄存器,各部分之间的连线表示数据通路,箭头表示信息传送方向.
要求:(1)标明图中a、b、c、d四个寄存器的名称(2)简述指令从主存取到控制器的数据通路(3)简述数据在运算器和主存之间进行存/取访问的数据通路WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU(1)标明图中a、b、c、d四个寄存器的名称PCDRIRARWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU(2)简述指令从主存取到控制器的数据通路WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU(3)简述数据在运算器和主存之间进行存/取访问的数据通路WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU6.
2设某计算机运算控制器逻辑图如图6.
8,控制信号意义见表6.
1,指令格式和微指令格式如下:其中1-23位代表的1-23控制信号见表6.
1.
试写出下述三条指令的微程序编码:(1)JMP(无条件转移到(rs1)+disp)(2)Load(从(rs1)+disp指示的内存单元取数,送rs保存)(3)Store(把rs内容送到(rs1)+disp指示的内存单元)操作码rs,rdrs1imm或disp指令格式12.
.
.
2324.
.
.
25控制字段下址字段微指令格式WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU分析与解答:PC+1rs1->GR(rs1)->ALUALU->ARPC->ABALU->PCimm(disp)->ALUDB->IRDB->DRDR->DBrs,rd->GR(rs)->ALUDR->ALU+l>GRALU->DRAR->ABADSM/IO#W/R#JMP:LOAD:111111111111111111XXXXXX1WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU1XXXX1111111111111PC->ABALU->PCPC+1DB->IRDB->DRDR->DBrs1->GRrs,rd->GR(rs1)->ALU(rs)->ALUDR->ALU+l>GRALU->DRALU->ARAR->ABADSM/IO#W/R#imm(disp)->ALUSTORE:WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU6.
6某机有8条微指令I1-I8,每条微指令所包含的微命令控制信号如表所示.
a-j分别对应10种不同性质的微命令信号.
假设一条微指令的控制字段为8位,请安排微指令的控制字段格式.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU分析与解答:8位控制字段使用"译码法"WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU当a、c、d、g都为0时,产生译码信号4位2位2位acdg01-e10-f11-h01-b10-i11-j直接控制译码译码WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU6.
7已知某机采用微程序控制方式,其控制存储器容量为512*48(位).
微指令字长为48位,微程序可在整个控制存储器中实现转移,可控制微程序转移的条件共4个(直接控制),微指令采用水平型格式,如图所示.
(1)微指令中的三个字段分别对应多少位(2)画出围绕这种微指令格式的微程序控制器逻辑框图微指令字段判别测试字段下地址字段操作控制顺序控制WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU分析与解答:控制存储器容量为512*48、微指令字长位48位=>控制存储器共有512个存储单元,完全寻址需要9位4个直接控制的转移条件=>占用4位微指令字段判别测试字段下地址字段35位9位4位WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU6.
15设有主频为16MHz的微处理器,平均每条指令的执行时间为两个机器周期,每个机器周期由两个时钟脉冲组成.
问:(1)存储器为"0等待",求机器速度(2)假如每两个机器周期中有一个是访存周期,需插入1个时钟周期的等待时间,求机器速度("0等待"表示存储器可在一个机器周期完成读/写操作,因此不需要插入等待时间)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU中央处理部件CPU分析与解答:平均每条指令的执行时间为两个机器周期,每个机器周期由两个时钟脉冲组成(1)存储器为"0等待"时:16MHz=>16M脉冲/s=>8M机器周期/s=>4M指令周期/s=>4MIPS(2)每两个机器周期中有一个是访存周期,需插入1个时钟周期的等待时间:一个访存周期需要2个机器周期+另一个机器周期=3个机器周期=>6个时钟脉冲=>16/6MIPS=2.
67MIPSWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统7.
3设某流水线计算机有一个指令和数据合一的cache,已知cache的读/写时间为10ns,主存的读/写时间为100ns,取指的命中率为98%,数据的命中率为95%,在执行程序时,约有1/5指令需要存/取一个操作数,为简化起见,假设指令流水线在任何时候都不阻塞.
问设置cache后,与无cache比较,计算机的运算速度可提高多少倍WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统分析与解答:有cache时:取指令时间10ns*98%+(10ns+100ns)*2%=12ns取数据时间(10ns*95%+(10ns+100ns)*5%)*1/5=3ns∴平均访存时间=取指令时间+取数据时间=15nsCPUCacheMemory10ns100ns取指令命中率=98%取数据命中率=95%WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统无cache时:平均访存时间=100ns+100ns*1/5=120ns∴运算速度提高=120ns/15ns=8倍WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统7.
5设某计算机的cache采用4路组相联映像,已知cache容量为16KB,主存容量为2MB,每个字块有8个字,每个字有32位.
请回答:(1)主存地址多少位(按字节编址),各字段如何划分(各需多少位)(2)设cache起始为空,CPU从主存单元0,1,…,100依次读出101个字(主存一次读一个字),并重复按此次序数读11次,问命中率为多少若cache速度是主存的5倍,问采用cache与无cache比较速度提高多少倍4个字节……000000073FFF3FF816K16K16K16K01Cache……27-227-116KMemoryWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统分析与解答:(1)主存容量2MB=2*220B=221B=>主存地址21位已知每个字块有8个字,每个字有32位,cache容量为16KB,主存容量为2MB:每个字占4个字节=22字节每个字块有8个字=23个字把主存分成2MB/16KB=27块主存高位地址组号块内地址字节9732WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统(2)第1次读不命中,后10次读命中=>命中率=10/11=91%采用cache比无cache速度提高=(11*5)/(10*1+1*5)=55/15=3.
67倍主存高位地址组号块内地址字节9732WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统7.
6设某计算机采用直接映像cache,已知容量是4096B.
(1)若CPU依次从主存单元0,1,…,99和4096,4097,…,4195交替取指令,循环执行10次,问命中率为多少(2)如cache存取时间为10ns,主存存取时间为100ns,cache命中率为95%,求平均存取时间.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU存储系统分析与解答:(1)命中率=0(2)平均存取时间=10*95%+(100+10)*5%=15nsWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU处理器:数据通路及其控制5.
5希望给本章描述的单周期数据通路加入addi(立即数加)指令.
给下图的单周期数据通路加入必要的数据通路和控制信号.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU处理器:数据通路及其控制分析与解答:addi指令格式:rsrtimm865516addirt,rs,immrs+imm->rtWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU处理器:数据通路及其控制rsrtimm8分析与解答:WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU利用流水线提高性能6.
2请使用流水线示意图,说明执行下列三条指令所需的转发路径:Add$2,$3,$4Add$4,$5,$6Add$5,$3,$4WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU利用流水线提高性能分析与解答:IFIDEXMEMWB$3$4$2Add$2,$3,$4$4$5$6$4$3$5Add$4,$5,$6Add$5,$3,$4WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach1.
10AssumethetwoprograminFigure1.
15eachexecute100millionfloating-pointoperationsduringexecutiononeachofthethreemachines.
Ifperformanceisexpressedasarate,thentheaveragethattrackstotalexecutiontimeistheharmonicmean.
n∑i=1n(1/Ratei)whereRateiisafunctionof1/Timei,theexecutiontimefortheithofnprogramsintheworkload.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachQa.
CalculatetheMFLOPSratingofeachprogram.
Qb.
Calculatethearithmetic,geometric,andharmonicmeansofMFLOPSforeachmachine.
Qc.
Whichofthethreemeansmatchestherelativeperformanceoftotalexecutiontime401101001TotalTime201001000ProgramP2(secs)20101ProgramP1(secs)ComputerCComputerBComputerAFigure1.
15WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
Numberoffloating-pointoperationsinaprogramMFLOPSExecutiontimeinseconds*106Eachexecutes100millionfloating-pointoperations52011000.
11000P252010101001P1MSLOPSTimeMFLOPSTimeMFLOPSTimeProgramComputerCComputerBComputerAWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUAnswerb.
ArithmeticMean=∑i=1nai/nGeometricMean=(∏i=1nai)1/nHarmonicMean=n/∑i=1n(1/ai)1.
00.
60.
6Geometric(normalizedtoC)1.
61.
01.
0Geometric(normalizedtoB)1.
61.
01.
0Geometric(normalizedtoA)5.
01.
80.
2Harmonic5.
05.
550.
1ArithmeticCBAMeanComputerComputerArchitecture––AQuantitativeApproachWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerc.
ThearithmeticmeanofMFLOPSratestrendsinverselywithtotalexecutiontime.
Thegeometricmeans,regardlessofwhichnormalizationisused,donotshoweachdifferenceintotalexecutiontime.
Theharmonicmeantrackstotalexecutiontimebest.
401101001TotalTime201001000ProgramP2(secs)20101ProgramP1(secs)ComputerCComputerBComputerAFigure1.
15WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAppendixA.
2Usethefollowingcodefragment:Loop:LDF0,0(R2)LDF4,0(R3)MUL.
DF0,F0,F4ADD.
DF2,F0,F2DADDUIR2,R2,#8DADDUIR3,R3,#8DSUBUR5,R4,R2BNEZR5,LoopAssumethattheinitialvalueofR4isR2+792792/8=99iteration=99timesWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachForthisexerciseassumethestandardfivestageintegerpipelineandtheMIPSFPpipelineasdescribedinsectionA.
5.
Ifstructuralhazardsareduetowrite-backcontention,assumetheearliestinstructiongetspriorityandotherinstructionsarestalled.
Qa.
ShowthetimingofthisinstructionsequencefortheMIPSFPpipelinewithoutanyforwardingorbypassinghardwarebutassumingaregisterreadandawriteinthesameclockcycle"forward"throughtheregisterfile.
Assumethatthebranchishandledbyflushingthepipeline.
Ifallmemoryreferenceshitinthecache,howmanycyclesdoesthislooptaketoexecuteWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachQb.
ShowthetimingofthisinstructionsequencefortheMIPSFPpipelinewithnormalforwardingorbypassinghardware.
Assumethatthebranchishandledbypredictingitasnottaken.
Ifallmemoryreferenceshitinthecache,howmanycyclesdoesthislooptaketoexecuteSee3Hazards&ForwardingStructuralHazardDataHazardControlHazardWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
(withoutforwarding)EMWFDssFL.
DF0,0(R2)srsDFBNEZR5,LoopMWsEFDDSUBUR5,R4,R2MWDEFDADDUIR3,R3,#8WEMsD…ssFsDADDUIR2,R2,#8WEMEEsE…ssssDsFADD.
DF2,F0,F2W…EMssEEFDMUL.
DF0,F0,F4MWFDELDF4,0(R3)WFDEMLDF0,0(R2)25262723242122192017181516…131456781234ClockcycleinstructionWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
(withoutforwarding)EMWFDssFL.
DF0,0(R2)srsDFBNEZR5,LoopMWsEFDDSUBUR5,R4,R2MWDEFDADDUIR3,R3,#8WEMsD…ssFsDADDUIR2,R2,#8WEMEEsE…ssssDsFADD.
DF2,F0,F2W…EMssEEFDMUL.
DF0,F0,F4MWFDELDF4,0(R3)WFDEMLDF0,0(R2)25262723242122192017181516…131456781234ClockcycleinstructionLoop1Loop2…WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
totalloopexecutiontime=22*99=2178clockcyclesWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerb.
(withnormalforwarding)WEMFDFsL.
DF0,0(R2)DrFsBNEZR5,LoopWEMDsFDSUBUR5,R4,R2WEMFDDADDUIR3,R3,#8MWsDEFs.
.
.
DADDUIR2,R2,#8MWEEsEEsDs.
.
.
FADD.
DF2,F0,F2WEMWsEE.
.
.
FDMUL.
DF0,F0,F4MWFDELDF4,0(R3)WFDEMLDF0,0(R2)232122192017181516121314567.
.
.
1234ClockcycleinstructionWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerb.
(withnormalforwarding)WEMFDFsL.
DF0,0(R2)DrFsBNEZR5,LoopWEMDsFDSUBUR5,R4,R2WEMFDDADDUIR3,R3,#8MWsDEFs.
.
.
DADDUIR2,R2,#8MWEEsEEsDs.
.
.
FADD.
DF2,F0,F2WEMWsEE.
.
.
FDMUL.
DF0,F0,F4MWFDELDF4,0(R3)WFDEMLDF0,0(R2)232122192017181516121314567.
.
.
1234ClockcycleinstructionLoop1Loop2…WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerb.
totalloopexecutiontime=18*98+19=1783clockcyclesWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAppendixA.
3Supposethebranchfrequencies(aspercentageofallinstructions)areasfollows:Conditionalbranches15%Jumps&Calls1%Conditionalbranches60%aretakenWeareexaminingafour-deeppipelinewherethebranchisresolvedattheendofthesecondcycleforunconditionalbranchesandattheendofthethirdcycleforconditionalbranches.
Assumingthatonlythefirstpipestagecanalwaysbedoneindependentofwhetherthebranchgoesandignoringotherpipelinestalls,howmuchfasterwouldthemachinebewithoutanybranchhazardsWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerPipelineCPI=IdealpipelineCPI+(StructuralHazardStalls+DataHazardStalls+ControlHazardStalls)1clockcycleunpipelinedPipelinespeedup=1+Pipelinestallsclockcyclepipelined=PipelineDepth/(1+Pipelinestalls)NoControlHazard:Pipelinespeedupideal=4/(1+0)=4WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachHavingControlHazard:Assume4stage:IF,ID,EXandWBHandleJump&Call:…IFstalli+3…IDIFstalli+2…EXIDIFIFi+1WBEXIDIFJumporCall654321InstructionClockcycleWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachHandletakenconditionalbranch:…stallstalli+3…IFstallstalli+2…IDIFstallIFi+1WBEXIDIFTakenBranch654321InstructionClockcycleWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachHandlenot-takenconditionalbranch:…IFstalli+3…IDIFstalli+2…EXIDstallIFi+1WBEXIDIFNot-takenBranch654321InstructionClockcycleWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachSummaryofabove3controlflowinstructions:PipelineStallreal=(1*1%)+(2*9%)+(1*6%)=0.
25PipelineSpeedupreal=4/(1+0.
25)=3.
2115%*40%=6%Conditional(nottaken)215%*60%=9%Conditional(taken)11%Jump&CallStall(cycles)Frequency(perinstruction)ControlflowtypeWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachPipelineSpeedupwithoutcontrolhazard=4/3.
2=1.
25——25%speedupWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach5.
19SomememorysystemshandleTLBmissedinsoftware(asanexception),whileothersusehardwareforTLB(TranslationLookasideBuffer)misses.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachQa.
Whatarethetrade-offbetweentwomethodsforhandlingTLBmissesQb.
WillTLBmisshandlinginsoftwarealwaysbeslowerthanTLBmisshandlinginhardwareExplain.
Qc.
Aretherepagetablestructuresthatwouldbedifficulttohandleinhardware,butpossibleinsoftwareArethereanysuchstructurethatwouldbedifficultforsoftwaretohandlebuteasyforhardwaretomanageWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachQd.
UsethedatafromFigure5.
45tocalculatethepenaltytoCPIforTLBmissesonthefollowingworkloadassuminghardwareTLBhandlersrequire10cyclespermissandsoftwareTLBhandlerstakes30cyclespermiss:(50%gcc,25%perl,25%ijpeg),(30%swim,30%wave5,20%hydro2d,10%gcc).
Qe.
AretheTLBmisstimesinpart(d)realisticDiscuss.
Qf.
WhyareTLBmissrateforfloating-pointprogramgenerallyhigherthenthoseforintegerprogramWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
Softwareisslowerbecauseoftheoverheadswitchtothehandlercode,butthereplacementalgorithmcanbehigherthanhardwareandawidervarietyofvirtualmemoryorganizationscanbereadilyaccommodated.
Hardware-fasterbutlessflexibleAnswerb.
Factorsaffectingonthehandlingtimeinclude:Pagetable–pagedMoreefficientpagetablesearchingalgorithm——softwareTLBentryprefetching——hardwareAnswerc.
Pagetablestructurethatchangedynamicallywouldbedifficulttohandleinhardwarebutpossibleinsoftware.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAdaptedfromFigure5.
450.
190.
460.
010.
64hydro2d0.
891.
720.
170.
74wave50.
105.
990.
000.
40swim0.
260.
091.
660.
56perl0.
100.
020.
030.
49ijpeg0.
300.
253.
430.
63gccI-TLBL2-CacheI-CacheCPIProgramTLBmissesper1000instr.
Cachemissesper1000instructionsWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerd.
0.
1025%ijpeg0.
2625%perl0.
350%gccTLBmisses/1000instructionsWeightProgramWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachWorkloadmissrate=∑Weighti*(TLBmisses/1000i)=50%*0.
3+25%*0.
26+25%*0.
1=0.
24/1000instructionsPenalty(Hardware)=WMR*TLBmisshandlingtime(10cycles)=2.
4cycles/1000instructionsCPI=0.
0024clocks/instructionPenalty(Software)=WMR*TLBmisshandlingtime(30cycles)=7.
2cycles/1000instructionsCPI=0.
0072clocks/instructionWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach0.
310%gcc0.
1920%hydro2d0.
8930%wave50.
130%swimTLBmisses/1000instructionsWeightProgramWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachWorkloadmissrate=∑Weighti*(TLBmisses/1000i)=30%*0.
1+30%*0.
89+20%*0.
19+10%*0.
3=0.
37/1000instructionsPenalty(Hardware)=WMR*TLBmisshandlingtime(10cycles)=3.
7cycles/1000instructionsCPI=0.
0037clocks/instructionPenalty(Software)=WMR*TLBmisshandlingtime(30cycles)=11.
1cycles/1000instructionsCPI=0.
0111clocks/instructionWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswere.
TheTLBmisstimesaretoosmall.
HandlingaTLBmissrequiresfindingandtransferringapagetableentryinmainmemorytotheTLB.
Amainmemoryaccesstypicallytakesontheorderof100clocks,alreadymuchgreaterthanthemisstimeinpart(d).
Answerf.
Floating-pointprogramsoftentraverselargedatastructuresandthusmoreoftenreferencealargenumberofpages.
ItisthusmorelikelythattheTLBwillexperienceahigherrateofcapacitymisses.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach3.
2ConsiderthefollowingfourMIPScodefragmentseachcontainingtwoinstructions:i.
DADDIR1,R1,#4LDR2,7(R1)ii.
DADDR3,R1,R2SDR2,7(R1)iii.
SDR2,7(R1)SDF2,200(R7)iv.
BEZR1,placeSDR1,7(R1)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproacha.
Foreachfragment(i)to(iv)identifyeachtypeofdependencethatexistsorthatmayexist(afragmentmayhavenodependence)anddescribewhatdataflow,namereuse,orcontrolstructurecausesorwouldcausethedependence.
Foradependencethatmayexist,describethesourceoftheambiguityandidentifythetimeatwhichthatuncertaintyisresolved.
b.
Foreachcodefragment,discusswhetherdynamicschedulingis,maybe,orisnotsufficienttoallowout-of-orderexecutionofthefragment.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
No.
ChanginginstructionorderisspeculativeuntilthebranchresolvedNoneBEZR1,placeSDR1,7(R1)Maybe.
Ifthehardwarecomputestheeffectiveaddressesearlyenough,thenthestoreordermaybeexchanged.
OutputdependencemayexistSDR2,7(R1)SDF2,200(R7)YesNoneDADDR3,R1,R2SDR2,7(R1)No.
ChanginginstructionorderwillbreakprogramsemanticsTruedependenceofR1DADDIR1,R1,#4LDR2,7(R1)Dynamicschedulingsufficientforout-of-orderexecutionDataDependenceCodefragmentWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach3.
9Increasingthesizeofabranch-predictionbuffermeansthatitislesslikelythattwobranchinaprogramwillsharethesamepredictor.
Asinglepredictorpredictingasinglebranchinstructionisgenerallymoreaccuratethanisthatsamepredictorservingmorethanonebranchinstruction.
Qa.
Listasequenceofbranchtakenandnottakenactiontoshowasimpleexampleof1-bitpredictorsharingthatreducesmispredictionrate.
Qb.
Listasequenceofbranchtakenandnottakenactiontoshowasimpleexampleofhowsharinga1-bitpredictorincreasesmispredictionrate.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswera.
PredictionAccuracyincreases:0%->50%no--yes--no--yes--no--yes--no--no--CorrectpredictionTNTNTNTNTTTTTNTNTNTNTTTNTB2PB1PB2PB1PB2PB1PB2PB1P----no------no------no------no--CorrectpredictionNTTTNTNTTTNTB1PB1PB1PB1PWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerb.
PredictionAccuracydecreases:100%->0%no--no--no--no--no--no--no--no--CorrectpredictionNTTTNTNTTTNTNTTTNTNTTTNTB2PB1PB2PB1PB2PB1PB2PB1P----yes------yes------yes------no--CorrectpredictionTTTTTTTNTB1PB1PB1PB1PWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproach3.
14Supposewehaveadeeplypipelineprocessor.
Forwhichweimplementabranch-targetbufferfortheconditionalbranchesonly.
Assumethatthemispredictionpenaltyisalways4cyclesandthebuffermisspenaltyisalways3cycles.
Assume90%hitrateand90%accuracyand15%branchfrequency.
Q.
Howmuchfasteristheprocessorwiththebranch-targetbufferversusaprocessorthathasafixed2cyclebranchpenaltyAssumeabaseCPIwithoutbranchstallof1.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachBranch-TargetBuffer(BTB):AddressofbranchindextogetpredictionANDbranchaddress(iftaken)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachAnswerCPIBTB-Systemwithabranch-targetbufferCPINBTB-Systemwithoutabranch-targetbufferCPINBTBCPIbase+StallNBTBSpeedupCPIBTBCPIbase+StallBTBCPIbase=1——exercisestatementWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachStall=∑s∈StallFrequencys*Penaltys∴StallNBTB=15%*2=0.
3StallBTB=1.
5%*3+1.
3%*4=0.
097415%*90%*10%=1.
3%IncorrectHit015%*90%*90%=12.
1%CorrectHit315%*10%=1.
5%--MissPenalty(cycle)Frequency(perinstruction)BTBpredictionBTBresultAssume90%hitrateand90%accuracyand15%branchfrequencyWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUComputerArchitecture––AQuantitativeApproachCPIbase+StallNBTB1+0.
3Speedup1.
2CPIbase+StallBTB1+0.
097——20%fasterWANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU标量处理机习题5.
3假设一条指令的执行过程分为"取指令"、"分析"和"执行"三段,每一段的时间分别是⊿t、2⊿t和3⊿t.
在下列各种情况下,分别写出连续执行n条指令所需要的时间表达式.
(1)顺序执行方式.
(2)仅"取指令"和"执行"重叠.
(3)"取指令"、"分析"和"执行"重叠.
(4)先行控制方式.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU标量处理机习题分析与解答WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU标量处理机习题顺序方式:执行n条指令的时间=n*(t取指+t分析+t执行)"执行"和"取指"重叠:执行n条指令的时间=t取指+n*t分析+(n-1)*MAX{t取指,t执行}+t执行WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU标量处理机习题分析与解答WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU标量处理机习题"执行"、"分析"和"取指"重叠:执行n条指令的时间=t取指+MAX{t取指,t分析}+(n-2)*MAX{t取指,t分析,t执行}+MAX{t分析,t执行}+t执行先行控制:执行n条指令的时间=t取指+t分析+n*t执行WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题7.
3设16个处理器编号分别为0、1、…、15,要用单级互连网络.
当互连函数分别为(1)Cube3(2)PM2+3(3)PM2-0(4)Shuffle(5)Shuffle(Shuffle)时,第13号处理器各与哪一个处理器相连WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答16个处理器可用4位2进制P3P2P1P0表示:(1)Cube3:P3P2P1P0->/P3P2P1P0(2)PM2+3:P(j)->P(j+23mod16)(3)PM2-0:P(j)->P(j-20mod16)(4)Shuffle:P3P2P1P0->P2P1P0P3(5)Shuffle(Shuffle):P3P2P1P0->P1P0P3P2WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答第13号处理器:1101(1)Cube3:1101->0101=>(5)10(2)PM2+3:P(13)->P(13+23mod16)=5(3)PM2-0:P(13)->P(13-20mod16)=12(4)Shuffle:1101->1011=>(11)10(5)Shuffle(Shuffle):1101->0111=>(7)10WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题7.
4在编号分别为0、1、2、…、F的16个处理器之间,要求按下列配对通信:(B、1),(8、2),(7、D),(6、C),(E、4),(A、0),(9、3),(5、F).
试选择所用互连网络类型、控制方式,并画出该互连网络的拓扑结构和各级交换开关状态图.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答(B、1)(8、2)(7、D)(6、C)(E、4)(A、0)(9、3)(5、F)(1011、0001)(1000、0010)(0111、1101)(0110、1100)(1110、0100)(1010、0000)(1001、0011)(0101、1111)P3P2P1P0->/P3P2/P1P0WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答P3P2P1P0->/P3P2/P1P0采用Cube网络,4级控制第1、3级:交换状态第0、2级:直连状态级控制信号:0101(从右至左分别控制第0级至第3级)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题7.
12并行处理机有16个处理机,要实现相当于先4组4元交换,然后2组8元交换,再次是1组16元交换的交换函数功能,请写出此时各处理器之间所实现之互连函数的一般式,画出相应多级网络拓扑结构图,标出各级交换开关的状态.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答(0123|4567|89AB|CDEF)4组4元交换(32107654|BA98FEDC)2组8元交换(45670123CDEF89AB)1组16元交换(BA98FEDC32107654)WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJU网络互连习题分析与解答(0、B)(1、A)(2、9)(3、8)(4、F)(5、E)(6、D)(7、C)(0000、1011)(0001、1010)(0001、1001)(0011、1000)(0100、1111)(0101、1110)(0110、1101)(0111、1100)P3P2P1P0->/P3P2/P1/P0WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUSIMD计算机8.
10在16台PE的并行处理机上,要对存放在M分体并行存储器中的16*16二维数组实现行、列、主对角线、次对角线上各元素均无冲突访问,要求M至少为多少此时数组在存储器中应如何存放写出其一般规则.
证明这样存放同时也可以无冲突地访问该二维数组中任意4*4子阵列的各元素.
WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUSIMD计算机分析与解答n台PE的并行处理机,要对n*n二维数组实现行、列、主对角线、次对角线上各元素的同时无冲突访问,要求:存储器模数M是一个>=n的质数,且M=22p+1数组中,同一列上两个相邻行的元素其地址错开的体号距离δ1为2p,同一行上两个向量的元素其地址错开的体号距离δ2为1WANGWei:ComputerOrganizationandArchitecture,Copyright2004TJUSIMD计算机分析与解答本例中,n=16,∴模数=17,又∵17=22*2+1,∴δ1=22=4,δ2=1

展开全文

指令电脑提示虚拟内存不足相关文档

复杂零件、大型装配体的设计和有限元仿真分析对

域名价格为什么很多网站域名价格差别很大，价格贵贱有什么关系啊免费注册域名怎么注册免费域名？美国vps服务器美国VPS和美国服务器速度快吗免费vps服务器有没有便宜的vps，最好是免费的 asp虚拟空间怎样在一个虚拟空间里放上一个ASP和一个PHP的网站虚拟主机管理系统推荐几个适合windows的免费虚拟主机管理系统 1g虚拟主机打算买个1G的虚拟主机，用来做什么好？域名网站域名和网址的区别老域名什么样的域名算是老域名查域名知道IP地址如何查询域名（网站的域名）域名拍卖网站域名备案查询万网域名管理 net主机 ipage 国内加速器泉州电信 qq对话框美国免费空间网游服务器服务器监测宏讯 web应用服务器 lamp架构腾讯数据库云销售系统 let 大硬盘分区大硬盘补丁 ddos是什么更多

指令电脑提示虚拟内存不足

创梦网络-四川大带宽、镇江电信服务器云服务器低至56元

数脉科技8月促销，新客减400港币，BGP、CN2+BGP、阿里云线路低至350元

Hostio€5/月KVM-2GB/25GB/5TB/荷兰机房