CMPEN 341 Second Midterm 10/28/2021 Question-1 [Parallelism] [28 pts] This

FIND A SOLUTION AT Academic Writers Bay

CMPEN 341 Second Midterm
10/28/2021
Question-1 [Parallelism] [28 pts] This question has five parts [a through e]
Consider two threads, T1 and T2, with the following codes (‘$zero’ represents a register that is hardwired to value 0):
T1
my_again:
lw r2, 0(r1) lw r4, 4(r1) add r2, r2, r8 add r4, r4, r8 sw r2, 4(r1) sw r4, 0(r1) sub r1, r1, -4 bne r1, $zero, my_again:
T2
your_again: lw r2, 0(r1) add r2, r2, r9 sw r2, 0(r1) lw r4, 4(r1) add r4, r4, r9 sw r4, 4(r1) sub r1, r1, -12 bne r1, $zero, your_again:
Each of these threads is to be separately scheduled and executed on a two-slot VLIW machine with the goal of achieving a minimum number of cycles. A ‘bundle’ in this architecture has two instructions. The first of these instructions can be only an ALU or branch instruction, whereas the second one can be only a load or store instruction. You are allowed to reorder independent instructions and change the offset of addressing (if needed). You are not allowed to combine instructions.
(a) Map T1 to this VLIW machine. Explain each instruction-to-execution slot mapping decision you make in sufficient detail (i.e., why you have decided so; couldn’t instruction be scheduled in an earlier slot (cycle)?). [5 pts]
(b) Map T2 to this VLIW machine. Explain each instruction-to-execution slot mapping decision you make in sufficient detail (i.e., why you have decided so; couldn’t instruction be scheduled in an earlier slot (cycle)?). [5 pts]
(c) Repeat (a), but this time assuming that any instruction can be mapped to any execution slot. [5 pts]
(d) Repeat (b), but this time assuming that any instruction can be mapped to any execution slot. [5 pts]
(e) Suppose you decided to move to a simultaneous multi-threading architecture (SMT). The SMT architecture you are considering can execute up to 4 instructions in parallel. Further, in a given cycle, any combination of independent instructions (from the same or different threads) can be executed in parallel. Show a scheduling of these two threads together on the SMT machine with the goal of improving throughput. [8 pts]
Question-2 [Branch Prediction and Stalls] [21 pts] This question has five parts [a through e]
(a) Explain the functionalities of the following hardware components: BTB (Branch Target Buffer) and BHT (Branch History Table). Discuss how these two components complement each other. [2 pts]
(b) Consider a branch that has the following outcome pattern (T for taken, N for not taken).
N T N N T T T N N T T T N T N N
How many branches are predicted correctly with a static (0-bit) always-taken branch predictor for this branch outcome pattern. [5 pts]
(c) Using the same sequence from B. How many branches are predicted correctly with a dynamic 1-bit predictor where the initial state is Taken (T) for this branch outcome pattern? [5 pts]
(d) How many branches are predicted correctly with a dynamic, saturating counter 2-bit predictor for the branch outcome pattern in B? Suppose the four states are strong not taken (SN), weak not taken
(wn), weak taken (wt), and strong taken (ST). Assume that the initial prediction state is wt (weak taken).
[5 pts]
(e) What are the fundamental differences between ‘pipeline stall’ and ‘pipeline flush’? Provide examples to highlight what instructions will cause stall and what instructions will cause flush, and why. [4 pts]
Question-3 [Load/Store Queues] [25 pts] This question has three parts [a through c]
Consider the following sequence of instructions:
add r1,r6,r1 sw r1, 0(r12) lw r7, 8(r9) lw r6, -4(r10) lw r8, 4(r11) add r4,r6,r7 add r8,r8,r4 sw r8, 8(r9) lw r2, -8(r3)
(a ) Explain how ‘store queue’ can be used to improve/optimize the performance of this code sequence.
[8 pts]
(b) Discuss the relationship of the optimization in part (a) to ‘data forwarding’ (bypassing) used in pipelining. [7 pts]
(c) Explain how ‘load queue’ can be used to improve/optimize the performance of this code sequence [10 pts]
Question-4 [Hazards and Unrolling] [26 pts] This question has three parts [a through c]
(a) Identify all RAW, WAW and WAR dependencies in the loop shown below. Write down the dependencies within a ‘single iteration’ only. Use the following notation, for example, to indicate a dependency between Ix and Iy through r3 register: Ix- Iy (r3). (‘$zero’ represents a register that is hardwired to value 0, and ‘mul’ is opcode for multiplication): [8 pts]
loop:
I1: lw s1, 0(r1)
I2: mul s2, s1, s0
I3: add s3, s3, s2
I4: mul s2, s1, s1
I5: add s2, s2, s3
I6: sw s2, 0(r1)
I7: sub r1, r1, 8
I8: bne r1, $zero, loop
(b) Unroll the loop above once and eliminate as many dependences as you can via ‘register renaming’.
What dependences remain after renaming? Why can’t they be eliminated via renaming? [8 pts]
(c) Two important techniques have emerged recently for achieving high performance in processors. One is ‘speculative execution’, where instructions (or sequences of instructions) are executed before all the information needed to commit the instruction has been nailed down. The other is ‘simultaneous multithreading’ (SMT), where the processor can issue instructions from multiple threads (or processes), potentially in the same cycle. Describe the pros and cons of each approach with respect to the other. Do these approaches capture different opportunities for parallelism, or are they different ways to achieve the same result? If both were implemented in a single system, would you expect their projected improvements to be additive? Why or why not? [10 pts]

Order Your Custom Paper
Best Custom Essay Writing Services

QUALITY: 100% ORIGINAL PAPER NO PLAGIARISM – CUSTOM PAPER

Why Choose Us?

  • 100% non-plagiarized Papers
  • 24/7 /365 Service Available
  • Affordable Prices
  • Any Paper, Urgency, and Subject
  • Will complete your papers in 6 hours
  • On-time Delivery
  • Money-back and Privacy guarantees
  • Unlimited Amendments upon request
  • Satisfaction guarantee
SATISTACTION

How It Works

  • Click on the “Place Your Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
  • Fill in your paper’s requirements in the “PAPER DETAILS” section.
  • Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
  • Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
  • From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.

About AcademicWritersBay.com

AcademicWritersBay.com is an easy-to-use and reliable service that is ready to assist you with your papers 24/7/ 365days a year. 99% of our customers are happy with their papers. Our team is efficient and will always tackle your essay needs comprehensively assuring you of excellent results. Feel free to ask them anything concerning your essay demands or Order.

AcademicWritersBay.com is a private company that offers academic support and assistance to students at all levels. Our mission is to provide proficient and high quality academic services to our highly esteemed clients. AcademicWritersBay.com is equipped with competent and proficient writers to tackle all types of your academic needs, and provide you with excellent results. Most of our writers are holders of master’s degrees or PhDs, which is an surety of excellent results to our clients. We provide assistance to students all over the world.
We provide high quality term papers, research papers, essays, proposals, theses and many others. At AcademicWritersBay.com, you can be sure of excellent grades in your assignments and final exams.

NO PLAGIARISM
error: Content is protected !!