Overall average was 57.2% (st. dev. 26.9%). Average scores by problem were 1a: 13.2, 1b: 9.0, 2: 16.4, and 3: 16.6 points respectively (25 points for each).

When recopying code samples, be sure to indicate labels for loops. I think some of you got confused as to where loops began and ended!

Partial solutions:

Q1. To support the claim that different code is functionally equivalent, you must demonstrate that the result is identical. In this case, the result was written to an array. The result is not the value of registers at the end of execution or the number of loops executed.

In the case of code sample I, execution results in certain values being written to an array by a loop that executes 100x, In the case of code sample II, the same values are written to the same locations, though the code executes as prologue, loop (98x), and eplilogue. Check to see what gets stored. If you can show it’s the same for code sample I and code sample II, then you have demonstrated that the claim is true.

Regarding cycle calculation: First: Number of executions of a loop is not the same as the number of cycles. You were expected to calculate the total number of cycles that each code sample would take to be fully executed. I gave partial credit so long as your answers were approximately correct — not being nitpicky about a few cycles here and there. Your answers should have been on the order of 900 cycles, give or take, for code sample I, and on the order of 600 cycles for code sample II. Why does code sample II execute in fewer cycles? Software pipelining. In code sample II, code is rearranged by the compiler to reduce stalls during execution.

Folks who stated that sample II could be rewritten in fewer lines were missing the point.

Q2. Folks did better on this question. To show your work, you should have completed a table:

D B1 pred B1 New B1 B2 pred B2 New B2
2     etc.      

and indicated which were the misprediction(s), and given a total count of misprediction(s).

Q3. Many people only worked out one iteration of the loop and points were deducted. Here’s a partial solution.

Label Instruction Issue Exec Read Write Commit
Loop L.D f2,0(r1) 1 2 3 4 5
ADD.D f4,f2,f0 1 5-7 8 9
L.D f6,0(r2)
ADD.D f6,f4,f6
L.D f8,0(r3)
ADD.D f10,f8,f6
S.D f10,0(r2)
ADDI r3,r3,-8
ADDI r2,r2,-8
ADDI r1,r1,-8
BNEZ r1,Loop 6 10 20

Label Instruction Issue Exec Read Write Commit
ADDI r1,r1,-8
BNEZ r1,Loop 12 13 26

Some of you did not take any pipelining into consideration, much less Tomasulo. :( Maybe you can start with the partial solution here and give it another go.