Just so everyone understands: Each homework is weighted equally, but scoring methods for individual assignments will differ. Part of what determines the scoring method is which problems (if any) are assigned from the book, and if so, how points are assigned in the book. I try to use this as a starting point. Within each homework assignment, each problem is weighted equally.
With that out of the way…
Overall average was 70.2% (st. dev. 17.6%). Average scores by problem were 24.1, 18.1, 13.8, and 14.2 points respectively (25 points for each).
Q1. a. ~5.56; b. ~90.9%. No points were deducted for minor rounding. If you got part a wrong and then used that result in calculation of part b, then so long as you showed your work and the steps were correct, you received full credit for part b.
Q2. Most folks got this right, or at least got very close. Remember we are concerned about multicycle operations here. If you’re puzzled, please check Section C.5, especially Figure C34 and Figure C37. Execution of all instructions should have taken 21 cycles:
|LD F4, 0(R2)||IF||ID||EX||MM||WB|
|MUL.D F0, F4, F4||IF||ID||…||M1||M2||M3||M4||M5||M6||M7||MM||WB|
|MUL.D F8, F0, F6||IF||…||ID||…||…||…||…||…||…||M1||M2||M3||M4||M5||M6||M7||MM||WB|
|ADD.D F8, F4, F4||…||IF||…||…||…||…||…||…||ID||…||…||…||A1||A2||A3||A4||MM||WB|
Where “MM” is short for MEM, and “…” indicates a stall. I did not take points off if you started A1 on 14 and then stalled for three cycles after A4 before MM on 20.
Q3. This was not a question about multicycle operations. :(
a. Without forwarding:
|LD R1, 0(R2)||IF||ID||EX||MM||WB|
|ADD R4, R1, R5||IF||ID||…||…||EX||MM||WB|
|ADD R5, R4, R4||IF||…||…||ID||…||…||EX||MM||WB|
|ADD R6, R2, R3||IF||…||…||ID||EX||MM||WB|
b. With forwarding:
|LD R1, 0(R2)||IF||ID||EX||MEM||WB|
|ADD R4, R1, R5||IF||ID||…||EX||MM||WB|
|ADD R5, R4, R4||IF||…||ID||…||EX||MM||WB|
|ADD R6, R2, R3||IF||…||ID||EX||MM||WB|
Q4. This one was difficult. There are a lot of details to keep in mind with scoreboard. After describing the CDC 6600 architecture, the book states two sentences later that we should assume two multipliers, one adder, one divide, and a single integer unit for all memory references, branches and integer ops. Elsewhere, the book gives the following EX cycle latencies for FPUs: add 2, mult 20, div 40. I assume this is what Prof Ling intended. I took a liberal approach and awarded marks if the student demonstrated that they had the gist of scoreboard and that their answers were consistent with their assumptions (sometimes implicit), trying not to get too nit-picky about details. Common mistake that was significant was writing to F10 (ADD.D) before operand was read by the DIV.D.
|LD.D F6, 34(R2)||1||2||4||5|
|LD.D F2, 45(R3)||6||7||9||10||Assumes only one unit|
|MUL.D F0, F2, F8||7||11||22||23||Waits for write to F2; 10 cycles to exec|
|DIV.D F4, F0, F10||8||24||65||66||Waits for write to F0; 40 cycles to exec|
|ADD.D F10, F12, F12||9||10||12||25||Must wait to write until DIV.D has read from F10|
For the last add, some folks assumed one adder, some folks assumed two. I was OK with either as long as your answer was consistent.
From this it should be simple to ascertain the register status table at cycle 20 and at cycle 40.
Interesting note: This is not the actual architecture for the 6600 (you may have already realized this). Here’s what these beasts looked like.
I can’t say this was before my time, but I can say this was before I wrote my first program. ;)