Find minimum associativity needed of level cache |Computer Science

Find minimum associativity needed of level cache |Computer Science

The transpose of a matrix interchanges its rows and columns and is illustrated on bottom of Page 345.

Here is a simple C loop to show the transpose:
for (i = 0; i < 3; i++) {
for (j = 0; j < 3; j++) {
output[j][i] = input[i][j];
}
}

Assume both the input and output matrices are stored in the row major order (row major order means row index changes fastest). Assume you are executing a 256 x 256 double-precision transpose on a processor with a 16 KB fully associative (so you don’t have to worry about cache conflicts) LRU replacement level 1 data cache with 64-byte blocks. Assume level 1 cache misses or prefetches require 16 cycles, always hit in the level 2 cache, and the level 2 cache can process a request every 2 processor cycles. Assume each iteration of the inner loop above requires 4 cycles if the data is present in the level 1 cache. Assume the cache has a write-allocate fetch-on-write policy for write misses. Unrealistically assume writing back dirty cache blocks requires 0 cycles.

3. or the simple implementation given above, this execution order would be non-ideal for the input matrix. However, applying a loop interchange optimization would create a non-ideal order for output matrix. Because loop interchange is not sufficient to improve its performance, it must be blocked instead.

b. What is the minimum associativity required of the level 1 cache for consistent performance independent of both arrays’ position in memory?

Order from us and get better grades. We are the service you have been looking for.