Find minimum associativity needed of level cache |Computer Science

The transpose of a matrix interchanges its rows and columns and is illustrated on bottom of Page 345.

Here is a simple C loop to show the transpose:
for (i = 0; i < 3; i++) {
for (j = 0; j < 3; j++) {
output[j][i] = input[i][j];
}
}

Assume both the input and output matrices are stored in the row major order (row major order means row index changes fastest). Assume you are executing a 256 x 256 double-precision transpose on a processor with a 16 KB fully associative (so you don’t have to worry about cache conflicts) LRU replacement level 1 data cache with 64-byte blocks. Assume level 1 cache misses or prefetches require 16 cycles, always hit in the level 2 cache, and the level 2 cache can process a request every 2 processor cycles. Assume each iteration of the inner loop above requires 4 cycles if the data is present in the level 1 cache. Assume the cache has a write-allocate fetch-on-write policy for write misses. Unrealistically assume writing back dirty cache blocks requires 0 cycles.

3. or the simple implementation given above, this execution order would be non-ideal for the input matrix. However, applying a loop interchange optimization would create a non-ideal order for output matrix. Because loop interchange is not sufficient to improve its performance, it must be blocked instead.

b. What is the minimum associativity required of the level 1 cache for consistent performance independent of both arrays’ position in memory?

Order from us and get better grades. We are the service you have been looking for.

Order essay

Type of paper needed:

Academic level:

Deadline:

Pages:

1650 words

Total price: $0.00