2007年12月19日 星期三
.
※ Q1. Where can a block be placed in a cache?
1. If each block has only one place it can appear in the cache, the cache is said to be direct mapped.
This mapping is usually : (Block address) MOD (Number of blocks in cache)
2. If a block can be placed anywhere in the cache, the cache is said to be fully associative.
3. If a block can be placed in a restricted set of places in the cache, the cache is set associative.
A set is a group of blocks in the cache. A block is first mapped onto a set, and then the block can be placed anywhere within that set. The set is usually chosen by bit selection;
that is, (Block address) MOD (Number of sets in cache)
If there are n blocks in a set, the cache placement is called n-way set associative.
※ Q2: How is a block found if it is in the cache?
1. offset從區塊選出需要的資料,
2. Index field選出某一集合,
3. Tag field比對是否命中
如果Cache大小不變,增加關連性會增加每個集合中的區塊數,=>Index會縮短,Tag會加長.
Index=0即為Full associative.
※ Q3: Which block should be replaced on a cache miss?
1. Random
2. LRU(least-recently used)
3. FIFO
※ Q4: What happens on a write?
1. Write through : The information is written to both the block in the cache and to the block in the lower-level memory.
2. Write back : The information is written only to the block in the cache. The modified cache block is written to main memory only when it is replaced.
To reduce the frequency of writing back blocks on replacement, a feature called the dirty bit is commonly used. This status bit indicates whether the block is dirty (modified while in the cache) or clean (not modified). If it is clean, the block is not written back on a miss, since identical information to the cache is found in lower levels.
※ When the CPU must wait for writes to complete during write through, the CPU is said to write stall. A common optimization to reduce write stalls is a write buffer, which allows the processor to continue as soon as the data is written to the buffer, thereby overlapping processor execution with memory updating. As we shall see shortly, write stalls can occur even with write buffers.
Since the data are not needed on a write, there are two are two options on a write miss:
1. Write allocate : The block is allocated on a write miss, followed by the write hit actions above. In this natural option, write misses act like read misses.
2. No-write allocate : This apparently unusual alternative is write misses do not affect the cache. Instead, the block is modified only in the lower level memory.
※ Alpha 21264 Data Cache :
Cache Size = 64K bytes = 2^16 bytes.
Block Size = 64 bytes = 2^6 bytes. (block offset = 6)
2-way associativity. Write-back and Write-allocate.
2^index = Cache Size / (Block Size × Set associativity)
2^index = 65526 / (64 × 2) => index = 512 => index field = 9 bits
步驟1 : 21264CPU送出48 bits虛擬位址到Cache以供Tag檢查,並在同一時間將虛擬位址轉換為44 bits的實際位址. => Tag field = 44 -9 -6 = 29 bits
步驟2 : Index選擇,兩個Tag同時被比較,而比較結果相同者被選出,
步驟3 : 兩個標籤被由Cache讀出後,就與由CPU送出的區塊位址中的標籤部分比較.為了確定該標籤包含有效資訊, Valid bit必須為1,否則比較的結果就會被忽略.
步驟4 : 假設其中一個標籤比對符合,就會通知CPU利用比對成功的輸入從2:1Mux載入適當資料.
21264採用Write-back方法,並對每一block利用一個Dirty Bit來記錄其是否曾被寫入.如果該要被置換出的Block (Victim) 曾被修改過,它的資料以及位址就會被送進 Victim buffer.
CPU知道送出的是Instruction Address或Data address,所以不同類型的Address可以使用不同的port.如此可以將記憶體架構與CPU間的頻寬加倍. 分離式Cache也讓設計者可以針對每個Cache做最佳化(不同的容量、區塊大小、關連性都可調整). 而另一種 Unified Cache or mixed Cache are applied to caches that can contain either instructions or data.