Despite the potential advantages of a unified cache which is used in the 80486 processor, the Pentium microprocessor uses separate code and data caches.
The reason is that the superscalar design and branch prediction demand more bandwidth than a unified cache.
First, efficient branch prediction requires that the destination of a branch be accessed simultaneously with data references of previous instructions executing in the pipeline.
Second, the parallel execution of data memory references requires simultaneous accesses for loads and stores.
Third, in the context of the overall Pentium microprocessor design, handling self-modifying code for separate code and data caches is only marginally more complex than for a unified cache.
The data and instruction caches on the Pentium processor are each 8 KB, two-way associative designs with 32 byte lines. Each cache has a dedicated translation lookaside Buffer (TLB) to translate linear addresses to physical addresses.
The caches can be enabled or disabled by software or hardware. The Pentium microprocessor implements the data cache to supports dual accesses by the U-pipe and V-pipe to provide additional bandwidth and simplify compiler instruction scheduling algorithms.
The data cache is write back or write through configured on a line-by-line basis and follows the MESI protocol. The data cache tags arc triple ported to support two data transfers and an inquire cycle in the same clock.
The code cache is an inherent write protected cache. The code cache tags of the Pentium processor are also triple ported to support snooping and split-line accesses.
The data path, however, is single ported with eight way interleaving of 32-bit-wide banks. When a bank conflict occurs, the U-pipe assumes software or hardware.
The Pentium microprocessor implements the data cache to supports dual accesses by the U-pipe and V-pipe to provide additional bandwidth and simplify compiler instruction scheduling algorithms.
The data cache is write back or write through configured on a line-by-line basis and follows the MESI protocol.
The data cache tags are triple ported to support two data transfers and an inquire cycle in the same clock.
The code cache is an inherently write protected cache.
The code cache tags of the Pentium processor are also triple ported to support snooping and split-line accesses.
The data path, however, is single ported with eight way interleaving of 32-bit-wide banks.
When a bank conflict occurs, the U-pipe assumes priority, and the V-pipe stalls for a clock cycle.
The bank conflict logic also serves to eliminate data dependencies between parallel memory references to a single location.
For memory references to double-precision floating-point data, the processor accesses consecutive banks in parallel, forming a single 64-bit path.
Translation lookaside buffers (TLB): Besides general-purpose caches, X86 processors include caches called Translation Lookaside Buffers (TLB) to speed up linear address translation. When a linear address is used for the first time, the corresponding physical address is computed through slow accesses to the page tables in RAM. The physical address is then stored in a TLB entry so that further references to the same linear address are quickly translated. When the CR_3 control register is modified, the hardware automatically invalidates all entries of the TLB.