1
0
rocket-chip/src/main/scala/rocket/BTB.scala

275 lines
11 KiB
Scala
Raw Normal View History

// See LICENSE.Berkeley for license details.
// See LICENSE.SiFive for license details.
2014-09-13 03:06:41 +02:00
package rocket
import Chisel._
Heterogeneous Tiles (#550) Fundamental new features: * Added tile package: This package is intended to hold components re-usable across different types of tile. Will be the future location of TL2-RoCC accelerators and new diplomatic versions of intra-tile interfaces. * Adopted [ModuleName]Params convention: Code base was very inconsistent about what to name case classes that provide parameters to modules. Settled on calling them [ModuleName]Params to distinguish them from config.Parameters and config.Config. So far applied mostly only to case classes defined within rocket and tile. * Defined RocketTileParams: A nested case class containing case classes for all the components of a tile (L1 caches and core). Allows all such parameters to vary per-tile. * Defined RocketCoreParams: All the parameters that can be varied per-core. * Defined L1CacheParams: A trait defining the parameters common to L1 caches, made concrete in different derived case classes. * Defined RocketTilesKey: A sequence of RocketTileParams, one for every tile to be created. * Provided HeterogeneousDualCoreConfig: An example of making a heterogeneous chip with two cores, one big and one little. * Changes to legacy code: ReplacementPolicy moved to package util. L1Metadata moved to package tile. Legacy L2 cache agent removed because it can no longer share the metadata array implementation with the L1. Legacy GroundTests on life support. Additional changes that got rolled in along the way: * rocket: Fix critical path through BTB for I$ index bits > pgIdxBits * coreplex: tiles connected via :=* * groundtest: updated to use TileParams * tilelink: cache cork requirements are relaxed to allow more cacheless masters
2017-02-09 22:59:09 +01:00
import Chisel.ImplicitConversions._
import config._
Heterogeneous Tiles (#550) Fundamental new features: * Added tile package: This package is intended to hold components re-usable across different types of tile. Will be the future location of TL2-RoCC accelerators and new diplomatic versions of intra-tile interfaces. * Adopted [ModuleName]Params convention: Code base was very inconsistent about what to name case classes that provide parameters to modules. Settled on calling them [ModuleName]Params to distinguish them from config.Parameters and config.Config. So far applied mostly only to case classes defined within rocket and tile. * Defined RocketTileParams: A nested case class containing case classes for all the components of a tile (L1 caches and core). Allows all such parameters to vary per-tile. * Defined RocketCoreParams: All the parameters that can be varied per-core. * Defined L1CacheParams: A trait defining the parameters common to L1 caches, made concrete in different derived case classes. * Defined RocketTilesKey: A sequence of RocketTileParams, one for every tile to be created. * Provided HeterogeneousDualCoreConfig: An example of making a heterogeneous chip with two cores, one big and one little. * Changes to legacy code: ReplacementPolicy moved to package util. L1Metadata moved to package tile. Legacy L2 cache agent removed because it can no longer share the metadata array implementation with the L1. Legacy GroundTests on life support. Additional changes that got rolled in along the way: * rocket: Fix critical path through BTB for I$ index bits > pgIdxBits * coreplex: tiles connected via :=* * groundtest: updated to use TileParams * tilelink: cache cork requirements are relaxed to allow more cacheless masters
2017-02-09 22:59:09 +01:00
import tile.HasCoreParameters
import util._
2015-10-17 04:11:57 +02:00
Heterogeneous Tiles (#550) Fundamental new features: * Added tile package: This package is intended to hold components re-usable across different types of tile. Will be the future location of TL2-RoCC accelerators and new diplomatic versions of intra-tile interfaces. * Adopted [ModuleName]Params convention: Code base was very inconsistent about what to name case classes that provide parameters to modules. Settled on calling them [ModuleName]Params to distinguish them from config.Parameters and config.Config. So far applied mostly only to case classes defined within rocket and tile. * Defined RocketTileParams: A nested case class containing case classes for all the components of a tile (L1 caches and core). Allows all such parameters to vary per-tile. * Defined RocketCoreParams: All the parameters that can be varied per-core. * Defined L1CacheParams: A trait defining the parameters common to L1 caches, made concrete in different derived case classes. * Defined RocketTilesKey: A sequence of RocketTileParams, one for every tile to be created. * Provided HeterogeneousDualCoreConfig: An example of making a heterogeneous chip with two cores, one big and one little. * Changes to legacy code: ReplacementPolicy moved to package util. L1Metadata moved to package tile. Legacy L2 cache agent removed because it can no longer share the metadata array implementation with the L1. Legacy GroundTests on life support. Additional changes that got rolled in along the way: * rocket: Fix critical path through BTB for I$ index bits > pgIdxBits * coreplex: tiles connected via :=* * groundtest: updated to use TileParams * tilelink: cache cork requirements are relaxed to allow more cacheless masters
2017-02-09 22:59:09 +01:00
case class BTBParams(
nEntries: Int = 40,
nMatchBits: Int = 14,
nPages: Int = 6,
2015-10-17 04:11:57 +02:00
nRAS: Int = 2,
updatesOutOfOrder: Boolean = false)
2015-10-06 06:48:05 +02:00
Heterogeneous Tiles (#550) Fundamental new features: * Added tile package: This package is intended to hold components re-usable across different types of tile. Will be the future location of TL2-RoCC accelerators and new diplomatic versions of intra-tile interfaces. * Adopted [ModuleName]Params convention: Code base was very inconsistent about what to name case classes that provide parameters to modules. Settled on calling them [ModuleName]Params to distinguish them from config.Parameters and config.Config. So far applied mostly only to case classes defined within rocket and tile. * Defined RocketTileParams: A nested case class containing case classes for all the components of a tile (L1 caches and core). Allows all such parameters to vary per-tile. * Defined RocketCoreParams: All the parameters that can be varied per-core. * Defined L1CacheParams: A trait defining the parameters common to L1 caches, made concrete in different derived case classes. * Defined RocketTilesKey: A sequence of RocketTileParams, one for every tile to be created. * Provided HeterogeneousDualCoreConfig: An example of making a heterogeneous chip with two cores, one big and one little. * Changes to legacy code: ReplacementPolicy moved to package util. L1Metadata moved to package tile. Legacy L2 cache agent removed because it can no longer share the metadata array implementation with the L1. Legacy GroundTests on life support. Additional changes that got rolled in along the way: * rocket: Fix critical path through BTB for I$ index bits > pgIdxBits * coreplex: tiles connected via :=* * groundtest: updated to use TileParams * tilelink: cache cork requirements are relaxed to allow more cacheless masters
2017-02-09 22:59:09 +01:00
trait HasBtbParameters extends HasCoreParameters {
val btbParams = tileParams.btb.getOrElse(BTBParams(nEntries = 0))
val matchBits = btbParams.nMatchBits max log2Ceil(p(coreplex.CacheBlockBytes) * tileParams.icache.get.nSets)
Heterogeneous Tiles (#550) Fundamental new features: * Added tile package: This package is intended to hold components re-usable across different types of tile. Will be the future location of TL2-RoCC accelerators and new diplomatic versions of intra-tile interfaces. * Adopted [ModuleName]Params convention: Code base was very inconsistent about what to name case classes that provide parameters to modules. Settled on calling them [ModuleName]Params to distinguish them from config.Parameters and config.Config. So far applied mostly only to case classes defined within rocket and tile. * Defined RocketTileParams: A nested case class containing case classes for all the components of a tile (L1 caches and core). Allows all such parameters to vary per-tile. * Defined RocketCoreParams: All the parameters that can be varied per-core. * Defined L1CacheParams: A trait defining the parameters common to L1 caches, made concrete in different derived case classes. * Defined RocketTilesKey: A sequence of RocketTileParams, one for every tile to be created. * Provided HeterogeneousDualCoreConfig: An example of making a heterogeneous chip with two cores, one big and one little. * Changes to legacy code: ReplacementPolicy moved to package util. L1Metadata moved to package tile. Legacy L2 cache agent removed because it can no longer share the metadata array implementation with the L1. Legacy GroundTests on life support. Additional changes that got rolled in along the way: * rocket: Fix critical path through BTB for I$ index bits > pgIdxBits * coreplex: tiles connected via :=* * groundtest: updated to use TileParams * tilelink: cache cork requirements are relaxed to allow more cacheless masters
2017-02-09 22:59:09 +01:00
val entries = btbParams.nEntries
val nRAS = btbParams.nRAS
val updatesOutOfOrder = btbParams.updatesOutOfOrder
val nPages = (btbParams.nPages + 1) / 2 * 2 // control logic assumes 2 divides pages
val opaqueBits = log2Up(entries)
val nBHT = 1 << log2Up(entries*2)
}
2014-04-08 00:58:49 +02:00
2015-10-06 06:48:05 +02:00
abstract class BtbModule(implicit val p: Parameters) extends Module with HasBtbParameters
abstract class BtbBundle(implicit val p: Parameters) extends ParameterizedBundle()(p)
with HasBtbParameters
class RAS(nras: Int) {
2014-04-08 00:58:49 +02:00
def push(addr: UInt): Unit = {
when (count < nras) { count := count + 1 }
val nextPos = Mux(Bool(isPow2(nras)) || pos < nras-1, pos+1, UInt(0))
2014-04-08 00:58:49 +02:00
stack(nextPos) := addr
pos := nextPos
}
def peek: UInt = stack(pos)
def pop(): Unit = when (!isEmpty) {
2014-04-08 00:58:49 +02:00
count := count - 1
pos := Mux(Bool(isPow2(nras)) || pos > 0, pos-1, UInt(nras-1))
2014-04-08 00:58:49 +02:00
}
def clear(): Unit = count := UInt(0)
2014-04-08 00:58:49 +02:00
def isEmpty: Bool = count === UInt(0)
private val count = Reg(UInt(width = log2Up(nras+1)))
private val pos = Reg(UInt(width = log2Up(nras)))
2016-01-14 22:57:45 +01:00
private val stack = Reg(Vec(nras, UInt()))
2014-04-08 00:58:49 +02:00
}
2015-10-06 06:48:05 +02:00
class BHTResp(implicit p: Parameters) extends BtbBundle()(p) {
val history = UInt(width = log2Up(nBHT).max(1))
2014-04-08 00:58:49 +02:00
val value = UInt(width = 2)
}
// BHT contains table of 2-bit counters and a global history register.
// The BHT only predicts and updates when there is a BTB hit.
// The global history:
// - updated speculatively in fetch (if there's a BTB hit).
// - on a mispredict, the history register is reset (again, only if BTB hit).
// The counter table:
// - each counter corresponds with the address of the fetch packet ("fetch pc").
2014-10-04 01:08:08 +02:00
// - updated when a branch resolves (and BTB was a hit for that branch).
// The updating branch must provide its "fetch pc".
class BHT(nbht: Int)(implicit val p: Parameters) extends HasCoreParameters {
val nbhtbits = log2Up(nbht)
def get(addr: UInt, update: Bool): BHTResp = {
2015-07-16 05:24:18 +02:00
val res = Wire(new BHTResp)
val index = addr(nbhtbits+1, log2Up(coreInstBytes)) ^ history
2014-10-04 01:08:08 +02:00
res.value := table(index)
res.history := history
val taken = res.value(0)
when (update) { history := Cat(taken, history(nbhtbits-1,1)) }
2014-04-08 00:58:49 +02:00
res
}
def update(addr: UInt, d: BHTResp, taken: Bool, mispredict: Bool): Unit = {
val index = addr(nbhtbits+1, log2Up(coreInstBytes)) ^ d.history
2014-10-04 01:08:08 +02:00
table(index) := Cat(taken, (d.value(1) & d.value(0)) | ((d.value(1) | d.value(0)) & taken))
when (mispredict) { history := Cat(taken, d.history(nbhtbits-1,1)) }
2014-04-08 00:58:49 +02:00
}
2015-09-30 23:36:26 +02:00
private val table = Mem(nbht, UInt(width = 2))
val history = Reg(UInt(width = nbhtbits))
2014-04-02 00:01:27 +02:00
}
// BTB update occurs during branch resolution (and only on a mispredict).
// - "pc" is what future fetch PCs will tag match against.
// - "br_pc" is the PC of the branch instruction.
2015-10-06 06:48:05 +02:00
class BTBUpdate(implicit p: Parameters) extends BtbBundle()(p) {
2014-04-02 00:01:27 +02:00
val prediction = Valid(new BTBResp)
val pc = UInt(width = vaddrBits)
val target = UInt(width = vaddrBits)
2014-04-02 00:01:27 +02:00
val taken = Bool()
val isValid = Bool()
2014-04-08 00:58:49 +02:00
val isJump = Bool()
2014-04-02 00:01:27 +02:00
val isReturn = Bool()
val br_pc = UInt(width = vaddrBits)
2014-04-02 00:01:27 +02:00
}
// BHT update occurs during branch resolution on all conditional branches.
// - "pc" is what future fetch PCs will tag match against.
2015-10-06 06:48:05 +02:00
class BHTUpdate(implicit p: Parameters) extends BtbBundle()(p) {
val prediction = Valid(new BTBResp)
val pc = UInt(width = vaddrBits)
val taken = Bool()
val mispredict = Bool()
}
2015-10-06 06:48:05 +02:00
class RASUpdate(implicit p: Parameters) extends BtbBundle()(p) {
val isCall = Bool()
val isReturn = Bool()
val returnAddr = UInt(width = vaddrBits)
val prediction = Valid(new BTBResp)
}
// - "bridx" is the low-order PC bits of the predicted branch (after
// shifting off the lowest log(inst_bytes) bits off).
// - "mask" provides a mask of valid instructions (instructions are
// masked off by the predicted taken branch from the BTB).
2015-10-06 06:48:05 +02:00
class BTBResp(implicit p: Parameters) extends BtbBundle()(p) {
2014-04-02 00:01:27 +02:00
val taken = Bool()
2015-10-06 06:48:05 +02:00
val mask = Bits(width = fetchWidth)
val bridx = Bits(width = log2Up(fetchWidth))
val target = UInt(width = vaddrBits)
val entry = UInt(width = opaqueBits)
2014-04-08 00:58:49 +02:00
val bht = new BHTResp
2014-04-02 00:01:27 +02:00
}
2015-10-06 06:48:05 +02:00
class BTBReq(implicit p: Parameters) extends BtbBundle()(p) {
val addr = UInt(width = vaddrBits)
}
// fully-associative branch target buffer
// Higher-performance processors may cause BTB updates to occur out-of-order,
// which requires an extra CAM port for updates (to ensure no duplicates get
// placed in BTB).
2015-10-06 06:48:05 +02:00
class BTB(implicit p: Parameters) extends BtbModule {
val io = new Bundle {
val req = Valid(new BTBReq).flip
2014-04-02 00:01:27 +02:00
val resp = Valid(new BTBResp)
val btb_update = Valid(new BTBUpdate).flip
val bht_update = Valid(new BHTUpdate).flip
val ras_update = Valid(new RASUpdate).flip
}
2016-07-15 02:10:27 +02:00
val idxs = Reg(Vec(entries, UInt(width=matchBits - log2Up(coreInstBytes))))
val idxPages = Reg(Vec(entries, UInt(width=log2Up(nPages))))
2016-07-15 02:10:27 +02:00
val tgts = Reg(Vec(entries, UInt(width=matchBits - log2Up(coreInstBytes))))
val tgtPages = Reg(Vec(entries, UInt(width=log2Up(nPages))))
2016-07-15 02:10:27 +02:00
val pages = Reg(Vec(nPages, UInt(width=vaddrBits - matchBits)))
2016-07-07 00:54:33 +02:00
val pageValid = Reg(init = UInt(0, nPages))
2014-04-02 00:01:27 +02:00
val isValid = Reg(init = UInt(0, entries))
val isReturn = Reg(UInt(width = entries))
val isJump = Reg(UInt(width = entries))
2016-07-15 02:10:27 +02:00
val brIdx = Reg(Vec(entries, UInt(width=log2Up(fetchWidth))))
private def page(addr: UInt) = addr >> matchBits
private def pageMatch(addr: UInt) = {
val p = page(addr)
pageValid & pages.map(_ === p).asUInt
}
private def idxMatch(addr: UInt) = {
val idx = addr(matchBits-1, log2Up(coreInstBytes))
idxs.map(_ === idx).asUInt & isValid
}
val r_btb_update = Pipe(io.btb_update)
val update_target = io.req.bits.addr
2014-04-02 00:01:27 +02:00
val pageHit = pageMatch(io.req.bits.addr)
val idxHit = idxMatch(io.req.bits.addr)
val updatePageHit = pageMatch(r_btb_update.bits.pc)
val (updateHit, updateHitAddr) =
if (updatesOutOfOrder) {
val updateHits = (pageHit << 1)(Mux1H(idxMatch(r_btb_update.bits.pc), idxPages))
(updateHits.orR, OHToUInt(updateHits))
} else (r_btb_update.bits.prediction.valid, r_btb_update.bits.prediction.bits.entry)
2014-04-02 00:01:27 +02:00
val useUpdatePageHit = updatePageHit.orR
val usePageHit = pageHit.orR
val doIdxPageRepl = !useUpdatePageHit
2016-07-07 00:54:33 +02:00
val nextPageRepl = Reg(UInt(width = log2Ceil(nPages)))
val idxPageRepl = Cat(pageHit(nPages-2,0), pageHit(nPages-1)) | Mux(usePageHit, UInt(0), UIntToOH(nextPageRepl))
val idxPageUpdateOH = Mux(useUpdatePageHit, updatePageHit, idxPageRepl)
2014-04-08 00:58:49 +02:00
val idxPageUpdate = OHToUInt(idxPageUpdateOH)
val idxPageReplEn = Mux(doIdxPageRepl, idxPageRepl, UInt(0))
val samePage = page(r_btb_update.bits.pc) === page(update_target)
val doTgtPageRepl = !samePage && !usePageHit
val tgtPageRepl = Mux(samePage, idxPageUpdateOH, Cat(idxPageUpdateOH(nPages-2,0), idxPageUpdateOH(nPages-1)))
val tgtPageUpdate = OHToUInt(pageHit | Mux(usePageHit, UInt(0), tgtPageRepl))
val tgtPageReplEn = Mux(doTgtPageRepl, tgtPageRepl, UInt(0))
when (r_btb_update.valid && (doIdxPageRepl || doTgtPageRepl)) {
val both = doIdxPageRepl && doTgtPageRepl
val next = nextPageRepl + Mux[UInt](both, 2, 1)
nextPageRepl := Mux(next >= nPages, next(0), next)
}
2014-04-08 00:58:49 +02:00
when (r_btb_update.valid) {
val nextRepl = Counter(r_btb_update.valid && !updateHit, entries)._1
val waddr = Mux(updateHit, updateHitAddr, nextRepl)
2016-07-07 00:54:33 +02:00
val mask = UIntToOH(waddr)
idxs(waddr) := r_btb_update.bits.pc(matchBits-1, log2Up(coreInstBytes))
2016-07-15 02:10:27 +02:00
tgts(waddr) := update_target(matchBits-1, log2Up(coreInstBytes))
idxPages(waddr) := idxPageUpdate +& 1 // the +1 corresponds to the <<1 on io.resp.valid
tgtPages(waddr) := tgtPageUpdate
isValid := Mux(r_btb_update.bits.isValid, isValid | mask, isValid & ~mask)
isReturn := Mux(r_btb_update.bits.isReturn, isReturn | mask, isReturn & ~mask)
isJump := Mux(r_btb_update.bits.isJump, isJump | mask, isJump & ~mask)
2016-07-05 08:43:25 +02:00
if (fetchWidth > 1)
2015-10-06 06:48:05 +02:00
brIdx(waddr) := r_btb_update.bits.br_pc >> log2Up(coreInstBytes)
require(nPages % 2 == 0)
val idxWritesEven = !idxPageUpdate(0)
def writeBank(i: Int, mod: Int, en: UInt, data: UInt) =
for (i <- i until nPages by mod)
when (en(i)) { pages(i) := data }
2014-05-19 04:25:43 +02:00
writeBank(0, 2, Mux(idxWritesEven, idxPageReplEn, tgtPageReplEn),
Mux(idxWritesEven, page(r_btb_update.bits.pc), page(update_target)))
writeBank(1, 2, Mux(idxWritesEven, tgtPageReplEn, idxPageReplEn),
Mux(idxWritesEven, page(update_target), page(r_btb_update.bits.pc)))
2016-07-07 00:54:33 +02:00
pageValid := pageValid | tgtPageReplEn | idxPageReplEn
}
io.resp.valid := (pageHit << 1)(Mux1H(idxHit, idxPages))
io.resp.bits.taken := true
io.resp.bits.target := Cat(pages(Mux1H(idxHit, tgtPages)), Mux1H(idxHit, tgts) << log2Up(coreInstBytes))
io.resp.bits.entry := OHToUInt(idxHit)
io.resp.bits.bridx := (if (fetchWidth > 1) Mux1H(idxHit, brIdx) else UInt(0))
2016-07-05 08:43:25 +02:00
io.resp.bits.mask := Cat((UInt(1) << ~Mux(io.resp.bits.taken, ~io.resp.bits.bridx, UInt(0)))-1, UInt(1))
2014-04-08 00:58:49 +02:00
// if multiple entries for same PC land in BTB, zap them
when (PopCountAtLeast(idxHit, 2)) {
isValid := isValid & ~idxHit
}
if (nBHT > 0) {
2014-10-04 01:08:08 +02:00
val bht = new BHT(nBHT)
val isBranch = !(idxHit & isJump).orR
2015-07-30 00:03:01 +02:00
val res = bht.get(io.req.bits.addr, io.req.valid && io.resp.valid && isBranch)
val update_btb_hit = io.bht_update.bits.prediction.valid
when (io.bht_update.valid && update_btb_hit) {
bht.update(io.bht_update.bits.pc, io.bht_update.bits.prediction.bits.bht, io.bht_update.bits.taken, io.bht_update.bits.mispredict)
}
2015-07-30 00:03:01 +02:00
when (!res.value(0) && isBranch) { io.resp.bits.taken := false }
2014-04-08 00:58:49 +02:00
io.resp.bits.bht := res
}
2014-04-02 00:01:27 +02:00
if (nRAS > 0) {
val ras = new RAS(nRAS)
val doPeek = (idxHit & isReturn).orR
2014-04-08 08:47:53 +02:00
when (!ras.isEmpty && doPeek) {
2014-04-08 00:58:49 +02:00
io.resp.bits.target := ras.peek
2014-04-02 00:01:27 +02:00
}
when (io.ras_update.valid) {
when (io.ras_update.bits.isCall) {
ras.push(io.ras_update.bits.returnAddr)
2014-04-08 08:47:53 +02:00
when (doPeek) {
io.resp.bits.target := io.ras_update.bits.returnAddr
2014-04-08 08:47:53 +02:00
}
}.elsewhen (io.ras_update.bits.isReturn && io.ras_update.bits.prediction.valid) {
ras.pop()
2014-04-08 00:58:49 +02:00
}
2014-04-02 00:01:27 +02:00
}
}
}