- Updated in fetch speculatively.
* Updates gated off by cpu.resp.fire().
* BTB direction factored into history update.
- All branches update the BHT.
- Each instruction carries history; index into BHT is recomputed by
passing in mem_reg_pc.
BTB entries reference a small number of unique pages, so we separate the
storage of pages from indices. This makes much larger BTBs feasible. It's
easy to exacerbate cycle time this way, so one-hot encoding is used as needed.