Don't examine a packet's xcpt signal if it might be invalid. In this case,
the correct fix is to not examine xcpt at all; the deleted code was vestigial.
(Note, the other use of xcpt(j+1) in this code is indeed safe.)
The following sequence would drop the first store when eccBytes=4:
sb x0, 0(t0)
nop
sb x0, 4(t0)
nop
sb x0, 1(t0)
Because the first and second store are to different ECC granules, the
hazard check correctly allowed the second one to proceed, but the third
was merged with the second, even though it conflicted with the first.
So, don't allow the third to be merged with the second, since the second
stored to a different ECC granule.
We were using it to compute the next PC on flush vs. replay (which require
PC+4 and PC, respectively). This fix gets rid of the adder altogether by
reusing the M-stage PC in the flush case, which by construction holds PC+4.
That effectively increased the miss latency by 5 cycles when there was
a store hit followed by a load miss. Since pending stores are drained
when releaseInFlight, the check I removed was redundant.
When we first implemented TL, we thought this was helpful, because
it made WidthWidgets stateless in all cases. However, it put too
much burden on all other masters and slaves, none of which benefitted
from this signal. Furthermore, even with addr_lo, WidthWidgets were
information lossy because when they widen, they have no information
about what to fill in the new high bits of addr_lo.
We are overloading the BTB-hit signal to mean that any part of the frontend
changed the control-flow, not just the BTB. That's the right thing to do for
most of the control logic, but it means the BTB sometimes won't get refilled
when we'd like it to. This commit makes the frontend use an invalid BTB entry
number when it, rather than the BTB, changes the control flow. Since the
entry number is invalid, the BTB will treat it as a miss and refill itself.
This is kind of a hack, but a more palatable fix requires reworking the RVC
IBuf, which I don't have time for right now.