AMO.aq should be implemented as AMO;FENCE, whereas AMO.rl should be
implemented as FENCE;AMO. These had been swapped. This error does
not affect cacheable accesses using the blocking D$, nor does it
affect accesses to the data scratchpad, nor does it affect accesses
to strongly ordered I/O regions (which is the default).
Cacheable accesses using the nonblocking D$ and accesses to weakly
ordered I/O regions may manifest memory-ordering violations. For
these accesses, the workaround is to use AMO.aqrl whenever AMO.aq
or AMO.rl had been used.
This proposal hasn't been adopted yet, but anything is better than the
current implementation, where clearing misa.C when the PC is misaligned
is effectively undefined.
In Rocket, debug triggers are supposed to happen before a store
occurs, rather than after. Previously, we reported the exception
on the store's PC, but the store occurred anyway. This probably
hasn't been problematic in practice because most stores are
idempotent.
The TL port can easily starve the processor, even at only 20% utilization,
because of a bad interaction with the pipeline. Giving the processor
static priority is OK in practice, since <50% of instructions are loads
and stores in typical workloads. Even if it executes 100% loads and stores,
it must eventually encounter an I$ miss, taken branch, or exception, so
even malicious code can't permanently starve the TL port.
Prior to this PR, the error device was allowed to be cached by
multiple actors despite never probing any of them. This is a
pretty unusual set of properties that has caused us trouble
several times now in the past.
Let's instead put the Error device into one of two very well
established categories: a straight-up MMIO device or a tracked
memory region.