1
0
Commit Graph

467 Commits

Author SHA1 Message Date
Andrew Waterman
82e13443b2 Merge pull request #937 from freechipsproject/critical-paths
Perform tag error detectoin/correction in same cycle as RAM
2017-08-08 15:03:28 -07:00
Andrew Waterman
7935c61c19 Don't report to the DTIM that data is cacheable
Otherwise, it will attempt to perform AMOs where they're unsupported!
2017-08-08 11:55:04 -07:00
Andrew Waterman
74d309c18e Make I vs. D a static property of TLB, not an input pin
The microarchitecture doesn't really support unified TLBs, so don't fake it.
2017-08-08 11:54:47 -07:00
Andrew Waterman
e92981b0bd DRY 2017-08-08 11:46:38 -07:00
Andrew Waterman
62ccba304c Perform tag error detectoin/correction in same cycle as RAM
The tag RAMs tend to be fast, so take up some of the slack.
This makes s2_nack faster.
2017-08-08 10:21:30 -07:00
Palmer Dabbelt
6d1d285464 Merge pull request #933 from freechipsproject/cinst
Print out the compressed instruction when executing one
2017-08-07 21:40:10 -07:00
Andrew Waterman
402907990c Revert "Remove one gate from D$ ECC check"
This reverts commit 7d94074b05, which
works fine with optimistic behavioral RAMs but not real ones.
2017-08-07 17:33:20 -07:00
Palmer Dabbelt
fc0d5fcf98 Print out the compressed instruction when executing one 2017-08-07 17:21:53 -07:00
Andrew Waterman
658e36f98b Reduce fanout on frontend io.cpu.req.valid signal 2017-08-06 17:38:51 -07:00
Andrew Waterman
7d94074b05 Remove one gate from D$ ECC check
The D$ corrects via writeback, so which word the error was in doesn't
matter, as the entire line is corrected.
2017-08-06 17:36:53 -07:00
Andrew Waterman
83875e3a0c Only flush D$ on FENCE.I if it won't always be probed on I$ miss 2017-08-05 14:22:40 -07:00
Andrew Waterman
991e16de92 Remove probe address mux from TLB response path 2017-08-05 12:57:38 -07:00
Andrew Waterman
b9b4142bb4 Get s2_nack off the critical path
We were using it to compute the next PC on flush vs. replay (which require
PC+4 and PC, respectively).  This fix gets rid of the adder altogether by
reusing the M-stage PC in the flush case, which by construction holds PC+4.
2017-08-05 00:30:36 -07:00
Andrew Waterman
6112adfbb0 Get L2 TLB tag/parity check off the D$ arbitration path 2017-08-04 17:01:51 -07:00
Andrew Waterman
8d97684555 Fix L2 TLB perfctr
It was counting conflict misses but not cold misses.
2017-08-04 17:01:31 -07:00
Andrew Waterman
df7f09b9ce Get I$ ECC check further off critical path 2017-08-04 16:59:21 -07:00
Andrew Waterman
4bfbe75d74 Avoid pipeline replays when fetch queue is full 2017-08-04 16:59:21 -07:00
Andrew Waterman
a45997d03f Separate I$ parity error from miss signal
Handle parity errors with a pipeline flush rather than a faster
frontend replay, reducing a critical path.
2017-08-04 16:59:21 -07:00
Andrew Waterman
06a831310b Shave a gate delay off I$ backpressure path
The deleted code was a holdover from Hwacha's vector fences.
2017-08-04 13:12:43 -07:00
Andrew Waterman
ecc2ee366c Shave a few gate delays off IBuf control logic
It takes a while for the pipeline to compute the stall signal, so avoid
using it until the last logic levels in the clock cycle.
2017-08-04 13:12:43 -07:00
Andrew Waterman
7937db0c84 Merge pull request #919 from freechipsproject/imiss-perf-counter
Fix I$ miss perfctr
2017-08-04 01:04:23 -07:00
Andrew Waterman
ba4eecc0f0 Use UIntToOH1 (#921)
Closes #920
2017-08-03 14:55:39 -07:00
Andrew Waterman
f483bab4aa Fix I$ miss perfctr
The old version was counting prefetches, too.
2017-08-03 00:52:12 -07:00
Andrew Waterman
1be1433f04 Merge pull request #918 from freechipsproject/icache-prefetch
Icache prefetch
2017-08-02 21:22:20 -07:00
Andrew Waterman
2537d0d54e Optionally prefetch next I$ line into L2$ on miss 2017-08-02 17:10:56 -07:00
Andrew Waterman
744cdb2f72 Make TLB report when it's safe to prefetch within a page 2017-08-02 17:09:38 -07:00
Andrew Waterman
7d2dd3769f Optimize a hazard check critical path 2017-08-02 14:27:25 -07:00
Andrew Waterman
2eb239d03f Add option to retime D$ way mux into subsequent pipeline stage 2017-08-01 23:59:20 -07:00
Andrew Waterman
9464c6db40 Mitigate(?) frontend critical path 2017-08-01 18:51:17 -07:00
Andrew Waterman
735701382f Mitigate some I$ response valid critical paths 2017-08-01 18:51:17 -07:00
Andrew Waterman
2ecea2ef60 Don't use a pipe queue on D$ TL A-channel
This cuts an I$->D$ path.
2017-08-01 15:17:07 -07:00
Andrew Waterman
5681693ccc Fix a D$ ready-valid signaling regression
I broke this in 66d06460fa.
2017-07-31 18:05:14 -07:00
Yunsup Lee
7adfd5c431 Merge pull request #906 from freechipsproject/critical-paths
Mitigate I$->D$->I$ critical path
2017-07-31 16:14:11 -07:00
Henry Cook
11332c1226 dcache: break potential combinatorial loop by making pstore_drain_on_miss more conservative 2017-07-31 14:03:30 -07:00
Andrew Waterman
d811692c3b Mitigate I$->D$->I$ critical path
This seemingly irrelevant change shaves several gate delays off the I$
tl.a.valid path.
2017-07-31 01:43:04 -07:00
Andrew Waterman
ac4339a8e7 Pass D$ backpressure to D-channel, rather than asserting 2017-07-29 11:48:36 -07:00
Andrew Waterman
edcd2c696c Avoid needless stall on E-channel back pressure 2017-07-29 11:47:58 -07:00
Andrew Waterman
fdb8935712 Improve fidelity of two perf counters 2017-07-28 13:14:04 -07:00
Andrew Waterman
4c82f6b77e Don't refill BTB on not-taken branches 2017-07-28 13:13:52 -07:00
Andrew Waterman
2e8b02e780 Merge D$ store hits when ECC is enabled
This avoids pipeline flushes due to subword WAW hazards, as with
consecutive byte stores.
2017-07-28 12:56:36 -07:00
Andrew Waterman
838864870e Bypass TLB refill signal to halve L2 TLB hit time
The 4-cycle hit time is 1 cycle too long to avoid a second
pipeline replay, so it was effectively 9 cycles instead of 4.
2017-07-28 12:56:36 -07:00
Andrew Waterman
ae1f7a95f6 Don't nack misses when there's a pending store
That effectively increased the miss latency by 5 cycles when there was
a store hit followed by a load miss.  Since pending stores are drained
when releaseInFlight, the check I removed was redundant.
2017-07-28 12:56:36 -07:00
Wesley W. Terpstra
9804bdc34e tilelink: remove obsolete addr_lo signal (#895)
When we first implemented TL, we thought this was helpful, because
it made WidthWidgets stateless in all cases. However, it put too
much burden on all other masters and slaves, none of which benefitted
from this signal. Furthermore, even with addr_lo, WidthWidgets were
information lossy because when they widen, they have no information
about what to fill in the new high bits of addr_lo.
2017-07-26 16:01:21 -07:00
Andrew Waterman
5a5b78b15e Improve L2 TLB coding style 2017-07-26 02:22:43 -07:00
Andrew Waterman
5a9c673f41 Fix L2 TLB response bug
Sometimes, it would inform the L1 TLB that the translation was for
a superpage, even though that's never the case.
2017-07-26 02:20:41 -07:00
Andrew Waterman
acca0fccf5 Fix BTB not being refilled on some indirect jumps
We are overloading the BTB-hit signal to mean that any part of the frontend
changed the control-flow, not just the BTB.  That's the right thing to do for
most of the control logic, but it means the BTB sometimes won't get refilled
when we'd like it to.  This commit makes the frontend use an invalid BTB entry
number when it, rather than the BTB, changes the control flow.  Since the
entry number is invalid, the BTB will treat it as a miss and refill itself.

This is kind of a hack, but a more palatable fix requires reworking the RVC
IBuf, which I don't have time for right now.
2017-07-26 02:13:43 -07:00
Andrew Waterman
15878d4691 Perform some control-flow transfers within the Frontend 2017-07-25 15:19:16 -07:00
Andrew Waterman
62c4080585 Add RVC instruction patterns 2017-07-25 15:19:16 -07:00
Andrew Waterman
66d06460fa Add option for acquire-before-release 2017-07-25 15:19:16 -07:00
Andrew Waterman
86ccd935fc Add method to print perf events 2017-07-25 15:19:16 -07:00