7935c61c19
Don't report to the DTIM that data is cacheable
...
Otherwise, it will attempt to perform AMOs where they're unsupported!
2017-08-08 11:55:04 -07:00
74d309c18e
Make I vs. D a static property of TLB, not an input pin
...
The microarchitecture doesn't really support unified TLBs, so don't fake it.
2017-08-08 11:54:47 -07:00
e92981b0bd
DRY
2017-08-08 11:46:38 -07:00
62ccba304c
Perform tag error detectoin/correction in same cycle as RAM
...
The tag RAMs tend to be fast, so take up some of the slack.
This makes s2_nack faster.
2017-08-08 10:21:30 -07:00
6d1d285464
Merge pull request #933 from freechipsproject/cinst
...
Print out the compressed instruction when executing one
2017-08-07 21:40:10 -07:00
402907990c
Revert "Remove one gate from D$ ECC check"
...
This reverts commit 7d94074b05
, which
works fine with optimistic behavioral RAMs but not real ones.
2017-08-07 17:33:20 -07:00
fc0d5fcf98
Print out the compressed instruction when executing one
2017-08-07 17:21:53 -07:00
658e36f98b
Reduce fanout on frontend io.cpu.req.valid signal
2017-08-06 17:38:51 -07:00
7d94074b05
Remove one gate from D$ ECC check
...
The D$ corrects via writeback, so which word the error was in doesn't
matter, as the entire line is corrected.
2017-08-06 17:36:53 -07:00
83875e3a0c
Only flush D$ on FENCE.I if it won't always be probed on I$ miss
2017-08-05 14:22:40 -07:00
991e16de92
Remove probe address mux from TLB response path
2017-08-05 12:57:38 -07:00
b9b4142bb4
Get s2_nack off the critical path
...
We were using it to compute the next PC on flush vs. replay (which require
PC+4 and PC, respectively). This fix gets rid of the adder altogether by
reusing the M-stage PC in the flush case, which by construction holds PC+4.
2017-08-05 00:30:36 -07:00
6112adfbb0
Get L2 TLB tag/parity check off the D$ arbitration path
2017-08-04 17:01:51 -07:00
8d97684555
Fix L2 TLB perfctr
...
It was counting conflict misses but not cold misses.
2017-08-04 17:01:31 -07:00
df7f09b9ce
Get I$ ECC check further off critical path
2017-08-04 16:59:21 -07:00
4bfbe75d74
Avoid pipeline replays when fetch queue is full
2017-08-04 16:59:21 -07:00
a45997d03f
Separate I$ parity error from miss signal
...
Handle parity errors with a pipeline flush rather than a faster
frontend replay, reducing a critical path.
2017-08-04 16:59:21 -07:00
06a831310b
Shave a gate delay off I$ backpressure path
...
The deleted code was a holdover from Hwacha's vector fences.
2017-08-04 13:12:43 -07:00
ecc2ee366c
Shave a few gate delays off IBuf control logic
...
It takes a while for the pipeline to compute the stall signal, so avoid
using it until the last logic levels in the clock cycle.
2017-08-04 13:12:43 -07:00
7937db0c84
Merge pull request #919 from freechipsproject/imiss-perf-counter
...
Fix I$ miss perfctr
2017-08-04 01:04:23 -07:00
ba4eecc0f0
Use UIntToOH1 ( #921 )
...
Closes #920
2017-08-03 14:55:39 -07:00
f483bab4aa
Fix I$ miss perfctr
...
The old version was counting prefetches, too.
2017-08-03 00:52:12 -07:00
1be1433f04
Merge pull request #918 from freechipsproject/icache-prefetch
...
Icache prefetch
2017-08-02 21:22:20 -07:00
2537d0d54e
Optionally prefetch next I$ line into L2$ on miss
2017-08-02 17:10:56 -07:00
744cdb2f72
Make TLB report when it's safe to prefetch within a page
2017-08-02 17:09:38 -07:00
7d2dd3769f
Optimize a hazard check critical path
2017-08-02 14:27:25 -07:00
2eb239d03f
Add option to retime D$ way mux into subsequent pipeline stage
2017-08-01 23:59:20 -07:00
9464c6db40
Mitigate(?) frontend critical path
2017-08-01 18:51:17 -07:00
735701382f
Mitigate some I$ response valid critical paths
2017-08-01 18:51:17 -07:00
2ecea2ef60
Don't use a pipe queue on D$ TL A-channel
...
This cuts an I$->D$ path.
2017-08-01 15:17:07 -07:00
5681693ccc
Fix a D$ ready-valid signaling regression
...
I broke this in 66d06460fa
.
2017-07-31 18:05:14 -07:00
7adfd5c431
Merge pull request #906 from freechipsproject/critical-paths
...
Mitigate I$->D$->I$ critical path
2017-07-31 16:14:11 -07:00
11332c1226
dcache: break potential combinatorial loop by making pstore_drain_on_miss more conservative
2017-07-31 14:03:30 -07:00
d811692c3b
Mitigate I$->D$->I$ critical path
...
This seemingly irrelevant change shaves several gate delays off the I$
tl.a.valid path.
2017-07-31 01:43:04 -07:00
ac4339a8e7
Pass D$ backpressure to D-channel, rather than asserting
2017-07-29 11:48:36 -07:00
edcd2c696c
Avoid needless stall on E-channel back pressure
2017-07-29 11:47:58 -07:00
fdb8935712
Improve fidelity of two perf counters
2017-07-28 13:14:04 -07:00
4c82f6b77e
Don't refill BTB on not-taken branches
2017-07-28 13:13:52 -07:00
2e8b02e780
Merge D$ store hits when ECC is enabled
...
This avoids pipeline flushes due to subword WAW hazards, as with
consecutive byte stores.
2017-07-28 12:56:36 -07:00
838864870e
Bypass TLB refill signal to halve L2 TLB hit time
...
The 4-cycle hit time is 1 cycle too long to avoid a second
pipeline replay, so it was effectively 9 cycles instead of 4.
2017-07-28 12:56:36 -07:00
ae1f7a95f6
Don't nack misses when there's a pending store
...
That effectively increased the miss latency by 5 cycles when there was
a store hit followed by a load miss. Since pending stores are drained
when releaseInFlight, the check I removed was redundant.
2017-07-28 12:56:36 -07:00
9804bdc34e
tilelink: remove obsolete addr_lo signal ( #895 )
...
When we first implemented TL, we thought this was helpful, because
it made WidthWidgets stateless in all cases. However, it put too
much burden on all other masters and slaves, none of which benefitted
from this signal. Furthermore, even with addr_lo, WidthWidgets were
information lossy because when they widen, they have no information
about what to fill in the new high bits of addr_lo.
2017-07-26 16:01:21 -07:00
5a5b78b15e
Improve L2 TLB coding style
2017-07-26 02:22:43 -07:00
5a9c673f41
Fix L2 TLB response bug
...
Sometimes, it would inform the L1 TLB that the translation was for
a superpage, even though that's never the case.
2017-07-26 02:20:41 -07:00
acca0fccf5
Fix BTB not being refilled on some indirect jumps
...
We are overloading the BTB-hit signal to mean that any part of the frontend
changed the control-flow, not just the BTB. That's the right thing to do for
most of the control logic, but it means the BTB sometimes won't get refilled
when we'd like it to. This commit makes the frontend use an invalid BTB entry
number when it, rather than the BTB, changes the control flow. Since the
entry number is invalid, the BTB will treat it as a miss and refill itself.
This is kind of a hack, but a more palatable fix requires reworking the RVC
IBuf, which I don't have time for right now.
2017-07-26 02:13:43 -07:00
15878d4691
Perform some control-flow transfers within the Frontend
2017-07-25 15:19:16 -07:00
62c4080585
Add RVC instruction patterns
2017-07-25 15:19:16 -07:00
66d06460fa
Add option for acquire-before-release
2017-07-25 15:19:16 -07:00
86ccd935fc
Add method to print perf events
2017-07-25 15:19:16 -07:00
5df8f0d1ea
Add L2 TLB miss counter
2017-07-25 15:19:16 -07:00