1
0
Commit Graph

388 Commits

Author SHA1 Message Date
Andrew Waterman
735701382f Mitigate some I$ response valid critical paths 2017-08-01 18:51:17 -07:00
Andrew Waterman
2ecea2ef60 Don't use a pipe queue on D$ TL A-channel
This cuts an I$->D$ path.
2017-08-01 15:17:07 -07:00
Andrew Waterman
5681693ccc Fix a D$ ready-valid signaling regression
I broke this in 66d06460fa.
2017-07-31 18:05:14 -07:00
Yunsup Lee
7adfd5c431 Merge pull request #906 from freechipsproject/critical-paths
Mitigate I$->D$->I$ critical path
2017-07-31 16:14:11 -07:00
Henry Cook
11332c1226 dcache: break potential combinatorial loop by making pstore_drain_on_miss more conservative 2017-07-31 14:03:30 -07:00
Andrew Waterman
d811692c3b Mitigate I$->D$->I$ critical path
This seemingly irrelevant change shaves several gate delays off the I$
tl.a.valid path.
2017-07-31 01:43:04 -07:00
Andrew Waterman
ac4339a8e7 Pass D$ backpressure to D-channel, rather than asserting 2017-07-29 11:48:36 -07:00
Andrew Waterman
edcd2c696c Avoid needless stall on E-channel back pressure 2017-07-29 11:47:58 -07:00
Andrew Waterman
fdb8935712 Improve fidelity of two perf counters 2017-07-28 13:14:04 -07:00
Andrew Waterman
4c82f6b77e Don't refill BTB on not-taken branches 2017-07-28 13:13:52 -07:00
Andrew Waterman
2e8b02e780 Merge D$ store hits when ECC is enabled
This avoids pipeline flushes due to subword WAW hazards, as with
consecutive byte stores.
2017-07-28 12:56:36 -07:00
Andrew Waterman
838864870e Bypass TLB refill signal to halve L2 TLB hit time
The 4-cycle hit time is 1 cycle too long to avoid a second
pipeline replay, so it was effectively 9 cycles instead of 4.
2017-07-28 12:56:36 -07:00
Andrew Waterman
ae1f7a95f6 Don't nack misses when there's a pending store
That effectively increased the miss latency by 5 cycles when there was
a store hit followed by a load miss.  Since pending stores are drained
when releaseInFlight, the check I removed was redundant.
2017-07-28 12:56:36 -07:00
Wesley W. Terpstra
9804bdc34e tilelink: remove obsolete addr_lo signal (#895)
When we first implemented TL, we thought this was helpful, because
it made WidthWidgets stateless in all cases. However, it put too
much burden on all other masters and slaves, none of which benefitted
from this signal. Furthermore, even with addr_lo, WidthWidgets were
information lossy because when they widen, they have no information
about what to fill in the new high bits of addr_lo.
2017-07-26 16:01:21 -07:00
Andrew Waterman
5a5b78b15e Improve L2 TLB coding style 2017-07-26 02:22:43 -07:00
Andrew Waterman
5a9c673f41 Fix L2 TLB response bug
Sometimes, it would inform the L1 TLB that the translation was for
a superpage, even though that's never the case.
2017-07-26 02:20:41 -07:00
Andrew Waterman
acca0fccf5 Fix BTB not being refilled on some indirect jumps
We are overloading the BTB-hit signal to mean that any part of the frontend
changed the control-flow, not just the BTB.  That's the right thing to do for
most of the control logic, but it means the BTB sometimes won't get refilled
when we'd like it to.  This commit makes the frontend use an invalid BTB entry
number when it, rather than the BTB, changes the control flow.  Since the
entry number is invalid, the BTB will treat it as a miss and refill itself.

This is kind of a hack, but a more palatable fix requires reworking the RVC
IBuf, which I don't have time for right now.
2017-07-26 02:13:43 -07:00
Andrew Waterman
15878d4691 Perform some control-flow transfers within the Frontend 2017-07-25 15:19:16 -07:00
Andrew Waterman
62c4080585 Add RVC instruction patterns 2017-07-25 15:19:16 -07:00
Andrew Waterman
66d06460fa Add option for acquire-before-release 2017-07-25 15:19:16 -07:00
Andrew Waterman
86ccd935fc Add method to print perf events 2017-07-25 15:19:16 -07:00
Andrew Waterman
5df8f0d1ea Add L2 TLB miss counter 2017-07-25 15:19:16 -07:00
Henry Cook
01ca3efc2b Combine Coreplex and System Module Hierarchies (#875)
* coreplex collapse: peripherals now in coreplex

* coreplex: better factoring of TLBusWrapper attachement points

* diplomacy: allow monitorless :*= and :=*

* rocket: don't connect monitors to tile tim slave ports

* rename chip package to system

* coreplex: only sbus has a splitter

* TLFragmenter: Continuing my spot battles on requires without explanatory strings

* pbus: toFixedWidthSingleBeatSlave

* tilelink: more verbose requires

* use the new system package for regression

* sbus: add more explicit FIFO attachment points

* delete leftover top-level utils

* cleanup ResetVector and RTC
2017-07-23 08:31:04 -07:00
Wesley W. Terpstra
4eface8a9e rocket: do not require FIFO order for memory-like regions 2017-07-12 17:39:00 -07:00
Henry Cook
4c595d175c Refactor package hierarchy and remove legacy bus protocol implementations (#845)
* Refactors package hierarchy.

Additionally:
  - Removes legacy ground tests and configs
  - Removes legacy bus protocol implementations
  - Removes NTiles
  - Adds devices package
  - Adds more functions to util package
2017-07-07 10:48:16 -07:00
Andrew Waterman
a0cbc376b4 Merge pull request #849 from freechipsproject/l2-tlb
L1 memory system improvements
2017-07-06 13:03:06 -07:00
Andrew Waterman
e1cc0a0a0e Mask debug interrupts similarly to other interrupts (#847)
This makes single-step exceptions higher-priority than debug interrupts.
2017-07-06 12:03:24 -07:00
Andrew Waterman
b2351c5fbf Use consistent casing 2017-07-06 11:16:56 -07:00
Andrew Waterman
be4eceec0d Fix stupid D$ probe bug 2017-07-06 01:20:47 -07:00
Andrew Waterman
90a7d6a343 Add L2 TLB option 2017-07-06 01:19:18 -07:00
Andrew Waterman
438abc76d2 Handle TL errors in L1 I$
Cache the error bit in the tag array; report precisely on access.
2017-07-06 01:02:11 -07:00
Andrew Waterman
0ef45fac9b Add tag ECC to D$ 2017-07-03 18:16:37 -07:00
Andrew Waterman
e9752f76ae Improve probe state machine
- Reduce reliance on s2_prb_ack_data due to future ECC changes
- Shave a cycle off valid, but clean, probes
- Code cleanup
2017-07-03 16:25:04 -07:00
Richard Xia
5b46350bc3 Make sure that DCache s2_xcpt data scratchpad case is assigned to after initial assignment. 2017-06-30 17:44:16 -07:00
Wesley W. Terpstra
5edc4546e3 rocket: add dtim and itim refs to cpus 2017-06-28 23:10:58 -07:00
Wesley W. Terpstra
7d6f8d48f2 Revert "rocket: link dtim to its cpu"
This reverts commit e6c2d446cc.
2017-06-28 23:10:57 -07:00
Wesley W. Terpstra
fbcd6f0eb2 Revert "rocket: link itim to its cpu"
This reverts commit 48390ed604.
2017-06-28 23:10:57 -07:00
Henry Cook
6e5a4c687f diplomacy: a type of connect that always disables monitors (#828) 2017-06-28 21:48:10 -07:00
Megan Wachs
992b480c74 Merge pull request #825 from freechipsproject/debug_wfi
Debug + WFI Interactions
2017-06-28 21:28:51 -07:00
Wesley W. Terpstra
ca3030cba3 dcache: fix a gender inversion bug introduced in #826 2017-06-28 15:38:53 -07:00
Wesley W. Terpstra
48390ed604 rocket: link itim to its cpu 2017-06-28 15:06:19 -07:00
Wesley W. Terpstra
e6c2d446cc rocket: link dtim to its cpu 2017-06-28 15:06:19 -07:00
Wesley W. Terpstra
3f6d5110cd rocket: dtim is not a dcache 2017-06-28 15:06:19 -07:00
Wesley W. Terpstra
84dc23c215 devices: add reg-names to most devices 2017-06-28 15:06:16 -07:00
Wesley W. Terpstra
852f03282f rocket: give itim and dtim a compatible field for drivers to match 2017-06-28 14:26:55 -07:00
Andrew Waterman
b9a934ae28 Support eccBytes > 1 2017-06-28 02:09:18 -07:00
Andrew Waterman
8e4be40efc Propagate wb_reg_rs2 for sfence ASID
This would have been a bug if we supported ASIDs.
2017-06-28 02:09:18 -07:00
Andrew Waterman
2077e4190b Make log more sensible for long-latency operations
Show only one write to the destination register, not two.
2017-06-28 02:09:18 -07:00
Andrew Waterman
6f8fdff762 Basic L1 D$ ECC support
Only supports ECC on data, not tags; only supports byte granularity.
2017-06-28 02:09:18 -07:00
Andrew Waterman
6100600179 Minor D$ code cleanup 2017-06-28 02:09:18 -07:00
Andrew Waterman
3e04a99f61 Refactor frontend exception passing
Bundle them, and leverage regularity, so that if we have to add more
exceptions in the future, we don't need to change so much code.
2017-06-28 02:09:18 -07:00
Andrew Waterman
cc2f87c214 Forbid S-mode execution from user memory
285c81746f
2017-06-28 02:09:18 -07:00
Andrew Waterman
8aa16a11f3 Reduce D$ access energy when refill width > access width 2017-06-28 02:09:18 -07:00
Andrew Waterman
25f585f2a9 Remove unused signal from TLB interface 2017-06-28 02:09:18 -07:00
Andrew Waterman
d5f80df0ae Allow speculative I$ refill to cacheable regions
Backpedaling on 27b143013f.  Shaving
four cycles off of I$ miss penalty is obviously worth the HW cost.
2017-06-28 02:09:18 -07:00
Megan Wachs
e1fe0f245b debug: Don't reset debugint register, as none of the interrupt registers are. 2017-06-27 14:10:13 -07:00
Megan Wachs
136e4b6c27 debug: use consistent coding style (Reg(init ... ) vs RegInit) 2017-06-27 13:42:38 -07:00
Megan Wachs
3b9550ede3 debug: correctly declare reg_debugint 2017-06-27 13:42:38 -07:00
Megan Wachs
56839b2c62 debug: Remove DebugInterrupt from DCSR (it is no longer part of V13 spec) 2017-06-27 13:42:38 -07:00
Megan Wachs
665c2a349c Correct Debug + WFI Interactions
1) Debug interrupt should end WFI
2) WFI should end immedately during single-step
3) WFI should act like NOP during Debug Mode
2017-06-27 13:42:38 -07:00
Zihao Yu
c9cfe46604 rocket,Rocket: fix type mismatch (#819) 2017-06-27 11:22:08 -07:00
Colin Schmidt
aced18b3bb Move RoCC interface to Diplomacy and TL2 (#807)
* Move RoCC interface to Diplomacy and TL2

* guard rocc arbiter to prevent zero-width wires
2017-06-22 12:07:09 -07:00
Henry Cook
5552f23294 tims: explictly name them for generated address map 2017-06-20 17:18:29 -07:00
Henry Cook
6b79842e66 dcache: just left shift by untagbits to get tag
Always safe because of the requirement on coreplex/RocketTiles.scala:126
2017-06-20 16:35:28 -07:00
Andrew Waterman
f396b4142d Merge pull request #806 from freechipsproject/mulh
Improve integer mul/div
2017-06-20 13:01:16 -07:00
Colin Schmidt
675f183dd2 refactor ICache to be reusable by other frontends (#808)
* refactor ICache to be reusable by other frontends

specifically one that would like to change the fetch width and number of
bytes in an instruction
2017-06-20 08:21:01 -07:00
Andrew Waterman
a6d9884cc0 Improve integer mul/div
- Signed integer multiplication latency is now deterministic (before,
it would take an extra cycle if the multiplier was negative).
- High-part multiplication is now one cycle faster.
- RV64 MULW now takes half as many cycles as MUL.
- Positive remainders are now one cycle faster.
2017-06-19 12:09:21 -07:00
Richard Xia
61c39da475 Check for rvc before declaring illegal instruction after an ebreak. 2017-06-16 10:49:36 -07:00
Andrew Waterman
8552c77972 Fix I$ reset regression FU-357
Can't rely on s2 TLB response, so mask using s2_valid.
2017-06-09 00:48:24 -07:00
Andrew Waterman
16ecbdd5b2 Reduce fanout on critical I$ miss signal 2017-06-02 20:45:50 -07:00
Andrew Waterman
27b143013f Improve ITLB QoR
- No need to check cacheability
- Remove a gate delay from PMP path
2017-06-02 20:45:50 -07:00
Wesley W. Terpstra
80c63c0da6 rocket: include hartid in cache master names 2017-06-02 15:52:23 -07:00
Wesley W. Terpstra
d25ad10592 diplomacy: require masters to have a name 2017-06-02 15:52:20 -07:00
Wesley W. Terpstra
0fe625c52f diplomacy: improve PMA circuit QoR 2017-06-01 15:30:20 -07:00
Jacob Chang
e3e77d68e6 PTW now does not require atomic memory operations, so take out the requirement (#767)
Bug fix in CSR which manifest itself when compiling a config with no extension
2017-05-26 13:11:15 -07:00
Andrew Waterman
dbc5e7c494 Add TLB miss performance counters (#762) 2017-05-23 12:52:25 -07:00
Andrew Waterman
b2b4c1abcd Separate tag ECC and data ECC options (#761) 2017-05-23 12:51:48 -07:00
Henry Cook
a19fc2549e tile: add tileBus xbar 2017-05-16 16:12:01 -07:00
Wesley W. Terpstra
18725a05b0 DTS tweaks (#740)
* rocket: do not report 's' in isa string

* rocket: report the micro-architecture of the core
2017-05-12 05:32:57 -07:00
Andrew Waterman
7eefc12705 Support vectored stvec interrupts, too
137812654e
2017-05-07 15:40:08 -07:00
Andrew Waterman
c6135a02df Revert "rocket: hard-wire UXL/SXL fields to 0"
This reverts commit ea0714bfcb.

We've waffled on this matter in the priv spec: 326bec83de
2017-05-07 15:23:21 -07:00
Andrew Waterman
dd1546fd69 Check PPN LSBs for superpage PTEs
5a32fe8782
2017-05-05 15:30:09 -07:00
Scott Johnson
1b3b228790 ITIM supports PutPartial 2017-05-04 00:57:52 -07:00
Andrew Waterman
398600d4da Interlock to prevent ITIM hazard when tl.a.valid & tl.d.valid & !tl.d.ready 2017-05-04 00:57:29 -07:00
Andrew Waterman
fa6ecdf813 Fix RVC/uncacheable instruction memory performance bug
9c1d126965 was an incomplete fix, so
sometimes we were requesting pipeline replays when they weren't
necessary.
2017-05-03 17:52:06 -07:00
Megan Wachs
f8c92d2669 Merge branch 'master' into pipeline-mmio 2017-05-03 08:37:12 -07:00
Andrew Waterman
4efcb5a139 Increase frontend decoupling (#722)
Reduce pathological RVC stalls
2017-05-03 07:54:46 -07:00
Henry Cook
4dd3345db2 Merge branch 'master' into pipeline-mmio 2017-05-02 16:23:26 -07:00
Andrew Waterman
3a1a37d41b Support PutPartial in ScratchpadSlavePort 2017-05-02 03:07:02 -07:00
Andrew Waterman
f8151ce786 Remove subword load muxing in ScratchpadSlavePort 2017-05-02 00:14:46 -07:00
Andrew Waterman
f49172b5bc ScratchpadSlavePort doesn't support byte/halfword atomics 2017-05-02 00:14:46 -07:00
Wesley W. Terpstra
3d06f01a2c rocket: turn on early ack for ITIM 2017-05-01 22:53:41 -07:00
Wesley W. Terpstra
30f1f1e7c7 rocket: turn on early ack for DTIM 2017-05-01 22:53:41 -07:00
Andrew Waterman
d6e69066a5 Fix ITIM loads (#716)
An incorrectly-set ready signal caused bad data to be read from the RAM.
2017-05-01 17:41:25 -07:00
Andrew Waterman
dd85d7e0a0 I$: Don't raise io.resp.valid if io.s1_kill was high previous cycle
@solomatnikov found the bug.  It doesn't manifest in Rocket because the
Frontend masks io.resp.valid with s2_valid.
2017-04-28 16:44:58 -07:00
Megan Wachs
d67738204f Interrupts: Less Pessimistic Synchronization (#714)
* interrupts: Less pessimistic synchronization for the different interrupt types. There are some issues with the interrupt number assignments.

* interrupts: Allow an option to NOT synchronize all the external interrupts coming into PLIC

* interrupts: ExampleRocketChipTop uses PeripheryAsyncExtInterrupts. Realized 'abstract' doesn't do what I thought in Scala.

* interrupts: use consistent async/periph/core ordering

* interrupts: Properly condition on 0 External interrupts

* interrupts: CLINT is also synchronous to periph clock
2017-04-28 14:49:24 -07:00
Andrew Waterman
7416f2a17e Unbreak groundtest 2017-04-28 02:10:33 -07:00
Andrew Waterman
8fd5ecdff8 Set io.cpu.resp.bits.addr for MMIO loads without affecting QoR 2017-04-27 19:50:38 -07:00
Henry Cook
3d0ed80ef6 new parameters ResetVectorBits, MaxHartIdBits, and MaxPriorityLevels 2017-04-27 18:17:31 -07:00
Andrew Waterman
99de42d34c Swap order of ITIM WidthWidget and Fragmenter
e99fa057ac accidentally reversed them
2017-04-27 15:30:02 -07:00