Andrew Waterman
5d1165c850
Express PMP mask generator using a carry chain
...
This allows it to be optimized like an adder, improving QoR when it
is on the critical path.
2017-03-26 14:20:16 -07:00
Andrew Waterman
bb42f3bf3b
WIP on FPU subword recoding
2017-03-26 14:20:16 -07:00
Wesley W. Terpstra
537274b645
coreplex: move buffers inside the coreplex
...
This should make hierarchical place and route easier.
2017-03-24 22:54:48 -07:00
Yunsup Lee
5bbb75e078
rename l2FrontendBus as fsb, expose bsb
2017-03-24 22:54:48 -07:00
Henry Cook
996a31364a
rocket: remove hard-coded paddrBits ( #610 )
...
Fall back on global variable but check that it is compatible with memory as seen from rocket's tilelink master port.
2017-03-24 22:30:18 -07:00
Wesley W. Terpstra
f36b1766f8
TLROM: use the smallest ROM implementation that works
...
The contents everywhere else are still zero.
2017-03-24 20:40:28 -07:00
Wesley W. Terpstra
ac205ca10a
bootrom: move to 0x10000 for more space (DTB on multicore is big)
2017-03-24 18:18:01 -07:00
Wesley W. Terpstra
34f8ce653a
bootrom: follow SBI (a0=hartid, a1=dtb)
2017-03-24 18:18:01 -07:00
Wesley W. Terpstra
9a2f0d01a1
GenerateBootROM: use compiled DTB
2017-03-24 18:18:01 -07:00
Andrew Waterman
17b1ee3037
Default to 8 PMPs; support 0 PMPs
2017-03-24 16:39:52 -07:00
Andrew Waterman
97006ab396
Don't modulate PMP privilege on passsthrough when !usingVM
2017-03-24 16:39:52 -07:00
Andrew Waterman
3f0d2fe826
Instantiate PTW unconditionally
...
This keeps the PMP datapaths intact. The PTW itself will get optimized
away for the !usingVM case.
2017-03-24 16:39:52 -07:00
Andrew Waterman
30415215b8
Don't check for exceptions on ScratchpadSlavePort accesses
2017-03-24 16:39:52 -07:00
Andrew Waterman
ccd5bc9a91
Improve QoR of PMP homogeneity checker
2017-03-24 16:39:52 -07:00
Andrew Waterman
e9cadf29d2
Improve DCache MMIO QoR
...
No need to store the cmd field. From the perspective of the cache, all
MMIO responses that have data can be treated the same as loads.
2017-03-24 16:39:52 -07:00
Andrew Waterman
fb6498f2c3
Use Reg(Vec) instead of Seq(Reg) for DCache MMIO
2017-03-24 16:39:52 -07:00
Andrew Waterman
0538dc77ce
For D$, use source 0 through N-1 for MMIO, not 1 through N
...
This makes the code a bit cleaner.
2017-03-24 16:39:52 -07:00
Andrew Waterman
3951e57789
Force each TLB entry into its own clock-gate group
...
This ameliorates a PMP critical path.
I can't figure out how to do this without asUInt/asTypeOf.
2017-03-24 16:39:52 -07:00
Andrew Waterman
8d7f1d777e
Fix an embarrassing typo in the PMPHeterogeneityChecker
2017-03-24 16:39:52 -07:00
Andrew Waterman
10c39cb8d6
Disable mprv in D-mode
2017-03-24 16:39:52 -07:00
Andrew Waterman
d3bda9574c
Put page homogeneity checker in PMP
...
Avoids redundancy between ITLB and DTLB
2017-03-24 16:39:52 -07:00
Andrew Waterman
9e05200e51
Don't require that PMP ranges be aligned to access size
...
e.g., if a range permits access to 0x0-0xb, allow 8-byte accesses 0x0-0x7.
2017-03-24 16:39:52 -07:00
Andrew Waterman
29e67279ba
add comments
2017-03-24 16:39:52 -07:00
Andrew Waterman
4c8be13a4d
Improve homogeneity circuit QoR
2017-03-24 16:39:52 -07:00
Andrew Waterman
59d6afa132
mideleg/medeleg not present without less-privileged traps
2017-03-24 16:39:52 -07:00
Andrew Waterman
38808f55d5
Share PMP mask gen between I$ and D$
2017-03-24 16:39:52 -07:00
Andrew Waterman
86d84959cf
More WIP on PMP
2017-03-24 16:39:52 -07:00
Andrew Waterman
2888779422
Flush pipeline from WB stage, not MEM
...
Fixes sptbr write -> instruction translation hazard.
2017-03-24 16:39:52 -07:00
Andrew Waterman
44ca3b60ab
Retime PTW response valid bits
...
It's not just to save the gate delay; it also reduces wire delay by
allowing the flops to be closer to their respective TLBs.
2017-03-24 16:39:52 -07:00
Andrew Waterman
a03556220c
Default TLB size = 32
...
@davidbiancolin
2017-03-24 16:39:52 -07:00
Andrew Waterman
1875407316
Get TLB permission checks off D$ clock gating critical path
2017-03-24 16:39:52 -07:00
Andrew Waterman
a4164348b4
Expose MXR to S-mode
2017-03-24 16:39:52 -07:00
Andrew Waterman
0380aed329
PUM -> SUM
2017-03-24 16:39:52 -07:00
Andrew Waterman
2a413e4496
Remove fruitless debug()
2017-03-24 16:39:52 -07:00
Andrew Waterman
29414f3a23
Simplify interrupt-stack discipline
...
f2ed45b179
2017-03-24 16:39:52 -07:00
Andrew Waterman
723352c3e2
Mitigate some more PMP critical paths
2017-03-24 16:39:52 -07:00
Andrew Waterman
7484f27ed3
Don't gate exception-cause pipeline registers separately
...
They are too narrow to justify gating separately from the other pipeline
registers (and one of the clock gates was on the PMP critical path).
2017-03-24 16:39:52 -07:00
Andrew Waterman
3ea822c2cf
Make blocking L1 D$ the default
...
The nonblocking cache is overdesigned for most Rocket-class cores, so
the blocking cache is the more appropriate default.
2017-03-24 16:39:52 -07:00
Andrew Waterman
487b8db5ef
Address some PMP critical paths
2017-03-24 16:39:52 -07:00
Andrew Waterman
03fb334c4c
Take mprv calculation off critical path
2017-03-24 16:39:52 -07:00
Andrew Waterman
f0796f0509
Pass correct access size information to PMP checker
2017-03-24 16:39:52 -07:00
Andrew Waterman
a6874c03f7
Remove DecoupledTLB
2017-03-24 16:39:52 -07:00
Andrew Waterman
78f9f6b9ef
When SFENCE.VMA has rs2 != x0, don't flush global mappings
2017-03-24 16:39:52 -07:00
Andrew Waterman
1b950128e1
PTW should always use S-mode privilege
...
If an exception occurs while a page-table walk is coincidentally in
progress (e.g., an illegal instruction executes during data TLB refill),
then the processor might enter M-mode. However, the PTW's accesses
should proceed without M privilege, to avoid bypassing PMPs.
Note, the same argument doesn't apply to the nonblocking cache's replay
queues, because those accesses have already been checked against the PMPs.
The cache correctly ignores access exceptions reported on replays,
provided no exceptions were reported on the initial access.
2017-03-24 16:39:52 -07:00
Andrew Waterman
aace526857
WIP on PMP
2017-03-24 16:39:52 -07:00
Andrew Waterman
b1b405404d
Set PRV=M when entering debug mode
...
Debug mode mostly behaves like M-mode, so this approach avoids having
to check the debug bit in most permission checks.
2017-03-24 16:39:52 -07:00
Andrew Waterman
cf168e419b
Support SFENCE.VMA rs1 argument
...
This one's a little invasive. To flush a specific entry from the TLB, you
need to reuse its CAM port. Since the TLB lookup can be on the critical
path, we wish to avoid muxing in another address.
This is simple on the data side, where the datapath already carries rs1 to
the TLB (it's the same path as the AMO address calculation). It's trickier
for the I$, where the TLB lookup address comes from the fetch stage PC.
The trick is to temporarily redirect the PC to rs1, then redirect the PC
again to the instruction after SFENCE.VMA.
2017-03-24 16:39:52 -07:00
Henry Cook
797c18b8db
Make some requirement failures more verbose ( #608 )
...
* tilelink: verbose requires in xbar
* diplomacy: verbose requires
2017-03-23 21:55:11 -07:00
Wesley W. Terpstra
bd08f10816
tilelink2: make sink ids optional ( #607 )
...
* tilelink2: make sink ids optional
* CacheCork: add a special-case for 1 sink id
2017-03-23 18:19:04 -07:00
Wesley W. Terpstra
19eb9b6906
l1tol2: put a flow Q on the exits ( #606 )
...
This Xbar connects the largest components in the design; the cores
and the L2 banks. We already have a full buffer on the core side.
However, the valid path going to the L2 comes back as a ready path.
Putting a flow Q also on the outputs of the l1tol2 cuts this path
in half at no cost to IPC.
2017-03-23 16:28:32 -07:00