I am reading a book \"Computer Organization and Design RISC-V Edition\", and I came across the encoding for S-B and U-J instruction types.
Those types I hav
The chosen encodings line up very nicely with other encodings, simplifying the hardware at the expense of software that has to generate instructions, software that has to decode instructions, and, programmers learning or working with RISC V ;).
The S-Format breaks up the immediate into imm[11:5]
and imm[4:0]
. The reason this immediate is broken up is to keep the other fields, namely the register fields, rs2
and rs1
, in the same position as with the two source register fields in R-Type instructions. (As compared with MIPS, which did similar but not as completely, this obviates a register name width (e.g. 5 bit wide) mux and several extra wirings, as well a control signal.)
The S-Format allows for a 12 bit immediate.
Whereas the (S)B-Type for branches uses a 13 bit immediate, though the last (Least Significant Bit) of the 13-bit immediate is always zero so it is not stored! So, it needs to actually encode 12 bits just like the S-Format, but because they are shifted in actual usage (left by one, e.g. *2), all the bits are essentially off by 1 bit position as compared with the S-Format immediate. (Shifting is not hard or slow but costs silicon real-estate. Typically, such a shift by a constant amount would be done by simply wiring the input bits to offset output bit positions rather than using a dedicated shifter we would see in an ALU; however, still this is immediate and datapath sized wiring so ~12 to 32+ extra wires.)
In order to not have to shift (as much as possible of) the part of the immediate that is stored, and so as to line nicely with the immediate in S-Format, the not stored LSB position (from S-Format) is used to store bit 11 of the SB-Format immediate. This way bits 10:1 line up exactly with the S-Format immediate.
But why not put bit 12 of the branch immediate there instead, which would keep one more bit in alignment (i.e. 11:1) with the S-Format? Because the highest bit encoded in the immediate of the instruction is used to sign extend the immediate to 32-bits (for RV32, or 64-bits for RV64, 128 for RV128, lots of wires!). So, by keeping the sign bit in the same place as with the S-Format 12 bit immediate, the same sign extension hardware can be shared (with the same first-described-above pros and cons ;-). Hence, the choice to store bit 11, the next most significant bit of the SB-Type immediate, in the 0 bit position (relative to S-Format).
The cost for SB (given S already) is only two or so (1-bit) wires and one 1-bit mux and a 1-bit control signal — minimal compared to alternatives.
See the following presentation, slide 46, titled "RISC-V Immediate Encoding", and subtitled: "Why is it so confusing?!?!"
The UJ-Type does similar, keeping the sign bit in the same bit position as the sign bit of other instructions, while aligning as many of the other bits as possible with other formats.
See slide 60 of the same presentation.
The official RISC-V spec does an excellent job of explaining every design choice in the instruction set, why something is done in that specific way. When in doubt you just need to have a look at it
So the rationale for instruction encoding is described in chapter 2.2 - Base Instruction Formats. It's all for making instruction decoding simpler and faster by
The RISC-V ISA keeps the source (
rs1
andrs2
) and destination (rd
) registers at the same position in all formats to simplify decoding. Except for the 5-bit immediates used in CSR instructions (Chapter 9), immediates are always sign-extended, and are generally packed towards the leftmost available bits in the instruction and have been allocated to reduce hardware complexity. In particular, the sign bit for all immediates is always in bit 31 of the instruction to speed sign-extension circuitry.
Decoding register specifiers is usually on the critical paths in implementations, and so the instruction format was chosen to keep all register specifiers at the same position in all formats at the expense of having to move immediate bits across formats (a property shared with RISC-IV aka. SPUR [11]).
Look at the instruction encoding you'll see that just a single decoder is needed for each of rs1
, rs2
and rd
in any instruction formats that need them, and bit 31 is always the sign bit in the immediates regardless of their length, for fast sign extension
Now focus to the immediates and you'll also see that they're arranged in "weird" orders, but they also allow decoders to be shared between formats. For example bits 10:1 are always at the same place in all formats. Same to bits 19:12 in U/J and 4:1 in S/B. Those 2 pairs are actually almost the same, with the immediate is shifted left by one bit in J and B. By interleaving bit that way the most of the hard work of shifting is left to the assembler, simplifying hardware even more
2.3 Immediate Encoding Variants
The only difference between the S and B formats is that the 12-bit immediate field is used to encode branch offsets in multiples of 2 in the B format. Instead of shifting all bits in the instruction-encoded immediate left by one in hardware as is conventionally done, the middle bits (imm[10:1]) and sign bit stay in fixed positions, while the lowest bit in S format (inst[7]) encodes a high-order bit in B format.
Similarly, the only difference between the U and J formats is that the 20-bit immediate is shifted left by 12 bits to form U immediates and by 1 bit to form J immediates. The location of instruction bits in the U and J format immediates is chosen to maximize overlap with the other formats and with each other.
Sign-extension is one of the most critical operations on immediates (particularly for XLEN>32), and in RISC-V the sign bit for all immediates is always held in bit 31 of the instruction to allow sign-extension to proceed in parallel with instruction decoding.
Although more complex implementations might have separate adders for branch and jump calculations and so would not benefit from keeping the location of immediate bits constant across types of instruction, we wanted to reduce the hardware cost of the simplest implementations. By rotating bits in the instruction encoding of B and J immediates instead of using dynamic hardware muxes to multiply the immediate by 2, we reduce instruction signal fanout and immediate mux costs by around a factor of 2. The scrambled immediate encoding will add negligible time to static or ahead-of-time compilation. For dynamic generation of instructions, there is some small additional overhead, but the most common short forward branches have straightforward immediate encodings.
If you're interested you can find more discussions in the official github page