riscv instruction helper
I (integer)
lui
LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format. LUI places the U-immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros.
1 | lui rd, immediate # x[rd] = sext(immediate[31:12] << 12) |
1 | lui t0, 1 # t0 = 1 << 12 = 0x1000 |
auipc
AUIPC (add upper immediate to pc) is used to build pc-relative addresses and uses the U-type format. AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the address of the AUIPC instruction, then places the result in register rd.
1 | auipc rd, immediate #x[rd] = pc + sext(immediate[31:12] << 12) |
1 | 0x80000028: auipc t0, 1 # 0x80000028 + 1<<12 = 0x80001028 |
jal
The jump and link (JAL) instruction uses the J-type format, where the J-immediate encodes a signed offset in multiples of 2 bytes. The offset is sign-extended and added to the address of the jump instruction to form the jump target address. Jumps can therefore target a ±1 MiB range. JAL stores the address of the instruction following the jump (pc+4) into register rd. The standard software calling convention uses x1 as the return address register and x5 as an alternate link register.
1 | jal rd, offset # x[rd] = pc+4; pc += sext(offset) |
1 | 0x80000028: jal ra, 1f # ra=0x8000002c, pc = pc +4 |
一般用于函数调用,offset会通过symbol来指定,这样编译器可以自己计算偏移地址。
jalr
The indirect jump instruction JALR (jump and link register) uses the I-type encoding. The target address is obtained by adding the sign-extended 12-bit I-immediate to the register rs1, then setting the least-significant bit of the result to zero. The address of the instruction following the jump (pc+4) is written to register rd.
1 | jalr rd, offset(rs1) # t =pc+4; pc=(x[rs1]+sext(offset))&~1; x[rd]=t |
pc加4给rd;rs1的值加上offset,然后把最低bit置0,给pc。
1 | 0x80000028: auipc t0, 0 # t0=pc=0x80000028 |
把最低bit清零的动作是为了让pc对齐,因为可能t0的值不对齐,所以imm也取了[11:0],加了之后可能就对齐了。
也可以省略ra。
1 | 0x80000028: auipc t0,0 # t0=pc=0x80000028 |
1 | jr rs1 # pc = x[rs1] # jalr x0, 0(rs1) |
beq/bne/blt/bltu/bge/bgeu
BEQ and BNE take the branch if registers rs1 and rs2 are equal or unequal respectively. BLT and BLTU take the branch if rs1 is less than rs2, using signed and unsigned comparison respectively. BGE and BGEU take the branch if rs1 is greater than or equal to rs2, using signed and unsigned comparison respectively. Note, BGT, BGTU, BLE, and BLEU can be synthesized by reversing the operands to BLT, BLTU, BGE, and BGEU,
respectively.
1 | beq rs1, rs2, offset: if (rs1 == rs2) pc += sext(offset) |
1 | beqz rs1, offset : if (rs1 == 0) pc += sext(offset) : beq rs1, x0, offset |
lb/lh/lw/lbu/lhu/lwu/ld/ sb/sh/sw/sd
lb字节加载。从地址x[rs1]+sign-extend(offset)读取一个字节,经符号位扩展写入x[rd]。
lbu无符号字节加载。从地址x[rs1]+sign-extend(offset)读取一个字节,经零扩展写入x[rd]。
1 | lb rd, offset(rs1) # x[rd] = sext(M[x[rs1] + sext(offset)][7:0]) |
1 | sb rs2, offset(rs1) # M[x[rs1] + sext(offset) = x[rs2][7: 0] |
addi/slti/sltiu/xori/ori/andi/ addiw
SLTI (set less than immediate) 比较 x[rs1]和有符号扩展的 immediate,如果 x[rs1]更小,向 x[rd]写入 1,否则写入 0。
addiw 把符号位扩展的立即数加到 x[rs1],将结果截断为 32 位,把符号位扩展的结果写入 x[rd]。
忽略算术溢出
1 | addi rd, rs1, immediate # x[rd] = x[rs1] + sext(immediate) |
slli/srli/srai/ slliw/srliw/sraiw
SLLI is a logical left shift (zeros are shifted into the lower bits)
slli 把寄存器x[rs1]左移shamt位,空出的位置填入0,结果写入x[rd]。对于RV32I,仅当shamt[5]=0 时,指令才是有效的
SRLI is a logical right shift (zeros are shifted into the upper bits);
SRAI is an arithmetic right shift (the original sign bit is copied into the vacated upper bits).
srai 把寄存器 x[rs1]右移 shamt 位,空位用 x[rs1]的最高位填充,结果写入 x[rd]。对于 RV32I,仅当 shamt[5]=0 时指令有效。
SLLIW, SRLIW, and SRAIW are RV64I-only instructions that are analogously defined but operate on 32-bit values and produce signed 32-bit results. SLLIW, SRLIW, and SRAIW encodings with imm[5] ̸= 0 are reserved.
slliw 把寄存器 x[rs1]左移 shamt 位,空出的位置填入 0,结果截为 32 位,进行有符号扩展后写入 x[rd]。仅当 shamt[5]=0 时,指令才是有效的。
1 | slli rd, rs1, shamt # x[rd] = x[rs1] ≪ shamt |
1 | li a0, 1 # a0 = 1 |
add/sub/sll/slt/sltu/xor/srl/sra/or/and
addw/subw/sllw/srlw/sraw
1 | add rd, rs1, rs2 # x[rd] = x[rs1] + x[rs2] |
ecall/ebreak
ebreak 通过抛出断点异常的方式请求调试器
ecall 通过引发环境调用异常来请求执行环境。
1 | Ebreak # RaiseException(Breakpoint) |
fence
1 | fence.i # Fence(Store, Fetch) |
同步指令流(Fence Instruction Stream). I-type, RV32I and RV64I. 使对内存指令区域的读写,对后续取指令可见
csr
CSRRW (Control and Status Register Read and Write) 记控制状态寄存器 csr 中的值为 t。 把寄存器 x[rs1]的值写入 csr,再把 t 写入 x[rd]。
CSRRS (Control and Status Register Read and Set). 记控制状态寄存器 csr 中的值为 t。 把 t 和寄存器 x[rs1]按位或的结果写入 csr,再把 t 写入x[rd]。
CSRRC (Control and Status Register Read and Clear ) 记控制状态寄存器 csr 中的值为 t。 把 t 和寄存器 x[rs1]按位与的结果写入 csr,再把 t 写入x[rd]
1 | csrrw rd, csr, rs1 # t = CSRs[csr]; CSRs[csr] = x[rs1]; x[rd] = t |
M
pseudo-instruction
symbol
pseudo-instruction | base instruction | meaning |
---|---|---|
la rd, symbol (non-PIC) | auipc rd, delta[31:12] + delta[11] | load absolute address |
addi rd, rd, delta[11:0] | where delta = symbol - pc | |
la rd, symbol (PIC) | auipc rd, delta[31:12] + delta[11] | load absolute address |
l{w/d} rd, rd, delta[11:0] | where delta = GOT[symbol] - pc |
Example:
1 | la t0, 1f; |
=>
1 | 80000054: auipc t0, 0x0 |
li
在 RV32I 中,它等同于执行 lui 和/或 addi; 对于 RV64I, 会扩展为这种指令序列 lui, addi, slli, addi, slli, addi ,slli, addi。
pseudo-instruction | base instruction | meaning |
---|---|---|
li rd, immediate | Myriad sequences | Load immediate |
Example
1 | li a0, 0x00001800 |
=>
1 | lui a0, 0x2 |
LUI
(load upper immediate) uses the same opcode as RV32I. LUI places the 20-bit U-immediate
into bits 31–12 of register rd and places zero in the lowest 12 bits. The 32-bit result is sign-extended
to 64 bits.
1 | li t6, 8 |
=>
1 | [00800f93] addi t6,x0,8 |
csr
pseudo-instruction | base instruction | meaning |
---|---|---|
rdinstret [h] rd | csrrs rd, instret [h], x0 | read instruction-retired counter |
rdcycle [h] rd | csrrs rd, cycle [h], x0 | read cycle counter |
rdtime[h] rd | csrrs rd, time[h], x0 | read real-time clock |
csrr rd, csr | csrrs rd, csr, x0 | read csr |
csrw csr, rs | csrrw x0, csr, rs | write csr |
csrs csr, rs | csrrs x0, csr, rs | set bits in csr |
csrc csr, rs | csrrc x0, csr, rs | clear bits in csr |
csrwi csr, imm | csrrwi x0, csr, imm | write csr, immediate |
csrsi csr, imm | csrrsi x0, csr, imm | set bits in csr, immediate |
csrci csr, imm | csrrci x0, csr, imm | clear bits in csr, immediate |
frcsr rd | csrrs rd, fcsr, x0 | read FP control/status register |
fscsr rd, rs | csrrw rd, fcsr, rs | swap FP control/status register |
fscsr rs | csrrw x0, fcsr, rs | write FP contrl/ status register |
frrm rd | csrrs rd, frm, x0 | read FP rounding mode |
fsrm rd, rs | csrrw rd, frm, rs | swap FP rounding mode |
fsrm rs | csrrw x0, frm, rs | write FP rounding mode |
frflags rd | csrrs rd, fflags, x0 | read FP exception flags |
fsflags rd, rs | csrrw rd, fflags, rs | swap FP exception flags |
fsflags rs | csrrw x0, fflags, rs | write FP exception flags |