Index Registers
Table of Contents
- Index Registers
- 1. Core Logic: Source and Destination
- 2. Deep Integration: A C and Assembly Side-by-Side Experiment
- 3. Advanced Usage: String Instructions
- 4. Memory Tricks Summary
Index Registers
The best way to understand index registers is to imagine them as a mover’s “GPS coordinates”. In the x86-64 architecture, RSI and RDI are like a pair of partners: one tells the CPU “where the item is” (source), and the other tells it “where to send it” (destination).
1. Core Logic: Source and Destination
We can quickly build intuition through a simple comparison table:
| Register | Full Name | Core Role | Real-World Analogy |
|---|---|---|---|
| RSI | Source Index | Source operand pointer | The address of the “supply location” |
| RDI | Destination Index | Destination operand pointer | The address of the “delivery location” |
[!NOTE]
Changes in modern x86-64: On 64-bit Linux systems (System V AMD64 ABI), RDI and RSI also have an extremely important identity — function arguments.
RDI: stores the 1st function argument.
RSI: stores the 2nd function argument.
2. Deep Integration: A C and Assembly Side-by-Side Experiment
To help you fully understand, let’s look at one of the most classic scenarios: memory copy (Memory Copy).
C Code
This code implements a simple character copy function: it copies the character pointed to by src to dest.
void manual_copy(char *dest, const char *src) { *dest = *src; // 把源地址的内容搬到目的地址}Corresponding Assembly Code (x86-64)
When you call manual_copy(buffer, message), the compiler arranges the registers like this:
Code segment
; 假设进入函数时:; RDI = dest 的地址 (第一个参数); RSI = src 的地址 (第二个参数)
manual_copy: mov al, [rsi] ; 【从源头取货】:把 RSI 指向地址里的 1 字节数据读到 AL 寄存器 mov [rdi], al ; 【送到目的地】:把 AL 里的数据写到 RDI 指向的地址中 ret ; 返回3. Advanced Usage: String Instructions
They are called “index registers” because x86 specifically designed some “fully automatic” instructions. For example, movsb (Move String Byte) automatically uses RSI and RDI.
Bulk Transfer Example
If we want to move 10 bytes at once:
-
C:
memcpy(dest, src, 10); -
Assembly instruction:
Code segment
lea rsi, [src_buffer] ; RSI = 源缓冲区首地址lea rdi, [dest_buffer] ; RDI = 目标缓冲区首地址mov rcx, 10 ; 计数器设置为 10rep movsb ; 【自动搬运】: ; 1. 把 [RSI] 的东西搬到 [RDI] ; 2. RSI++,RDI++ (自动指向下一个字节) ; 3. 重复执行,直到 RCX 为 0Here, the “indexing” nature of RSI and RDI is demonstrated vividly: they are not just addresses, they also automatically increase as the transfer proceeds, like conveyor belts in a warehouse moving forward on their own.
4. Memory Tricks Summary
- SI = Start / Source (where it starts / the source)
- DI = Destination / Done (the destination / where it ends)
When doing reverse engineering or debugging, if you see
rep movsor a large number of[rsi]/[rdi]operations, your first reaction should be: “The CPU is doing a big relocation!”
Do you want to continue and dig deeper into calling conventions? In other words, the “unwritten rule” of why RDI is the first argument and RSI is the second argument.
—> The “Unwritten Rule” of RDI as the First Argument