The “Unwritten Rule” of RDI as the First Argument

By Vesper Vei
4 minutes read

Table of Contents

  1. The “Unwritten Rule” of RDI as the First Argument
  2. 1. Hardware Legacy: Enabling “Zero-Cost” Calls to System Instructions
  3. 2. Performance Revolution: Evolving from the “Memory Era” to the “Register Era”
  4. 3. Practical Significance: Impact on Pwn and Reverse Engineering
    1. Summary Comparison Table

The “Unwritten Rule” of RDI as the First Argument

This question gets to the core of low-level design. Why is it specifically defined that RDI is the first argument and RSI is the second? Why not randomly pick two registers instead, such as R8 and R9?

The reasons behind this can mainly be summarized in two points: the continuation of hardware legacy and extreme performance optimization.


1. Hardware Legacy: Enabling “Zero-Cost” Calls to System Instructions

In C, the operational logic of many fundamental functions is “destination \leftarrow source”. The most typical example is memcpy(dest, src, size).

The brilliance of the design:

The x86 architecture has some very powerful built-in “string instructions” (such as movsb). These instructions are hard-wired into the CPU circuitry, and they mandate the use of RSI as the source and RDI as the destination.

If the calling convention (ABI) specifies that the first argument goes into RDI and the second into RSI, then when you write memcpy(dest, src, n):

  1. dest is already in RDI.

  2. src is already in RSI.

  3. The CPU does not need to spend extra time issuing mov instructions to swap register values; it can start copying immediately with a single hardware instruction.

Conclusion: this design allows high-frequency C functions (such as string processing and memory copying) to connect directly to the CPU’s hardware-accelerated instructions.


2. Performance Revolution: Evolving from the “Memory Era” to the “Register Era”

In the old 32-bit (x86) era, function arguments were placed on the stack.

By the 64-bit (x86-64) era, more registers became available. Engineers established a set of rules (the System V ABI) that requires the first 6 arguments to be passed through registers.

Why this particular order?

It works like a “standard protocol.” Without such a rule:

The purpose of uniformly specifying RDI, RSI, … is to ensure that compilers, operating systems, and code written in different languages can communicate with each other seamlessly. It is like everyone agreeing to drive on the right side of the road.


3. Practical Significance: Impact on Pwn and Reverse Engineering

For students learning binary security (Pwn), understanding this is crucial:


Summary Comparison Table

DimensionOld Era (32-bit/x86)New Era (64-bit/x86-64)Benefits
Storage locationMemory stack (Stack)Registers (RDI, RSI…)Fast (register access is on the nanosecond scale)
Instruction countRequires multiple push/pop operationsSimple mov is enoughSmaller code size and higher execution efficiency
Logical consistencyArbitraryStrictly follows Dest/Source logicHardware-level optimization, reducing the cost of data movement

Would you like to see how the CPU is “forced” to use the memory stack to handle arguments when there are more than 6 of them (and registers are no longer sufficient)?

—> When There Are More Than 6 Parameters, the Memory Stack Is "Forced" Into Use —> 参数从右往左压栈道”设计心机”


Relationship Graph

Loading graph...