阅读更多
1 Intel and AT&T Syntax
Assembly language has 2 different syntaxes, namely Intel Syntax
and AT&T Syntax
. They are roughly similar in form, but there are significant differences in detail. Very easy to confuse.
Intel | AT&T | |
---|---|---|
Comment | ; |
# |
Instruction | No suffix, e.g., add |
With suffix, indicating operand size, e.g., addq |
Register | eax , ebx , etc. |
%eax , %ebx , etc. |
Immediate Value | 0x100 |
$0x100 |
Direct Addressing | [eax] |
(%eax) |
Indirect Addressing | [base + reg + reg * scale + displacement] |
displacement(base, reg, scale) |
1.1 Memory Reference
The format for indirect memory reference in Intel Syntax
is: section:[base + index*scale + displacement]
The format for indirect memory reference in AT&T Syntax
is: section:displacement(base, index, scale)
- Here,
base
andindex
are any32-bit
base
andindex
registers. scale
can be1
,2
,4
, or8
. If thescale
is not specified, the default value is1
.section
can specify any segment register as the segment prefix. The default segment register varies depending on the situation.
Some examples:
-4(%ebp)
base
:%ebp
displacement
:-4
section
: not specifiedindex
: not specified, defaults to 0scale
: not specified, defaults to 1
2 Instructions
The characteristics of the Intel Syntax
are as follows:
1 | mnemonic DestinationOperand sourceOperand |
The characteristics of the AT&T Syntax
are as follows:
1 | mnemonic SourceOperand DestinationOperand |
Below are common instructions in the form of AT&T Syntax
:
Data Transfer Instructions:
Instruction Format | Description |
---|---|
movl src, dst |
Transfer doubleword |
movw src, dst |
Transfer word |
movb src, dst |
Transfer byte |
movsbl src, dst |
Sign-extend src (byte) to dst (doubleword) |
movzbl src, dst |
Zero-extend src (byte) to dst (doubleword) |
pushl src |
PushR[%esp] -= 4 M[R[%esp]] = src |
popl dst |
Popdst = M[R[%esp]] R[%esp] += 4 |
xchg mem/reg mem/reg |
Exchange the contents between two registers or between a register and memory (at least one must be a register)Both operands must be of the same data type, e.g., if one is byte, the other must be byte |
lea src, dst |
Load Effective Address DoubleWords, the instruction computes the effective address of a memory location and then places this address into a specified general-purpose register. |
leaq src, dst |
Load Effective Address Quadwords, similar to lea. |
Arithmetic and Logical Operations Instructions:
Instruction Format | Description |
---|---|
leal src, dst |
dst = &src , dst can only be a register |
incl dst |
dst += 1 |
decl dst |
dst -= 1 |
negl dst |
dst = -dst |
notl dst |
dst = ~dst |
addl src, dst |
dst += src |
subl src, dst |
dst -= src |
imull src, dst |
dst *= src |
xorl src, dst |
dst ^= src |
orl src, dst |
`dst |
andl src, dst |
dst &= src |
sall k dst |
dst << k |
shll k dst |
dst << k (same as sall ) |
sarl k dst |
dst >> k |
shrl k dst |
dst >> k (same as sarl ) |
Comparison Instructions:
Instruction Format | Description |
---|---|
cmpb s1, s2 |
s2 - s1 , compare byte, difference relationship |
testb s1, s2 |
s2 & s1 , compare byte, and relationship |
cmpw s1, s2 |
s2 - s1 , compare word, difference relationship |
testw s1, s2 |
s2 & s1 , compare word, and relationship |
cmpl s1, s2 |
s2 - s1 , compare doubleword, difference relationship |
testl s1, s2 |
s2 & s1 , compare doubleword, and relationship |
Jump Instructions:
Instruction Format | Description |
---|---|
jmp label |
Direct jump |
jmp *operand |
Indirect jump |
je label |
Jump if equal |
jne label |
Jump if not equal |
jz label |
Jump if zero |
jnz label |
Jump if not zero |
js label |
Jump if negative |
jns label |
Jump if not negative |
jg label |
Jump if greater |
jnle label |
Jump if greater |
jge label |
Jump if greater or equal |
jnl label |
Jump if greater or equal |
jl label |
Jump if less |
jnge label |
Jump if less |
jle label |
Jump if less or equal |
jng label |
Jump if less or equal |
SIMD
-related Instruction Set
- Advanced Vector Extensions
- 一文读懂SIMD指令集 目前最全SSE/AVX介绍
- Instruction Sets
Instruction Set | Description |
---|---|
MMX |
Introduced 8 new 64-bit vector registers MM0 to MM7 |
SSE |
Building on MMX , introduced 8 new 128-bit vector registers XMM0 to XMM7 . Only supports 128-bit floating point |
SSE2 |
Supports 128-bit integer |
SSE3 |
|
SSSE3 |
|
SSE4.1 |
|
SSE4.2 |
|
AVX |
Introduced 16 new 256-bit vector registers YMM0 to YMM15 . Floating point supports 256-bit, while integer only supports 128-bit |
AVX2 |
Integer now also supports 256-bit |
AVX512 |
Introduced 32 new 512-bit vector registers ZMM0 to ZMM31 |
- Data Types
Data Type | Description |
---|---|
__m128 |
Vector containing 4 float numbers |
__m128d |
Vector containing 2 double numbers |
__m128i |
Vector containing several integer numbers |
__m256 |
Vector containing 8 float numbers |
__m256d |
Vector containing 4 double numbers |
__m256i |
Vector containing several integer numbers |
Others:
Instruction | Description |
---|---|
enter size, nesting level |
Prepares the current stack frame. Where nesting level ranges from 0-31 which indicates the number of stack frame pointers copied from the previous frame to the new frame, typically 0 . enter $size, $0 is equivalent to push %rbp + mov %rsp, %rbp + sub $size, %rsp |
leave |
Restores the previous stack frame, equivalent to mov %rbp, %rsp , pop %rbp |
ret |
Returns after a function call |
cli |
Disables interrupts, ring0 |
sti |
Enables interrupts, ring0 |
lgdt src |
Loads the global descriptor |
lidt src |
Loads the interrupt descriptor |
cmov |
Conditional move instructions, used to eliminate branching |
2.1 How to Check Instructions
- x86 and amd64 instruction reference
- 汇编语言在线帮助
- X86 Opcode and Instruction Reference
- Intel x86/x64 开发者手册 卷1
- Intel x86/x64 开发者手册 卷2
- Intel® 64 and IA-32 Architectures Software Developer Manuals
- Good reference for x86 assembly instructions
- 汇编指令速查
- cgasm
1
2
3
4
5
6
7
8
9go get github.com/bnagy/cgasm
# 查看 GOPATH
go env | grep GOPATH
# 将 GOPATH 添加到环境变量 PATH 中
export PATH=${PATH}:$(go env | grep GOPATH | awk -F '=' '{print $2}' | sed -e 's/"//g')/bin
cgasm -f push
2.2 Memory Barriers
In the x86
architecture, there are several assembly instructions that can serve as memory barriers. Their role is to ensure that the order of memory accesses is not reordered, thereby ensuring the correctness and reliability of the program. These instructions include:
MFENCE
: All memory accesses before the execution of theMFENCE
instruction must complete before theMFENCE
instruction, and all memory accesses after theMFENCE
instruction must start after the execution of theMFENCE
instruction. This instruction acts as a full barrier, meaning it prevents all memory accesses on all processors.SFENCE
:SFENCE
ensures that all write operations before its execution have been committed to memory. Any write operations after theSFENCE
instruction cannot be reordered to occur before theSFENCE
instruction.LFENCE
:LFENCE
ensures that all read operations prior to its execution are complete. Any read operations after theLFENCE
instruction cannot be reordered to happen before theLFENCE
instruction.LOCK
instruction prefix: TheLOCK
prefix can be applied to specific instructions, such asLOCK ADD
,LOCK DEC
,LOCK XCHG
, etc. They guarantee that when executing an instruction with theLOCK
prefix, access to shared memory is serialized.
3 Register
For more information, please refer toSystem-Architecture-Register
64-bit register | Lower 32 bits | Lower 16 bits | Lower 8 bits |
---|---|---|---|
rax | eax | ax | al |
rbx | ebx | bx | bl |
rcx | ecx | cx | cl |
rdx | edx | dx | dl |
rsi | esi | si | sil |
rdi | edi | di | dil |
rbp | ebp | bp | bpl |
rsp | esp | sp | spl |
rip | eip | ? | ? |
r8 | r8d | r8w | r8b |
r9 | r9d | r9w | r9b |
r10 | r10d | r10w | r10b |
r11 | r11d | r11w | r11b |
r12 | r12d | r12w | r12b |
r13 | r13d | r13w | r13b |
r14 | r14d | r14w | r14b |
r15 | r15d | r15w | r15b |
In this context, rbp
is the base pointer register pointing to the bottom of the stack; rsp
is the stack pointer register pointing to the top of the stack; rip
is the instruction pointer register pointing to the next instruction to be executed.
Vector-related Registers (Refert toAdvanced Vector Extensions)
Register Name | Bit Size |
---|---|
xmm | 128 |
ymm | 256 |
zmm | 512 |
3.1 The registers used for parameters
The specific register used to store function parameters depends on the architecture and the calling convention being used. I’ll provide details for the x86-64 architecture using the System V AMD64 ABI calling convention, which is common for Unix-like systems (including Linux).
For the x86-64 System V AMD64 ABI:
- First parameter:
%rdi
- Second parameter:
%rsi
- Third parameter:
%rdx
- Fourth parameter:
%rcx
- Fifth parameter:
%r8
- Sixth parameter:
%r9
- Seventh parameter: Placed on the stack.
- Eighth parameter: Placed on the stack after the seventh parameter.
- … and so on.
4 Assembly Syntax
Reference:
4.1 Intel Syntax
4.1.1 Comment
1 | ; this is comment |
4.2 AT&T Syntax
4.2.1 Assembler Commands
Assembler directives (Assembler Directives
) start with an English period (‘.’) followed by letters for the rest of the command name. Typically, these are in lowercase. Below are some common commands:
Command | Description |
---|---|
.abort |
This command immediately terminates the assembly process. This is for compatibility with other assemblers. The initial idea was that the assembly language source would be fed into the assembler. If the program sending the source wanted to exit, it could use this command to notify as to exit. Use of .abort might not be supported in the future. |
.align abs-expr, abs-expr, abs-expr |
Increase the position counter (in the current subsection) to point to the specified storage boundary. The first expression argument (mandatory) represents the boundary base; the second expression argument represents the value of the filler byte, which is used to fill places passed by the position counter; the third expression argument (optional) indicates the maximum number of bytes this alignment command is allowed to cross. |
.ascii "str"... |
.ascii can have no arguments or several strings separated by commas. It saves each assembled string (without automatically appending a null byte at the end) in consecutive addresses. |
.asciz "str"... |
.asciz is similar to .ascii , but it automatically appends a null byte at the end of each string. |
.byte |
.byte can have no arguments or multiple expression arguments separated by commas. Each expression argument is assembled into the next byte. |
.data subsection |
.data informs as to append subsequent statements to the data section ending in subsection (which must be a pure expression). If the subsection argument is omitted, the default is 0. |
.def name |
Starts defining debug information for the symbol name . The definition area extends to the .endef command encountered. |
.end |
.end marks the end of the assembly file. as doesn’t process any statements after the .end command. |
.err |
If as assembles a .err command, it will print an error message. |
.float flonums |
Assemble 0 or more floating point numbers, separated by commas. |
.global symbol |
.global makes the symbol symbol visible to the linker ld . |
.int intnums |
Assemble 0 or more integers, separated by commas. |
.long |
Same as .int . |
.macro |
.macro and .endm are used to define macros. Macros can be used to generate assembly output. |
.quad bignums |
Assemble 0 or more long integers, separated by commas. |
.section name |
The .section command assembles subsequent code into a segment named name . |
.short shortnums |
Assemble 0 or more short integers, separated by commas. |
.single flonums |
Same as .float . |
.size |
This command is usually generated by compilers to add auxiliary debugging information in the symbol table. |
.string "str" |
Copies characters from the argument str into the target file. Multiple strings can be specified for copying, separated by commas. |
.text subsection |
Instructs as to assemble subsequent statements to the end of the text subsection identified by subsection , which is a pure expression. If the subsection argument is omitted, the default subsection is 0. |
.title "heading" |
When generating an assembly listing, use heading as the title. |
.word |
Same as .short . |
4.2.2 Symbol
A Symbol
is composed of letters and underscores, ending with a colon :
.
1 | <symbol_name>: |
4.2.3 Comment
1 | /* this is comment */ |
5 Practical Application
本小节转载摘录自不吃油条针对汇编语言的系列文章
5.1 Setting Up the Environment
Assembly Tools:
nasm
: Stands forNetwide Assembler
. It’s a generic assembler that usesIntel Syntax
.- Lacks the
PTR
keyword, somov DWORD PTR [rbp-0xc], edi
should be written asmov DWORD [rbp-0xc], edi
.
- Lacks the
masm
: Stands forMicrosoft Macro Assembler
. It’s specifically written by Microsoft for assembly underwindows
.gas
: Stands forGNU Assembler
and usesAT&T Syntax
.
1 | wget http://mirror.centos.org/centos/7/os/x86_64/Packages/nasm-2.10.07-7.el7.x86_64.rpm |
5.2 First Program
Objective of this section: Write assembly code equivalent to the following C++ program
1 | int main() { |
5.2.1 Intel Version
first.asm
is as follows:
1 | global main |
Compile and execute:
1 | nasm -o first.o -f elf64 first.asm |
5.2.2 AT&T Version
first.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o first.o first.asm |
5.3 Use Memory
Objective of this section: Calculate the value of 1+2 using memory.
5.3.1 Intel Version
use_memory.asm
is as follows:
1 | global main |
Compile and execute:
1 | nasm -o use_memory.o -f elf64 use_memory.asm |
The result shows a core dump
. This is because in the Linux operating system, memory is controlled by the operating system and cannot be read or written arbitrarily. You can change it to the following form:
1 | global main |
Compile and execute:
1 | nasm -o use_memory.o -f elf64 use_memory.asm |
In this version of the code, in addition to using [sui_bian_xie]
instead of a memory address, there are also two additional lines:
- The first line indicates that the following content, after compilation, will be placed in the data section of the executable file and will be allocated corresponding memory when the program starts.
- The second line is crucial as it describes the actual data. This line means that a 4-byte space is allocated and filled with zeros. The
dw
(double word) here represents 4 bytes. Thesui_bian_xie
in front is just a name, which means you can write anything here for ease of distinction when writing code. Thissui_bian_xie
will be processed by the compiler into a specific address during compilation. We don’t need to worry about the exact address; we just need to know that thesui_bian_xie
before and after represents the same thing.
1 | section .data |
Here’s another example, use_memory2.asm
, as follows:
1 | global main |
Compile and execute:
1 | nasm -o use_memory2.o -f elf64 use_memory2.asm |
5.3.2 AT&T Version
use_memory.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o use_memory.o use_memory.asm |
use_memory2.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o use_memory2.o use_memory2.asm |
5.4 Translate the first C program:
For the following C program:
1 | int x = 0; |
5.4.1 Intel Version
first_c.asm
is as follows:
1 | global main |
Compile and execute:
1 | nasm -o first_c.o -f elf64 first_c.asm |
5.4.2 AT&T Version
first_c.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o first_c.o first_c.asm |
5.5 Translating C Language if Statements
For the following C program:
1 | int main() { |
5.5.1 Intel Version
if_c.asm
is as follows:
1 | global main |
Compile and execute:
1 | nasm -o if_c.o -f elf64 if_c.asm |
5.5.2 AT&T Version
if_c.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o if_c.o if_c.asm |
5.6 Translating C Language Loop Statements
For the following C program:
1 | int main() { |
Or the for
version as follows:
1 | int main() { |
The above two forms of loops, with slight adjustments, can be simplified to the following equivalent program:
1 | int main() { |
5.6.1 Intel Version
loop_c.asm
is as follows:
1 | global main: |
Compile and execute:
1 | nasm -o loop_c.o -f elf64 loop_c.asm |
5.6.2 AT&T Version
loop_c.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o loop_c.o loop_c.asm |
5.7 Translating C Language Function Calls
1 | int fibonacci(int num) { |
To facilitate conversion to assembly, the program above has been rewritten into the following equivalent program:
1 | int fibonacci(int num) { |
Since registers are global resources, for recursive calls to work, before making a call, it’s necessary to push each register used by the current function onto the stack, and after the call returns, restore the registers. The instructions involved include:
call
: Initiates a function call.ret
: Returns from a function.push
: Pushes onto the stack.pop
: Pops from the stack.rsp
: Stack pointer.rbp
: Base pointer.
5.7.1 Intel Version
fibonacci_c.asm
is as follows:
1 | global main: |
Compile and execute:
1 | nasm -o fibonacci_c.o -f elf64 fibonacci_c.asm |
Among these, preparing the current stack frame and restoring the parent stack frame can be replaced with enter
and leave
.
1 | push rbp ; 保存上一级的栈底指针 |
Can be replaced with
1 | enter 0xc, 0x0 |
5.7.2 AT&T Version
fibonacci_c.asm
is as follows:
1 | .text |
Compile and execute:
1 | as -o fibonacci_c.o fibonacci_c.asm |
Among these, preparing the current stack frame and restoring the parent stack frame can be replaced with enter
and leave
.
1 | push %rbp # 保存上一级的栈底指针 |
Can be replaced with
1 | enter $0xc, $0x0 |
6 Tips
6.1 How to generate readable assembly code
Approach 1: (Not easy to understand)
1 | # Generate |
Approach 2:
1 | # 生成目标文件 |
7 参考
- Assembly Programming Tutorial
- x86 instruction listings
- 汇编语言入门一:环境准备
- 汇编语言入门二:环境有了先过把瘾
- 汇编语言入门三:是时候上内存了
- 汇编语言入门四:打通C和汇编语言
- 汇编语言入门五:流程控制(一)
- 汇编语言入门六:流程控制(二)
- 汇编语言入门七:函数调用(一)
- 汇编语言入门八:函数调用(二)
- 汇编语言入门九:总结与后续(闲扯)
- What is the difference between “mov (%rax),%eax” and “mov %rax,%eax”?
- Using GCC to produce readable assembly?
- 汇编语言入门教程:汇编语言程序设计指南(精讲版)
- AT&T ASM Syntax详解
- 汇编语言–x86汇编指令集大全
- x86 Assembly Guide