阅读更多
1 What is GDB
- 一个支持包括
c
以及c++
等众多语言的debugger
工具 - 它允许您检查程序在执行期间的某个时刻正在做什么
- 能够定位诸如
segmentation faults
等错误的具体原因
对c/c++
程序的调试,需要在编译前就加上-g
选项:
1 | gcc -g hello.c -o hello |
2 How to use GDB
gdb
提供了一个交互式的shell
,能够通过↑
查询历史命令,可以通过tab
进行命令行补全,可以通过help [command]
查询帮助文档
进入gdb交互界面的几种方式:
gdb <binary_with_-g>
:调试可执行文件gdb <binary_with_-g> core.xxx
:分析coredumpgdb <binary_with_-g> <pid_without_-g>
:以可执行文件为元数据,调试指定进程<binary>
需要用-g
参数编译出来,否则指定该文件没有意义<pid>
对应的进程可以是不带-g
参数编译出来的版本,只要保证源码一样即可
gdb -p <pid_with_-g>
:调试指定进程- 若
<pid>
对应的进程如果是用-g
参数编译出来,那么等效于gdb <binary_with_-g>
+run
- 若
下面演示一下使用gdb <binary_with_-g> <pid_without_-g>
这种方式进入gdb shell
1 | # 源码 |
1 | # 运行非debug版本 |
1 | # 查询进程id |
2.1 Symbol Mismatch
Symbol mismatch can occur when you use a binary compiled on one machine and attempt to run it on a different machine, especially if the two machines have different configurations or architectures. Here’s why this can happen:
- Library Dependencies: Binaries often rely on dynamic link libraries (shared libraries) or other system libraries. If the target machine doesn’t have the same versions of these libraries or they are missing altogether, you can encounter symbol mismatch errors.
- Architecture Differences: If the two machines have different CPU architectures (e.g., x86 vs. ARM), binaries compiled for one architecture may not run on the other. This is a fundamental incompatibility.
- Operating System Differences: Even if two machines have the same architecture, they may have different operating systems with different system calls and ABI (Application Binary Interface) specifications. This can lead to symbol mismatches.
- Compiler and Compiler Options: The compiler used to build the binary can affect symbol compatibility. Different compiler versions or options might generate different symbol names or behaviors.
- Bitness: Some operating systems and architectures have both 32-bit and 64-bit versions. Trying to run a binary compiled for one bitness on a machine of a different bitness can result in symbol mismatches.
To avoid symbol mismatch issues when moving binaries between machines:
- Use Static Linking: Consider statically linking libraries into your binary when compiling. This bundles the necessary libraries into the binary, reducing dependencies on external libraries.
- Build on the Target Machine: Whenever possible, compile your code on the machine where you intend to run it. This ensures that the binary is built with the correct dependencies and configurations.
- Cross-Compilation: If you must build on one machine and run on another, use cross-compilation tools to generate binaries specifically tailored for the target machine’s architecture and operating system.
- Package Managers: If you’re working with package managers (e.g., apt, yum, brew), use them to manage library dependencies and ensure compatibility between systems.
- Containerization: Consider using containerization technologies like Docker to package your application along with its dependencies, ensuring portability across different environments.
When you generate a core dump file (usually named “core”) on one machine (Machine B in your scenario) and attempt to debug it using GDB on the same machine, symbol mismatch should not be a significant issue. Here’s why:
- Binary Compatibility: The core dump file contains information about the state of the program at the moment it crashed or was interrupted. This includes the memory addresses, registers, and symbol names relevant to the binary that generated the core dump. Since you’re using GDB on the same machine where the binary was executed (Machine B), there should be no symbol mismatch problems related to the architecture or libraries of Machine A.
- GDB Compatibility: GDB is designed to work with core dump files generated by the same binary or a compatible binary. It will use the debugging information (symbols) embedded in the binary to analyze the core dump. As long as the binary and the core dump are compatible in terms of architecture, compiler options, and library versions, you should be able to use GDB without significant issues.
- Symbol Resolution: GDB uses the symbol information present in the binary (if it was compiled with debugging symbols) to resolve symbols during debugging. It doesn’t rely on external symbol files or libraries when debugging a core dump on the same machine where the program was running.
3 Command
3.1 Run Program
当我们通过gdb <binary>
这种方式进入gdb shell
后,程序不会立即执行,需要通过run
或者start
命令触发程序的执行
run
:开始执行程序,直到碰到第一个断点或者程序结束start
:开始执行程序,在main函数第一行停下来
如果程序有异常(比如包含段错误),那么我们将会得到一些有用的信息,包括:程序出错的行号,函数的参数等等信息
1 | # c++源文件 |
此外,可以通过set args
设置参数。例如:
set args -l a -C abc
set args --gtest_filter=TestXxx.caseX
3.2 Attach Program
gdb -p 12345
3.3 Break Point
break
:用于设置断点break <line_num>
break <func_name>
break <file_name>:<line_num>
break <file_name>:<func_name>
info break
:用于查看断点delete
:用于删除断点delete <break_id>
:删除指定断点delete
:删除所有断点
enable
:用于启用断点enable <break_id>
disable
:用于停用断点disable <break_id>
3.3.1 demo
1 | # c++源文件 |
1 | 1 #include <iostream> |
1 | # 回车,继续输出下10行 |
1 | 11 |
1 | # 回车,继续输出下10行 |
1 | 21 |
1 | # 在行号为8的位置打断点 |
3.4 Debugging
continue
:继续运行直至程序结束或者遇到下一个断点step
:源码级别的单步调试,会进入方法,另一种说法是step into
next
:源码级别的单步调试,不会进入方法,将方法调用视为一步,另一种说法是step over
stepi
:指令级别的单步调试,会进入方法,另一种说法是step into
nexti
:指令级别的单步调试,不会进入方法,将方法调用视为一步,另一种说法是step over
until
:退出循环finish
:结束当前函数的执行display <variable>
:跟踪查看某个变量,每次停下来都显示它的值undisplay <display_id>
:取消跟踪watch
:设置观察点。当被设置观察点的变量发生修改时,打印显示thread <id>
:切换调试的线程为指定线程up [<n>]
:沿着栈往上走一层或n
层down [<n>]
:沿着栈网下走一层或n
层frame
:显示当前的栈信息,包括当前的源码frame <n>
:跳转到栈的指定层attach <pid>
:重新连接到某个进程
3.5 Display Information
bt
、backtrace
、where
:查看当前调用堆栈bt 3
:最上面3层bt -3
:最下面3层
disassemble
:查看当前的汇编指令disassemble
:当前函数的汇编指令disassemble <function>
:指定函数的汇编指令set disassembly-flavor intel
:汇编风格指定为Intel Syntax
set disassembly-flavor att
:汇编风格指定为AT&T Syntax
,该风格为默认风格
list
:查看源码list
:紧接着上一次的输出,继续输出后10行源码list -
:紧接着上一次的输出,继续输出前10行源码list <linenumber>
:输出当前文件指定行号开始的10行源码list <function>
:输出指定函数的10行源码list <filename:linenum>
:输出指定文件指定行号开始的10行源码list <filename:function>
:输出指定文件指定函数的10行源码set substitute-path /root/starrocks /other/path/starrocks
:修改源码索引路径。当二进制在A机器或者docker内编译,但是在机器B上分析core文件,源码路径通常是对不上的,因此需要用这个命令来调整一下
info
用于查看各种调试相关的信息info break
:查看断点info reg
:查看寄存器info all-reg
:查看所有寄存器,包括浮点寄存器和向量寄存器info stack
:查看堆栈info thread
:查看线程info locals
:查看本地变量info args
:查看参数info symbol <address>
:查看指定内存地址所对应的符号信息(如果有的话)
print
:用于查看变量print <variable>
print <variable>.<field>
print (<type>)*<address>
:查看地址指向的对象,需要转型print *(<type>*)<address>
:查看地址指向的对象,需要转型print <array>[0]@5
:查看从下标0
开始,长度为5
的子集- 查看、设置属性:
show print <property>
、set print <property> on/off
,下面列出几个常用的属性名(全部属性可以通过show print [tab][tab]
或者help show print
来查看)address
:当程序显式函数信息时,显示函数地址,默认开启array
:当显示数组时,每个元素占一行,默认关闭elements
:数组的最大长度,超过该长度的元素就不再显示了,0表示无限制raw-values
:打印原始内容。在GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
上,打印标准库对象时,默认会打印优化后的内容(容器元素详情),而非容器本身的详细字段pretty
:是否以优雅的方式显式(分行、缩进等等,便于人阅读),默认关闭
std::vector
调试技巧print sizeof(*v._M_impl._M_start)
:查看元素大小print v._M_impl._M_finish - v._M_impl._M_start
:元素个数print v._M_impl._M_start[i]
:第i
个元素print &v._M_impl._M_start[i]
:第i
个元素的地址print v._M_impl._M_start[i]@j
:打印从第i
个元素开始的j
个元素
std::shared_ptr
调试技巧print *p._M_ptr
:查看具体类型,会打印类似于<vtable for Derive+16>
的信息,这里就可以获取到实际的类型信息print *(<type>*)p._M_ptr
:查看对应类型的详情
x/<count><format><size> <addr>
:以指定格式打印内存信息<count>
:正整数,表示需要显示的内存单元的个数,即从当前地址向后显示<count>
个内存单元的内容,一个内存单元的大小由第三个参数<size>
定义<format>
:表示addr
指向的内存内容的输出格式o
:octal
x
:hex
d
:decimal
u
:unsigned decimal
t
:binary
f
:float
a
:address
i
:instruction
c
:char
s
:string
z
:hex, zero padded on the left
<size>
:就是指以多少个字节作为一个内存单元,默认为4b
:1 byteh
:2 bytesw
:4 bytesg
:8 bytes
- 示例:
x/1ug $rbp-0x4
:查看寄存器rbp
存储的内容减去0x4
得到的地址中所存储的内容x/1xg $rsp
:查看寄存器rsp
存储的地址中所存储的内容
info reg
会显示所有寄存器的内容,其中内容会打印两列,第一列是以16进制的形式输出(raw format
),另一列是以原始形式输出(natural format
),下面显式了所有寄存器的大小以及类型
- 类型为
int64
的寄存器,natural format
用十进制表示 - 类型为
data_ptr
以及code_ptr
的寄存器,natural format
仍然以十六进制表示,所以你会看到两列完全一样的值
1 | <reg name="rax" bitsize="64" type="int64"/> |
3.5.1 demo of print
1 | # c++源文件 |
1 | 1 struct Person { |
1 | # 设置断点 |
3.6 Load Symbol Table
symbol-file /path/to/binary_file.debuginfo
3.7 Execute outside commands
格式:!<command> [params]
1 | (gdb) !pwd |
3.8 Tips
3.8.1 Redirect source file path
1 | (gdb) set substitute-path |
3.8.2 Redirect Thread Info to File
1 | (gdb) set pagination off |
4 Tips
4.1 How to Analyze a Core File
Here are some of tips:
info threads
: The default thread may not be where the crash occurred.thread <n>
: Switch to the specific thread where there may be something wrong.bt <n>
: List the call stack.frame <n>
: Go to the specific call frame.info locals
、info args
: See local variables and arguments.print
: See details of something.
4.2 Debugging an x86 application in Rosetta for Linux
I’m runing a centos7.9/amd64 docker container on my mac(M3), and fail to debug a core file with the error message below:
1 | $ gdb main core |
And there’s a mechanism called Rosetta:
This is a software bridge that allows applications compiled for one instruction set architecture (such as Intel x86) to run on a different architecture (like Apple’s ARM-based processors). Apple has used two versions: Rosetta for the transition from PowerPC to Intel processors, and Rosetta 2 for the transition from Intel to Apple Silicon.
According to Debugging an x86 application in Rosetta for Linux, I can debug the program by:
1 | # this will hang |
5 gdb-dashboard
gdb-dashboard在gdb
的基础之上,提供了一个更加友好的格式化界面
help dashboard
:查看帮助手册dashboard thread
:启用/禁用线程信息(大型工程,线程比较多的话,一般会禁用)dashboard
:刷新,通常在print
查看一些变量信息后,需要刷新一下重新显示详情
6 LLDB
LLDB is similar to GDB in most operations, but there are some differences (help
for more details):
- There is no
start
command frame select <id>
: select a framelldb -c <core> <binary>
: Analyze core file