AFL源码阅读（一）：启程

　　笔者明年年底就该研究生毕业了，然而毕业论文的题目还没有着落。一番考察之后，认为 fuzzing 方向还算是有事可做；于是打算在 fuzzing 领域研究得更深一些。

　　众所周知，AFL 系的 fuzzer 有几个缺陷：

发现的「有价值的漏洞」太少。fuzzer 指出的大量 bug 都是「无关紧要」的，例如缺乏验证而执行一个 division by zero，令程序收到 FPE 崩溃。这类漏洞要么本身就缺乏价值，要么就很难以利用，以至于需要通过讲故事的方式构造出一个利用场景。而另一方面，AFL 忽略了逻辑漏洞，因为只有 crash 才会被报告。
fuzzer 的搜索能力太差。例如，AFL 难以构造出能通过 magic number、checksum 检查的输入，实践上往往需要研究员手动注释掉这些检查。

　　关于第一个困难，笔者认为暂且没有很好的解决思路。归根结底，一个漏洞到底有没有价值，需要很多外部知识——例如，假如某个输入能让程序额外运行一秒钟，这对于 nginx 来说是巨大问题；但对于 checkpng 之类的小程序，便是无关紧要的。一个缓冲区 over-read 漏洞，对于 openssl 是致命的，但对于 gimp 这类程序而言，只是多了个让程序崩溃的 bug，大家并不特别关心其安全方面的危险。由于一个漏洞的价值难以衡量，笔者决定不钻研这个问题，仍以 crash 数量为衡量 fuzzer 好坏的第一标准。

　　而第二个困难，便是很有可能做出改进的地方。首先，fuzzer 不可能搜出所有 bug——使用反证法，假设存在这样的 fuzzer，那么我们就能获得一个求解任意约束问题的 oracle。例如，我们提供下面的程序：

cin >> p;
if(N % p == 0)
    abort();

　　如果真的存在一个 fuzzer 能找到程序中的所有漏洞，那它就一定能报告一个 $p$ 使得 $p \mid n$，从而令程序 crash。这像是天方夜谭，现阶段我们不存在这样强大的求解器。

　　既然不存在完美的求解器，那么我们可能会考虑让 fuzzer 拥有更加细致的求解能力。但这样的能力往往与 fuzz 效率是负相关的：以符号执行方法寻找 bug，实际表现并不比 AFL 更好。AFL 的设计哲学中有这样一个观点：如果一条路径需要很艰难地触发，那我们可以考虑加快 fuzz 速度，以尝试寻找更简单的触发方法。沿着这条设计思路，AFL 严格地控制每次程序执行的时间。

　　明显，fuzzer 需要在执行速度与求解能力上作出权衡。有一些方法试图在不太影响性能的情况下，加强 AFL 的求解能力。例如，EMS 方法认为，从 fuzz 一个程序的过程中学到的经验，能帮助我们 fuzz 其他程序。显而易见，我们对 libpng 进行 fuzz，可以学到很多 PNG 格式相关的 magic number；拿这些知识去 fuzz imagemagick，很可能是有效果的。EMS 并未引入约束求解器，而是从其他程序样本中学习「$A\to B$」这样的变异模式，在宏观上看，确实提升了 AFL 的搜索能力。

　　笔者也打算从「执行速度与求解能力的权衡」这方面着手，去优化 AFL。也就是说，笔者希望寻找一些不太影响 fuzz 执行速度的方法，去优化 fuzzer 的搜索能力。这样，势必要对 AFL 进行一些修改，因此笔者决定阅读一遍 AFL 源码。

　　坦率地讲，这是笔者第一次全面阅读中等规模项目的源码。因此操作流程上一定有不规范的地方，望读者在评论区指正。

0x00 项目文件分析

　　笔者选择阅读 AFL 的源码而非 AFL++，因为 AFL++ 引入了不少新功能和优化技术，代码应该比 AFL 更加繁杂。另外，笔者打算在 AFL 的基础上进行修改，因此选择原始 AFL 是最好的。

　　AFL 项目的仓库是 https://github.com/google/AFL，现已停止开发。据阅读 Linux 源码的人说，读源码可以先从最初几个 commit 着手，因为当时没有太多冗杂功能，core insight 比较突出。不过 Github 上并没有 2013 年版本的 AFL，因此作罢，直接阅读最新版代码。

　　把项目导入 Understand 源码阅读器，看一下 Metrics Treemap：

　　可见 afl-fuzz.c 有 8000 余行，是最大的源码文件。其余的 afl-tim.c, afl-analyze.c, afl-showmap.c 等文件有 1000 行左右。观察一下代码依赖关系：

　　我们来看这个项目的 CI 配置，即 .travis.yml 文件。

language: c

env:
  - AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1 AFL_NO_UI=1 AFL_STOP_MANUALLY=1
  - AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1 AFL_NO_UI=1 AFL_EXIT_WHEN_DONE=1
 # TODO: test AFL_BENCH_UNTIL_CRASH once we have a target that crashes
  - AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1 AFL_NO_UI=1 AFL_BENCH_JUST_ONE=1

before_install:
  - sudo apt update
  - sudo apt install -y libtool libtool-bin automake bison libglib2.0

# TODO: Look into splitting off some builds using a build matrix.
# TODO: Move this all into a bash script so we don't need to write bash in yaml.
script:
  - make
  - ./afl-gcc ./test-instr.c -o test-instr-gcc
  - mkdir seeds
  - echo "" > seeds/nil_seed
  - if [ -z "$AFL_STOP_MANUALLY" ];
    then ./afl-fuzz -i seeds -o out/ -- ./test-instr-gcc;
    else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-gcc;
    fi
  - .travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 3
  - rm -r out/*
  - ./afl-clang ./test-instr.c -o test-instr-clang
  - if [ -z "$AFL_STOP_MANUALLY" ];
    then ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang;
    else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang;
    fi
  - .travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 2
  - make clean
  - CC=clang CXX=clang++ make
  - cd llvm_mode
  # TODO: Build with different versions of clang/LLVM since LLVM passes don't
  # have a stable API.
  - CC=clang CXX=clang++ LLVM_CONFIG=llvm-config make
  - cd ..
  - rm -r out/*
  - ./afl-clang-fast ./test-instr.c -o test-instr-clang-fast
  - if [ -z "$AFL_STOP_MANUALLY" ];
    then ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang-fast;
    else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang-fast;
    fi
  - .travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 3
  # Test fuzzing libFuzzer targets and trace-pc-guard instrumentation.
  - clang -g -fsanitize-coverage=trace-pc-guard ./test-libfuzzer-target.c -c
  - clang -c -w llvm_mode/afl-llvm-rt.o.c
  - wget https://raw.githubusercontent.com/llvm/llvm-project/main/compiler-rt/lib/fuzzer/afl/afl_driver.cpp
  - clang++ afl_driver.cpp afl-llvm-rt.o.o test-libfuzzer-target.o -o test-libfuzzer-target
  - timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-libfuzzer-target
  - cd qemu_mode
  - ./build_qemu_support.sh
  - cd ..
  - gcc ./test-instr.c -o test-no-instr
  - if [ -z "$AFL_STOP_MANUALLY" ];
    then ./afl-fuzz -Q -i seeds -o out/ -- ./test-no-instr;
    else timeout --preserve-status 5s ./afl-fuzz -Q -i seeds -o out/ -- ./test-no-instr;
    fi
  - .travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 12 -p 9

　　这个 CI 流程是先编译 AFL，再执行一些测试。我们逐条解释这些脚本：

# 编译 AFL
make

# 用 afl-gcc 编译靶标程序
./afl-gcc ./test-instr.c -o test-instr-gcc

# 准备语料集
mkdir seeds
echo "" > seeds/nil_seed

# 执行一次时长 5s 的 fuzz
if [ -z "$AFL_STOP_MANUALLY" ];
then ./afl-fuzz -i seeds -o out/ -- ./test-instr-gcc;
else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-gcc;
fi

# 检查 fuzzer 是否正确生成了 fuzzer_stats 文件（这个文件保存了 fuzzer 的当前状态，以便恢复 fuzz）
.travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 3

# 清理
rm -r out/*

# 用 afl-clang 执行一遍 fuzz，大致与 afl-gcc 的流程一致
./afl-clang ./test-instr.c -o test-instr-clang
if [ -z "$AFL_STOP_MANUALLY" ];
then ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang;
else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang;
fi
.travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 2
make clean

# 用 llvm_mode 执行一遍 fuzz
CC=clang CXX=clang++ make
cd llvm_mode
CC=clang CXX=clang++ LLVM_CONFIG=llvm-config make
cd ..
rm -r out/*
./afl-clang-fast ./test-instr.c -o test-instr-clang-fast
if [ -z "$AFL_STOP_MANUALLY" ];
then ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang-fast;
else timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-instr-clang-fast;
fi
.travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 1 -p 3

# 测试对 libFuzzer 的兼容性
clang -g -fsanitize-coverage=trace-pc-guard ./test-libfuzzer-target.c -c
clang -c -w llvm_mode/afl-llvm-rt.o.c
wget https://raw.githubusercontent.com/llvm/llvm-project/main/compiler-rt/lib/fuzzer/afl/afl_driver.cpp
clang++ afl_driver.cpp afl-llvm-rt.o.o test-libfuzzer-target.o -o test-libfuzzer-target
timeout --preserve-status 5s ./afl-fuzz -i seeds -o out/ -- ./test-libfuzzer-target

# 测试 qemu mode
cd qemu_mode
./build_qemu_support.sh
cd ..
gcc ./test-instr.c -o test-no-instr
if [ -z "$AFL_STOP_MANUALLY" ];
  then ./afl-fuzz -Q -i seeds -o out/ -- ./test-no-instr;
  else timeout --preserve-status 5s ./afl-fuzz -Q -i seeds -o out/ -- ./test-no-instr;
  fi
.travis/check_fuzzer_stats.sh -o out -k peak_rss_mb -v 12 -p 9

0x01 决定阅读顺序

　　AFL 不是一个简单的 CRUD 项目，其内部充斥着各种精细的优化手段。我们现在需要寻找一个切入点，去阅读项目源码。直接阅读 afl-fuzz.c 恐怕会陷入「看到一个函数 -> 寻找函数定义 -> 看到里面调用了另一个函数……」这样的循环，因此笔者准备先读项目的其他部分，积攒一些理解，然后再读 afl-fuzz.c。这就如同在阅读一个图片处理库的核心源码之前，先读一个调用了这个库的小程序的源码，熟悉各种 API 的用途，从而读核心源码的时候也更容易。

　　我们阅读源码的主要目标应该是：

理清静态插桩过程（gcc、clang、llvm mode）
理清 fuzz 过程：如何变异、如何将 input 传递给程序、如何收集覆盖度信息
理清 qemu mode 的插桩和执行过程

　　因此，我们决定阅读顺序：

阅读 afl-gcc.c 和 afl-as.c，即静态插桩相关代码
阅读 afl-tmin.c ，这个工具的用途是「将一个 input case 缩小，但与原 input 拥有相同的覆盖度」。它会完整地演示如何收集程序的覆盖度信息，而不涉及 afl-fuzz.c 中的其他流程。这将给我们提供一个绝佳的切面，以研究 AFL 收集覆盖度的方法
阅读 afl-fuzz.c

　　那么，我们现在从 afl-gcc.c 开始。

0x02 afl-gcc 编译命令生成器

　　由于阅读过程中少不了动态调试，我们用 VS Code 远程连接到 Linux 虚拟机上。导入源码，发现 clangd 插件找不到 AFL_PATH 常量。

　　看一眼 Makefile：

PREFIX     ?= /usr/local
BIN_PATH    = $(PREFIX)/bin
HELPER_PATH = $(PREFIX)/lib/afl
DOC_PATH    = $(PREFIX)/share/doc/afl
MISC_PATH   = $(PREFIX)/share/afl

　　原来这些常量都是 make 指定的。根据一篇 Stackoverflow 回答，发现 clangd 可以从 compile_commands.json 中获取编译 flag；根据 clangd 官网，使用下面的命令就能生成这个 json：

make clean
bear -- make

　　尝试一下之后，问题果然解决了。下面来看代码。

　　根据注释， afl-gcc 是 gcc, g++, clang, clang++ 的包装器。它的作用是设置一些编译参数，然后调用这些编译器。事实上，编译出来的 afl-clang, afl-g++ 等文件都是指向 afl-gcc 的软链接。

　　afl-gcc 需要知道 afl-as 的路径。afl-as 是插桩器，我们将会在后文分析它的逻辑。默认情况下， afl-as 位于 /usr/local/lib/afl/ ，不过也可以透过 AFL_PATH 指定。

📓

一些细节：
如果 AFL_HARDEN 打开， afl-gcc 会给下游编译器传递一些开关（ -D_FORTIFY_SOURCE=2, -fstack-protector-all），使得内存 bug 更容易暴露和复现。如果 AFL_USE_ASAN 打开，将会使用 ASan。

另外， afl-gcc 允许用户使用非标准的下游编译器（也就是说，并非 gcc 或 clang）。设置 AFL_CC 和 AFL_CXX 即可。

　　　先看 main 函数：

/* Main entry point */

int main(int argc, char** argv) {

  // 如果不是 QUIET 模式，则输出作者信息
  if (isatty(2) && !getenv("AFL_QUIET")) {
    SAYF(cCYA "afl-cc " cBRI VERSION cRST " by <lcamtuf@google.com>\n");
  } else be_quiet = 1;

  // 如果不带参数调用，则输出帮助文档后退出
  if (argc < 2) {
    SAYF("\n"
         "This is a helper application for afl-fuzz. It serves as a drop-in replacement\n"
         "for gcc or clang, letting you recompile third-party code with the required\n"
         "runtime instrumentation. A common use pattern would be one of the following:\n\n"

         "  CC=%s/afl-gcc ./configure\n"
         "  CXX=%s/afl-g++ ./configure\n\n"

         "You can specify custom next-stage toolchain via AFL_CC, AFL_CXX, and AFL_AS.\n"
         "Setting AFL_HARDEN enables hardening optimizations in the compiled code.\n\n",
         BIN_PATH, BIN_PATH);

    exit(1);
  }

  // 寻找 afl-as
  find_as(argv[0]);

  // 修改编译参数
  edit_params(argc, argv);

  // 执行下游编译器
  execvp(cc_params[0], (char**)cc_params);

  FATAL("Oops, failed to execute '%s' - check your PATH", cc_params[0]);

  return 0;

}

　　逻辑很清晰：先调用 find_as(argv[0]) 寻找 afl-as ；然后修改编译器参数，并执行下游编译器。我们按顺序看，先观察负责寻找 afl-as 的 find_as 函数：

/* Try to find our "fake" GNU assembler in AFL_PATH or at the location derived
   from argv[0]. If that fails, abort. */

static void find_as(u8* argv0) {
  // 从环境变量中读取 $AFL_PATH
  u8 *afl_path = getenv("AFL_PATH");
  u8 *slash, *tmp;

  // 如果存在环境变量 $AFL_PATH，且 $AFL_PATH/as 存在，则成功找到
  if (afl_path) {

    tmp = alloc_printf("%s/as", afl_path);

    if (!access(tmp, X_OK)) {
      as_path = afl_path;
      ck_free(tmp);
      return;
    }

    ck_free(tmp);

  }

  // 于 argv[0] 所在的目录下寻找 afl-as
  slash = strrchr(argv0, '/');

  if (slash) {

    u8 *dir;

    *slash = 0;
    dir = ck_strdup(argv0);
    *slash = '/';

    tmp = alloc_printf("%s/afl-as", dir);

    if (!access(tmp, X_OK)) {
      as_path = dir;
      ck_free(tmp);
      return;
    }

    ck_free(tmp);
    ck_free(dir);

  }

  // fallback，如果前两个位置都找不到，则去编译 afl-gcc 时定义的 AFL_PATH 去找
  // 默认情况下，AFL_PATH 由 Makefile 定义成 "/usr/local/lib/afl"
  if (!access(AFL_PATH "/as", X_OK)) {
    as_path = AFL_PATH;
    return;
  }

  FATAL("Unable to find AFL wrapper binary for 'as'. Please set AFL_PATH");
 
}

　　我们平时一般使用 /work/afl/afl-gcc 来调用 afl-gcc，所以正常情况下，会找到 /work/afl/afl-as 这个文件。

　　接下来，阅读 edit_params 函数，看看下游编译器的参数是如何生成的。

　　首先，分析自己的 argv[0] ，确定自己需要调用哪个下游编译器——例如，如果 argv[0] 是 /work/afl/afl-clang++ ，则下游编译器是 clang++ 。另外，上文提到过，AFL 允许用户自己指定下游编译器，如果 AFL_CC 和 AFL_CXX 存在，则会覆盖掉默认编译器。

　　接下来，将自己的 argv[] 复制一份，稍后将会原样传递给下游编译器。由于这一步骤的存在，我们可以直接使用 afl-gcc 代替原有的 gcc 指令。

-integrated-as 和 -pipe 开关会被忽略。
-B 参数会被覆盖为 as_path 。根据 gcc 文档：This option specifies where to find the executables, libraries, include files, and data files of the compiler itself.
如果是 clang 模式，则打开 -no-integrated-as 开关。
如果 AFL_HARDEN 打开，则设置 -fstack-protector-all 和 -D_FORTIFY_SOURCE=2 。
若传入的编译参数中本来就有 -fsanitize=address 或 -fsanitize=memory，则设置环境变量 AFL_USE_ASAN 为 1；
否则执行以下步骤：
- 如果 AFL_USE_ASAN 打开，则设置 -U_FORTIFY_SOURCE 和 -fsanitize=address
- 如果 AFL_USE_MSAN 打开，则设置 -U_FORTIFY_SOURCE 和 -fsanitize=memory

　　默认情况下（未设置 AFL_DONT_OPTIMIZE 环境变量），会加入以下优化开关：

-g -O3 -funroll-loops -D__AFL_COMPILER=1 -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1

　　如果 AFL_NO_BUILTIN 环境变量被打开，则加入以下开关：

-fno-builtin-strcmp
-fno-builtin-strncmp
-fno-builtin-strcasecmp
-fno-builtin-strncasecmp
-fno-builtin-memcmp
-fno-builtin-strstr
-fno-builtin-strcasestr

▲ 关于这些 no-builtin 开关的用途，详见 Stackoverflow 讨论

　　以上就是 edit_params 的全流程。注意到，最关键的一步是加入了 -B as_path 这个 flag，使得下游编译器在汇编过程中，以 afl-as 替换了原生的汇编器。而具体的插桩过程，则是 afl-as 负责实现。我们在下一章节去阅读 afl-as 的源码。

　　本节阅读的 edit_params 函数源码如下：

/* Copy argv to cc_params, making the necessary edits. */

static void edit_params(u32 argc, char** argv) {
  u8 fortify_set = 0, asan_set = 0;
  u8 *name;

#if defined(__FreeBSD__) && defined(__x86_64__)
  u8 m32_set = 0;
#endif

  // 确定使用哪个下游编译器
  cc_params = ck_alloc((argc + 128) * sizeof(u8*));

  name = strrchr(argv[0], '/');
  if (!name) name = argv[0]; else name++;

  if (!strncmp(name, "afl-clang", 9)) {

    clang_mode = 1;

    setenv(CLANG_ENV_VAR, "1", 1);

    if (!strcmp(name, "afl-clang++")) {
      u8* alt_cxx = getenv("AFL_CXX");
      cc_params[0] = alt_cxx ? alt_cxx : (u8*)"clang++";
    } else {
      u8* alt_cc = getenv("AFL_CC");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"clang";
    }

  } else {

    /* With GCJ and Eclipse installed, you can actually compile Java! The
       instrumentation will work (amazingly). Alas, unhandled exceptions do
       not call abort(), so afl-fuzz would need to be modified to equate
       non-zero exit codes with crash conditions when working with Java
       binaries. Meh. */

#ifdef __APPLE__

    if (!strcmp(name, "afl-g++")) cc_params[0] = getenv("AFL_CXX");
    else if (!strcmp(name, "afl-gcj")) cc_params[0] = getenv("AFL_GCJ");
    else cc_params[0] = getenv("AFL_CC");

    if (!cc_params[0]) {

      SAYF("\n" cLRD "[-] " cRST
           "On Apple systems, 'gcc' is usually just a wrapper for clang. Please use the\n"
           "    'afl-clang' utility instead of 'afl-gcc'. If you really have GCC installed,\n"
           "    set AFL_CC or AFL_CXX to specify the correct path to that compiler.\n");

      FATAL("AFL_CC or AFL_CXX required on MacOS X");

    }

#else

    if (!strcmp(name, "afl-g++")) {
      u8* alt_cxx = getenv("AFL_CXX");
      cc_params[0] = alt_cxx ? alt_cxx : (u8*)"g++";
    } else if (!strcmp(name, "afl-gcj")) {
      u8* alt_cc = getenv("AFL_GCJ");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"gcj";
    } else {
      u8* alt_cc = getenv("AFL_CC");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"gcc";
    }

#endif /* __APPLE__ */

  }


  // 复制自己的 argv
  while (--argc) {
    u8* cur = *(++argv);

    if (!strncmp(cur, "-B", 2)) {

      if (!be_quiet) WARNF("-B is already set, overriding");

      if (!cur[2] && argc > 1) { argc--; argv++; }
      continue;

    }

    if (!strcmp(cur, "-integrated-as")) continue;

    if (!strcmp(cur, "-pipe")) continue;

#if defined(__FreeBSD__) && defined(__x86_64__)
    if (!strcmp(cur, "-m32")) m32_set = 1;
#endif

    if (!strcmp(cur, "-fsanitize=address") ||
        !strcmp(cur, "-fsanitize=memory")) asan_set = 1;

    if (strstr(cur, "FORTIFY_SOURCE")) fortify_set = 1;

    cc_params[cc_par_cnt++] = cur;

  }

  cc_params[cc_par_cnt++] = "-B";
  cc_params[cc_par_cnt++] = as_path;

  if (clang_mode)
    cc_params[cc_par_cnt++] = "-no-integrated-as";

  if (getenv("AFL_HARDEN")) {

    cc_params[cc_par_cnt++] = "-fstack-protector-all";

    if (!fortify_set)
      cc_params[cc_par_cnt++] = "-D_FORTIFY_SOURCE=2";

  }

  // 若传入的编译参数中本来就打开了 ASan 或 MSan，则这里把 $AFL_USE_ASAN 也打开
  if (asan_set) {

    /* Pass this on to afl-as to adjust map density. */

    setenv("AFL_USE_ASAN", "1", 1);

  } else if (getenv("AFL_USE_ASAN")) {

    if (getenv("AFL_USE_MSAN"))
      FATAL("ASAN and MSAN are mutually exclusive");

    if (getenv("AFL_HARDEN"))
      FATAL("ASAN and AFL_HARDEN are mutually exclusive");

    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=address";

  } else if (getenv("AFL_USE_MSAN")) {

    if (getenv("AFL_USE_ASAN"))
      FATAL("ASAN and MSAN are mutually exclusive");

    if (getenv("AFL_HARDEN"))
      FATAL("MSAN and AFL_HARDEN are mutually exclusive");

    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=memory";


  }

  if (!getenv("AFL_DONT_OPTIMIZE")) {

#if defined(__FreeBSD__) && defined(__x86_64__)

    /* On 64-bit FreeBSD systems, clang -g -m32 is broken, but -m32 itself
       works OK. This has nothing to do with us, but let's avoid triggering
       that bug. */

    if (!clang_mode || !m32_set)
      cc_params[cc_par_cnt++] = "-g";

#else

      cc_params[cc_par_cnt++] = "-g";

#endif

    cc_params[cc_par_cnt++] = "-O3";
    cc_params[cc_par_cnt++] = "-funroll-loops";

    /* Two indicators that you're building for fuzzing; one of them is
       AFL-specific, the other is shared with libfuzzer. */

    cc_params[cc_par_cnt++] = "-D__AFL_COMPILER=1";
    cc_params[cc_par_cnt++] = "-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1";

  }

  if (getenv("AFL_NO_BUILTIN")) {

    cc_params[cc_par_cnt++] = "-fno-builtin-strcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-memcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strstr";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasestr";

  }

  cc_params[cc_par_cnt] = NULL;

}

0x03 afl-as 静态插桩器

　　afl-as 是原生 GNU as 的 wrapper。我们先阅读 main 函数：

/* Main entry point */

int main(int argc, char** argv) {

  s32 pid;
  u32 rand_seed;
  int status;
  u8* inst_ratio_str = getenv("AFL_INST_RATIO");

  struct timeval tv;
  struct timezone tz;

  clang_mode = !!getenv(CLANG_ENV_VAR);

  if (isatty(2) && !getenv("AFL_QUIET")) {

    SAYF(cCYA "afl-as " cBRI VERSION cRST " by <lcamtuf@google.com>\n");
 
  } else be_quiet = 1;

  if (argc < 2) {

    SAYF("\n"
         "This is a helper application for afl-fuzz. It is a wrapper around GNU 'as',\n"
         "executed by the toolchain whenever using afl-gcc or afl-clang. You probably\n"
         "don't want to run this program directly.\n\n"

         "Rarely, when dealing with extremely complex projects, it may be advisable to\n"
         "set AFL_INST_RATIO to a value less than 100 in order to reduce the odds of\n"
         "instrumenting every discovered branch.\n\n");

    exit(1);

  }

  gettimeofday(&tv, &tz);

  rand_seed = tv.tv_sec ^ tv.tv_usec ^ getpid();

  srandom(rand_seed);

  edit_params(argc, argv);

  if (inst_ratio_str) {

    if (sscanf(inst_ratio_str, "%u", &inst_ratio) != 1 || inst_ratio > 100) 
      FATAL("Bad value of AFL_INST_RATIO (must be between 0 and 100)");

  }

  if (getenv(AS_LOOP_ENV_VAR))
    FATAL("Endless loop when calling 'as' (remove '.' from your PATH)");

  setenv(AS_LOOP_ENV_VAR, "1", 1);

  /* When compiling with ASAN, we don't have a particularly elegant way to skip
     ASAN-specific branches. But we can probabilistically compensate for
     that... */

  if (getenv("AFL_USE_ASAN") || getenv("AFL_USE_MSAN")) {
    sanitizer = 1;
    inst_ratio /= 3;
  }

  if (!just_version) add_instrumentation();

  if (!(pid = fork())) {

    execvp(as_params[0], (char**)as_params);
    FATAL("Oops, failed to execute '%s' - check your PATH", as_params[0]);

  }

  if (pid < 0) PFATAL("fork() failed");

  if (waitpid(pid, &status, 0) <= 0) PFATAL("waitpid() failed");

  if (!getenv("AFL_KEEP_ASSEMBLY")) unlink(modified_file);

  exit(WEXITSTATUS(status));

}

　　可见工作流程是：

初始化随机数种子
在汇编指令序列上插桩
修改 as 参数
调用 as 生成可执行文件，并清理现场

　　先来看 afl-as 是如何读取并修改 as 参数的。源码中有很多对应 MacOS 的内容，我们只关注 Linux x86，故删去这些段落。精简后的代码如下：

/* Examine and modify parameters to pass to 'as'. Note that the file name
   is always the last parameter passed by GCC, so we exploit this property
   to keep the code simple. */

static void edit_params(int argc, char** argv) {

  u8 *tmp_dir = getenv("TMPDIR"), *afl_as = getenv("AFL_AS");
  u32 i;

  /* Although this is not documented, GCC also uses TEMP and TMP when TMPDIR
     is not set. We need to check these non-standard variables to properly
     handle the pass_thru logic later on. */

  if (!tmp_dir) tmp_dir = getenv("TEMP");
  if (!tmp_dir) tmp_dir = getenv("TMP");
  if (!tmp_dir) tmp_dir = "/tmp";

  as_params = ck_alloc((argc + 32) * sizeof(u8*));

  as_params[0] = afl_as ? afl_as : (u8*)"as";

  as_params[argc] = 0;

  for (i = 1; i < argc - 1; i++) {

    if (!strcmp(argv[i], "--64")) use_64bit = 1;
    else if (!strcmp(argv[i], "--32")) use_64bit = 0;

    as_params[as_par_cnt++] = argv[i];

  }

  input_file = argv[argc - 1];

  if (input_file[0] == '-') {

    if (!strcmp(input_file + 1, "-version")) {
      just_version = 1;
      modified_file = input_file;
      goto wrap_things_up;
    }

    if (input_file[1]) FATAL("Incorrect use (not called through afl-gcc?)");
    else input_file = NULL;

  } else {

    /* Check if this looks like a standard invocation as a part of an attempt
       to compile a program, rather than using gcc on an ad-hoc .s file in
       a format we may not understand. This works around an issue compiling
       NSS. */

    if (strncmp(input_file, tmp_dir, strlen(tmp_dir)) &&
        strncmp(input_file, "/var/tmp/", 9) &&
        strncmp(input_file, "/tmp/", 5)) pass_thru = 1;

  }

  modified_file = alloc_printf("%s/.afl-%u-%u.s", tmp_dir, getpid(),
                               (u32)time(NULL));

wrap_things_up:

  as_params[as_par_cnt++] = modified_file;
  as_params[as_par_cnt]   = NULL;

}

　　可见，这个过程很类似于 afl-gcc 修改参数的逻辑：

首先确定 as 程序的名字，默认就是 GNU as，但用户也可以提供 AFL_AS 来覆盖
设置临时文件 modified_file 路径为 /tmp/.afl-pid-timestamp.s
将自己程序的 argv 原样复制给 as

　　那么，整个 afl-as 程序的逻辑就是：读入原来的汇编代码，生成一个插了桩的新汇编代码（存放在临时目录），调用 GNU as 来将新汇编代码转化成机器码。

　　接下来，我们阅读插桩过程的核心部分。

　　add_instrumentation 是一个 200 多行的函数，其中逻辑比较复杂。我们不妨先实际执行一遍 as 过程，看看它是如何插桩的。

　　编写一段简单的代码：

#include <stdio.h>

void work() {
    for(int i=1; i<=10; i++) {
        printf("Hello, world %d\n", i);
    }
}

int main(void) {
    work();
    return 0;
}

　　编译、插桩：

AFL_DONT_OPTIMIZE=1 ../afl-gcc target.c -o target -O0 -fno-asynchronous-unwind-tables

▲ 使用 -fno-asynchronous-unwind-tables 以去除 .cfi 指令。见 Stackoverflow

　　我们修改了 afl-as 的源码，让它将插桩前、插桩后的汇编代码都保留到文件系统中。插桩前的汇编代码为：

	.file	"target.c"
	.text
	.section	.rodata
.LC0:
	.string	"Hello, world %d\n"
	.text
	.globl	work
	.type	work, @function
work:
	pushq	%rbp
	movq	%rsp, %rbp
	subq	$16, %rsp
	movl	$1, -4(%rbp)
	jmp	.L2
.L3:
	movl	-4(%rbp), %eax
	movl	%eax, %esi
	leaq	.LC0(%rip), %rax
	movq	%rax, %rdi
	movl	$0, %eax
	call	printf@PLT
	addl	$1, -4(%rbp)
.L2:
	cmpl	$10, -4(%rbp)
	jle	.L3
	nop
	nop
	leave
	ret
	.size	work, .-work
	.globl	main
	.type	main, @function
main:
	pushq	%rbp
	movq	%rsp, %rbp
	movl	$0, %eax
	call	work
	movl	$0, %eax
	popq	%rbp
	ret
	.size	main, .-main
	.ident	"GCC: (Debian 12.2.0-14) 12.2.0"
	.section	.note.GNU-stack,"",@progbits

　　插桩后的代码为：

	.file	"target.c"
	.text
	.section	.rodata
.LC0:
	.string	"Hello, world %d\n"
	.text
	.globl	work
	.type	work, @function
work:

/* --- AFL TRAMPOLINE (64-BIT) --- */

.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x00000af3, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp

/* --- END --- */

	pushq	%rbp
	movq	%rsp, %rbp
	subq	$16, %rsp
	movl	$1, -4(%rbp)
	jmp	.L2
.L3:

/* --- AFL TRAMPOLINE (64-BIT) --- */

.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x00003d9b, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp

/* --- END --- */

	movl	-4(%rbp), %eax
	movl	%eax, %esi
	leaq	.LC0(%rip), %rax
	movq	%rax, %rdi
	movl	$0, %eax
	call	printf@PLT
	addl	$1, -4(%rbp)
.L2:

/* --- AFL TRAMPOLINE (64-BIT) --- */

.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x0000d5a5, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp

/* --- END --- */

	cmpl	$10, -4(%rbp)
	jle	.L3

/* --- AFL TRAMPOLINE (64-BIT) --- */

.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x000078e0, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp

/* --- END --- */

	nop
	nop
	leave
	ret
	.size	work, .-work
	.globl	main
	.type	main, @function
main:

/* --- AFL TRAMPOLINE (64-BIT) --- */

.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x0000c35b, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp

/* --- END --- */

	pushq	%rbp
	movq	%rsp, %rbp
	movl	$0, %eax
	call	work
	movl	$0, %eax
	popq	%rbp
	ret
	.size	main, .-main
	.ident	"GCC: (Debian 12.2.0-14) 12.2.0"
	.section	.note.GNU-stack,"",@progbits

/* --- AFL MAIN PAYLOAD (64-BIT) --- */
/* 此处省略 300 余行 */

　　可见，在每一个基本块入口处，afl-as 插入了一段代码。除此之外，在整个程序的末尾，插入了一段 300 多行的 AFL main payload。暂且先不管 AFL main payload，我们先分析在每个 branch 开始的位置插入的代码，这类代码形如：

/* --- AFL TRAMPOLINE (64-BIT) --- */
.align 4

leaq -(128+24)(%rsp), %rsp
movq %rdx,  0(%rsp)
movq %rcx,  8(%rsp)
movq %rax, 16(%rsp)
movq $0x000078e0, %rcx
call __afl_maybe_log
movq 16(%rsp), %rax
movq  8(%rsp), %rcx
movq  0(%rsp), %rdx
leaq (128+24)(%rsp), %rsp
/* --- END --- */

　　AFL 白皮书中说，上述代码本质上实现了如下逻辑：

cur_location = <COMPILE_TIME_RANDOM>;
shared_mem[cur_location ^ prev_location]++; 
prev_location = cur_location >> 1;

　　这里有一点值得注意：为什么需要把 cur_location 右移一位再赋值给 prev_location ？我们来看一个例子。设两个入口点的随机值分别为 $A, B$，假设不存在这个右移，那么由于异或运算的交换律，$A\to B$ 和 $B \to A$ 都会使得 mem[A ^ B]++，这样就丢失了方向信息。而 AFL 中存在这个右移，使得 $A\to B$ 实际上引发的是 mem[(A>>1)^B]++ ，而 $B\to A$ 引发 mem[(B>>1)^A]++ ，巧妙地区分开了这两种不同方向。

　　现在我们来分析一下这段汇编，看它是如何实现上述伪代码逻辑的。

将 rsp 下降一段距离
将 rdx, rcx, rax 的值存放到栈上
将 rcx 设为一个立即数（由 afl-as 随机生成）
调用 __afl_maybe_log
恢复 rdx, rcx, rax 和 rsp

　　这已经解释了 cur_location 的来历。它是随机生成的，现在存放在 rcx 寄存器中。接下来调用 __afl_maybe_log ，可以猜测，它要实现「mem 自增」和「保存 prev_location」两项任务。

　　跟进 __afl_maybe_log 看看：

__afl_maybe_log:

  lahf
  seto  %al

  /* Check if SHM region is already mapped. */

  movq  __afl_area_ptr(%rip), %rdx
  testq %rdx, %rdx
  je    __afl_setup

__afl_store:

  /* Calculate and store hit for the code location specified in rcx. */

  xorq __afl_prev_loc(%rip), %rcx
  xorq %rcx, __afl_prev_loc(%rip)
  shrq $1, __afl_prev_loc(%rip)

  incb (%rdx, %rcx, 1)

__afl_return:

  addb $127, %al
  sahf
  ret

　　首先解释一下 lahf 和 seto %al 这两行代码的意思。它们是负责存储 eflags 寄存器的值——将低 8 位保存在 ah，将 OF 位保存在 al，并在桩代码退出时执行 addb $127, %al 和 sahf 恢复现场，使得整个桩代码对原程序透明。

　　__afl_maybe_log 先检查共享内存区域是否已经映射。如果还未映射，则跳转到 __afl_setup 进行初始化；否则继续执行 __afl_store 逻辑，rdx 寄存器指向共享内存区块。

　　__afl_store 执行过程为：

将目前存储着 cur_loc 的 rcx 寄存器异或上 prev_loc
将 prev_loc 设为 cur_loc （这里利用了异或运算的自反性）
将 prev_loc 右移一位
增加 hit count

　　这些过程执行完后，恢复 eflags 并返回。现在，我们弄清了插入到基本块起始处的桩代码的逻辑。至于 AFL MAIN PAYLOAD 那一段几百行的汇编，与 fork server 有关，我们下一篇文章再研究。

　　在研究明白 afl-as 的行为之后，回头再看 add_instrumentation 函数。可以发现它是一个 parser，每次扫描并原样输出一行汇编码，如果发现这个地方要插桩，则把桩代码插进去。至于具体的桩代码，它们定义在 afl-as.h 中，有 32 位、64 位两个版本。我们上文已经详细解释了 64 位版本。

　　在扫描完成之后，于文件末尾写入 main payload。它也是有 32 位和 64 位版本。

　　以上，我们理解了插桩过程是如何进行的。而 add_instrumentation 函数之实现细节，正如我们所熟悉的各种词法分析器一样，写得很琐碎：大部分代码是在分类讨论各种 token（Linux 的、MacOS 的、OpenBSD 的），没有仔细研读的必要。

　　在本文中，我们阅读了 afl-gcc 和 afl-as 的源码，初步了解了 AFL 编译和插桩过程。本文只分析了插入到基本块入口的桩代码，并未详细解释插入到整个汇编代码文件末尾的 main payload。这部分内容留到下一篇文章。

AFL源码阅读（一）：启程

Ruan Xingzhi

Ruan Xingzhi

0x00 项目文件分析

0x01 决定阅读顺序

0x02 afl-gcc 编译命令生成器

0x03 afl-as 静态插桩器

AFL 二次开发的若干心得

如何 fuzz argv：以 DISCOUNT 为例

GPicView fuzz 实践

AFL源码阅读（二）：Main Payload 汇编

一场有趣的 js 反混淆