GPicView fuzz 实践

  GPicView 是 lxde 桌面环境默认的图片查看器,不过代码比较老旧。其源码在 SourceForge 上,我们接下来使用 Debian 社区维护的版本。

Debian LXDE packaging team / gpicview · GitLab
Debian Salsa Gitlab

寻找 fuzz 目标

  GPicView 是一个 GUI 程序。如果我们每次执行都启动 GUI,fuzz 效率将会不可接受。因此,现在需要找一些与 GUI 关系不大的逻辑,作为 fuzz 目标。最好情况是完全不用启动 GUI;稍次一些的情况是必须启动 GUI,但可以使用 persistent mode,于是每数千次执行中只需启动一次 GUI。

  项目源码文件很少,可以看到最复杂的文件是 exif.c

  这代码逻辑十分混乱,且硬编码了各种 magic number。从直觉上猜测,这里很大概率有开发者考虑不到的情况,造成 bug。那么现在追踪这个 ProcessExifDir 函数被谁调用,几分钟后可以整理出调用链:  

  • 当用户按下「save」按钮后,会调用 on_save 这个 handler
  • on_save 函数调用 ExifRotate(file_name, mw->rotation_angle)
  • ExifRotate 函数调用 ReadJpegFile( fname, READ_ALL)
  • ReadJpegFile 函数调用 ReadJpegSections(infile, ReadMode)
  • ReadJpegSections 函数调用 process_EXIF(Data, itemlen)
  • process_EXIF 函数调用 ProcessExifDir(ExifSection+8+FirstOffset, ExifSection+8, length-8, 0)
  • 执行 ProcessExifDir 函数,到达目标位置

  因此,我们主要关注 on_save 这个 handler。用户按下保存按钮之后,就会触发它。其代码如下:

void on_save( GtkWidget* btn, MainWin* mw )
{
    cancel_slideshow(mw);
    if( ! mw->pix )
        return;

    char* file_name = g_build_filename( image_list_get_dir( mw->img_list ),
                                        image_list_get_current( mw->img_list ), NULL );
    GdkPixbufFormat* info;
    info = gdk_pixbuf_get_file_info( file_name, NULL, NULL );
    char* type = gdk_pixbuf_format_get_name( info );

    /* Confirm save if requested. */
    if ((pref.ask_before_save) && ( ! save_confirm(mw, file_name)))
        return;

    if(strcmp(type,"jpeg")==0)
    {
        if(!pref.rotate_exif_only || ExifRotate(file_name, mw->rotation_angle) == FALSE)
        {
            // hialan notes:
            // ExifRotate retrun FALSE when
            //   1. Can not read file
            //   2. Exif do not have TAG_ORIENTATION tag
            //   3. Format unknown
            // And then we apply rotate_and_save_jpeg_lossless() ,
            // the result would not effected by EXIF Orientation...
#ifdef HAVE_LIBJPEG
            int status = rotate_and_save_jpeg_lossless(file_name,mw->rotation_angle);
	    if(status != 0)
            {
                main_win_show_error( mw, g_strerror(status) );
            }
#else
            main_win_save( mw, file_name, type, pref.ask_before_save );
#endif
        }
    } else
        main_win_save( mw, file_name, type, pref.ask_before_save );
    mw->rotation_angle = 0;
    g_free( file_name );
    g_free( type );
}

  也就是说,它首先检查当前图片是不是 jpeg 格式,然后在那个 if 条件中调用 ExifRotate 函数(这个函数接下来调用 ReadJpegFile)。需要注意,这可能会被前面的条件 !pref.rotate_exif_only 短路掉,所以如果想触发 ExifRotate 函数,这个 rotate_exif_only 应当设为 true。

  幸运的是,略微浏览代码,可以发现 ExifRotate 的逻辑与 GUI 无关。它的代码如下:

int ExifRotate(const char * fname, int new_angle)
{
    int fail = FALSE;
    int exif_angle = 0;
    int a;
    
    if(new_angle == 0)
      return TRUE;
    
    // use jhead functions
    ResetJpgfile();

    // Start with an empty image information structure.
    memset(&ImageInfo, 0, sizeof(ImageInfo));

    if (!ReadJpegFile( fname, READ_ALL)) return FALSE;
    
    if (NumOrientations != 0)
    {
        if(new_angle == 0) 		new_angle = 1;
        else if(new_angle == 90)	new_angle = 6;
        else if(new_angle == 180)	new_angle = 3;
        else if(new_angle == 270)	new_angle = 8;
        else if(new_angle == -45)	new_angle = 7;
        else if(new_angle == -90)	new_angle = 2;
        else if(new_angle == -135)	new_angle = 5;
        else if(new_angle == -180)	new_angle = 4;
        
        exif_angle = ExifRotateFlipMapping[ImageInfo.Orientation][new_angle];

        for (a=0;a<NumOrientations;a++){
            switch(OrientationNumFormat[a]){
                case FMT_SBYTE:
                case FMT_BYTE:      
                    *(uchar *)(OrientationPtr[a]) = (uchar) exif_angle;
                    break;

                case FMT_USHORT:    
                    Put16u(OrientationPtr[a], exif_angle);
                    break;

                case FMT_ULONG:     
                case FMT_SLONG:     
                    memset(OrientationPtr, 0, 4);
                    // Can't be bothered to write  generic Put32 if I only use it once.
                    if (MotorolaOrder){
                        ((uchar *)OrientationPtr[a])[3] = exif_angle;
                    }else{
                        ((uchar *)OrientationPtr[a])[0] = exif_angle;
                    }
                    break;

                default:
                    fail = TRUE;
                    break;
            }
        }
    }
    
    if(fail == FALSE)
    {
        WriteJpegFile(fname);
    }
    
    // free jhead structure
    DiscardData();
    
    return (NumOrientations != 0) ? TRUE : FALSE;
}

  因此,我们可以编写 harness,在不打开 GUI 的情况下,直接测试 ReadJpegFile 函数。

编写 harness

  分析 ExifRotate 函数的代码,可以发现它做的事情:

  1. 调用 ResetJpgfile() 做一些初始化(变量清零、分配空间等工作)
  2. 调用 ReadJpegFile(),这是我们想要 fuzz 的主要目标
  3. 干些与旋转有关的杂活
  4. 调用 WriteJpegFile() 覆写图片文件
  5. 调用 DiscardData() 释放空间

  因此,我们想在这基础上编写 harness,则需要把上面「覆写图片」的逻辑去掉。harness 代码如下:

int harness(const char * fname, int new_angle)
{
    int fail = FALSE;
    int exif_angle = 0;
    int a;
    
    if(new_angle == 0)
      return TRUE;
    
    // use jhead functions
    ResetJpgfile();

    // Start with an empty image information structure.
    memset(&ImageInfo, 0, sizeof(ImageInfo));

    if (!ReadJpegFile( fname, READ_ALL)) return FALSE;
    
    if (NumOrientations != 0)
    {
        if(new_angle == 0) 		new_angle = 1;
        else if(new_angle == 90)	new_angle = 6;
        else if(new_angle == 180)	new_angle = 3;
        else if(new_angle == 270)	new_angle = 8;
        else if(new_angle == -45)	new_angle = 7;
        else if(new_angle == -90)	new_angle = 2;
        else if(new_angle == -135)	new_angle = 5;
        else if(new_angle == -180)	new_angle = 4;
        
        exif_angle = ExifRotateFlipMapping[ImageInfo.Orientation][new_angle];

        for (a=0;a<NumOrientations;a++){
            switch(OrientationNumFormat[a]){
                case FMT_SBYTE:
                case FMT_BYTE:      
                    *(uchar *)(OrientationPtr[a]) = (uchar) exif_angle;
                    break;

                case FMT_USHORT:    
                    Put16u(OrientationPtr[a], exif_angle);
                    break;

                case FMT_ULONG:     
                case FMT_SLONG:     
                    memset(OrientationPtr, 0, 4);
                    // Can't be bothered to write  generic Put32 if I only use it once.
                    if (MotorolaOrder){
                        ((uchar *)OrientationPtr[a])[3] = exif_angle;
                    }else{
                        ((uchar *)OrientationPtr[a])[0] = exif_angle;
                    }
                    break;

                default:
                    fail = TRUE;
                    break;
            }
        }
    }
    
    // if(fail == FALSE)
    // {
    //     WriteJpegFile(fname);
    // }
    
    // free jhead structure
    DiscardData();
    
    return (NumOrientations != 0) ? TRUE : FALSE;
}

  在 main 函数中,不调用 gtk_main() 启动 GUI,而是调用 harness。我们采用 AFL++ 的 persistant mode,写法如下:

extern int harness(const char * fname, int new_angle);

int main(int argc, char *argv[])
{
    #ifdef __AFL_HAVE_MANUAL_CONTROL
    __AFL_INIT();
    #endif

    while (__AFL_LOOP(10000)) 
    {
        harness(argv[1], 90);
    }

    return 0;
}

  这样,我们便可以高效 fuzz 了。

💡
persistant mode 可以大幅提升程序的执行效率。上面的代码中,首先是采用了延迟 fork(对本程序没什么意义,但如果程序有 parse argv 等逻辑,则会有一些效果)。另外,每 10000 次实验才 fork 一次,这是一个很大的加速。但需要尽量保证被循环执行的代码无副作用。

fuzzing

  在 E5-2650v4 x2 机器上进行 fuzz。服务器上共有 48 个逻辑核心,我们写个脚本用来并行 fuzz:

import os
import time
import random

os.system('tmux kill-server')
time.sleep(.5)
os.system('mkdir -p /dev/shm/afl-current-work')

cmd = '/work/aflpp/afl-fuzz -i corpus -o /dev/shm/afl-current-work %s -- ../gpicview @@'

import libtmux
server = libtmux.Server()
server.new_session(session_name="fuzzer-master", attach=False, window_command=(cmd % '-M master'))

for x in range(40):
    opt = f'-S slave{x+1} '

    if random.randint(1, 10) <= 1:
        opt += '-Z '
    if random.randint(1, 10) <= 4:
        opt += '-p explore '
    elif random.randint(1, 10) < 4:
        opt += '-p exploit '
    
    
    server.new_session(session_name=f"fuzzer-slave-{x} " + opt, attach=False, window_command=(cmd % opt))


server.new_session(session_name=f"fuzzer-slave-asan", attach=False, window_command=(cmd % '-S slave-asan').replace('gpicview', 'gpicview.asan'))

  上面的代码在 /dev/shm 下建立工作目录,是为了减少 SSD 写入量,保护硬盘。它启动了 1 个 master、40 个普通 slave(随机指定一些参数),还有一个 slave 用于 fuzz 带 ASan 的程序。根据文档,仅需要开一个 ASan fuzzer 实例,以减少时间浪费。这些 fuzzer 程序由 tmux 管理。

💡
/tmp 不一定是 tmpfs,但 /dev/shm 一定是。

  fuzzer 刚启动几秒钟,就报告了 crash。数分钟后,crash 数量增长到 21 个:

  一晚上之后:

  粗略看一眼 ASan 结果,有越界内存读、堆溢出等。

结果分析

  首先,在 Debian 社区编译的 GPicView 程序中复现漏洞。触发步骤是打开图片、旋转、保存。

0:00
/

  可见我们通过 fuzz harness 得到的用例,确实可以在正常程序中触发。下面,我们分析所有 crash 用例:

import os
import shutil

workdir = 'bak-work-20231209'

for afl_sync_id in os.listdir(workdir):
    crash_dir = os.path.join(workdir, afl_sync_id, 'crashes')
    
    for crash_file in os.listdir(crash_dir):
        if not crash_file.startswith('id:'):
            continue
            
        new_filename = '{}-{}'.format(afl_sync_id, crash_file.split(',')[0].split(':')[-1])
        shutil.copy(os.path.join(crash_dir, crash_file), os.path.join('crashes', new_filename))

  42 个 fuzzer 实例,一共提供了 1618 个 crash。显然,逐个分析是看不完了,于是我们用 ASan 来观察调用栈。

import os
from tqdm import tqdm

for filename in tqdm(os.listdir('crashes')):
    os.system(f'./asan_gpicview crashes/{filename} 2> report/{filename}')

  分析 ASan 报告:

import os
import re
from tqdm import tqdm

import sqlite3

db = sqlite3.connect(':memory:')

db.execute('CREATE TABLE info (id INTEGER PRIMARY KEY, filename TEXT, content TEXT, summary TEXT)')
db.commit()

for f in tqdm(os.listdir('report')):
    content = open(os.path.join('report', f)).read()

    info = re.findall(r'SUMMARY:([\S\s]*?)\n', content)
    assert len(info) in [0, 1]

    if len(info) == 0:
        assert len(content) == 0
        info.append('')

    db.execute('INSERT INTO info (filename, content, summary) VALUES (?, ?, ?)', [f, content, info[0]])

db.commit()

print(db.execute('SELECT COUNT(*) AS cnt, summary, MIN(filename) FROM info GROUP BY summary ORDER BY cnt DESC').fetchall())

  结果如下:

count crash file ASan summary
912 master-000000 AddressSanitizer: SEGV ./exif.c in Get16u
258 master-000015 AddressSanitizer: heap-buffer-overflow ./exif.c:1702:57 in harness
195 master-000012 AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0x1583e1) (BuildId: 0f2b39b572b576eb49c92fd0cc6dfa2bc9904500)
107 master-000003 正常退出
75 master-000006 AddressSanitizer: SEGV ./exif.c:320:18 in Get32s
39 master-000014 AddressSanitizer: SEGV ./exif.c:323:18 in Get32s
5 slave-asan-000009 AddressSanitizer: SEGV ./exif.c:425:37 in ConvertAnyFormat
4 slave-asan-000020 AddressSanitizer: heap-buffer-overflow ./jpgfile.c:28:13 in Get16m
4 master-000021 AddressSanitizer: SEGV ./exif.c:424:45 in ConvertAnyFormat
3 slave-asan-000017 AddressSanitizer: heap-buffer-overflow ./jpgfile.c:77:22 in process_SOFn
3 slave-asan-000023 AddressSanitizer: heap-buffer-overflow ./jpgfile.c:51:27 in process_COM
3 slave-asan-000036 AddressSanitizer: heap-buffer-overflow ./jpgfile.c:28:41 in Get16m
3 slave23-000027 AddressSanitizer: SEGV ./exif.c:401:37 in ConvertAnyFormat
2 slave-asan-000004 AddressSanitizer: heap-buffer-overflow ./exif.c in Get16u
2 slave28-000007 AddressSanitizer: SEGV ./exif.c:631:39 in ProcessExifDir
1 slave-asan-000061 AddressSanitizer: heap-buffer-overflow crtstuff.c in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
1 slave-asan-000039 AddressSanitizer: heap-buffer-overflow ./jpgfile.c:80:22 in process_SOFn
1 slave32-000023 AddressSanitizer: SEGV ./exif.c:400:37 in ConvertAnyFormat

  有 107 个 crash 是无法复现的,说明我们的 harness 不够好——在 __AFL_LOOP(10000) 循环中,一些代码存在副作用。

  现在,打开 AFL 的 crash explore 模式,跑几十分钟,获得更多 crash:

  这些新 crash 的总结如下:

count CRASH FILE memory write? ASAN SUMMARY
2529 0 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c in Get16u
380 100 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:320:18 in Get32s
307 1011 1 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/exif.c:1702:57 in harness
257 1031 0 AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0x1583e1) (BuildId: 0f2b39b572b576eb49c92fd0cc6dfa2bc9904500)
129 1014 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:631:39 in ProcessExifDir
125 1019 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:323:18 in Get32s
122 1000 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/exif.c in Get16u
107 master-000003 0 正常退出
49 1033 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/jpgfile.c:28:13 in Get16m
46 1012 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/jpgfile.c:51:27 in process_COM
44 1020 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/jpgfile.c:80:22 in process_SOFn
8 1226 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:425:37 in ConvertAnyFormat
5 1933 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:424:45 in ConvertAnyFormat
4 2257 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/jpgfile.c:28:41 in Get16m
3 slave-asan-000017 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/jpgfile.c:77:22 in process_SOFn
3 slave23-000027 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:401:37 in ConvertAnyFormat
2 426 0 AddressSanitizer: SEGV /home/blue/Desktop/work/programs/gpicview/src/exif.c:400:37 in ConvertAnyFormat
1 slave-asan-000061 0 AddressSanitizer: heap-buffer-overflow crtstuff.c in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
1 498 0 AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/exif.c:323:18 in Get32s

  上面的漏洞,大部分是源于 exif 信息中的 offset 字段未校验,从而达成任意地址读,产生崩溃。但程序源码中也存在一些写入 base+offset 地址的操作,这个漏洞的危害比任意地址读更大。

  笔者肉眼看到的「写入用户可控地址」的逻辑是:当图片旋转后,需要向 exif 中保存新的旋转信息。这个信息直接写入了 OrientationPtr[0] 所指向的地址,而这个地址是原图片 exif 信息中 base + offset 计算出来的。由于 offset 可控,故这里存在任意地址写。

  观察上面 1011 号 crash file 的报告:

=================================================================
==47591==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x624000000000 at pc 0x56477cdb013b bp 0x7ffea01cab00 sp 0x7ffea01caaf8
WRITE of size 1 at 0x624000000000 thread T0
    #0 0x56477cdb013a in harness /home/blue/Desktop/work/programs/gpicview/src/exif.c:1702:57
    #1 0x56477cdb013a in main /home/blue/Desktop/work/programs/gpicview/src/gpicview.c:57:9
    #2 0x7f2c831b81c9  (/lib/x86_64-linux-gnu/libc.so.6+0x271c9) (BuildId: 0f2b39b572b576eb49c92fd0cc6dfa2bc9904500)
    #3 0x7f2c831b8284 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27284) (BuildId: 0f2b39b572b576eb49c92fd0cc6dfa2bc9904500)
    #4 0x56477ccee0a0 in _start (/home/blue/Desktop/work/programs/gpicview/fuzz/asan_gpicview+0x6d0a0) (BuildId: e2195e8b4fadfc4d)

Address 0x624000000000 is a wild pointer inside of access range of size 0x000000000001.
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/blue/Desktop/work/programs/gpicview/src/exif.c:1702:57 in harness
Shadow bytes around the buggy address:
  0x0c487fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c487fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c487fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c487fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c487fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c487fff8000:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c487fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c487fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c487fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c487fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c487fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==47591==ABORTING

  可见,fuzzer 找到的任意地址写触发路径,与笔者肉眼看源码发现的路径是一样的——在更新 exif 旋转信息时产生任意地址写。

  攻击者可以提供恶意的 offset 数据,控制 OrientationPtr[0] 的值;同时,将 OrientationNumFormat[0] 控制为 FMT_SBYTE, FMT_BYTE, FMT_USHORT 其中的一个,即可在 ExifRotate() 被执行时,写入对应地址。但由于程序限制,写入的数据只能在 0x00-0x08 之间,且至多只能写两次。因此漏洞的影响范围有限。