Proj THUDBFuzz Paper Reading: Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

zoukankan html css js c++ java

Proj THUDBFuzz Paper Reading: Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing
Abstract

文章认为: search-based testing does not generate highly-structured data; grammar-based fuzzing does not generate test case structures
本文: G-EVOSUITE
实验: an empirical study
对象: 3个Json parser的20个java classes
效果: json相关的类branch coverage + 15%，其他的至少负面影响不太大

1. Intro

文章认为: 对于grammar-based fuzzing，用户不但需要指定entry points，还需要人工创建为每个method都要test structure
G-EVOSUITE用演化算法确定句子顺序，以此保证test case structure，而使用grammar-based fuzzing保证input data符合语法。
Evolutionary algorithms create and evolve the test case structure（statement sequence) while grammar-based fuzzing is used to evolve parts of the input data.
Fuzzer令演化算法的初始化population为structured JSON inputs并让变异出的数据结构保持JSON结构
The fuzzer injects structured JSON inputs in the initial population of EA with some prob and manipulates this data to maintain a well-formed JSON structure.
作者过去曾经实现了一个前沿的JAVA test case generator-EVOSUITE。
目前的实验细节:3个JSON parser库，20个类，其中16个需要JSON input，4个non-JSON。16个JSON input用于验证本文的方法增加了code coverage，4个non-JSON用于验证没有影响到EVOSUITE在非json输入类上面的性能。
对每一组，本文分别用了60s, 120s, 180s搜索时间来做对比。
结果: 更高的code coverage

2. Background and Related Work

Search-based Test Case Generation

已有的方法常依赖于：
1. test adequacy criteria: 常用来设置搜索目标。例如: application level，branch distance等
2. 演化算法: 例如: EVOSUITE，是前沿的演化算法test case工具，内置多种遗传算法，包括monotonic genetic alogo, local solvers和many objective algos
- 这里为了支持同时优化多个目标，本文选择了DynaMOSA
DynaMOSA

将test case本身视为染色体，那么，每个语句都是一个gene。算法采用single point crossover的方式，通过重组语句来生成新的test case，并使用uniform mutation进一步修改子代

Grammar-based fuzzing

开发者往往需要指定application entry point，而且不同的entry point可能对应着不同的语法。
这些软件的目标往往是: allopw satisfying/cover different program conditions
常用策略包括:
1. 符号执行
2. 元信息启发
3. 语法
4. hybrid
Reasons for combining

开发者往往需要指定application entry point，而且不同的entry point可能对应着不同的语法。

3. Approach

G-EVOSUITE
1. 以EVOSUITE作为test case generator
- 在此基础上加了一个json fuzzer
- 稍微修改了DynaMOSA算法
1. 以SNODGE为grammer fuzzer
Initialization
1. 创建初代池
- 目的: 激活目标class的不同方法
- 步骤：
  
  随机挑选若干method calls
  
  创建对应的objects
  
  填入参数
  
  随机
  
  从常量池中抽取
- 本文的改进
  
  需要json参数的地方能传入json参数
Selection

用perference sorting algo结合多项指标(line, branch, exception和weak mutation)为test case打分，最好的是tournament selection。test case根据是否打开了新branch，会有机会复制自己传到下一代。

Grammar-based Mutation

使得test case不会快速困在局部最优点
1. uniform mutation
2. add new <k,v>
3. add new objs
4. remove <k,v>
5. modify <k,v>
6. reording <k,v>
7. 检查测试用例T中所有input data, 确定string inputs，使用JSON parser重新验证是否有效
4. Empirical Study

主要回答两个问题
1. To what extend does grammer-based fuzzing improve the effectiveness of test case generation in EVOSUITE?
2. What is the effectiveness of combining grammar-based fuzzing and search-based testing over different search budgets?
5. Conclusion
查看全文

相关阅读:
ABP WebApi的请求类型
 对服务器的文件或文件夹压缩
 VS运行遇到的那些坑
 asp.net core Api集成Swagger
通过文件路径获取文件名
 读取服务器或者本地的文件夹下面的文件
 计算出 3 至 1000 范围内最大的十个素数，放入数组中，并计算出其累加和。
LINUX 无法登入系统（2017-1-16）
zynq里面的AXI总线（2017-1-11）
学习随笔（2017-1-10）

原文地址：https://www.cnblogs.com/xuesu/p/14491541.html

Proj THUDBFuzz Paper Reading: Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Abstract

1. Intro

2. Background and Related Work

Search-based Test Case Generation

DynaMOSA

Grammar-based fuzzing

Reasons for combining

3. Approach

Initialization

Selection

Grammar-based Mutation

4. Empirical Study

5. Conclusion