[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools

zoukankan html css js c++ java

[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools
Basic Information
- Publication: ICSE'17
- Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, Abhik Roychoudhury
- Language: C Program
- Source: Codeforces Programming Contest (Reject/Accept)
- Description: a set of 3902 defects from 7436 programs automatically classified across 39 defect classes
- Dataset Homepage
Summary

Existing benchmarks (like ManyBugs and IntroClass) on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools.
Four criterias for a benchmark that allows extensive evaluation of repair tools:
- C1: Diverse types of real defects.
- C2: Large number of defects.
- C3: Large number of programs.
- C4: Programs that are algorithmically complex
- C5: Large held-out test suite for patch correctness verification
Overall, author crawled over 10000 webpages from Codeforces programming contest. For each rejected submission r, they find another accepted submission a by the same user for the same programming problem in the crawled data. Each fault is represented by the submission pair (r, a). In total, they obtain 5544 defects. Then they further exclude 924 defects due to inadequate held-out tests, 677 defects due to non-reproducible bugs, and 41 defects due to a known CIL bugs2 in handling variable sized multidimensional array.

All defects are divided into 39 classes by using Gumtree on AST-level syntactic differences between buggy program and patched program.

Structure

codeflaws
|> 1-A-bug-18353198-18353306 (<contestid>-<problem>-bug-<buggy-submisionid>-<accepted-submissionid>)
|===> 1-A-18353198.c (<contestid>-<problem>-<buggy-submisionid>.c)
|===> 1-A-18353306.c (<contestid>-<problem>-<accepted-submissionid>.c)
|===> input-neg1 (Test input files: input[0-9]+ file used by Test suite (i))
|===> output-neg1 (Test output files: output[0-9]+ file used by Test suite (i))
|===> heldout-input-pos1 (heldout-input[0-9]+ file used by Test suite (ii))
|===> heldout-output-pos1 (heldout-output[0-9]+ file used by Test suite (ii))
|===> 1-A-18353198.c.revlog(Test configuration for SPR that specify the name for pass/fail test: --.c.revlog)
|===> test-genprog.sh (Repair Test script (test suite given to repair tools for generating repair), test-genprog.sh is for search-based repair tools (GenProg, SPR, Prophet))
|===> test-angelix.sh (Repair Test script (test suite given to repair tools for generating repair), test-angelix.sh is for Angelix as it requires inserting special instrumentation)
|===> test-valid.sh(Test script for patch validation (held-out test suite): test-valid.sh is for validating the correctness of patches)
|===> Makefile (Makefile for compiling the buggy submission. This contains the CFLAGS options recommended by Codeforces. To compile the accepted submission, use the command make FILENAME=10-A-13543524)
|===> Makefile.genprog (Makefile.genprog for compiling the buggy submission using cilly. This is for GenProg experiments as GenProg works on CIL representation.)
查看全文

相关阅读:
Git远程仓库的使用(github为例)
SQL查看数据库中每张表的数据量和总数据量
 HTML简单的注册页面搭建
 java新建日期获取工具
 ArrayList的使用方法技巧(转载)
UI初级 TextField
UI初级 Label
UI入门纯代码第一节 UIWindow, UIView
C 入门第十一节
 secoclient安装mac版提示系统配置文件写入失败的解决方案

原文地址：https://www.cnblogs.com/XBWer/p/9195057.html

[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools

Basic Information

Summary

Structure