SHF: Small: Supporting Regular Expression Testing, Search, Repair, Comprehension, and Maintenance

Date:

Abstract: Many aspects of our economy rely heavily on software working correctly. However, software errors are common, routinely cause security breaches, and cost our economy billions of dollars annually. Despite the well-known high costs of software errors, the software industry struggles to overcome this challenge, as new errors are reported faster than they can be fixed. Recent research has demonstrated the potential of automated program repair techniques to address this challenge. In this research, we develop new techniques to fix software errors and implement new features automatically. The challenge is to fix code while not breaking other functionality, and to work toward repairing code of increasing complexity.

The approach takes advantage of the high availability of open-source code that already implements many functions required for a new software project. The approach is to search for relevant code in open-source projects, adapt that code to its new context using automated software repair and generation techniques, and then validate the changed software. A key component of the approach is semantic code search, which queries large databases of code to find code snippets that satisfy a behavioral specification. The project develops novel techniques that (1) encode large, searchable bodies of code as behavioral profiles, (2) localize bugs and features to code blocks, modules, and components, (3) extract the desired behavioral profiles of those blocks, modules, and components, (4) use the extracted profiles to search the database for potential patches, (5) adapt the potential patches to fit into the code context, and (6) validate the potential patches. The project focuses on producing high-quality code, verifying that the injected code does not break existing functionality. The broader impacts come mainly from goal of radically improving software productivity through reuse and adaptation of existing code.