Kathryn T. Stolee

Students: George Mathew (PhD in progress), Justin Middleton (PhD in progress), Joshua Kayani (REU 2016), James Saylor (Ugrad, 2014)

Behavioral code analysis seeks to understand the relationship of code to a description of that code. In semantic code search, the code description is a behavioral input-output example. In cross-language code analysis, the description is code in one language, with the goal of identifying code in another language that behaves the same (or close).

Semantic code search: Searching for code is a common task among programmers, with the ultimate goal of reuse. While the process of searching for code – issuing a query and selecting a relevant match – is straightforward, several costs must be balanced, including the costs of specifying the query, examining the results to find desired code, and not finding a relevant result. For syntactic searches the query cost is quite low, but the results are often irrelevant, so the examination cost is high and matches may be missed. Semantic searches may return more relevant results, but current techniques that involve writing complex specifications or executing code against test cases are costly to the developer, and close matches cannot be easily identified. We have developed an approach for semantic search in which developers specify lightweight specifications and an SMT solver identifies matching programs from a repository. A program repository is automatically encoded offline so the search for programs is efficient. Programs are also encoded at various abstraction levels to enable partial matches when no, or few, exact matches exists.

Cross-language analysis: This project will develop techniques to automatically identify incapabilities and potential misconceptions between two programming languages. Two main research tasks will be investigated for this project. The first task is to develop an approach for automatically identifying clusters of similar code based on dynamic behavior, likely invariants, observed side effects, and performance. Behavioral clusters are formed from snippets in multiple languages that produce the same outputs on the same inputs. Likely invariants from observed behavior are used to describe similarities and differences. The second task is to develop a technique to identify misconceptions that emerge when a programmer assumes code should behave the same but it does not. To identify misconceptions, the technique leverages the behavior clusters and characterizations from the code similarity analysis; code that looks similar but behaves differently in overt (behavior) or insidious (performance, side effects) ways are candidates. The technique will rank misconceptions based on probability of appearing and likely impact. Finally, the technique will use invariants, behavior, side effects and performance to form automated explanations of behavioral similarities and differences. Finally, these techniques and explanations will be applied for the benefit of two groups of real programmers: transfer students who know one language and need to learn a new one, and data scientists who work with many programming languages to complete their tasks. For programmers learning a new language, in student, professional, or hobby capacities, this work aims to increase the speed and reliability with which they acquire knowledge of the programming language.

Students: Kai Presler-Marshall (PhD in progress), Andrew Hill (M.S. 2018), Yalin Ke (M.S. 2015)

Automated program repair can potentially reduce debugging costs and improve software quality but recent studies have drawn attention to shortcomings in the quality of automatically generated repairs. We propose a new kind of repair that uses the large body of existing open-source code to find potential fixes. The key challenges lie in efficiently finding code semantically similar (but not identical) to defective code and then appropriately integrating that code into a buggy program. We present SearchRepair, a repair technique that addresses these challenges by (1) encoding a large database of human-written code fragments as SMT constraints on input-output behavior, (2) localizing a given defect to likely buggy program fragments and deriving the desired input-output behavior for code to replace those fragments, (3) using state-of-the-art constraint solvers to search the database for fragments that satisfy that behavior and replacing the likely buggy code with these potential patches, and (4) validating that the patches repair the bug against program test suites.

Funding: SHF: Small: Supporting Regular Expression Testing, Search, Repair, Comprehension, and Maintenance

Students: Peipei Wang (PhD in progress), Carl Chapman (M.S. 2016)

Due to the popularity and pervasive use of regular expressions, researchers have created tools to support their creation, validation, and use. However, little is known about the context in which regular expressions are used, the features that are most common, and how behaviorally similar regular expressions are to one another. In this project, we explore the context in which regular expressions are used through a combination of developer surveys and repository analysis. This is the first rigorous examination of regex usage and it provides empirical evidence to support design decisions by regex tool builders.

Students: Gina R. Bai (PhD in progress), Kai Presler-Marshall (PhD in progress)

We are exploring a variety of topics in computer science education, including human studies on student behavior during various software engineering tasks and the potential role of program repair of SQL statements in education. In broader diversity efforts to improve retention of underrepresented individuals in computing, we have drawn from social-psychological research to create a lightweight intervention that provides positive, granular exam feedback. Results from two 100-level computing courses show the intervention increased all students’ self-assessments of CS ability but only increased women’s CS persistence intentions.

Past Students: Peng Sun

Crowdsourcing is a compelling approach for accomplishing tasks that require opinions or work from a large number of people. I am interested in techniques and approaches to help researchers and practitioners to best leverage crowdsourcing to conduct software engineering tasks and to evaluate software engineering research.

One of the most popular end user programming domains is mashups. Mashup programming environments are popping up to help end users to create mashups that tailor and individualize data streams. This means that the power of creation is in the hands of the end user. However, the mashups created by end users are often littered with errors and deficiencies that can make them error-prone and hard to understand. Further, users often 'reinvent the wheel' by creating mashups that have the same functionality as mashups created by other users.

Our work with web mashups deals with refactoring techniques to reduce the complexity, increase abstraction, updated broken or dated components, and standardize the programs to fit community development patterns. Results from our empirical study refactoring 8,051 Yahoo! Pipes programs and details regarding the manipulation infrastructure can be found here.

End-user programmers outnumber professionals programmers, write software that matters to an increasingly large number of users, and face software engineering challenges that are similar to their professionals counterparts. Yet, we know little about how these end-user programmers create and share artifacts as part of a community. To gain a better understanding of these issues, we perform an artifact-based community analysis of 32,887 mashups from the the Yahoo! Pipes repository. We observed that, like with other online communities, there is great deal of attrition but authors that persevere tend to improve over time, creating pipes that are more configurable, diverse, complex, and popular. We also discovered, however, that end- user programmers employ the repository in different ways than professionals, do not effectively reuse existing programs, and in most cases do not have an awareness of the community.

By observing the clipboard as a mode of data transfer in the desktop environment, we are searching for patterns in end users' usage history for the purposes of finding areas in which users are inefficient in transferring data and could benefit from automation and validation of copy and paste activities.

Using automatically-generated assertions to improve the robustness of Web macros. For details, visit the website.