Posts by Collection

SHF: EAGER: Collaborative Research: Demonstrating the Feasibility of Automatic Program Repair Guided by Semantic Code Search

Published: July 01, 2014

Abstract: Software is an integral part of our everyday lives, and our economy relies heavily on software working correctly. However, bugs in software cause security breaches, and cost our economy billions of dollars annually. While these high costs of bugs are well known, the software industry struggles to remedy the situation because the inherent complexity of the software makes bugs so common that new bugs are typically reported faster than developers can fix them. The goal of this project is to develop a technique that fixes bugs automatically, greatly reducing the cost of fixing the bugs, improving quality of software, and reducing the negative effects on the economy and society.

SHF: Medium: Collaborative Research: Semi and Fully Automated Program Repair and Synthesis via Semantic Code Search

Published: July 01, 2016

Abstract: In software development, regular expressions are a common programming construct used for many purposes, including querying databases, searching documents, validating user input, and parsing files. Most programming languages have standard libraries or built-in support for regular expression processing. Despite their frequent appearance in software development activities, regular expressions are prone to programming errors. When a regular expression is responsible for a software bug, the impact can be severe, possibly resulting in corrupted data, security vulnerabilities, denial of service attacks, or website outages. This research develops new techniques to test, understand, reuse, and maintain regular expressions, in an effort to improve developer comprehension and reduce related bugs.

SHF: Small: Supporting Regular Expression Testing, Search, Repair, Comprehension, and Maintenance

Published: August 15, 2017

Abstract: Many aspects of our economy rely heavily on software working correctly. However, software errors are common, routinely cause security breaches, and cost our economy billions of dollars annually. Despite the well-known high costs of software errors, the software industry struggles to overcome this challenge, as new errors are reported faster than they can be fixed. Recent research has demonstrated the potential of automated program repair techniques to address this challenge. In this research, we develop new techniques to fix software errors and implement new features automatically. The challenge is to fix code while not breaking other functionality, and to work toward repairing code of increasing complexity.

CAREER: On the Foundations of Semantic Code Search

Published: August 01, 2018

During software development, programmers frequently use search to find code to reuse. At minimum, code reuse requires that the behavior of the found code satisfies the needs of the programmer. Current search tools consume a textual description of the desired code as a query, which ignores the behavior of the source code. Semantic code search finds code based on behavior, and recent research has demonstrated its potential to find source code to reuse code as well as repair software faults. Challenges arise when 1) the desired code does not exist; 2) there are too many results to navigate efficiently; or 3) it is difficult to differentiate between similar code snippets. These challenges are especially pronounced for programmers in languages that are less supported, such as those used by end-user programmers.

SHF: SMALL: Automated Discovery of Cross-Language Program Behavior Inconsistency

Published: August 01, 2020

Abstract: In the software industry, hundreds of programming languages exist, many of which programmers are expected to be proficient in. The common assumption has been that once a programmer knows one language, they can leverage concepts and knowledge already learned and easily pick up another programming language. Unfortunately, empirical studies find this process to be error-prone and ineffective due to imprecise mismatches between concepts and expressions across programming languages. This project develops techniques to ease the acquisition of knowledge for new programming languages by identifying and explaining how the behaviors of code in different languages relate. The anticipated result is that programmers will learn new languages faster and write code with fewer bugs. Beyond the general benefit of better-educated programmers, techniques for teaching computer programming are important in particular because programming is a crucial skill for a digitally literate society.

Improving Software Testing Education through Lightweight Explicit Testing Strategies and Feedback

Published: July 01, 2022

Abstract: This project aims to serve the national interest by designing tools and pedagogy that will help students learn how to write better software tests. Software testing is a critical skill for computer science graduates when they enter the workforce. A software professional with better software testing skills is likely to produce more secure, more robust, and less error-prone software. Writing effective software tests is an open-ended problem that students find very challenging. The use of checklists has been shown to be an effective practice for helping students learn other software engineering skills. This project will investigate the use of checklists to provide a procedure that students can use to write software tests. Students will learn how to write software tests and use them to evaluate the quality of software they develop. The project team will develop a software tool that will run students? tests on software and provide real-time hints and feedback on the quality of their tests. The project will also examine the benefits of testing education for students in terms of their learning outcomes and code quality.

Paper Title Number 1

Published in Journal 1, 2009

This paper is about the number 1. The number 2 is left for future work.

Recommended citation: Your Name, You. (2009). "Paper Title Number 1." Journal 1. 1(1). http://academicpages.github.io/files/paper1.pdf

Paper Title Number 2

Published in Journal 1, 2010

This paper is about the number 2. The number 3 is left for future work.

Recommended citation: Your Name, You. (2010). "Paper Title Number 2." Journal 1. 1(2). http://academicpages.github.io/files/paper2.pdf

Paper Title Number 3

Published in Journal 1, 2015

This paper is about the number 3. The number 4 is left for future work.

Recommended citation: Your Name, You. (2015). "Paper Title Number 3." Journal 1. 1(3). http://academicpages.github.io/files/paper3.pdf

Paper Title Number 4

Published in GitHub Journal of Bugs, 2024

This paper is about fixing template issue #693.

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3). http://academicpages.github.io/files/paper3.pdf

Searching for Source Code with Constrained Semantic Search

Published: November 19, 2013

Programmers are resourceful. This talk explores how search supports their development practices.

I get by with a little help from my friends: crowdsourcing program repair

Published: January 10, 2017

Regular expressions are commonly used in source code, yet developers find them hard to read, hard to write, and hard to compose. Motivated by the prevalence of regular expression usage in practice and the number of bug reports related to regular expressions, I propose several future directions for studying regular expressions, including error classification, test coverage, test input generation, reuse, and automated program repair. The repair strategies can work in the presence or absence of fault localization, and with or without test cases. I conclude by discussing the potential impact of integrating regex support into automated program repair approaches. Dagstuhl Seminar

Making software development easier, one search at a time

Published: May 04, 2018

Search is an embedded process in modern software development, and it’s not going anywhere. Search is used to help developers find code to reuse, find API examples, and explore documentation. This talk explores my recent research in code search for software engineering, including understanding why developers search, how we can make search more precise, and how behavioral code search can enable automated program repair. advertisement

Repairing Programs with Semantic Code Search

Published: June 28, 2018

All software has bugs. This talk presents an approach to patching bugs that is not prone to overfitting (at least compared to prior approaches).

How Software Engineering Became My Career

Published: September 07, 2018

Software engineering is not about programming. It’s about people.

Program Analysis Fueled Search

Published: February 01, 2020

Developers search the web for code, but web search was not built for code. Here are the challenges and the strides we’ve made to make search for code easier and more precise.

Nevertheless, She Persisted: An intervention to increase women’s persistence intentions in CS

Published: August 27, 2021

Gender stereotypes about women’s computing ability contribute to the dearth of women in computing by causing women to experience gender bias. These gender stereotypes are doubly disadvantaging to women because they create gender diferences in self-assessments of computing ability, decreasing the likelihood that women will persist in Computer Science (CS). This is because students need to believe they have sufcient ability in a feld in order to pursue it as a career. This talk presents an intervention to increase women’s persistence intentions in CS.

To Search or Not To Search, Depends on the Question

Published: April 27, 2022

Her talk looks at when it’s worth searching for what when programming. YouTube

How Code Search Drives Software Engineering

Published: July 10, 2023

Code search describes the process of retrieving source code from a repository, where that source code matches a query. Whether a developer is looking for where an error was thrown, learning how to use a new-to-them API, learning a new programming language, or browsing their team’s directory to familiarize themselves with the codebase, search underpins all these activities. Beyond those human-driven software engineering processes, search is also a component in automated software engineering, such as automated program repair, code example recommendation, and clone detection. This talk explores my research group’s explorations into how code search shapes modern software development processes, and looks ahead to future challenges and opportunities (especially in the context of generative AI).

Katie Stolee, PhD