### Top 10 Arxiv Papers Today in Software Engineering

##### #1. Influence of Technical and Social Factors for Introducing Bugs
###### Filipe Falcão, Caio Barbosa, Baldoino Fonseca, Alessandro Garcia, Márcio Ribeiro
As the modern open-source paradigm makes it easier to contribute to software projects, the number of developers involved in these projects keep increasing. This growth in the amount of developers makes it more difficult to deal with harmful contributions. Recent researches have found that technical and social factors can predict the success of contributions to open-source projects on GitHub. However, these researches do not study the relation between these factors with the introduction of bugs. Our study aims at investigating the influence of technical (such as, developers' experience) and social (such as, number of followers) factors on the introduction of bugs, using information from 14 projects hosted on GitHub. Understanding the influence of these factors may be useful to developers, code reviewers and researchers. For instance, code reviewers may want to double check commits from developers that present bug-related factors. We found that technical factors have a consistent influence in the introduction of bugs. On the other...
more | pdf | html
###### Tweets
ComputerPapers: Influence of Technical and Social Factors for Introducing Bugs. https://t.co/dvxX2CoKNM
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 5
Total Words: 8500
Unqiue Words: 1848

##### #2. Fault Localization for Declarative Models in Alloy
###### Kaiyuan Wang, Allison Sullivan, Darko Marinov, Sarfraz Khurshid
Fault localization is a popular research topic and many techniques have been proposed to locate faults in imperative code, e.g. C and Java. In this paper, we focus on the problem of fault localization for declarative models in Alloy -- a first order relational logic with transitive closure. We introduce AlloyFL, the first set of fault localization techniques for faulty Alloy models which leverages multiple test formulas. AlloyFL is also the first set of fault localization techniques at the AST node granularity. We implements in AlloyFL both spectrum-based and mutation-based fault localization techniques, as well as techniques that are based on Alloy's built-in unsat core. We introduce new metrics to measure the accuracy of AlloyFL and systematically evaluate AlloyFL on 38 real faulty models and 9000 mutant models. The results show that the mutation-based fault localization techniques are significantly more accurate than other types of techniques.
more | pdf | html
None.
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 4
Total Words: 13079
Unqiue Words: 3028

##### #3. On the Use of Emoticons in Open Source Software Development
###### Maëlick Claes, Mika Mäntylä, Umar Farooq
Background: Using sentiment analysis to study software developers' behavior comes with challenges such as the presence of a large amount of technical discussion unlikely to express any positive or negative sentiment. However, emoticons provide information about developer sentiments that can easily be extracted from software repositories. Aim: We investigate how software developers use emoticons differently in issue trackers in order to better understand the differences between developers and determine to which extent emoticons can be used as in place of sentiment analysis. Method: We extract emoticons from 1.3M comments from Apache's issue tracker and 4.5M from Mozilla's issue tracker using regular expressions built from a list of emoticons used by SentiStrength and Wikipedia. We check for statistical differences using Mann-Whitney U tests and determine the effect size with Cliff's delta. Results: Overall Mozilla developers rely more on emoticons than Apache developers. While the overall rate of comments with emoticons is of...
more | pdf | html
None.
###### Tweets
arxiv_cshc: On the Use of Emoticons in Open Source Software Development https://t.co/JM2Sq2ZwaP
ComputerPapers: On the Use of Emoticons in Open Source Software Development. https://t.co/9FTLiDyNIg
arxiv_cshc: On the Use of Emoticons in Open Source Software Development https://t.co/JM2Sq2ZwaP
###### Github

List of emoticons and emotions used for the ESEM paper "On the Use of Emoticons in Open Source Software Development"

Repository: ESEM2018-Emoticons-Emotions-List
User: M3SOulu
Language: None
Stargazers: 0
Subscribers: 4
Forks: 0
Open Issues: 0
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 3767
Unqiue Words: 1343

##### #4. Learning from Mutants: Using Code Mutation to Learn and Monitor Invariants of a Cyber-Physical System
###### Yuqi Chen, Christopher M. Poskitt, Jun Sun
Cyber-physical systems (CPS) consist of sensors, actuators, and controllers all communicating over a network; if any subset becomes compromised, an attacker could cause significant damage. With access to data logs and a model of the CPS, the physical effects of an attack could potentially be detected before any damage is done. Manually building a model that is accurate enough in practice, however, is extremely difficult. In this paper, we propose a novel approach for constructing models of CPS automatically, by applying supervised machine learning to data traces obtained after systematically seeding their software components with faults ("mutants"). We demonstrate the efficacy of this approach on the simulator of a real-world water purification plant, presenting a framework that automatically generates mutants, collects data traces, and learns an SVM-based model. Using cross-validation and statistical model checking, we show that the learnt model characterises an invariant physical property of the system. Furthermore, we...
more | pdf | html
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 10656
Unqiue Words: 2895

##### #5. Summary of a Literature Review in Scalability of QoS-aware Service Composition
###### Leticia Duboc, Faisal Alrebeish, Vivek Nallur, Rami Bahasoon
This paper shows that authors have no consistent way to characterize the scalability of their solutions, and so consider only a limited number of scaling characteristics. This review aimed at establishing the evidence that the route for designing and evaluating the scalability of dynamic QoS-aware service composition mechanisms has been lacking systematic guidance, and has been informed by a very limited set of criteria. For such, we analyzed 47 papers, from 2004 to 2018.
more | pdf | html
###### Tweets
ComputerPapers: Summary of a Literature Review in Scalability of QoS-aware Service Composition. https://t.co/HYpZyI8oaw
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 4
Total Words: 2545
Unqiue Words: 1045

##### #6. Moving Beyond the Mean: Analyzing Variance in Software Engineering Experiments
###### Adrian Santos, Markku Oivo, Natalia Juristo
Software Engineering (SE) experiments are traditionally analyzed with statistical tests (e.g., $t$-tests, ANOVAs, etc.) that assume equally spread data across treatments (i.e., the homogeneity of variances assumption). Differences across treatments' variances in SE are not seen as an opportunity to gain insights on technology performance, but instead, as a hindrance to analyze the data. We have studied the role of variance in mature experimental disciplines such as medicine. We illustrate the extent to which variance may inform on technology performance by means of simulation. We analyze a real-life industrial experiment on Test-Driven Development (TDD) where variance may impact technology desirability. Evaluating the performance of technologies just based on means (as traditionally done in SE) may be misleading. Technologies that make developers resemble more to each other (i.e., technologies with smaller variances) may be more suitable if the aim is minimizing the risk of adopting them in real practice.
more | pdf | html
None.
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 6062
Unqiue Words: 1605

##### #7. FuzzerGym: A Competitive Framework for Fuzzing and Learning
###### William Drozd, Michael D. Wagner
Fuzzing is a commonly used technique designed to test software by automatically crafting program inputs. Currently, the most successful fuzzing algorithms emphasize simple, low-overhead strategies with the ability to efficiently monitor program state during execution. Through compile-time instrumentation, these approaches have access to numerous aspects of program state including coverage, data flow, and heterogeneous fault detection and classification. However, existing approaches utilize blind random mutation strategies when generating test inputs. We present a different approach that uses this state information to optimize mutation operators using reinforcement learning (RL). By integrating OpenAI Gym with libFuzzer we are able to simultaneously leverage advancements in reinforcement learning as well as fuzzing to achieve deeper coverage across several varied benchmarks. Our technique connects the rich, efficient program monitors provided by LLVM Santizers with a deep neural net to learn mutation selection strategies directly...
more | pdf | html
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 9355
Unqiue Words: 2925

##### #8. Reduction of Redundant Rules in Association Rule Mining-Based Bug Assignment
###### Meera Sharma, Abhishek Tandon, Madhu Kumari, V B Singh
Bug triaging is a process to decide what to do with newly coming bug reports. In this paper, we have mined association rules for the prediction of bug assignee of a newly reported bug using different bug attributes, namely, severity, priority, component and operating system. To deal with the problem of large data sets, we have taken subsets of data set by dividing the large data set using K-means clustering algorithm. We have used an Apriori algorithm in MATLAB to generate association rules. We have extracted the association rules for top 5 assignees in each cluster.The proposed method has been empirically validated on 14696 bug reports of Mozilla open source software project, namely, Seamonkey, Firefox and Bugzilla. The proposed method provides an improvement over the existing techniques for bug assignment problem.
more | pdf | html
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 4
Total Words: 6219
Unqiue Words: 1776

##### #9. Automating Requirements Traceability: Two Decades of Learning from KDD
###### Alex Dekhtyar, Jane Huffman Hayes
This paper summarizes our experience with using Knowledge Discovery in Data (KDD) methodology for automated requirements tracing, and discusses our insights.
more | pdf | html
None.
###### Tweets
ComputerPapers: Automating Requirements Traceability: Two Decades of Learning from KDD. https://t.co/y0FFmicQJs
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 2
Total Words: 2832
Unqiue Words: 1265

##### #10. Lemma Functions for Frama-C: C Programs as Proofs
###### Grigoriy Volkov, Mikhail Mandrykin, Denis Efremov
This paper describes the development of an auto-active verification technique in the Frama-C framework. We outline the lemma functions method and present the corresponding ACSL extension, its implementation in Frama-C, and evaluation on a set of string-manipulating functions from the Linux kernel. We illustrate the benefits our approach can bring concerning the effort required to prove lemmas, compared to the approach based on interactive provers such as Coq. Current limitations of the method and its implementation are discussed.
more | pdf | html
None.
###### Tweets
ComputerPapers: Lemma Functions for Frama-C: C Programs as Proofs. https://t.co/wM3xxDSdTW
arxiv_cslo: Lemma Functions for Frama-C: C Programs as Proofs https://t.co/H1JNyNgEf8
None.
None.
###### Other stats
Sample Sizes : None.
Authors: 3
Total Words: 6115
Unqiue Words: 1904

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 72,893 papers.

###### Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Online
###### Stats
Tracking 72,893 papers.