Skip to content

Commit d0ab250

Browse files
committed
Merge branch 'splc23-views' into develop
2 parents d802fb9 + 3475a21 commit d0ab250

195 files changed

Lines changed: 3888 additions & 901 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Dockerfile

Lines changed: 0 additions & 75 deletions
This file was deleted.

README.md

Lines changed: 52 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,125 +1,58 @@
1-
<img padding="10" align="right" src="https://www.acm.org/binaries/content/gallery/acm/publications/artifact-review-v1_1-badges/artifacts_evaluated_reusable_v1_1.png" alt="ACM Artifacts Evaluated Reusable" width="114" height="113"/>
21

32
![Maven](https://github.com/VariantSync/DiffDetective/actions/workflows/maven.yml/badge.svg)
43
[![Documentation](https://img.shields.io/badge/Documentation-Read-purple)][documentation]
5-
[![Install](https://img.shields.io/badge/Install-Instructions-blue)](INSTALL.md)
64
[![GitHubPages](https://img.shields.io/badge/GitHub%20Pages-online-blue.svg?style=flat)][website]
75
[![License](https://img.shields.io/badge/License-GNU%20LGPLv3-blue)](LICENSE.LGPL3)
8-
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7110095.svg)](https://doi.org/10.5281/zenodo.7110095)
9-
10-
# Classifying Edits to Variability in Source Code
11-
12-
This is the replication package for our paper _Classifying Edits to Variability in Source Code_ accepted at the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022).
13-
14-
This replication package consists of four parts:
15-
16-
1. **DiffDetective**: For our validation, we built _DiffDetective_, a java library and command-line tool to classify edits to variability in git histories of preprocessor-based software product lines.
17-
2. **Appendix**: The appendix of our paper is given in PDF format in the file [appendix.pdf][appendix].
18-
3. **Haskell Formalization**: We provide an extended formalization in the Haskell programming language as described in our appendix. Its implementation can be found in the Haskell project in the [proofs](proofs) directory.
19-
4. **Dataset Overview**: We provide an overview of the 44 inspected datasets with updated links to their repositories in the file [docs/datasets/all.md][dataset].
20-
21-
## 1. DiffDetective
22-
DiffDetective is a java library and command-line tool to parse and classify edits to variability in git histories of preprocessor-based software product lines by creating [variation diffs][variationdiff_class] and operating on them.
23-
24-
We offer a [Docker](https://www.docker.com/) setup to easily __replicate__ the validation performed in our paper.
25-
In the following, we provide a quickstart guide for running the replication.
26-
You can find detailed information on how to install Docker and build the container in the [INSTALL](INSTALL.md) file, including detailed descriptions of each step and troubleshooting advice.
27-
28-
### 1.1 Build the Docker container
29-
Start the docker deamon.
30-
Clone this repository.
31-
Open a terminal and navigate to the root directory of this repository.
32-
To build the Docker container you can run the `build` script corresponding to your operating system.
33-
#### Windows:
34-
`.\build.bat`
35-
#### Linux/Mac (bash):
36-
`./build.sh`
37-
38-
### 1.2 Start the replication
39-
To execute the replication you can run the `execute` script corresponding to your operating system with `replication` as first argument.
40-
41-
#### Windows:
42-
`.\execute.bat replication`
43-
#### Linux/Mac (bash):
44-
`./execute.sh replication`
45-
46-
> WARNING!
47-
> The replication will at least require an hour and might require up to a day depending on your system.
48-
> Therefore, we offer a short verification (5-10 minutes) which runs DiffDetective on only four of the datasets.
49-
> You can run it by providing "verification" as argument instead of "replication" (i.e., `.\execute.bat verification`, `./execute.sh verification`).
50-
> If you want to stop the execution, you can call the provided script for stopping the container in a separate terminal.
51-
> When restarted, the execution will continue processing by restarting at the last unfinished repository.
52-
> #### Windows:
53-
> `.\stop-execution.bat`
54-
> #### Linux/Mac (bash):
55-
> `./stop-execution.sh`
56-
57-
You might see warnings or errors reported from SLF4J like `Failed to load class "org.slf4j.impl.StaticLoggerBinder"` which you can safely ignore.
58-
Further troubleshooting advice can be found at the bottom of the [Install](INSTALL.md) file.
59-
60-
### 1.3 View the results in the [results][resultsdir] directory
61-
All raw results are stored in the [results][resultsdir] directory.
62-
The aggregated results can be found in the following files.
63-
(Note that the links below only have a target _after_ running the replication or verification.)
64-
- [speed statistics][resultsdir_speed_statistics]: contains information about the total runtime, median runtime, mean runtime, and more.
65-
- [classification results][resultsdir_classification_results]: contains information about how often each class was found, and more.
66-
67-
Moreover, the results comprise the (LaTeX) tables that are part of our paper and appendix.
68-
69-
### Documentation
70-
71-
DiffDetective is documented with javadoc. The documentation can be accessed on this [website][documentation]. Notable classes of our library are:
72-
- [VariationDiff](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/VariationDiff.html) and [DiffNode](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/DiffNode.html) implement variation diffs from our paper. A variation diff is represented by an instance of the `VariationDiff` class. It stores the root node of the diff and offers various methods to parse, traverse, and analyze variation diffs. `DiffNode`s represent individual nodes within a variation diff.
73-
- [EditClassValidation](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/validation/EditClassValidation.html) contains the main method for our validation.
74-
- [ProposedEditClasses](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/editclass/proposed/ProposedEditClasses.html) holds the catalog of the nine edit classes we proposed in our paper. It implements the interface [EditClassCatalogue](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/editclass/EditClassCatalogue.html), which allows to define custom edit classifications.
75-
- [BooleanAbstraction](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/feature/BooleanAbstraction.html) contains data and methods for boolean abstraction of higher-order logic formulas. We use this for macro parsing.
76-
- [GitDiffer](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/diff/GitDiffer.html) may parse the history of a git repository to variation diffs.
77-
- The [datasets](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/datasets/package-summary.html) package contains various classes for describing and loading datasets.
78-
79-
## 2. Appendix
80-
81-
Our [appendix][appendix] consists of:
82-
1. An extended formalization of our concepts in the [Haskell][haskell] programming language. The corresponding source code is also part of this replication package (see below).
83-
2. The proofs for (a) the completeness of variation diffs to represent edits to variation trees, and (b) the completeness and unambiguity of our edit classes.
84-
3. An inspection of edit patterns from related work to show that existing patterns are either composite patterns built from our edit classes or similar to one of our edit classes. The used diffs of these patterns can also be found in [docs/compositepatterns](docs/compositepatterns).
85-
4. The complete results of our validation for all 44 datasets.
86-
87-
## 3. Haskell Formalization
88-
The extended formalization is a [Haskell][haskell] library in the [`proofs`](proofs) subdirectory.
89-
Since the `proofs` library is its own software project, we provide a separate documentation of requirements and installation instructions within the projects subdirectory.
90-
Requirements and instructions for setting up the build environment (Stack) are given in [proofs/REQUIREMENTS.md](proofs/REQUIREMENTS.md).
91-
How to build our library and how to run the example is described in the [proofs/INSTALL.md](proofs/INSTALL.md).
92-
93-
94-
## 4. Dataset Overview
95-
### 4.1 Open-Source Repositories
96-
We provide an overview of the used 44 open-source preprocessor-based software product lines in the [docs/datasets/all.md][dataset] file.
97-
As described in our paper in Section 5.1, this list contains all systems that were studied by Liebig et al., extended by four new subject systems (Busybox, Marlin, LibSSH, Godot).
98-
We provide updated links for each system's repository.
99-
100-
### 4.2 Forked Repositories for Replication
101-
To guarantee the exact replication of our validation, we created forks of all 44 open-source repositories at the state we performed the validation for our paper.
102-
The forked repositories are listed in the [replication datasets](docs/datasets/esecfse22-replication.md.md) and are located at the Github user profile [DiffDetective](https://github.com/DiffDetective?tab=repositories).
103-
These repositories are used when running the replication as described under `1.2` and in the [INSTALL](INSTALL.md).
104-
105-
## 5. Running DiffDetective on Custom Datasets
106-
You can also run DiffDetective on other datasets by providing the path to the dataset file as first argument to the execution script:
107-
108-
#### Windows:
109-
`.\execute.bat path\to\custom\dataset.md`
110-
#### Linux/Mac (bash):
111-
`./execute.sh path/to/custom/dataset.md`
112-
113-
The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](docs/datasets/all.md) in the [docs/datasets](docs/datasets) folder.
114-
115-
[variationdiff_class]: https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/VariationDiff.html
116-
[haskell]: https://www.haskell.org/
117-
[dataset]: docs/datasets/all.md
118-
[appendix]: appendix.pdf
119-
120-
[documentation]: https://variantsync.github.io/DiffDetective/docs/javadoc/
121-
[website]: https://variantsync.github.io/DiffDetective/
1226

123-
[resultsdir]: results
124-
[resultsdir_classification_results]: results/validation/current/ultimateresult.metadata.txt
125-
[resultsdir_speed_statistics]: results/validation/current/speedstatistics.txt
7+
# DiffDetective - Analysing Edits to Preprocessor-Based Variability
8+
9+
DiffDetective is a research software to study the evolution of configurable and variational software projects, also known as software product lines.
10+
11+
DiffDetective reads the Git history of a C-preprocessor-based software product line to analyze patches in terms of _variation diffs_ [1].
12+
A variation diff is a variability-aware diff that depicts changes to source code as well as to variability annotations (e.g., C-preprocessor macros such as `#if` and `#ifdef`).
13+
14+
![DiffDetectiveTeaser](docs/teaser.png)
15+
16+
This figure outlines the parsing process within DiffDetective.
17+
Given two states of a C-preprocessor annotated source code file (left), for example before and after a commit, DiffDetective constructs a variation diff (right) that describes the differences of the code as well as the involved variability.
18+
DiffDetective can construct a variation diff either from a text-based diff between both file versions (center path),
19+
or by first parsing both versions to an abstract representation, a variation tree (center top and bottom), and constructing a variation diff using a tree matching algorithm in a second step.
20+
21+
## Publications
22+
23+
### [2] Views on Edits to Variational Software (SPLC 2023)
24+
25+
[![Replication Package](https://img.shields.io/badge/Replication-Package-blue)](replication/splc23-views/README.md)
26+
[![Artifact DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.8027920.svg)](https://doi.org/10.5281/zenodo.8027920)
27+
28+
> P. M. Bittner, A. Schultheiß, S. Greiner, B. Moosherr, S. Krieter, C. Tinnes, T. Kehrer, T. Thüm. _Views on Edits to Variational Software_. Conditionally Accepted at the 27th ACM International Systems and Software Product Line Conference (SPLC 2023)
29+
30+
In this work, we used DiffDetective for a feasibility study of creating views on edits to C-preprocessor based software.
31+
The idea of a view is to act as a filter on relevant parts of a system.
32+
For instance, a piece of source code may be deemed relevant if it implements a certain feature.
33+
34+
Views on edits extend views to software changes.
35+
A view on an edit thus is a simplified form of an edit that, for example, contains only changes to a certain feature.
36+
We implemented views on edits for variational systems in terms of views on variation diffs.
37+
38+
Our replication package and further information can be found in the [README](replication/splc23-views/README.md) file in the respective directory (`replication/splc23-views`).
39+
40+
### [1] Classifying Edits to Variability in Source Code (ESEC/FSE 2022)
41+
42+
[![Preprint](https://img.shields.io/badge/Preprint-Read-purple)](https://github.com/SoftVarE-Group/Papers/raw/main/2022/2022-ESECFSE-Bittner.pdf)
43+
[![Paper](https://img.shields.io/badge/Paper-Read-purple)](https://dl.acm.org/doi/10.1145/3540250.3549108)
44+
[![Talk](https://img.shields.io/badge/Talk-Watch-purple)](https://www.youtube.com/watch?v=EnDx1AWxD24)
45+
[![Replication Package](https://img.shields.io/badge/Replication-Package-blue)](replication/esecfse22/README.md)
46+
[![Artifact DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7110095.svg)](https://doi.org/10.5281/zenodo.7110095)
47+
48+
> P. M. Bittner, C.Tinnes, A. Schultheiß, S. Viegener, T. Kehrer, T. Thüm. _Classifying Edits to Variability in Source Code_. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), ACM, New York, NY, November 2022
49+
50+
<img padding="10" align="right" src="https://www.acm.org/binaries/content/gallery/acm/publications/artifact-review-v1_1-badges/artifacts_evaluated_reusable_v1_1.png" alt="ACM Artifacts Evaluated Reusable" width="114" height="113"/>
51+
52+
In this work, we used DiffDetective to classify the effect of edits on the variability of the edited source code in the change histories of 44 open-source C-preprocessor-based software projects.
53+
54+
Our replication package and further information can be found in the [README](replication/esecfse22/README.md) file in the respective directory (`replication/esecfse22`).
55+
56+
57+
[documentation]: https://htmlpreview.github.io/?https://github.com/VariantSync/DiffDetective/blob/splc23-views/docs/javadoc/index.html
58+
[website]: https://variantsync.github.io/DiffDetective/

appendix/appendix-splc23-views.pdf

460 KB
Binary file not shown.

build.bat

Lines changed: 0 additions & 2 deletions
This file was deleted.

build.sh

Lines changed: 0 additions & 2 deletions
This file was deleted.

default.nix

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ pkgs.stdenv.mkDerivation rec {
4545
dontConfigure = true;
4646
outputHashAlgo = "sha256";
4747
outputHashMode = "recursive";
48-
outputHash = "sha256-EdDXq537nTeCEY25RE5mtoFdwhWfuY3Shzz2+DA+u7M=";
48+
outputHash = "sha256-gmbyhqgMMZxt3+7ov/Zgm1EGdZBhn4WfAj8yphhg2CA=";
4949
};
5050

5151
buildPhase = ''

docker/DOCKER.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,3 @@ To fix permission issues that occur in Docker environments under Linux, two file
88
These files should remain unaltered and are automatically copied to the Docker container.
99

1010
> Make sure to also set the required [.gitattributes](../.gitattributes) in your replication package. They are required to assure that the Docker container can be executed correctly under Windows.
11-
12-
## Execution
13-
The [`execute.sh`](execute.sh) script can be adjusted to run the program that should be executed by the Docker container.

docker/entrypoint.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
#!/bin/sh
2-
ls -l
32
if [ "$(id -u)" = "0" ]; then
43
# running on a developer laptop as root
54
fix-perms -r -u user -g user /home/user

docs/datasets/emacs.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
Project name | Domain | Source code available (**y**es/**n**o)? | Is it a git repository (**y**es/**n**o)? | Repository URL | Clone URL | Estimated number of commits
2-
---|-------------------------|-----------------------------------------|-----------------------------------|--------------------------------------------------------------|------------------------------------------------------------------|---
3-
emacs | text editor | y | y | https://github.com/emacs-mirror/emacs | https://github.com/emacs-mirror/emacs.git | 153,926
1+
Project name | Domain | Source code available (**y**es/**n**o)? | Is it a git repository (**y**es/**n**o)? | Repository URL | Clone URL | Estimated number of commits
2+
---|-------------------------|----------------------------------------|---------------------------------|-------------------------------------------------------------|------------------------------------------------------------------|---
3+
emacs | text editor | y | y | https://github.com/emacs-mirror/emacs | https://github.com/DiffDetective/emacs.git | 153,926

0 commit comments

Comments
 (0)