Skip to content

Commit c9d0e5d

Browse files
committed
Merge branch 'main' of github.com:VariantSync/DiffDetective into main
2 parents 6a99a9a + c3f3848 commit c9d0e5d

59 files changed

Lines changed: 737 additions & 61 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitattributes

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Set the default behavior, in case people don't have core.autocrlf set.
2+
* text=auto
3+
4+
# Explicitly declare text files you want to always be normalized and converted
5+
# to native line endings on checkout.
6+
*.sh text eol=lf
7+
*.bat text eol=crlf
8+
9+
# Denote all files that are truly binary and should not be modified.
10+
*.png binary
11+
*.jpg binary

.gitignore

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,12 +121,11 @@ patches/*
121121
/input
122122
/error
123123
examples/
124-
results/
125124

126125
/log.txt
127-
/local-maven-repo
128126

129127
### Eclipse ###
130128
.settings/
131129
.classpath
132130
.project
131+
plotting/__pycache__

Dockerfile

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# syntax=docker/dockerfile:1
2+
# ----------------------------
3+
# This template sets up a Docker container using a multi-stage build and Alpine Linux.
4+
# Alpine is a Linux-based OS embedded systems making it one of the smalles Docker images. It should be used if possible.
5+
# The multi-stage build consists of two stages: The compile stage, and the environment preparation stage.
6+
#
7+
# The compile stage only installs the packages required to compile the source files of the prototype. The generated binaries
8+
# are then copied to the environment during the environment preparation stage.
9+
#
10+
# The environment preparation stage is responsible for installing all dependencies and copying all files that are required
11+
# to execute the Docker container.
12+
#
13+
# This template contains the commands that are required to compile and execute a Java prototype.
14+
#
15+
# Lines commented as `REQUIRED` should only be altered/removed if you know what you are doing.
16+
# Lines commented as `EXAMPLE` contain commands that probably have to be adjusted
17+
# ----------------------------
18+
19+
20+
FROM alpine:3.15
21+
# PACKAGE STAGE
22+
23+
# EXAMPLE: Prepare the compile environment. JDK is automatically installed
24+
RUN apk add maven
25+
26+
# REQUIRED: Create and navigate to a working directory
27+
WORKDIR /home/user
28+
29+
COPY local-maven-repo ./local-maven-repo
30+
31+
# EXAMPLE: Copy the source code
32+
COPY src ./src
33+
# EXAMPLE: Copy the pom.xml if Maven is used
34+
COPY pom.xml .
35+
# EXAMPLE: Execute the maven package process
36+
RUN mvn package || exit
37+
38+
FROM alpine:3.15
39+
40+
# Create a user
41+
RUN adduser --disabled-password --home /home/sherlock --gecos '' sherlock
42+
43+
RUN apk add --no-cache --upgrade bash
44+
RUN apk add --update openjdk17
45+
46+
# REQUIRED: Change into the home directory
47+
WORKDIR /home/sherlock
48+
49+
# Copy the compiled JAR file from the first stage into the second stage
50+
# Syntax: COPY --from=STAGE_ID SOURCE_PATH TARGET_PATH
51+
WORKDIR /home/sherlock/holmes
52+
COPY --from=0 /home/user/target/DiffDetectiveRunner.jar .
53+
WORKDIR /home/sherlock
54+
55+
# Copy the setup
56+
COPY docs holmes/docs
57+
58+
# Copy the docker resources
59+
COPY docker/* ./
60+
RUN mkdir DiffDetectiveMining
61+
62+
# Adjust permissions
63+
RUN chown sherlock:sherlock /home/sherlock -R
64+
RUN chmod +x execute.sh
65+
RUN chmod +x entrypoint.sh
66+
RUN chmod +x fix-perms.sh
67+
68+
# EXAMPLE: List the content in the work dir
69+
RUN ls -l
70+
71+
# REQUIRED: Set the entrypoint
72+
ENTRYPOINT ["./entrypoint.sh", "./execute.sh"]
73+
74+
# REQUIRED: Set the user
75+
USER sherlock

INSTALL.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# Installation
2+
## Installation Instructions
3+
In the following, we describe how to build the Docker image and run the experiments in Docker containers.
4+
5+
### Install Docker (if required)
6+
How to install Docker depends on your operating system.
7+
8+
#### Windows or Mac
9+
You can find download and installation instructions [here](https://www.docker.com/get-started).
10+
11+
#### Linux Distributions
12+
How to install Docker on your system, depends on your distribution. However, the chances are high that Docker is part of your distributions package database.
13+
Docker's [documentation](https://docs.docker.com/engine/install/) contains instructions for common distributions.
14+
15+
### Open a Suitable Terminal
16+
```
17+
# Windows Command Prompt:
18+
- Press 'Windows Key + R' on your keyboard
19+
- Type in 'cmd'
20+
- Click 'OK' or press 'Enter' on your keyboard
21+
22+
# Windows PowerShell:
23+
- Open the search bar (Default: 'Windows Key') and search for 'PowerShell'
24+
- Start the PowerShell
25+
26+
# Linux:
27+
- Press 'ctrl + alt + T' on your keyboard
28+
```
29+
30+
### Build the Docker Container
31+
To build the Docker container you can run the build script corresponding to your OS
32+
```
33+
# Windows:
34+
.\build.bat
35+
# Linux/Mac (bash):
36+
./build.sh
37+
```
38+
39+
## Validation & Expected Output
40+
41+
### Running the Validation
42+
To run the validation you can run the script corresponding to your OS with `validation` as first argument. The validation should take about 10-20 minutes depending on your hardware.
43+
```
44+
# Windows:
45+
.\execute.bat validation
46+
# Linux/Mac (bash):
47+
./execute.sh validation
48+
```
49+
The results of the validation will be stored in the [results](results) directory.
50+
51+
### Expected Output of the Validation
52+
The aggregated results of the validation can be found in the following files.
53+
54+
- The [speed statistics](results/difftrees/speedstatistics.txt) contain information about the total runtime, median runtime, mean runtime, and more:
55+
```
56+
#Commits: 14527
57+
Total commit process time is: 12.427866666666667min
58+
Fastest commit process time is: df4a1fa9c5cc5d54a9347a2bf4843cae87a942f1___xorg-server___0ms
59+
Slowest commit process time is: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___14578ms
60+
Median commit process time is: 6dc71f6b2c7ff49adb504426b4cd206e4745e1e3___xorg-server___19ms
61+
Average commit process time is: 51.330075032697735ms
62+
```
63+
- The [classification results](results/difftrees/ultimateresult.metadata.txt) contain information about how often each pattern was found, and more.
64+
```
65+
repository: <NONE>
66+
total commits: 18046
67+
filtered commits: 593
68+
failed commits: 0
69+
empty commits: 2926
70+
processed commits: 14527
71+
tree diffs: 55008
72+
fastestCommit: df4a1fa9c5cc5d54a9347a2bf4843cae87a942f1___xorg-server___0ms
73+
slowestCommit: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___14578ms
74+
runtime in seconds: 747.5400000000001
75+
runtime with multithreading in seconds: 137.22
76+
treeformat: diff.difftree.serialize.treeformat.CommitDiffDiffTreeLabelFormat
77+
nodeformat: mining.formats.ReleaseMiningDiffNodeFormat
78+
edgeformat: mining.formats.DirectedEdgeLabelFormat with mining.formats.ReleaseMiningDiffNodeFormat
79+
analysis: mining.strategies.PatternValidation
80+
#NON nodes: 0
81+
#ADD nodes: 0
82+
#REM nodes: 0
83+
filtered because not (is not empty): 132
84+
AddToPC: { total = 260536; commits = 12703 }
85+
AddWithMapping: { total = 27720; commits = 1447 }
86+
RemFromPC: { total = 235017; commits = 11830 }
87+
RemWithMapping: { total = 15381; commits = 1361 }
88+
Specialization: { total = 4662; commits = 624 }
89+
Generalization: { total = 7397; commits = 564 }
90+
Reconfiguration: { total = 2231; commits = 258 }
91+
Refactoring: { total = 5769; commits = 921 }
92+
Untouched: { total = 0; commits = 0 }
93+
#Error[#else after #else]: 2
94+
#Error[#endif without #if]: 8
95+
#Error[#else or #elif without #if]: 9
96+
#Error[not all annotations closed]: 6
97+
```
98+
99+
(Note that the above links only have a target after running the validation.)
100+
The processing times might deviate.
101+
102+
## Troubleshooting
103+
104+
### 'Got permission denied while trying to connect to the Docker daemon socket'
105+
`Problem:` This is a common problem under Linux, if the user trying to execute Docker commands does not have the permissions to do so.
106+
107+
`Fix:` You can fix this problem by either following the [post-installation instructions](https://docs.docker.com/engine/install/linux-postinstall/), or by executing the scripts in the replication package with elevated permissions (i.e., `sudo`)
108+
109+
### 'Unable to find image 'replication-package:latest' locally'
110+
`Problem:` The Docker container could not be found. This either means that the name of the container that was built does not fit the name of the container that is being executed (this only happens if you changed the provided scripts), or that the Docker container was not built yet.
111+
112+
`Fix:` Follow the instructions described above in the section `Build the Docker Container`.
113+
114+
### No results after validation, or 'cannot create directory '../results/difftrees': Permission denied'
115+
`Problem:` This problem can occur due to how permissions are managed inside the Docker container. More specifically, it will appear, if Docker is executed with elevated permissions (i.e., `sudo`) and if there is no [results](results) directory because it was deleted manually. In this case, Docker will create the directory with elevated permissions, and the Docker user has no permissions to access the directory.
116+
117+
`Fix:` If there is a _results_ directory delete it with elevated permission (e.g., `sudo rm -r results`).
118+
Then, create a new _results_ directory without elevated permissions, or execute `git restore .` to restore the deleted directory.

README.md

Lines changed: 66 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,66 @@
1-
# DiffDetective
2-
3-
[![Thesis](https://img.shields.io/badge/Thesis-Read-blue)][thesis]
4-
5-
This is the tool accompanying the bachelor's thesis [**Empirical Evaluation of Feature Trace Recording on the Edit History of Marlin**][thesis] by Sören Viegener.
6-
(The version of DiffDetective described in and submitted with the thesis can be found on branch `thesis-sv`).
7-
8-
DiffDetective is a library to analyse the evolution of variability in source code in preprocessor-based software product lines.
9-
It serves two main purposes:
10-
1. DiffDetective parses diffs on preprocessor annotated source code to so called `DiffTrees`. For example, the following diff in which
11-
the annotation `DEBUG` gets removed and an `else` case is added
12-
```diff
13-
#if A
14-
x = 0;
15-
- #if DEBUG
16-
print(x);
17-
- #endif
18-
+ #else
19-
+ x = 1;
20-
#endif
21-
```
22-
can be parsed to the following graph structure to analyse the diff:
23-
24-
![difftreeshowcase](docs/showcase/examplediff.png)
25-
26-
3. DiffDetective takes a preprocessor-based software product line repository as input and matches edit patterns in its commit history.
27-
It can detect an extensible variety of different edit patterns and reverse engineers feature contexts known from feature trace recording.
28-
The output of the tool consists of all pattern matches found and different metrics relevant for the evaluation of feature trace recording.
29-
30-
## Related Work
31-
32-
**Feature Trace Recording**.
33-
Paul Maximilian Bittner, Alexander Schultheiß, Thomas Thüm, Timo Kehrer, Jeffrey Young, and Lukas Linsbauer.
34-
*ESEC/FSE'21. ACM, New York, NY, USA. August 2021*:
35-
https://pmbittner.github.io/FeatureTraceRecording/
36-
37-
### Edit Patterns
38-
**Concepts, Operations, and Feasibility of a Projection-Based Variation Control System**.
39-
Stefan Stănciulescu, Thorsten Berger, Eric Walkingshaw, and Andrzej Wąsowski.
40-
https://ieeexplore.ieee.org/document/7816478
41-
42-
[thesis]: https://oparu.uni-ulm.de/xmlui/handle/123456789/38679
1+
# Classifying Edits to Variability in Source Code
2+
3+
This is the replication package our submission _Classifying Edits to Variability in Source Code_ submitted to the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) in March 2022.
4+
5+
This replication package consists of four parts:
6+
7+
1. **Appendix**: The appendix of our paper is given in PDF format in the file [appendix.pdf](appendix.pdf).
8+
2. **DiffDetective**: For our validation, we built DiffDetective, a java library and command-line tool to classify edits to variability in git histories of preprocessor-based software product lines.
9+
3. **Haskell Formalization**: We provide an extended formalization in the Haskell programming language as described in our appendix. Its implementation can be found in the Haskell project in the [proofs](proofs) directory.
10+
4. **Dataset Overview**: We provide an overview of the 44 inspected datasets with updated links to their repositories in the file [docs/datasets.md](docs/datasets.md).
11+
12+
## Appendix
13+
14+
Our appendix consists of:
15+
1. An extended formalization of our concepts in the [Haskell][haskell] programming language. The corresponding source code is also part of this replication package (see below).
16+
2. The proofs for (a) the completeness of variation tree diffs to represent edits to variation trees, and (b) the completeness and unambiguity of our elementary edit patterns.
17+
3. An inspection of edit patterns from related work to show that existing patterns are either composite patterns built from our elementary patterns or similar to our elementary patterns.
18+
4. The complete results of our validation for all 44 datasets.
19+
20+
## DiffDetective
21+
We offer a [Docker](https://www.docker.com/) setup to easily __replicate__ our validation with _DiffDetective_.
22+
You can find detailed information on how to install Docker and build the container in the [INSTALL](INSTALL.md) file.
23+
In the following, we provide instructions for running the replication.
24+
25+
### 1. Build the Docker container
26+
To build the Docker container you can run the _build_ script corresponding to your OS.
27+
#### Windows:
28+
`.\build.bat`
29+
#### Linux/Mac (bash):
30+
`./build.sh`
31+
32+
### 2. Start the replication
33+
To execute the replication you can run the _execute_ script corresponding to your OS with `replication` as first argument.
34+
35+
> ! The replication will at least require several hours and might require up to a few days depending on your system.
36+
> Therefore, we offer a short validation (5-10 minutes) which runs _DiffDetective_ on only four of the datasets.
37+
> You can run it by providing "validation" as argument instead of "replication" (i.e., ./execute.sh validation).
38+
> If you want to stop the replication, you can call the provided script for stopping the container. Note that you will have to restart the entire replication, if you stop it at any point.
39+
> #### Windows:
40+
> `.\stop-execution.bat`
41+
> #### Linux/Mac (bash):
42+
> `./stop-execution.sh`
43+
44+
#### Windows:
45+
`.\execute.bat replication`
46+
#### Linux/Mac (bash):
47+
`./execute.sh replication`
48+
49+
50+
51+
### 3. View the results in the [results](results) directory
52+
All raw results are stored in the [results](results) directory. The aggregated results can be found in the following files:
53+
- [speed statistics](results/difftrees/speedstatistics.txt): contains information about the total runtime, median runtime, mean runtime, and more.
54+
- [classification results](results/difftrees/ultimateresult.metadata.txt): contains information about how often each pattern was found, and more.
55+
56+
(Note that the above links only have a target _after_ running the replication.)
57+
58+
Moreover, the results comprise the (LaTeX) tables that are part of our paper and appendix.
59+
60+
## Haskell Formalization
61+
The extended formalization in Haskell is a library using the _Stack_ build system.
62+
63+
Instructions for manually installing Stack are given in [proofs/REQUIREMENTS.md](proofs/REQUIREMENTS.md).
64+
How to build our library and how to run the example is described in the [proofs/INSTALL.md](proofs/INSTALL.md).
65+
66+
[haskell]: https://www.haskell.org/

appendix.pdf

2.16 MB
Binary file not shown.

build.bat

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
docker build -t replication-package .
2+
@pause

build.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
#! /bin/bash
2+
docker build -t replication-package .

deploy_libs.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/bin/bash
2+
3+
# FeatureIDE
4+
mvn deploy:deploy-file -DgroupId=de.ovgu -DartifactId=featureide.lib.fm -Dversion=3.8.1 -Durl=file:./local-maven-repo/ -DrepositoryId=local-maven-repo -DupdateReleaseInfo=true -Dfile=./lib/de.ovgu.featureide.lib.fm-v3.8.1.jar
5+
6+
# Functjonal
7+
mvn deploy:deploy-file -DgroupId=anonymized -DartifactId=Functjonal -Dversion=1.0-SNAPSHOT -Durl=file:./local-maven-repo/ -DrepositoryId=local-maven-repo -DupdateReleaseInfo=true -Dfile=./lib/Functjonal-1.0-SNAPSHOT.jar
8+
9+
# sat4j
10+
mvn deploy:deploy-file -DgroupId=org.sat4j -DartifactId=core -Dversion=2.3.5 -Durl=file:./local-maven-repo/ -DrepositoryId=local-maven-repo -DupdateReleaseInfo=true -Dfile=./lib/org.sat4j.core.jar

docker/DOCKER.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Docker Files
2+
3+
This directory contains the files that are required to run the Docker container.
4+
5+
## Permission Fix
6+
To fix permission issues that occur in Docker environments under Linux, two files are required: [`fix-perms.sh`](fix-perms.sh) and [`entypoint.sh`](entrypoint.sh).
7+
8+
These files should remain unaltered and are automatically copied to the Docker container.
9+
10+
> Make sure to also set the required [.gitattributes](../.gitattributes) in your replication package. They are required to assure that the Docker container can be executed correctly under Windows.
11+
12+
## Execution
13+
The [`execute.sh`](execute.sh) script can be adjusted to run the program that should be executed by the Docker container.

0 commit comments

Comments
 (0)