|
1 | | -<img padding="10" align="right" src="https://www.acm.org/binaries/content/gallery/acm/publications/artifact-review-v1_1-badges/artifacts_evaluated_reusable_v1_1.png" alt="ACM Artifacts Evaluated Reusable" width="114" height="113"/> |
2 | 1 |
|
3 | 2 |  |
4 | 3 | [][documentation] |
5 | | -[](INSTALL.md) |
6 | 4 | [][website] |
7 | 5 | [](LICENSE.LGPL3) |
8 | | -[](https://doi.org/10.5281/zenodo.7110095) |
9 | | - |
10 | | -# Classifying Edits to Variability in Source Code |
11 | | - |
12 | | -This is the replication package for our paper _Classifying Edits to Variability in Source Code_ accepted at the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). |
13 | | - |
14 | | -This replication package consists of four parts: |
15 | | - |
16 | | -1. **DiffDetective**: For our validation, we built _DiffDetective_, a java library and command-line tool to classify edits to variability in git histories of preprocessor-based software product lines. |
17 | | -2. **Appendix**: The appendix of our paper is given in PDF format in the file [appendix.pdf][appendix]. |
18 | | -3. **Haskell Formalization**: We provide an extended formalization in the Haskell programming language as described in our appendix. Its implementation can be found in the Haskell project in the [proofs](proofs) directory. |
19 | | -4. **Dataset Overview**: We provide an overview of the 44 inspected datasets with updated links to their repositories in the file [docs/datasets/all.md][dataset]. |
20 | | - |
21 | | -## 1. DiffDetective |
22 | | -DiffDetective is a java library and command-line tool to parse and classify edits to variability in git histories of preprocessor-based software product lines by creating [variation diffs][variationdiff_class] and operating on them. |
23 | | - |
24 | | -We offer a [Docker](https://www.docker.com/) setup to easily __replicate__ the validation performed in our paper. |
25 | | -In the following, we provide a quickstart guide for running the replication. |
26 | | -You can find detailed information on how to install Docker and build the container in the [INSTALL](INSTALL.md) file, including detailed descriptions of each step and troubleshooting advice. |
27 | | - |
28 | | -### 1.1 Build the Docker container |
29 | | -Start the docker deamon. |
30 | | -Clone this repository. |
31 | | -Open a terminal and navigate to the root directory of this repository. |
32 | | -To build the Docker container you can run the `build` script corresponding to your operating system. |
33 | | -#### Windows: |
34 | | -`.\build.bat` |
35 | | -#### Linux/Mac (bash): |
36 | | -`./build.sh` |
37 | | - |
38 | | -### 1.2 Start the replication |
39 | | -To execute the replication you can run the `execute` script corresponding to your operating system with `replication` as first argument. |
40 | | - |
41 | | -#### Windows: |
42 | | -`.\execute.bat replication` |
43 | | -#### Linux/Mac (bash): |
44 | | -`./execute.sh replication` |
45 | | - |
46 | | -> WARNING! |
47 | | -> The replication will at least require an hour and might require up to a day depending on your system. |
48 | | -> Therefore, we offer a short verification (5-10 minutes) which runs DiffDetective on only four of the datasets. |
49 | | -> You can run it by providing "verification" as argument instead of "replication" (i.e., `.\execute.bat verification`, `./execute.sh verification`). |
50 | | -> If you want to stop the execution, you can call the provided script for stopping the container in a separate terminal. |
51 | | -> When restarted, the execution will continue processing by restarting at the last unfinished repository. |
52 | | -> #### Windows: |
53 | | -> `.\stop-execution.bat` |
54 | | -> #### Linux/Mac (bash): |
55 | | -> `./stop-execution.sh` |
56 | | -
|
57 | | -You might see warnings or errors reported from SLF4J like `Failed to load class "org.slf4j.impl.StaticLoggerBinder"` which you can safely ignore. |
58 | | -Further troubleshooting advice can be found at the bottom of the [Install](INSTALL.md) file. |
59 | | - |
60 | | -### 1.3 View the results in the [results][resultsdir] directory |
61 | | -All raw results are stored in the [results][resultsdir] directory. |
62 | | -The aggregated results can be found in the following files. |
63 | | -(Note that the links below only have a target _after_ running the replication or verification.) |
64 | | -- [speed statistics][resultsdir_speed_statistics]: contains information about the total runtime, median runtime, mean runtime, and more. |
65 | | -- [classification results][resultsdir_classification_results]: contains information about how often each class was found, and more. |
66 | | - |
67 | | -Moreover, the results comprise the (LaTeX) tables that are part of our paper and appendix. |
68 | | - |
69 | | -### Documentation |
70 | | - |
71 | | -DiffDetective is documented with javadoc. The documentation can be accessed on this [website][documentation]. Notable classes of our library are: |
72 | | -- [VariationDiff](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/VariationDiff.html) and [DiffNode](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/DiffNode.html) implement variation diffs from our paper. A variation diff is represented by an instance of the `VariationDiff` class. It stores the root node of the diff and offers various methods to parse, traverse, and analyze variation diffs. `DiffNode`s represent individual nodes within a variation diff. |
73 | | -- [EditClassValidation](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/validation/EditClassValidation.html) contains the main method for our validation. |
74 | | -- [ProposedEditClasses](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/editclass/proposed/ProposedEditClasses.html) holds the catalog of the nine edit classes we proposed in our paper. It implements the interface [EditClassCatalogue](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/editclass/EditClassCatalogue.html), which allows to define custom edit classifications. |
75 | | -- [BooleanAbstraction](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/feature/BooleanAbstraction.html) contains data and methods for boolean abstraction of higher-order logic formulas. We use this for macro parsing. |
76 | | -- [GitDiffer](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/diff/GitDiffer.html) may parse the history of a git repository to variation diffs. |
77 | | -- The [datasets](https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/datasets/package-summary.html) package contains various classes for describing and loading datasets. |
78 | | - |
79 | | -## 2. Appendix |
80 | | - |
81 | | -Our [appendix][appendix] consists of: |
82 | | -1. An extended formalization of our concepts in the [Haskell][haskell] programming language. The corresponding source code is also part of this replication package (see below). |
83 | | -2. The proofs for (a) the completeness of variation diffs to represent edits to variation trees, and (b) the completeness and unambiguity of our edit classes. |
84 | | -3. An inspection of edit patterns from related work to show that existing patterns are either composite patterns built from our edit classes or similar to one of our edit classes. The used diffs of these patterns can also be found in [docs/compositepatterns](docs/compositepatterns). |
85 | | -4. The complete results of our validation for all 44 datasets. |
86 | | - |
87 | | -## 3. Haskell Formalization |
88 | | -The extended formalization is a [Haskell][haskell] library in the [`proofs`](proofs) subdirectory. |
89 | | -Since the `proofs` library is its own software project, we provide a separate documentation of requirements and installation instructions within the projects subdirectory. |
90 | | -Requirements and instructions for setting up the build environment (Stack) are given in [proofs/REQUIREMENTS.md](proofs/REQUIREMENTS.md). |
91 | | -How to build our library and how to run the example is described in the [proofs/INSTALL.md](proofs/INSTALL.md). |
92 | | - |
93 | | - |
94 | | -## 4. Dataset Overview |
95 | | -### 4.1 Open-Source Repositories |
96 | | -We provide an overview of the used 44 open-source preprocessor-based software product lines in the [docs/datasets/all.md][dataset] file. |
97 | | -As described in our paper in Section 5.1, this list contains all systems that were studied by Liebig et al., extended by four new subject systems (Busybox, Marlin, LibSSH, Godot). |
98 | | -We provide updated links for each system's repository. |
99 | | - |
100 | | -### 4.2 Forked Repositories for Replication |
101 | | -To guarantee the exact replication of our validation, we created forks of all 44 open-source repositories at the state we performed the validation for our paper. |
102 | | -The forked repositories are listed in the [replication datasets](docs/datasets/esecfse22-replication.md.md) and are located at the Github user profile [DiffDetective](https://github.com/DiffDetective?tab=repositories). |
103 | | -These repositories are used when running the replication as described under `1.2` and in the [INSTALL](INSTALL.md). |
104 | | - |
105 | | -## 5. Running DiffDetective on Custom Datasets |
106 | | -You can also run DiffDetective on other datasets by providing the path to the dataset file as first argument to the execution script: |
107 | | - |
108 | | -#### Windows: |
109 | | -`.\execute.bat path\to\custom\dataset.md` |
110 | | -#### Linux/Mac (bash): |
111 | | -`./execute.sh path/to/custom/dataset.md` |
112 | | - |
113 | | -The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](docs/datasets/all.md) in the [docs/datasets](docs/datasets) folder. |
114 | | - |
115 | | -[variationdiff_class]: https://variantsync.github.io/DiffDetective/docs/javadoc/org/variantsync/diffdetective/variation/diff/VariationDiff.html |
116 | | -[haskell]: https://www.haskell.org/ |
117 | | -[dataset]: docs/datasets/all.md |
118 | | -[appendix]: appendix.pdf |
119 | | - |
120 | | -[documentation]: https://variantsync.github.io/DiffDetective/docs/javadoc/ |
121 | | -[website]: https://variantsync.github.io/DiffDetective/ |
122 | 6 |
|
123 | | -[resultsdir]: results |
124 | | -[resultsdir_classification_results]: results/validation/current/ultimateresult.metadata.txt |
125 | | -[resultsdir_speed_statistics]: results/validation/current/speedstatistics.txt |
| 7 | +# DiffDetective - Analysing Edits to Preprocessor-Based Variability |
| 8 | + |
| 9 | +DiffDetective is a research software to study the evolution of configurable and variational software projects, also known as software product lines. |
| 10 | + |
| 11 | +DiffDetective reads the Git history of a C-preprocessor-based software product line to analyze patches in terms of _variation diffs_ [1]. |
| 12 | +A variation diff is a variability-aware diff that depicts changes to source code as well as to variability annotations (e.g., C-preprocessor macros such as `#if` and `#ifdef`). |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | +This figure outlines the parsing process within DiffDetective. |
| 17 | +Given two states of a C-preprocessor annotated source code file (left), for example before and after a commit, DiffDetective constructs a variation diff (right) that describes the differences of the code as well as the involved variability. |
| 18 | +DiffDetective can construct a variation diff either from a text-based diff between both file versions (center path), |
| 19 | +or by first parsing both versions to an abstract representation, a variation tree (center top and bottom), and constructing a variation diff using a tree matching algorithm in a second step. |
| 20 | + |
| 21 | +## Publications |
| 22 | + |
| 23 | +### [2] Views on Edits to Variational Software (SPLC 2023) |
| 24 | + |
| 25 | +[](replication/splc23-views/README.md) |
| 26 | +[](https://doi.org/10.5281/zenodo.8027920) |
| 27 | + |
| 28 | +> P. M. Bittner, A. Schultheiß, S. Greiner, B. Moosherr, S. Krieter, C. Tinnes, T. Kehrer, T. Thüm. _Views on Edits to Variational Software_. Conditionally Accepted at the 27th ACM International Systems and Software Product Line Conference (SPLC 2023) |
| 29 | +
|
| 30 | +In this work, we used DiffDetective for a feasibility study of creating views on edits to C-preprocessor based software. |
| 31 | +The idea of a view is to act as a filter on relevant parts of a system. |
| 32 | +For instance, a piece of source code may be deemed relevant if it implements a certain feature. |
| 33 | + |
| 34 | +Views on edits extend views to software changes. |
| 35 | +A view on an edit thus is a simplified form of an edit that, for example, contains only changes to a certain feature. |
| 36 | +We implemented views on edits for variational systems in terms of views on variation diffs. |
| 37 | + |
| 38 | +Our replication package and further information can be found in the [README](replication/splc23-views/README.md) file in the respective directory (`replication/splc23-views`). |
| 39 | + |
| 40 | +### [1] Classifying Edits to Variability in Source Code (ESEC/FSE 2022) |
| 41 | + |
| 42 | +[](https://github.com/SoftVarE-Group/Papers/raw/main/2022/2022-ESECFSE-Bittner.pdf) |
| 43 | +[](https://dl.acm.org/doi/10.1145/3540250.3549108) |
| 44 | +[](https://www.youtube.com/watch?v=EnDx1AWxD24) |
| 45 | +[](replication/esecfse22/README.md) |
| 46 | +[](https://doi.org/10.5281/zenodo.7110095) |
| 47 | + |
| 48 | +> P. M. Bittner, C.Tinnes, A. Schultheiß, S. Viegener, T. Kehrer, T. Thüm. _Classifying Edits to Variability in Source Code_. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), ACM, New York, NY, November 2022 |
| 49 | +
|
| 50 | +<img padding="10" align="right" src="https://www.acm.org/binaries/content/gallery/acm/publications/artifact-review-v1_1-badges/artifacts_evaluated_reusable_v1_1.png" alt="ACM Artifacts Evaluated Reusable" width="114" height="113"/> |
| 51 | + |
| 52 | +In this work, we used DiffDetective to classify the effect of edits on the variability of the edited source code in the change histories of 44 open-source C-preprocessor-based software projects. |
| 53 | + |
| 54 | +Our replication package and further information can be found in the [README](replication/esecfse22/README.md) file in the respective directory (`replication/esecfse22`). |
| 55 | + |
| 56 | + |
| 57 | +[documentation]: https://htmlpreview.github.io/?https://github.com/VariantSync/DiffDetective/blob/splc23-views/docs/javadoc/index.html |
| 58 | +[website]: https://variantsync.github.io/DiffDetective/ |
0 commit comments