You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This central interface allows users to add loaders and writers for further or custom data types as well as to replace existing loaders.
46
46
Currently, `Resources` support IO of CSV files, feature models (KernelHaven `json`, and FeatureIDE `dimacs`, `xml`), variant configurations (FeatureIDE `xml`), and presence conditions of product lines and variants.
47
47
48
-
From the loaded `dataset`, we can obtain the available evolution step.
49
-
An evolution step describes a commit-sized change to the input software product line, and is defined by the (child) commit performing a change to a previous (parent) commit.
48
+
From the loaded `dataset`, we can obtain the available evolution steps.
49
+
An evolution step describes a commit-sized change to the input software product line, and is defined by a (child) commit performing a change to a previous (parent) commit.
50
50
Note that the evolution steps are not ordered because commits in the input product-line repository might not have been ordered as the commits might have been extracted from different branches.
51
-
Alternatively, we can also request a continuous history of evolution steps instead of an unordered set.
51
+
If we require an order, we can request a continuous history of evolution steps instead of an unordered set.
52
52
Therefore, a `SequenceExtractor` is used to determine how the successfully extracted commits should be ordered.
53
53
In this example, we use the `LongestNonOverlappingSequences` extractor to sort the commits into one single continuous history.
54
54
Nevertheless, merge commits and error commits (where VEVOS/Extraction failed) are excluded from the history and thus, the returned list of commits has gaps.
@@ -77,17 +77,18 @@ In particular, the `VariabilityDataset` provides:
77
77
78
78
To generate variants, we have to specify which variants should be generated.
79
79
Therefore, a `Sampler` is used that returns the set of variants to use for a certain feature model.
80
+
The set of desired variants is encapsulated in samplers because the set of valid variants of the input product line may change when the feature model changes over time (i.e., commits).
81
+
Thus, the sampler can be invoked during each step of the variant simulation.
80
82
Apart from the possibility of introducing custom samplers, VEVOS/Simulation comes with two built-in ways for sampling:
81
83
Random configuration sampling using the FeatureIDE library, and constant sampling.
82
-
Random sampling returns a random set of valid configuration from a given feature model.
83
-
Constant sampling uses a pre-defined set of variants to generate ignoring the feature model.
84
-
The set of desired variants is encapsulated in samplers because the set of valid variants of the input product line may change when the feature model changes.
85
-
Thus, the sampler can be invoked during each step of the variant simulation.
84
+
85
+
[Random sampling](src/main/java/vevos/feature/sampling/FeatureIDESampler.java) returns a random set of valid configurations from a given feature model:
[Constant sampling](src/main/java/vevos/feature/sampling/ConstSampler.java) uses a pre-defined set of variants and ignores the feature model (it can easily be extended though to for example crash if a configuration violates a feature model at any commit):
Note that Busybox has a special subclass called `BusyboxRepository` that performs some necessary pre- and postprocessing on the product lines source code.
115
+
Note that Busybox has a special subclass called `BusyboxRepository` that performs some necessary pre- and postprocessing on the product line's source code.
115
116
116
117
We are now ready to traverse the evolution history to generate variants:
117
118
```java
@@ -126,7 +127,7 @@ However, both types of data are not directly accessible but have to be loaded fi
126
127
This is what the `Lazy` type is used for:It defers the loading of data until it is actually required.
127
128
This makes accessing the possibly huge (93GB for 13k commits of Linux, yikes!) ground truth dataset faster and memory-friendly as only required data is loaded into memory.
128
129
We can start the loading process by invoking `Lazy::run` that returns a value of the loaded type (i.e., `Optional<IFeatureModel>` or `Optional<Artefact>`).
129
-
A `Lazy` caches its loaded value so loading is only performed once.
130
+
A `Lazy` caches its loaded value, so loading is only performed once:Subsequent calls to `Lazy::run` return the cached value directly.
130
131
(Loaded data that is not required anymore can and should be freed by invoking `Lazy::forget`.)
131
132
As the extraction of feature model or presence condition might have failed, both types are again wrapped in an `Optional` that contains a value if extraction was successful.
132
133
Let's assume the extraction succeeded by just invoking `orElseThrow` here.
@@ -142,14 +143,14 @@ In case the `variantsSampler` is actually a `ConstSampler` (see above), it will
142
143
```
143
144
Optionally, we might want to filter which files of a variant to generate.
144
145
For example, a study on evolution of code in variable software systems could be interested only in generating the changed files of a commit.
145
-
In our case, let's just generate all variants.
146
+
In our case, let's just generate the entire code base of each variant.
146
147
Moreover, `VariantGenerationOptions` allow to configure some parameters for the variant generation.
147
148
Here, we just instruct the generation to exit in case an error happens but we could for example also instruct it to ignore errors and proceed.
To generate variants, we have to access the source code of the input software product line, at the currently inspected commit.
153
+
To generate variants, we have to access the source code of the input software product line at the currently inspected commit.
153
154
We thus checkout the current commit in the product line's repository:
154
155
```java
155
156
try {
@@ -171,8 +172,8 @@ Finally, we may indeed generate our variants:
171
172
The generation returns a `Result` that either represents the ground truth for the generated variant, or contains an exception if something went wrong.
172
173
Incase the generation was successful, we can inspect the `groundTruth` of the variant.
173
174
The `groundTruth` consists of
174
-
- the presence conditions and feature mappings of the variant (which are different from the software product lines presence conditions, for example because line numbers shifted),
175
-
- and a block matching that for each source code file (key of the map) tells us which blocks of source code in the variant steam from which blocks of source code in the software product line.
175
+
- the presence conditions and feature mappings of the variant (which are different from the presence conditions of the software product line, for example because line numbers shifted),
176
+
- and a block matching that for each source code file (key of the map) tells us which blocks of source code in the variant stem from which blocks of source code in the software product line.
176
177
We may also export ground truth data to disk for later usage.
177
178
178
179
(Here it is important to export the ground truth as `.variant.csv` as this suffix is used by our `Resources` to correctly load the ground truth.
@@ -189,7 +190,7 @@ In contrast, the suffix is `.spl.csv` for ground truth presence conditions of th
189
190
}
190
191
}
191
192
```
192
-
Incase we use Busybox as our input product line, we have to clean its repository as a last step:
193
+
Incase we use Busybox as our input product line, we have to clean its repository as a last step before we can proceed to the next `SPLCommit`:
193
194
```java
194
195
if (splRepository instanceofBusyboxRepository b) {
195
196
try {
@@ -199,7 +200,7 @@ In case we use Busybox as our input product line, we have to clean its repositor
199
200
}
200
201
}
201
202
```
202
-
This was round-trip about the major features of VEVOS/Simulation.Further features and convencience methods can be found in our documentation.
203
+
This was round-trip about the major features of VEVOS/Simulation.
203
204
204
205
## ProjectStructure
205
206
@@ -208,15 +209,15 @@ The project is structured into the following packages:
208
209
- [`vevos.feature`](src/main/java/vevos/feature) contains our representation for `Variant`s and their `Configuration`s as well as sampling of configurations and variants
209
210
- [`vevos.io`](src/main/java/vevos/io) contains our `Resources` service and default implementations for loading `CSV` files, ground truth, feature models, and configurations
210
211
- [`vevos.repository`](src/main/java/vevos/repository) contains classes for representing git repositories and commits
211
-
- [`vevos.sat`](src/main/java/vevos/sat) contains an interfacefor SAT solving (currently only used for annotation simplification on demand)
212
+
- [`vevos.sat`](src/main/java/vevos/sat) contains an interfacefor SAT solving (currently only used for annotation simplification, which is deactivated by default)
212
213
- [`vevos.util`](src/main/java/vevos/util) is the conventional utils package with helper methods for interfacing with FeatureIDE, name generation, logging, and others.
213
214
- [`vevos.variability`](src/main/java/vevos/variability) contains the classes for representing evolution histories and the ground truth dataset.
214
215
The package is divided into:
215
-
- [`vevos.variability.pc`](src/main/java/vevos/variability/pc) contains classes for representing , and annotations (i.e., presence conditions and feature mappings). We store annotations in `Artefact`s that follow a tree structure similar to the annotations in preprocessor based software product lines.
216
-
- [`vevos.variability.pc.groundtruth`](src/main/java/vevos/variability/pc/groundtruth) contains datatypes for the ground truth of generated variants
217
-
- [`vevos.variability.pc.options`](src/main/java/vevos/variability/pc/options) contains the options for the variant generation process
218
-
- [`vevos.variability.pc.visitor`](src/main/java/vevos/variability/pc/visitor) contains an implementation of the visitor pattern for traversing and inspecting `ArtefactTree`s. Some visitors for querying a files or a line's presence condition, as well as a pretty printer can be found in `vevos.variability.pc.visitor.common`.
219
-
- [`vevos.variability.sequenceextraction`](src/main/java/vevos/variability/pc/sequenceextraction) contains default implementation for `SequenceExtractor`. These are algorithms for sorting pairs of commits into continuous histories (see example above).
216
+
- [`vevos.variability.pc`](src/main/java/vevos/variability/pc) contains classes for representing annotations (i.e., presence conditions and feature mappings). We store annotations in `Artefact`s that follow a tree structure similar to the annotations in preprocessor based software product lines.
217
+
- [`vevos.variability.pc.groundtruth`](src/main/java/vevos/variability/pc/groundtruth) contains datatypes for the ground truth of generated variants.
218
+
- [`vevos.variability.pc.options`](src/main/java/vevos/variability/pc/options) contains the options for the variant generation process.
219
+
- [`vevos.variability.pc.visitor`](src/main/java/vevos/variability/pc/visitor) contains an implementation of the visitor pattern for traversing and inspecting `ArtefactTree`s. Some visitors for querying a files or a line's presence condition, as well as a pretty printer can be found in [`vevos.variability.pc.visitor.common`](src/main/java/vevos/variability/pc/visitor/common).
220
+
- [`vevos.variability.sequenceextraction`](src/main/java/vevos/variability/sequenceextraction) contains default implementations for `SequenceExtractor`. These are algorithms for sorting pairs of commits into continuous histories (see example above).
0 commit comments