Skip to content

Commit cd2068f

Browse files
committed
revised readme
1 parent a998606 commit cd2068f

1 file changed

Lines changed: 23 additions & 22 deletions

File tree

README.md

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ final CaseSensitivePath splRepositoryPath = CaseSensitivePath.of("path", "to", "
3333
final CaseSensitivePath groundTruthDatasetPath = CaseSensitivePath.of("path", "to", "datasets");
3434
final CaseSensitivePath variantsGenerationDir = CaseSensitivePath.of("directory", "to", "put", "generated", "variants");
3535
```
36-
We can now load the extraced ground truth dataset:
36+
We can now load the extracted ground truth dataset:
3737
```java
3838
final VariabilityDataset dataset = Resources.Instance()
3939
.load(VariabilityDataset.class, groundTruthDatasetPath.path());
@@ -45,10 +45,10 @@ Internally, `Resources` stores `ResourceLoader` and `ResourceWriter` objects tha
4545
This central interface allows users to add loaders and writers for further or custom data types as well as to replace existing loaders.
4646
Currently, `Resources` support IO of CSV files, feature models (KernelHaven `json`, and FeatureIDE `dimacs`, `xml`), variant configurations (FeatureIDE `xml`), and presence conditions of product lines and variants.
4747

48-
From the loaded `dataset`, we can obtain the available evolution step.
49-
An evolution step describes a commit-sized change to the input software product line, and is defined by the (child) commit performing a change to a previous (parent) commit.
48+
From the loaded `dataset`, we can obtain the available evolution steps.
49+
An evolution step describes a commit-sized change to the input software product line, and is defined by a (child) commit performing a change to a previous (parent) commit.
5050
Note that the evolution steps are not ordered because commits in the input product-line repository might not have been ordered as the commits might have been extracted from different branches.
51-
Alternatively, we can also request a continuous history of evolution steps instead of an unordered set.
51+
If we require an order, we can request a continuous history of evolution steps instead of an unordered set.
5252
Therefore, a `SequenceExtractor` is used to determine how the successfully extracted commits should be ordered.
5353
In this example, we use the `LongestNonOverlappingSequences` extractor to sort the commits into one single continuous history.
5454
Nevertheless, merge commits and error commits (where VEVOS/Extraction failed) are excluded from the history and thus, the returned list of commits has gaps.
@@ -77,17 +77,18 @@ In particular, the `VariabilityDataset` provides:
7777

7878
To generate variants, we have to specify which variants should be generated.
7979
Therefore, a `Sampler` is used that returns the set of variants to use for a certain feature model.
80+
The set of desired variants is encapsulated in samplers because the set of valid variants of the input product line may change when the feature model changes over time (i.e., commits).
81+
Thus, the sampler can be invoked during each step of the variant simulation.
8082
Apart from the possibility of introducing custom samplers, VEVOS/Simulation comes with two built-in ways for sampling:
8183
Random configuration sampling using the FeatureIDE library, and constant sampling.
82-
Random sampling returns a random set of valid configuration from a given feature model.
83-
Constant sampling uses a pre-defined set of variants to generate ignoring the feature model.
84-
The set of desired variants is encapsulated in samplers because the set of valid variants of the input product line may change when the feature model changes.
85-
Thus, the sampler can be invoked during each step of the variant simulation.
84+
85+
[Random sampling](src/main/java/vevos/feature/sampling/FeatureIDESampler.java) returns a random set of valid configurations from a given feature model:
8686
```java
8787
/// Either use random sampling, ...
8888
final int numberOfVariantsToGenerate = 42;
8989
Sampler variantsSampler = FeatureIDESampler.CreateRandomSampler(numberOfVariantsToGenerate);
9090
```
91+
[Constant sampling](src/main/java/vevos/feature/sampling/ConstSampler.java) uses a pre-defined set of variants and ignores the feature model (it can easily be extended though to for example crash if a configuration violates a feature model at any commit):
9192
```java
9293
/// ... or use a predefined set of variants.
9394
final Sample variantsToGenerate = new Sample(List.of(
@@ -111,7 +112,7 @@ final SPLRepository splRepository = new SPLRepository(splRepositoryPath.path());
111112
/// for Busybox:
112113
final SPLRepository splRepository = new BusyboxRepository(splRepositoryPath.path());
113114
```
114-
Note that Busybox has a special subclass called `BusyboxRepository` that performs some necessary pre- and postprocessing on the product lines source code.
115+
Note that Busybox has a special subclass called `BusyboxRepository` that performs some necessary pre- and postprocessing on the product line's source code.
115116

116117
We are now ready to traverse the evolution history to generate variants:
117118
```java
@@ -126,7 +127,7 @@ However, both types of data are not directly accessible but have to be loaded fi
126127
This is what the `Lazy` type is used for: It defers the loading of data until it is actually required.
127128
This makes accessing the possibly huge (93GB for 13k commits of Linux, yikes!) ground truth dataset faster and memory-friendly as only required data is loaded into memory.
128129
We can start the loading process by invoking `Lazy::run` that returns a value of the loaded type (i.e., `Optional<IFeatureModel>` or `Optional<Artefact>`).
129-
A `Lazy` caches its loaded value so loading is only performed once.
130+
A `Lazy` caches its loaded value, so loading is only performed once: Subsequent calls to `Lazy::run` return the cached value directly.
130131
(Loaded data that is not required anymore can and should be freed by invoking `Lazy::forget`.)
131132
As the extraction of feature model or presence condition might have failed, both types are again wrapped in an `Optional` that contains a value if extraction was successful.
132133
Let's assume the extraction succeeded by just invoking `orElseThrow` here.
@@ -142,14 +143,14 @@ In case the `variantsSampler` is actually a `ConstSampler` (see above), it will
142143
```
143144
Optionally, we might want to filter which files of a variant to generate.
144145
For example, a study on evolution of code in variable software systems could be interested only in generating the changed files of a commit.
145-
In our case, let's just generate all variants.
146+
In our case, let's just generate the entire code base of each variant.
146147
Moreover, `VariantGenerationOptions` allow to configure some parameters for the variant generation.
147148
Here, we just instruct the generation to exit in case an error happens but we could for example also instruct it to ignore errors and proceed.
148149
```java
149150
final ArtefactFilter<SourceCodeFile> artefactFilter = ArtefactFilter.KeepAll();
150151
final VariantGenerationOptions generationOptions = VariantGenerationOptions.ExitOnError(artefactFilter);
151152
```
152-
To generate variants, we have to access the source code of the input software product line, at the currently inspected commit.
153+
To generate variants, we have to access the source code of the input software product line at the currently inspected commit.
153154
We thus checkout the current commit in the product line's repository:
154155
```java
155156
try {
@@ -171,8 +172,8 @@ Finally, we may indeed generate our variants:
171172
The generation returns a `Result` that either represents the ground truth for the generated variant, or contains an exception if something went wrong.
172173
In case the generation was successful, we can inspect the `groundTruth` of the variant.
173174
The `groundTruth` consists of
174-
- the presence conditions and feature mappings of the variant (which are different from the software product lines presence conditions, for example because line numbers shifted),
175-
- and a block matching that for each source code file (key of the map) tells us which blocks of source code in the variant steam from which blocks of source code in the software product line.
175+
- the presence conditions and feature mappings of the variant (which are different from the presence conditions of the software product line, for example because line numbers shifted),
176+
- and a block matching that for each source code file (key of the map) tells us which blocks of source code in the variant stem from which blocks of source code in the software product line.
176177
We may also export ground truth data to disk for later usage.
177178

178179
(Here it is important to export the ground truth as `.variant.csv` as this suffix is used by our `Resources` to correctly load the ground truth.
@@ -189,7 +190,7 @@ In contrast, the suffix is `.spl.csv` for ground truth presence conditions of th
189190
}
190191
}
191192
```
192-
In case we use Busybox as our input product line, we have to clean its repository as a last step:
193+
In case we use Busybox as our input product line, we have to clean its repository as a last step before we can proceed to the next `SPLCommit`:
193194
```java
194195
if (splRepository instanceof BusyboxRepository b) {
195196
try {
@@ -199,7 +200,7 @@ In case we use Busybox as our input product line, we have to clean its repositor
199200
}
200201
}
201202
```
202-
This was round-trip about the major features of VEVOS/Simulation. Further features and convencience methods can be found in our documentation.
203+
This was round-trip about the major features of VEVOS/Simulation.
203204

204205
## Project Structure
205206

@@ -208,15 +209,15 @@ The project is structured into the following packages:
208209
- [`vevos.feature`](src/main/java/vevos/feature) contains our representation for `Variant`s and their `Configuration`s as well as sampling of configurations and variants
209210
- [`vevos.io`](src/main/java/vevos/io) contains our `Resources` service and default implementations for loading `CSV` files, ground truth, feature models, and configurations
210211
- [`vevos.repository`](src/main/java/vevos/repository) contains classes for representing git repositories and commits
211-
- [`vevos.sat`](src/main/java/vevos/sat) contains an interface for SAT solving (currently only used for annotation simplification on demand)
212+
- [`vevos.sat`](src/main/java/vevos/sat) contains an interface for SAT solving (currently only used for annotation simplification, which is deactivated by default)
212213
- [`vevos.util`](src/main/java/vevos/util) is the conventional utils package with helper methods for interfacing with FeatureIDE, name generation, logging, and others.
213214
- [`vevos.variability`](src/main/java/vevos/variability) contains the classes for representing evolution histories and the ground truth dataset.
214215
The package is divided into:
215-
- [`vevos.variability.pc`](src/main/java/vevos/variability/pc) contains classes for representing , and annotations (i.e., presence conditions and feature mappings). We store annotations in `Artefact`s that follow a tree structure similar to the annotations in preprocessor based software product lines.
216-
- [`vevos.variability.pc.groundtruth`](src/main/java/vevos/variability/pc/groundtruth) contains datatypes for the ground truth of generated variants
217-
- [`vevos.variability.pc.options`](src/main/java/vevos/variability/pc/options) contains the options for the variant generation process
218-
- [`vevos.variability.pc.visitor`](src/main/java/vevos/variability/pc/visitor) contains an implementation of the visitor pattern for traversing and inspecting `ArtefactTree`s. Some visitors for querying a files or a line's presence condition, as well as a pretty printer can be found in `vevos.variability.pc.visitor.common`.
219-
- [`vevos.variability.sequenceextraction`](src/main/java/vevos/variability/pc/sequenceextraction) contains default implementation for `SequenceExtractor`. These are algorithms for sorting pairs of commits into continuous histories (see example above).
216+
- [`vevos.variability.pc`](src/main/java/vevos/variability/pc) contains classes for representing annotations (i.e., presence conditions and feature mappings). We store annotations in `Artefact`s that follow a tree structure similar to the annotations in preprocessor based software product lines.
217+
- [`vevos.variability.pc.groundtruth`](src/main/java/vevos/variability/pc/groundtruth) contains datatypes for the ground truth of generated variants.
218+
- [`vevos.variability.pc.options`](src/main/java/vevos/variability/pc/options) contains the options for the variant generation process.
219+
- [`vevos.variability.pc.visitor`](src/main/java/vevos/variability/pc/visitor) contains an implementation of the visitor pattern for traversing and inspecting `ArtefactTree`s. Some visitors for querying a files or a line's presence condition, as well as a pretty printer can be found in [`vevos.variability.pc.visitor.common`](src/main/java/vevos/variability/pc/visitor/common).
220+
- [`vevos.variability.sequenceextraction`](src/main/java/vevos/variability/sequenceextraction) contains default implementations for `SequenceExtractor`. These are algorithms for sorting pairs of commits into continuous histories (see example above).
220221

221222
## Setup
222223

0 commit comments

Comments
 (0)