|
| 1 | +--- |
| 2 | +title: 'EMB: A Curated Corpus of Web/Enterprise Applications for Scientific Research in Software Engineering' |
| 3 | +tags: |
| 4 | + - JVM |
| 5 | + - Java |
| 6 | + - Kotlin |
| 7 | + - NodeJS |
| 8 | + - JavaScript |
| 9 | + - TypeScript |
| 10 | + - .Net |
| 11 | + - C# |
| 12 | + - SBST |
| 13 | + - search-based software engineering |
| 14 | + - test generation |
| 15 | + - system testing |
| 16 | + - fuzzing |
| 17 | + - REST |
| 18 | + - GraphQL |
| 19 | + - benchmark |
| 20 | + |
| 21 | +authors: |
| 22 | + - name: Andrea Arcuri |
| 23 | + orcid: 0000-0003-0799-2930 |
| 24 | + affiliation: 1 |
| 25 | + - name: Man Zhang |
| 26 | + orcid: 0000-0003-1204-9322 |
| 27 | + affiliation: 1 |
| 28 | + - name: Amid Golmohammadi |
| 29 | + orcid: TODO |
| 30 | + affiliation: 1 |
| 31 | + - name: Asma Belhadi |
| 32 | + orcid: TODO |
| 33 | + affiliation: 1 |
| 34 | + - name: Juan Pablo Galeotti |
| 35 | + orcid: 0000-0002-0747-8205 |
| 36 | + affiliation: 2 |
| 37 | +affiliations: |
| 38 | + - name: Kristiania University College, Department of Technology, Oslo, Norway |
| 39 | + index: 1 |
| 40 | + - name: FCEyN-UBA, and ICC, CONICET-UBA, Depto. de Computaci\'on, Buenos Aires, Argentina |
| 41 | + index: 2 |
| 42 | +date: February 2022 |
| 43 | + |
| 44 | +[//]: # (bibliography: paper.bib) |
| 45 | +--- |
| 46 | + |
| 47 | +# Summary |
| 48 | + |
| 49 | +In this repository, |
| 50 | +we collected several different systems, in different programming languages, like |
| 51 | +Java, Kotlin, JavaScript and C#. |
| 52 | +In this documentation, we will refer to these projects as System Under Test (SUT). |
| 53 | +Currently, the SUTs are either _REST_ or _GraphQL_ APIs. |
| 54 | + |
| 55 | +For each SUT, we implemented _driver_ classes, which can programmatically _start_, _stop_ and _reset_ the state of SUT (e.g., data in SQL databases). |
| 56 | +As well as enable setting up different properties in a _uniform_ way, like choosing TCP port numbers for the HTTP servers. |
| 57 | +If a SUT uses any external services (e.g., a SQL database), these will be automatically started via Docker in these driver classes. |
| 58 | + |
| 59 | +# Statement of Need |
| 60 | + |
| 61 | +This collection of SUTs was originally assembled for easing experimentation with the fuzzer called [EvoMaster](http://evomaster.org). |
| 62 | +However, finding this type of applications is not trivial among open-source projects. |
| 63 | +Furthermore, it is not simple to sort out all the technical details on how to set these applications up and start them in a simple, uniform approach. |
| 64 | + |
| 65 | +Therefore, this repository provides the important contribution of providing all these necessary scripts and software libraries for researchers that need this kind of case study. |
| 66 | + |
| 67 | + |
| 68 | +# Tool Summary |
| 69 | + |
| 70 | +The projects were selected based on searches using keywords on GitHub APIs, using convenience sampling. |
| 71 | +Several SUTs were looked at, in which we discarded the ones that would not compile, would crash at startup, would use obscure/unpopular libraries with no documentation to get them started, are too trivial, student projects, etc. |
| 72 | +Where possible, we tried to prioritize/sort based on number of _stars_ on GitHub. |
| 73 | + |
| 74 | + |
| 75 | +Note that some of these open-source projects might be no longer supported, whereas others are still developed and updated. |
| 76 | +Once a system is added to EMB, we do not modify nor keep it updated with its current version under development. |
| 77 | +The reason is that we want to keep an easy to use, constant set of case studies for experimentation that can be reliably used throughout the years. |
| 78 | + |
| 79 | +The SUTs called _NCS_ (Numerical Case Study) and _SCS_ (String Case study) are artificial, developed by us. |
| 80 | +They are based on numerical and string-based functions previously used in the literature of unit test generation. |
| 81 | +We just re-implemented in different languages, and put them behind a web service. |
| 82 | + |
| 83 | + |
| 84 | +Due to several reasons, the software in this repository is not published as a library (e.g., on Maven and NPM). |
| 85 | +To use EMB, you need to clone this repository: |
| 86 | + |
| 87 | +``` |
| 88 | +git clone https://github.com/EMResearch/EMB.git |
| 89 | +``` |
| 90 | + |
| 91 | +There are 2 main use cases for EMB: |
| 92 | + |
| 93 | +* Run experiments with _EvoMaster_ |
| 94 | + |
| 95 | +* Run experiments with other tools |
| 96 | + |
| 97 | +Everything can be setup by running the script `scripts/dist.py`. |
| 98 | +Note that you will need installed at least JDK 8, JDK 11, NPM and .Net 3.x, as well as Docker. |
| 99 | +Also, you will need to setup environment variables like `JAVA_HOME_8` and `JAVA_HOME_11`. |
| 100 | +The script will issue error messages if any prerequisite is missing. |
| 101 | +Once the script is completed, all the SUTs will be available under the `dist` folder, and a `dist.zip` will be created as well (if `scripts/dist.py` is run with `True` as input). |
| 102 | + |
| 103 | +Note that here the drivers will be built as well besides the SUTs, and the SUT themselves will also have an instrumented version (for white-box testing heuristics) for _EvoMaster_ (this is for JavaScript and .Net, whereas instrumentation for JVM is done at runtime, via an attached JavaAgent). |
| 104 | + |
| 105 | +In the built `dist` folder, the files will be organized as follows: |
| 106 | + |
| 107 | +* For JVM: `<name>-sut.jar` will be the non-instrumented SUTs, whereas their executable drivers will be called `<name>-evomaster-runner.jar`. |
| 108 | + Instrumentation can be done at runtime by attaching the `evomaster-agent.jar` JavaAgent. If you are running experiments with EvoMaster, this will be automatically attached when running experiments with `exp.py` (available in the EvoMaster's repository). Or it can be attached manually with JVM option `-Devomaster.instrumentation.jar.path=evomaster-agent.jar` when starting the driver. |
| 109 | +* For NodeJS: under the folder `<name>` (for each NodeJS SUT), the SUT is available under `src`, whereas the instrumented version is under `build`. |
| 110 | +* For .Net: currently only the instrumented version is available (WORK IN PROGRESS) |
| 111 | + |
| 112 | + |
| 113 | + |
| 114 | +For running experiments with EvoMaster, you can also "start" each driver directly from an IDE (e.g., IntelliJ). |
| 115 | +Each of these drivers has a "main" method that is running a REST API (binding on default port 40100), where each operation (like start/stop/reset the SUT) can be called via an HTTP message by EvoMaster. |
| 116 | +For JavaScript, you need to use the files `em-main.js`. |
| 117 | + |
| 118 | + |
| 119 | +You can also build (and install) each module separately, based on needs. |
| 120 | +For example, a Maven module can be installed with: |
| 121 | + |
| 122 | +``mvn clean install -DskipTests`` |
| 123 | + |
| 124 | +However, it is important to understand how this repository is structured, to be able to effectively navigate through it. |
| 125 | +Each folder represents a set of SUTs (and drivers) that can be built using the same tools. |
| 126 | +For example, the folder `jdk_8_maven` contains all the SUTs that need JDK 8 and are built with Maven. |
| 127 | +On the other hand, the SUTs in the folder `jdk_11_gradle` require JDK 11 and Gradle. |
| 128 | + |
| 129 | +For JVM and .Net, each module has 2 submodules, called `cs` (short for "Case Study") and `em` (short for "EvoMaster"). |
| 130 | +`cs` contains all the source code of the different SUTs, whereas `em` contains all the drivers. |
| 131 | +Note: building a top-module will build as well all of its internal submodules. |
| 132 | + |
| 133 | +Regarding JavaScript, unfortunately NodeJS does not have a good handling of multi-module projects. |
| 134 | +Each SUT has to be built separately. |
| 135 | +However, for each SUT, we put its source code under a folder called `src`, whereas all the code related to the drivers is under `em`. |
| 136 | + |
| 137 | +The driver classes for Java and .Net are called `EmbeddedEvoMasterController`. |
| 138 | +For JavaScript, they are in a script file called `app-driver.js`. |
| 139 | +Note that Java also a different kind of driver called `ExternalEvoMasterController`. |
| 140 | +The difference is that in External the SUT is started on a separated process, and not running in the same JVM of the driver itself. |
| 141 | + |
| 142 | + |
| 143 | +# Published Results |
| 144 | + |
| 145 | +# Related Work |
| 146 | + |
| 147 | + |
| 148 | + |
| 149 | +# Acknowledgements |
| 150 | + |
| 151 | +This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 864972), and |
| 152 | +partially by UBACyT 2020 20020190100233BA, PICT-2019-01793. |
| 153 | + |
| 154 | +# References |
| 155 | + |
0 commit comments