HSCC 2020 - Repeatability Evaluation

Submission Guidelines

The Repeatability Evaluation Package (REP) consists of three components:

If you would like to submit software and/or data in another format, please contact the RE committee chair in advance to discuss options.

The RP submission website: RE submission link.  

When preparing your RP, keep in mind that other conferences have reported that the most common reason for reproducibility failure is installation problems.  We recommend that you have an independent member of your lab test your installation instructions and REP on a clean machine before final submission.

The repeatability evaluation process uses anonymous reviews so as to solicit honest feedback.  Authors of REPs should make a genuine effort to avoid learning the identity of the reviewers.  This effort may require turning off analytics or only using systems with high enough traffic that REC accesses will not be apparent.  In all cases where tracing is unavoidable the authors should provide warnings in the documentation so that reviewers can take necessary precautions to maintain anonymity.

REPs are considered confidential material in the same sense as initial paper submissions: committee members agree not to share REP contents and to delete them after evaluation. REPs remain the property of the authors, and there is no requirement to post them publicly (although we encourage you to do so).

Note for tool/case study paper authors on double-blind process:
Tool and case study paper authors are strongly encouraged to submit a repeatability package due by October 28th. While we expect the tool/case study papers submitted to the main conference to follow the double-blind instructions, the RE packages do not need to be anonymized. Therefore please remove any links to a repository that could reveal your identity from the paper but include these in your RE package submission. RE packages and papers will be evaluated by different committees. RE committee's findings will be communicated to the regular reviewers of the paper before the final decisions are made.

Second round of repeatability evaluation:
Authors of the accepted papers will be encouraged to submit a repeatability package due early January. Among the RE packages submitted, the papers, results of which are deemed repeatable will receive a repeatability badge that will be included on the first page of the published version. These papers will also be highlighted on the conference website. The timeline for the first round is as follows:
  1. RE package submission for tool/case study papers: October 28th, 2019, AOE
  2. RE notification for tool/case study papers: December 23rd, 2019, AOE (tentative)

Background and Goals

HSCC has a rich history of publishing strong papers emphasizing computational contributions; however, subsequent re-creation of these computational elements is often challenging because details of the implementation are unavoidably absent in the paper. Some authors post their code and data to their websites, but there is little formal incentive to do so and no easy way to determine whether others can actually use the result.  As a consequence, computational results often become non reproducible -- even by the research group which originally produced them -- after just a few years.

The goal of the HSCC repeatability evaluation process is to improve the reproducibility of computational results in the papers selected for the conference. 

Benefits for Authors

We hope that this process will provide the following benefits to authors:

While creating a repeatability package will require some work from the authors, we believe the cost of that extra work is outweighed by a direct benefit to members of the authors' research lab: if an independent reviewer can replicate the results with a minimum of effort, it is much more likely that future members of the lab will also be able to do so, even if the primary author has departed.

The repeatability evaluation process for HSCC draws upon several similar efforts at other conferences (SIGMOD, SAS, CAV, ECOOP, OOPSLA), and a first experimental run was held at HSCC14.

Repeatability Evaluation Criteria

Each member of the Repeatability Evaluation Committee assigned to review a Repeatability Package (RP) will judge it based on three criteria -- coverage, instructions, and quality -- where each criteria is assessed on the following scale:
In order to be judged "repeatable" an RP must "meet expectations" (average score of 3), and must not have any missing elements (no scores of 1).  Each RP is evaluated independently according to the objective criteria.  The higher scores ("exceeds" or "significantly exceeds expectations") in the criteria should be considered aspirational goals, not requirements for acceptance.

Coverage

What fraction of the appropriate figures and tables are reproduced by the RP?  Note that some figures and tables should not be included in this calculation; for example, figures generated in a drawing program, or tables listing only parameter values.  The focus is on those figures or tables in the paper containing computationally generated or processed experimental evidence to support the claims of the paper.

Note that satisfying this criterion does not require that the corresponding figures or tables be recreated in exactly the same format as appears in the paper, merely that the data underlying those figures or tables be generated in a recognizable format.

A repeatable element is one for which the computation can be rerun by following the instructions in the RP in a suitably equipped environment.  An extensible element is one for which variations of the original computation can be run by modifying elements of the code and/or data.  Consequently, necessary conditions for extensibility include that the modifiable elements be identified in the instructions or documentation, and that all source code must be available and/or involve calls to commonly available and trusted software (eg: Windows, Linux, C or Python standard libraries, Matlab, etc.).

The categories for this criterion are:

Instructions

This criterion is focused on the instructions which will allow another user to recreate the computational results from the paper.

Quality

This criterion explores the documentation and trustworthiness of the software and its results.  While a set of scripts which exactly recreate, for example, the figures from the paper certainly aid in repeatability, without well-documented code it is hard to understand how the data in that figure were processed, without well-documented data it is hard to determine whether the input is correct, and without testing it is hard to determine whether the results can be trusted.

If there are tests in the RP which are not included in the paper, they should at least be mentioned in the instructions document.  Documentation of test details can be put into the instructions document or into a separate document in the RP.

The categories for this criterion are:

Note that tests are a form of documentation, so it is not really possible to have testing without documentation.


Sample Repeatability Package



[Thanks to Ian M. Mitchell for the content of this page]