INTERNET-DRAFT Pushpita Bhattacharjee Intended Status: Proposed Standard Abel Sanchez Expires: July 11 2019 MIT January 7 2019 Reproduce Object framework(ROF) draft-aspb-rof-00 Abstract This document postulates a standard for computational models to help with reproducibility. This standard proposes a lightweight,programing language agnostic, JavaScript Object Notation(JSON) based Framework to describe the environment and behavior of a computational model. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Pushpita B Expires 7/11/2019 [Page 1] INTERNET DRAFT Reproduce Object framework(ROF) Copyright and License Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 8.1 Normative References . . . . . . . . . . . . . . . . . . . 6 8.2 Informative References . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7 Pushpita B Expires 7/11/2019 [Page 2] INTERNET DRAFT Reproduce Object framework(ROF) 1 Introduction Reproducibility in computational research is an unsolved complex challenge faced by numerous researchers today. The nature of the complexity stems from various stakeholders involved in the end to end research paradigm and from the multiple components involved in this process, such as the computational model, software platform, data sets, publication methods etc. Scientific research like any other field has gone through a major digital transformation in the past couple of decades with the rise of the world wide web. Many types of research now depend on developing complex computational models, using various software platforms and analyzing big sets of data. But when it comes to publishing the results of research we still use the archetypical only text-based publication process. This process makes it increasingly difficult to reproduce research results from a published study. Although research has become increasingly multidisciplinary and dependent on computer software models and large sets of data, the traditional way of scientific publications has not changed much over time. Various studies have found that the text- based analysis with supporting material presented in a paper is not enough to reproduce research fully. This has been one of the factors leading to the reproducibility crisis. Reproduce Object framework takes a digital automation approach in trying to propose a solution for the reproducibility challenge. The Reproduce Object Framework is a JavaScript Object Notation (JSON) based lightweight standard to define a computational model configuration, inputs to the model, its results and environment to reproduce the model. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Grammar This standard follows the basic JSON syntax of immutable key string and user-defined value allowed by JSON grammar, the key-value pair is separated by a colon(:). Starting with a right curly bracket ({)and ending with the left curly bracket (}).The left of the colon(:) entry signifies the key identifier of the entry and the right side provides the value for each of the identifiers. It contains 6 mandatory elements: 1. code_repository- this key corresponds to the code location for Pushpita B Expires 7/11/2019 [Page 3] INTERNET DRAFT Reproduce Object framework(ROF) the model in the publicly accessible cloud. 2. language- this key identifies the programing language used to develop the model. 3.language_version- this key declares the version of the programming language used to develop the model. 4.input_file_location- this points to the relative location of the input file that the computational model requires for the analysis. The path is decided from the root of the code base folder structure and includes the name and format of the file. 5.output_file- this key points to the relative location of the results file that the computational model renders as the output of the analysis is saved. The path is decided from the root of the code base folder structure and includes the name and format of the file. 6.main_file- this key expects the file location of the executable file that will trigger the execution. The path is decided from the root of the code base folder structure and includes the name and format of the file. 7.read_me- this key points to the location of the file with the description of the model. The path is decided from the root of the code base folder and it includes the name and format of the file. The order of the keys may depend upon the implementation by the platform. 3 Values The Reproduce Object must contain only one permissible value for each key.This specification allows the following as the values for each key: 1. code_repository- must contain a single location where all the source code for the model is uploaded. It can be in any of the cloud source code sharing platforms like GitHub, Google Drive, Dropbox etc. The path should be the same as the cloud repository convention and accessible freely without any special authorization. 2.language- must have the official programming language name declared by the official language distribution. It will contain a Pushpita B Expires 7/11/2019 [Page 4] INTERNET DRAFT Reproduce Object framework(ROF) single language name.It can have values like python, R, node_js etc. This language compiler should be publicly available for download and use. 3.language_version- must contain a single specific version number of the programming language used to develop the computational model. This value will be formatted with 3 dot notation, denoting the major, minor and patch version as officially declared by the programming language. This version should be publicly available for download and use. 4.input_file-This value must contain a single path location relative to the root folder of the source code. If the file exists in the root folder itself , the declaration must be ./filename.extension.Location path must include the file name and extension of the file. 5.output_file- Must contain the location of results file generated by the computational model, location is defined relative to the root of the source code folder. Location path must include the file name and extension of the file. 6.main_file - Must contain a single file path location relative to the root folder of the source code. This key refers to the file which triggers the execution of the model. 7.read_me- Must contain a single file path location relative to the root folder of the source code. This file contains details regarding the author of the model, copyright restrictions if any and Model description. The implementation can limit the size of string for memory and other consideration. 4 Example A brief, single-entry Reproduce standard Object: { code_repository: git@github.com:pushpitab/ML-Demo.git language: node_js Pushpita B Expires 7/11/2019 [Page 5] INTERNET DRAFT Reproduce Object framework(ROF) language_version:10.9.0 input_file: ./input/the_input_file.json output_file:./results/the_output_file.json main_file:./src/main.js read_me:./README.md } It describes a computational model with the following details: This model's source code is shared in public cloud repository GitHub.The code can be accessed freely without special authorization.This contains GitHub notation of sharing and user identifying syntax.The programing language used in the model is node_js. The version of node_js used is declared as 10.9.0. The path to the input file required for the model to run is under the root folder and one level down in a folder name input.The name of the file is the_input_file.json After executing the model, the result is saved in the results folder, inside the root of the source folder by the name the_output_file.json.The main executable file is located in the root of the src folder and named as main.js. The last value refers to the read_me document with details about the description, author and copyright. Its located in the root of the source folder. 5 Parser A Reproduce Standard Object parser transforms the object to the required actionable elements. Any standard JSON parser can be used to parse the Reproduce Object. 6 Generator To create the Reproduce Object, generating method MUST follow the JSON grammar and the Reproduce Object grammar. 7 IANA Considerations This memo includes no request to IANA. 8 References 8.1 Normative References Pushpita B Expires 7/11/2019 [Page 6] INTERNET DRAFT Reproduce Object framework(ROF) [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 2014, . 8.2 Informative References Authors' Addresses Abel Sanchez Massachusetts Institute of Technology EMail: doval@mit.edu Pushpita bhattacharjee Massachusetts Institute of Technology EMail: pushpital@mit.edu Pushpita B Expires 7/11/2019 [Page 7]