Team:Heidelberg/pages/igemathome/implementation

From 2014.igem.org

Revision as of 13:43, 14 October 2014 by Maexlich (Talk | contribs)

Contents

Introduction

Most widely used programming languages in science often have the flaw of dependencies to runtime components that have to be installed on the executing system. For example Python, one of the most prominent programing languages in science, requires the installation of the Python runtime containing all standard functions and the Python interpreter which reads the actual program source code and executes the instructions defined. Without these components a python program cannot be run. Our aim being to enable the distributed execution of our software, these requirements represent a serious flaw. It was required to add components to the software that automatically extract runtime dependencies and pass through the access to essential functions of the BOINC API. As an additional layer of complexity independent from the scientific computation application is added, we decided to implement the functionality enabling portability of applications in a dedicated peace of software, the loader application. This application takes care of solving all dependencies required for the scientific application to run, allowing the simple distribution of python and java applications without having to bother about structural program-language dependencies. As both programing languages are designed for platform independency, the source code written can be used for all main target platforms of the BOINC platform without adaptation.

General Concepts of the Boinc Platform

Figure 2) File references in the BOINC storage modal
Figure 2) File references in the BOINC storage modal

Diagram of the directory structure used by the BOINC Platform
Source: BOINC Wiki

BOINC applications are run via the BOINC client, which takes care of the communication with the server, the execution and monitoring of the distributed applications, resource requirement checks for the scientific application and the management of completed jobs. It also offers access to a few of these functions via the BOINC Application Programming Interface, which can be called by scientific applications by compiling against the BOINC API library. The most important concepts when programing BOINC applications are described in detail in the following sections.

Encapsulating multiple processes via Slots

The BOINC client builds up a file structure, which saves the executables, input files and result files into a project directory, separating them from temporary files of the application which are saved in slot directories. The working directory of the application is set to a slot directory, so that files written by relative path do not get overwritten by multiple instances of an application running at the same time. To allow access to the project files, these must be described in templates files of the BOINC system associating the with a logical filename BOINC Input and output templates

Resolving file names [BOINC Resolving Files]

The storage model of the BOINC Platform is based on the requirement of immutable files that are downloaded/uploaded from/to the server. To allow applications to access different input files without requiring a change in the program code, files can be resolved via a logical name. For demonstration, one can imagine an application that takes 2 input files, ‘’input1.txt’’ and ‘’input2.txt’’ and generates an output file, ‘’output.txt’’ from these two. As the input files are different for each job, they cannot be saved under the same filename, because of the BOINC Platform requirements. By defining these files in the BOINC Platform, they can be associated with a logical name (input1.txt, input2.txt and output.txt) while in reality a different filename is accessed after resolving the logical file name to a value completely independent of the logical filename. This functionality is enabled via the boinc_resolve_filename API function, which will generate output similar to ../../projects/igemathome/name_used_for_staging referring to corresponding input files in the project directory (see Figure 2).

Bundling

This section explains different concepts used and steps required to enable the bundling of a scientific application written in Java or Python with the corresponding runtime dependencies.

Java Applications

Java applications require the installation of the Java Virtual Machine, which interprets the byte-code created by the Java compiler and executes the appropriate functions. It also delivers the Java Class Library, which enables access to the Standard Library of Java, implementing platform abstraction. To allow these applications to run we wrote a java application launcher based on the javafxpackager. It extracts the Java runtime into the Project folder of the BOINC Application, initializes the Java Virtual Machine and launchers the specified jar containing the scientific application. As Java offers methods for calling native C and C++ code via the Java Native Interface, we utilized this functionality to allow access to the Application Programming Interface of the BOINC Platform.

JNI (Java Native Interface)

The Java Native Interface allow the execution of Java code initiated from a native application as well as the execution of native code called from Java applications. The former is called the Invocation API and was used to allow the loader to set up a complete Java Virtual Machine. To access the invocation API, the main library (jvm.dll on Windows or jvm.so on Linux) is dynamically loaded in the loader application, allowing to change the Java Runtime version without requiring recompilation of the loader application, making the process more flexible and modularized. The parameters required for the initialization and launch of the JVM are read from a file called ‘’package.cfg’’ containing the following structure:

app.mainjar=LinkerEvaluator.jar
app.version=1.0
app.mainclass=org/igemathome/linker/evaluator/Main
app.classpath=
app.preferences.id=org/igemathome/linker/evaluator/Main

The latter ability to run native C/C++ code called by Java applications was used to implement an interface to the BOINC Platform API, allowing the use of functions to ease integration into the distributed computation system. To accomplish this, it is required to compile the native program components as shared libraries which then are dynamically loaded by the JVM on the use of native code functions. The source code of these files has to follow a specific structure generated by the program ‘’javah’’, based on a defined Java class containing function definitions prefixed by the ‘’native’’ keyword. These generated header files, can then be used to write implementations of the functions in C/C++ and converting objects to/from Java via access to the JNIEnv structure [JNI Design overview].

BoincAPIWrapper

The BoincAPIWrapper allows the most important BOINC API functions to be called from Java applications, translating the calls directly into C/C++ applications. The Boinc API functions can be accessed via the class org.igemathome.wrapper.BoincApiWrapper which contains methods named exactly the same as the C/C++ BOINC Api and thus can be accessed and simply be reading the BOINC API documentation.

Problems

After implementing the main methods of the BOINC API via the Java Native Interface we encountered some serious problems. Mainly the applications segfaulted inside the JVM after initializing the BOINC API. After multiple days of searching we realized, that the problem was caused by the redirection of standard Output and standard Error by the boinc_init_diagnostics() function with rewrote the file descriptor for these outputs. As the JVM did not realize the descriptors were rewritten, it tried to access the old address of the file descriptor resulting in access to memory regions not allocated for the program, thus resulting in Segmentation Faults. We were able to work around these errors by removing all calls to the freopen() function, yet realized later on that this problem only accrued if the BoincAPIWrapper was called outside of the launcher context via the classic java launcher using the command java –jar FILENAME. It seems, that the standard Output and standard Error streams are handled differently when using the embedded version of the JVM.

Compilation

This section explains how to compile your own version of the Java launcher and package it to a portable binary. The Build system of the Java Launcher is implemented as Makefiles for the compilation under UNIX like operating systems and as Microsoft Visual Studio Project files for Windows Operating Systems. The BOINC source code and an installed Java Development Kit are required for the compilation of the software. For the generation of portable binaries under Linux the usage of the provided change root is recommended (for further information read the section Linux Binary Compatability).

Windows

The project files for the Windows version of the launcher and BoincAPIWrapper can be found in the path src/cpp/launcher/win and src/cpp/libboincAPIWrapper/win in the BOINC Java Wrapper source code. Probably one must change the paths of the referencing project files in Visual Studio to point to the path of the BOINC source code and edit the include path of the Project to refer to the installation path of the JDK subfolders include and include/win32.

Linux

On Linux the process of compiling is a bit more complex, as a change root should be used for building the program to get portable binaries that run on multiple Linux distributions. Please download the change root appropriate for your platform (32bit or 64bit) via the download link below, extract it and follow these instructions, yet skipping the steps including deboostrap and use the provided chroot instead. The Makefiles for the launcher may be found under the path src/cpp/libboincAPIWrapper whereas the libBoincAPIWrapper can be found at src/cpp/libboincAPIWrapper. For the Makefiles to run correctly it is required to set the two environment variables, then the script may be run:

export BOINC_DIR=PATH_TO_BOINC_SOURCE_CODE
export JDK=PATH_TO_JAVA_DEVELOPMENT_KIT
make

Packaging the Java Runtime

The loader expects the Java Runtime to be located inside a

runtime.zip
    └── runtime
        └── jre
            ├── bin
            │   ├── attach.dll
            │   ├── awt.dll
            │   ├── boincAPIWrapper.dll
            │  ...
            ├── COPYRIGHT
            ├── lib
            │   ├── accessibility.properties
            │   ├── applet
            │  ...
            │   ├── ext
            │   │   ├── access-bridge-32.jar
            │   │   ├── cldrdata.jar
            │   │   ├── dnsns.jar
            │   │   ├── igemathome.jar
            │  ... ...   
            │   └── flavormap.properties
            ├── LICENSE
            ├── README.txt
            ├── THIRDPARTYLICENSEREADME.txt
            ├── THIRDPARTYLICENSEREADME-JAVAFX.txt
            └── Welcome.html

References

[JNI Design Overview] http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html

[BOINC Resolving Files] http://boinc.berkeley.edu/trac/wiki/BoincFiles