The Problem
Researchers struggle to compare and evaluate bioinformatics tools reliably and improve their quality
Life sciences research depends heavily on bioinformatics software: tools, workflows and web services that analyse complex biological data. As the number of available methods grows, researchers and developers face persistent challenges:
- Lack of objective comparison between tools performing similar tasks
- Difficulty assessing software quality beyond published claims
- Limited reproducibility of performance evaluations
- Fragmented benchmarking efforts, often confined to individual projects or short-lived challenges
Without shared, transparent benchmarking events, users struggle to choose appropriate tools, developers lack systematic feedback, and communities miss opportunities for collective improvement. OpenEBench addresses these challenges by providing an open, community-driven benchmarking and monitoring platform for bioinformatics software. OpenEBench brings together scientific benchmarking and technical monitoring to help users compare and improve software.
It enables scientific communities to:
- Define reference datasets and evaluation metrics.
- Run standardised, reproducible benchmarking events.
- Compare tools fairly and transparently.
- Track software performance and quality over time.
Beyond performance benchmarking, OpenEBench also contributes to improve research software quality through its OpenEBench Software Observatory. The observatory aggregates metadata from multiple sources and evaluates software against FAIR principles (Findable, Accessible, Interoperable, Reusable), providing:
- FAIRness indicators and scores.
- Visibility into software maturity and maintenance practices.
- Actionable insights for developers to improve their software.
This dual focus on performance and quality distinguishes OpenEBench from traditional benchmarking efforts.
User Community
Developers and researchers collaborate to improve bioinformatics tools
OpenEBench serves a diverse and interconnected user base:
- Bioinformatic tools developers, who want objective feedback on performance and visibility for their tools.
- Scientific communities, who define domain-specific benchmarking events aligned with real research needs.
- Researchers and end-users, who need trustworthy information to select appropriate tools.
- Infrastructure providers and funders, who seek insight into software quality, sustainability and impact.
The platform is designed to be community-led: benchmarking events are defined and governed by the communities that use them.
Technical Aspects
OpenEBench is built as a modular, service-oriented platform designed to support reproducible benchmarking and software observatory.
Languages and Codebase
The OpenEBench codebase is primarily developed using:
- Java and Python for data processing and benchmarking workflows infrastructure.
- Observatory backend: Python (FastAPI as REST API paradigm) and JavaScript (Node/Express.js).
- Java for REST and graphQL services of the scientific benchmarking, and Python for accessory REST services.
- PHP for OpenEBench VRE, which uses bash and Nextflow for orchestrating benchmarking pipelines.
- Most of the frontends are written in Javascript based on Nuxt framework.
The platform is organised into independent but interoperable components, enabling reuse across different benchmarking communities and facilitating long-term maintenance and extension. All core components are open source and developed following collaborative good practices.
Architecture and Deplopyment
OpenEBench follows a cloud-ready, container-based architecture:
- OpenEBench provides RESTful APIs for data access and integration.
- Most of services and tools are packaged using Docker containers to ensure reproducibility and portability.
- Benchmarking workflows are executed within a Virtual Research Environment (VRE).
- MongoDB is deployed as a core backend service supporting scalable data storage and querying.
Interoperabiltiy and Integration
The platform integrates with external registries, repositories and monitoring services to harvest software metadata and usage signals.
Software Practices
OpenEBench follows open-source and FAIR-aligned practices
OpenEBench is developed as an open‑source platform with code repositories hosted on GitHub. Contributors follow community standards for version control, code review, and integration.
- Version control: Both BSC GitLab and GitHub‑based repositories with clear topic tagging for discoverability.
- Code review: Pull requests are used to ensure changes align with project objectives and maintain quality standards.
- Testing and validation: Benchmarking workflows, maintained by the different scientific communities (usually in GitHub), are encouraged to include validation and verification of results.
- Community feedback: Issues and discussions capture user feedback, drive prioritisation, and support continuous improvement across releases.
Developer Community
OpenEBench is developed and maintained at the Technologies for Biomedical Research Laboratory (TechBioLab) at the Barcelona Supercomputing Center (BSC) as part of the ELIXIR Tools Platform. New contributors can get started by exploring the different git repositories and participating in community discussions. The platform encourages collaborative development and feedback to continuously improve benchmarking events.
Tools
OpenEBench uses tools to be FAIR-compliant
OpenEBench relies on a combination of development and deployment tools that streamline scientific benchmarking workflow creation, testing, and containerisation. These tools ensure the software remains stable and reproducible across environments. Key tools include:
- Git / GitHub: version control and code collaboration.
- Docker: containerizing workflows for reproducibility.
- Workflow languages: Nextflow for standardised benchmarking pipelines from the different scientific communities.
FAIR & Open
OpenEBench is fully aligned with FAIR and open science principles
OpenEBench promotes openness and FAIR compliance across its code, workflows, and benchmarking results:
Findable:
- OpenEBench codebases and data models are publicly available through multiple repositories hosted on GitHub and BSC’s GitLab.
- OpenEBench uses persistent identifiers (PIDs) to uniquely identify benchmarking related digital objects (e.g. challenges, datasets, metrics, and results), supporting finability.
Accessible:
- Users can access software, documentation, and benchmark data freely, and contribute via GitHub or BSC’s GitLab.
- The different git repositories holding both the code and the data model are publicly hosted both on GitHub and BSC’s GitLab under open-source licenses.
Interoperable:
- APIs and workflows use either standard or documented formats to support interoperability. Some examples:
- Metrics workflows provided by the communities must be written in Nextflow. The steps must depend on docker images, which should be public and distributable. Their docker recipes and build process should be public. When the docker images are not distributable, the docker recipes must be public. All the metrics workflows from the different communities have to support the very same minimal named parameters, and produce their main outputs listing the metrics assigned to the assessed participants, following a minimal dataset data model. An example of a dataset following the model here.
- REST API both returns and accepts documents following the OEB benchmarking data model, described in JSON Schema with several relational extensions (link to the 1.0 model, examples of 1.0 model and graphical representation of the 1.0 model).
Reusable:
- The different git repositories holding both the code and the data model are publicly hosted both on GitHub and BSC’s Gitlab under open-source licenses (Apache-2.0), CC-BY-SA-4.0, LGPLv2, etc…).
- Metrics workflows provided by the communities must be written in Nextflow and depend on Docker images, enabling reuse.
- The development process is transparent, with scientific and technical benchmarking communities involvement in decisions and feature development.
Documentation
OpenEBench provides multiple documentation resources to help users navigate the platform and contribute to development:
- Main entry point: OpenEBench ReadTheDocs
- README files for each repository explaining setup and usage
- Templates and examples for workflow examples and benchmarking pipelines
- Step-by-step guides for developers and administrators
Sustainability
As mentioned above, OpenEBench is maintained by the TechBioLab at BSC and supported by the ELIXIR Tools Platform. Governance combines core team oversight with community input. Funding from ELIXIR and institutional support ensures ongoing development, maintenance, and future-proofing. The project emphasises modular design, open standards, and containerization to reduce risks and facilitate long-term sustainability.
References
- OpenEBench main entry point
- OpenEBench technical monitoring
- OpenEBench scientific benchmarking
- OpenEBench Software Observatory
- OpenEBench ReadTheDocs
- OpenEBench Scientific benchmarking data model
- OpenEBench Scientific benchmarking REST and graphQL code
- Quest for Orthologs benchmarking metrics computation workflow
- OpenEBench Scientific benchmarking tools
- OpenEBench Scientific level2 ingestion tools
- OpenEBench Scientific frontend code
- OpenEBench Technical monitoring repository metadata enricher
Tools and resources on this page
| Tool or resource | Description | Related pages | More about tool on TechRadar |
|---|---|---|---|
| Apache-2.0 | A permissive license that lets you use, modify, and distribute code (even commercially) with attribution and patent protection. | ||
| CC-BY-SA-4.0 | A Creative Commons license that allows any use (including commercial) as long as proper credit is given. | ||
| Docker | Docker is a tool for creating isolated environments (application isolation) for software development called containers to enable consistent software running across platforms. Docker allows developers to build, share, run and verify applications easily. DockerHub is a repository for sharing and managing container images. | APICURON - The platfor... DOME Registry Research Software Stor... Archiving software Continuous Integration... Creating a good README Packaging software Reproducible software ... Using containers | View on TechRadar ↗ |
| Git | Distributed version control system designed to handle everything from small to very large projects with speed and efficiency | Research Software Stor... Research Software Stor... APICURON - The platfor... Research Software Stor... DOME Registry Research Software Stor... Research Software Stor... Research Software Stor... Using version control | View on TechRadar ↗ |
| GitHub | GitHub is a platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control. GitHub provides access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. | Research Software Stor... APICURON - The platfor... DOME Registry Research Software Stor... Archiving software Performing a code review Computational workflows Documenting code Documenting software p... Documenting software u... Adopting FAIR research... Packaging software Releasing software Using version control | |
| GitLab | DevOps platform that enables teams to collaborate, plan, develop, test, and deploy software using an integrated toolset for version control, CI/CD, and project management. | Phoenix2 Archiving software Performing a code review Computational workflows Documenting code Documenting software p... Documenting software u... Adopting FAIR research... Packaging software Releasing software Using version control | View on TechRadar ↗ |
| Java | Java is a versatile, object-oriented programming language designed to be platform-independent, with its “write once, run anywhere” capability enabled by the Java Virtual Machine (JVM). | Creating a good README | |
| JavaScript | JavaScript is a lightweight, high-level programming language primarily used to create interactive and dynamic content on web pages, running seamlessly in browsers. Known for its versatility, JavaScript supports event-driven, functional, and object-oriented programming, making it essential for modern web development and widely used alongside HTML and CSS. | Choosing languages, to... | |
| LGPLv2 | A copyleft license allowing linking in proprietary software, but modifications to the LGPL code itself must stay open. | ||
| MongoDB | MongoDB is a document-oriented NoSQL database used for high performance, high availability, and easy scalability. | APICURON - The platfor... | |
| Nextflow | Nextflow is a workflow management system that enables scalable, reproducible, and portable scientific workflows for research and production use cases. | Computational workflows Reproducible software ... | |
| OpenEBench | An open infrastructure for benchmarking bioinformatics tools and workflows, enabling continuous evaluation and comparison of scientific software. | ||
| OpenEBench Software Observatory | A platform component that monitors and analyses bioinformatics software quality, usage and sustainability within the OpenEBench ecosystem. | ||
| OpenEBench VRE | A virtual research environment that provides integrated tools and workflows for running, managing and analyzing bioinformatics benchmarks within OpenEBench. | ||
| PHP | A widely used open-source scripting language designed for web development that runs on the server side. | ||
| Python | Python is an interpreted, high-level, object-oriented programming language known for its dynamic typing, readability, and extensive standard library, making it ideal for rapid development and modular, reusable code. | Research Software Stor... Creating a good README Adopting FAIR research... Choosing languages, to... |