Analysis Tools⚓︎
Overview⚓︎
This chapter introduces the analysis tools integrated, or candidate for integration, into the Application Quality BB.
Each of these tools allows analysing and validating application code or execution artefacts from the quality, security, or performance perspective.
The analysis tools are pre-loaded in the database at deployment time (from an embedded fixture document). They may be edited directly in the database using the backend (Django) administration interface however this require a proper understanding of how pipelines and tools are defined.
The available tools are listed with their details in the Application Quality Web interface. The same interface allows authenticated users to integrate the tools in custom pipelines. You will find the instructions in the Application Quality User Manual.
How Analysis Tools Are Implemented and Executed⚓︎
The tools in the Application Quality BB are implemented using the Common Workflow Language (CWL), and executed by Calrissian and the reference implementation cwltool. The BB makes use of the pycalrissian Python library to setup and execute the analysis pipelines in the local Kubernetes cluster (possibly in a virtual cluster if this is option is activated).
Each tool is defined as a CWL CommandLineTool (CLT). These tools are typically invoked within a CWL Workflow, which orchestrates multiple CommandLineTool components. Some additional tools in these workflows handle auxiliary tasks such as filtering files from a cloned git repository or saving the analysis results to the database.
How Tools are Parameterised⚓︎
Most of the integrated tools support many command line parameters that allow controlling their behaviour.
To allow using these parameters in analysis pipelines (through the UI or the API), the CLT in which the tool is integrated must define the corresponding inputs and use these inputs in the embedded scripts.
The sections below describe the parameters that have been implemented in the CLTs and can thus be used to parameterise the pipeline executions.
More parameters may be exposed as necessary by extending the definition of the CLTs in the database.
Available tools⚓︎
The following table organises the analysis tools per application type and per check type.
As can be seen, a number of tools are readily available. Feedback and requirements will be necessary to determine which tools will be integrated next.
| Best Practices | Application Quality | Application Performance | |
|---|---|---|---|
| Python scripts | Pylint, Ruff, Flake8, SonarQube 1 |
Bandit | Pytest 2 |
| Jupyter Notebooks | Ruff, SonarQube 1, Notebook Best Practices Validator |
Papermill | |
| AP CWL | Application Package Validator | Calrissian 2 | |
| Docker | Trivy | Kaniko 2 | |
| openEO |
1 Implementation in progress
2 Candidate tool
Tools description⚓︎
Development Best Practices⚓︎
Pylint⚓︎
Pylint is a static code analyser for Python 2 or 3. Pylint analyses your code without actually running it.
It checks for errors, enforces a coding standard, looks for code smells, and can make suggestions about how the code could be refactored.
Exposed parameters
| Name | Type | Description |
|---|---|---|
errors_only |
boolean | In error mode, messages with a category besides ERROR or FATAL are suppressed, and no reports are done by default. Error mode is compatible with disabling specific errors. |
verbose |
boolean | In verbose mode, extra non-checker-related info will be displayed. |
disable |
string | Disable the message, report, category or checker with the given id(s). |
Ruff⚓︎
An extremely fast Python linter and code formatter, written in Rust.
Exposed parameters
| Name | Type | Description |
|---|---|---|
verbose |
boolean | Enable verbose logging. |
Flake8⚓︎
Flake8 is a wrapper around these tools: PyFlakes, pycodestyle, Ned Batchelder’s McCabe script.
Flake8 runs all the tools by launching the single flake8 command. It displays the warnings in a per-file, merged output.
Exposed parameters
| Name | Type | Description |
|---|---|---|
verbose |
boolean | Increase the verbosity of Flake8’s output. |
Application Package Validator⚓︎
This tool verifies the compliance of CWL files for EOEPCA Application Packages (AP CWL) against the requirements specified in the OGC Best Practice for Earth Observation Application Package document.
Each CWL file matching the filter is validated in a separate job, leading to the creation of individual analysis reports.
Exposed parameters
| Name | Type | Description |
|---|---|---|
detail |
string | Output detail (none|errors|hints|all). Default: “hints”. |
entry_point |
string | Name of entry point (Workflow or CommandLineTool) |
Jupyter Notebook Best Practices Validator⚓︎
This tool aims at validating the notebooks against the CEOS Jupyter Notebook Best Practice v1.1 document.
Exposed parameters
| Name | Type | Description |
|---|---|---|
abspath |
boolean | Uses absolute paths in output. |
schema |
string | Supported values: ‘eumetsat’ or ‘schema.org’. |
SonarQube⚓︎
SonarQube Server is an on-premise analysis tool designed to detect coding issues in 30+ languages, frameworks, and IaC platforms.
Exposed parameters
| Name | Type | Description |
|---|---|---|
Application Quality⚓︎
Bandit⚓︎
Bandit is a tool designed to find common security issues in Python code. To do this, Bandit processes each file, builds an AST from it, and runs appropriate plugins against the AST nodes. Once Bandit has finished scanning all the files, it generates a report.
Exposed parameters
| Name | Type | Description |
|---|---|---|
verbose |
boolean | Output extra information like excluded and included files. |
Trivy⚓︎
The all-in-one open source security scanner.
Trivy is used to find vulnerabilities (CVE) & misconfigurations (IaC) in container images.
Note: Trivy may also be applied to code repositories, binary artifacts, Kubernetes clusters, and more. This is not yet possible in the Application Quality BB.
Exposed parameters
| Name | Type | Description |
|---|---|---|
image |
string | The name and tag of the distant image to scan. |
Application Performance⚓︎
Papermill⚓︎
Papermill is a tool for parameterizing and executing Jupyter Notebooks.
Each Jupyter notebook maching the filter is executed in a separate job, leading to the generation of individual reports.
Exposed parameters
| Name | Type | Description |
|---|---|---|
extract_requirements |
boolean | Tell the tool to extract Python requirements from the notebook content |
extra_requirements |
string | Python libraries to install before executing the notebook |
execution_parameters |
string | Parameter values required to execute the notebook |
Pytest⚓︎
Exposed parameters
| Name | Type | Description |
|---|---|---|
Calrissian⚓︎
Calrissian allows executing a CWL runner inside a Kubernetes cluster. Its goal is to be highly efficient and scalable, taking advantage of high capacity clusters to run many steps in parallel.
Exposed parameters
| Name | Type | Description |
|---|---|---|