Running Elyra in an air-gapped environment¶
Elyra requires access to resources that are commonly located in remote environments. Deployment of Elyra in an air-gapped environment (a network environment that is physically separated from other environments, public or private) therefore requires additional considerations. This document identifies resources that must be made available to successfully install and run Elyra in such an environment.
These dependencies are in addition to any dependencies you wish to include in the installation, such as JupyterLab extensions that are not distributed with Elyra.
When using Elyra’s features to build, export, or run pipelines, additional runtime dependencies must be accessible in the environment where JupyterLab is installed and the Kubernetes cluster where the pipeline runtime environment (Kubeflow Pipelines or Apache Airflow) is installed.
In the chart above the arrows indicate whether read access, write access, or both is required.
JupyterLab environment dependencies¶
Elyra requires access to the following dependencies when you build, export, or submit a pipeline:
- Runtime environment: Elyra requires access to the Kubernetes cluster where Kubeflow Pipelines or Apache Airflow is running.
- GitHub repository or GitLab project: For Apache Airflow Elyra requires access to the GitHub repository or GitLab project that is configured in the runtime configuration.
- Component definitions for custom components: Elyra utilizes catalog connectors to locate and load component definitions. The connectors must be able to communicate with the configured catalog, or you will not be able to submit or export pipelines. For example, if a pipeline utilizes a component that is stored in a URL component catalog, the component’s URL must be accessible via an anonymous HTTP request.
- S3-compatible cloud storage for generic components: During pipeline export or submission Elyra uploads pipeline artifacts to an S3 bucket. These artifacts are downloaded to the pipeline runtime environment when the pipeline is executed.
Runtime environment dependencies¶
During pipeline execution in the Kubeflow Pipelines or Apache Airflow environment access to the following dependencies is required:
Container registry: All pipeline nodes are executed in containers. The runtime environment must be configured to have read access to the registries (e.g. Docker Hub) where the container images are stored that the generic and custom components are referencing.
Elyra runtime artifacts: When processing pipeline nodes that are implemented using generic components, Elyra downloads a few dependencies to the container. By default these dependencies are located in a release-specific branch in the Elyra GitHub repository:
https://raw.githubusercontent.com/elyra-ai/elyra/main/etc/kfp/pip.conf https://raw.githubusercontent.com/elyra-ai/elyra/main/elyra/kfp/bootstrapper.py https://raw.githubusercontent.com/elyra-ai/elyra/main/elyra/airflow/bootstrapper.py https://raw.githubusercontent.com/elyra-ai/elyra/main/etc/generic/requirements-elyra.txt
In air-gapped environments you must store a copy of these files in a location that is accessible via an anonymous HTTP
GETrequest and configure the following environment variables in the environment where JupyterLab is running:
- For Kubeflow Pipelines:
- For Apache Airflow:
- For Kubeflow Pipelines:
S3-compatible cloud storage for generic components: When processing pipeline nodes that are implemented using generic components, Elyra downloads the pipeline artifacts that were uploaded when the pipeline was exported or submitted.