Deploying Open Data Hub with Elyra¶

In this example we will show how to deploy the Elyra Image to be used with Open Data Hub.
An installation will consist of the deploying both the ODH Operator and Kubeflow in the same cluster.

Requirements¶

An OpenShift Cluster
- Since we will be installing both ODH and Kubeflow our preferred resource requirements will be:
- 16 GB memory, 6 CPUs, and 45G of Disk Space
oc (OpenShift Command Line Interface)
- Installation instructions for Windows and MacOS
kfctl (Kubeflow Deployment Tool)
- Latest releases can be found in the kfctl Github Repository
- Extract the kfctl binary from the tar into a directory on your PATH

Install KubeFlow on OpenShift¶

Installation Instructions for Kubeflow can be found in the OpenDataHub Documentation

Accessing the Kubeflow Pipelines Main Dashboard

After deploying Kubeflow to your OpenShift cluster, setup port forwarding to your Kubeflow installation
In your workstation, run oc port-forward svc/istio-ingressgateway -n istio-system 8080:80 &
You should be able to reach the Kubeflow Pipelines Dashboard by navigating to: http://localhost:8080/pipeline/#/pipelines

Installing the Open Data Hub Operator on OpenShift¶

After installing our requirements on our local workstation, we want to install the ODH Operator in our OpenShift Cluster.

Open your web browser to the OpenShift Dashboard and navigate to the Projects page under the Home Dropdown on the left side menu
Click on Create Project and create give it a name e.g. elyra or odh
After creating the Project navigate to the OperatorHub page and search for opendatahub
Open the Open Data Hub tile and click on Install, leave all options as is at their default settings.
Once the Operator has finished installing, navigate to the Installed Operators page under the Operators dropdown and click on ‘Open Data Hub’
Click on Create KfDef, then select ‘YAML View’. You should now see a default configuration. Replace it with the following:
NOTE: Make sure to fill in the namespace field with the Project name you created earlier

apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
  annotations:
    kfctl.kubeflow.io/force-delete: 'false'
  name: opendatahub
  namespace: INSERT YOUR PROJECT NAME HERE
spec:
  applications:
    - kustomizeConfig:
        repoRef:
          name: manifests
          path: odh-common
      name: odh-common
    - kustomizeConfig:
        repoRef:
          name: manifests
          path: radanalyticsio/spark/cluster
      name: radanalyticsio-cluster
    - kustomizeConfig:
        repoRef:
          name: manifests
          path: radanalyticsio/spark/operator
      name: radanalyticsio-spark-operator
    - kustomizeConfig:
        parameters:
          - name: s3_endpoint_url
            value: s3.odh.com
        repoRef:
          name: manifests
          path: jupyterhub/jupyterhub
      name: jupyterhub
    - kustomizeConfig:
        overlays:
          - additional
        repoRef:
          name: manifests
          path: jupyterhub/notebook-images
      name: notebook-images
  repos:
    - name: kf-manifests
      uri: >-
        https://github.com/opendatahub-io/manifests/tarball/v1.0-branch-openshift
    - name: manifests
      uri: 'https://github.com/opendatahub-io/odh-manifests/tarball/v0.8.0'
  version: v0.8.0
status: {} 

This kfdef configuration will install a minimal deployment of Open Data Hub.
NOTE: The Argo Workflow controller is explicitly not included in the kfdef due a conflict with the Argo controller that comes with a standard installation of Kubeflow. The removal is only required when both Open Data Hub and Kubeflow are installed in the same namespace.

Elyra

Click on ‘Create’
The Open Data Hub Operator should now be installing a basic deployment of JupyterHub on ODH.

Accessing the ODH JupyterHub Landing/Spawner Page

There are many ways to access the Landing Page, in this example we assume the user is using the default installation and not making any modifications to the network services for the application e.g. Using Istio or opening up NodePorts
In your OpenShift Dashboard, in the upper right corner click on your username, and then Copy Login Command a new page should pop up and then click on Display Token. Copy the oc login command.
On your local workstation, paste the oc login command, this will allow you to control your cluster your workstation.
Open a proxy to your cluster oc proxy &, this will run in the background.
Navigate to the Landing Page in your browser. NOTE: Replace theProject name in the URL with your own http://localhost:8001/api/v1/namespaces/INSERT_PROJECT_NAME/services/http:jupyterhub:8080/proxy

Using Elyra with Open Data Hub¶

In the JupyterHub landing/spawner page, ensure you set the following two environmental variables before starting a notebook

NAME	VALUES	REQUIRED	DESCRIPTION
COS_BUCKET	A-Z, a-z, 0-9, -, .		This will be the bucket that your artifacts will be sent to post notebook execution. This can be modified in the Elyra Metadata Editor at runtime. Default value: 'default'

Accessing Default Object Storage¶

When using the default metadata runtime created, pipeline artifacts will be sent to the Minio S3 object storage instance installed when Kubeflow Pipelines is installed
Setup port forwarding to Minio with the following :

oc port-forward svc/minio-service -n kubeflow 9000:9000 &

You should be able to reach the Minio Dashboard in your web browser by navigating tolocalhost:9000

Additional Resources and Documentation¶

ODH Installation Docs
ODH KubeFlow Installation Docs