Skip to content

GabrielFeodorov/kubeflow1.8.0

Repository files navigation

Kubeflow on OCI OKE

This quickstart template deploys Kubeflow on Oracle Kubernetes Engine (OKE).

Pre-Requisites

Please read the following prerequisites sections thoroughly prior to deployment.

Instance Principals & IAM Policy

Deployment depends on use of Instance Principals via OCI CLI to generate kube config for use with kubectl. You should create a dynamic group for the compartment where you are deploying Kubeflow. In this example, I am using a Default Tag for all resources in the target compartment to define the Dynamic Group:

tag.Kubeflow.InstancePrincipal.value='Enabled'

After creating the group, you should set specific IAM policies for OCI service interaction:

Allow dynamic-group Kubeflow to manage cluster-family in compartment Kubeflow
Allow dynamic-group Kubeflow to manage object-family in compartment Kubeflow
Allow dynamic-group Kubeflow to manage virtual-network-family in compartment Kubeflow
Allow dynamic-group Kubeflow to manage file-family in compartment Kubeflow

This will allow interaction with the OKE cluster using instance principals, as well as Kubeflow to interact with Object Storage,File Systems and OCI Vaults.

Kubeflow access and Oracle IDCS Authentication

Reserved public ip

Deployment depends on a public ip for the Load Balancer. This is used to create the certificates and the authentication in the Oracle IDCS APP if you decide to use it. Go to Create a Reserved Public IP.

Authentication using Oracle IDCS

  1. Create an Oracle IDCS Integration Application
  • Nativagate to Oracle Identity Domains in the OCI Console and click on your current domain.
  • Select Integrated Applications from the left-side menu and click Add application.
  • Select Confidential Application and click Launch workflow.
  • Add a name and description and click on Next.
  • Configure OAuth
    • Resource server configuration select Skip for later.
    • Client configuration select CConfigure this application as a client now.
    • Check the boxes for Client credentials and Authorization code.
    • Redirect URL - add https://kubeflow.<reserved_public_ip>.nip.io/dex/callback
    • Scroll down to Client ip address and select Anywhere
    • Token issuance policy, Authorized resources select All.
    • Click on Next.
  • Web tier policy select Skip and do later and click on Finish.
  • Click on Activate to activate your application.
  • On the left side of your Application select Users or Groups to authorize users or groups to authenticate using this Application.
  1. Collecting your Application information for the Deployment. You will need the Application Client ID and Client secret and your OCI Domain URL
  • Client ID and Client secret
    • On your Application page, select OAuth configuration from the left side.
    • Under General Information
      • Note down Client ID
      • Under Client secret click on Show secret and note it down.
  • OCI Domain URL
    • Go to Oracle Identity Domains click on your current domain.
    • Under Domain Information you will find Domain URL. Note it down.
  1. Enabling Oracle Authentication when deploying the ORM Stack for Kubeflow.
  • In the Configure variables page of the stack
  • Under Kubeflow Configuration
  • Check the box for Configure authentication with Oracle IDCS

Deployment

This deployment uses Oracle Resource Manager and consists of a VCN,a Mount Target OKE Cluster with Node Pool, and an Edge node. The Edge node installs OCI CLI, Kubectl, and Kustomize. Kustomize is used to build Kubeflow manifests and deploy them to OKE using kubectl. This is done using cloudinit - the build process is logged in /var/log/OKE-kubeflow-initialize.log.

Note that you should select shapes and scale your node pool as appropriate for your workload.

This template deploys the following by default:

  • Virtual Cloud Network
    • Public (Edge) Subnet
    • Private Subnet
    • File System subnet
    • Internet Gateway
    • NAT Gateway
    • Service Gateway
    • Route tables
    • Security Lists
      • TCP 22 for Edge SSH on public subnet
      • Ingress to both subnets from VCN CIDR
      • Egress to Internet for both subnets
  • Mount Target
  • OCI Virtual Machine Edge Node
  • OKE Cluster and Node Pool
  • Load Balancer

Simply click the Deploy to OCI button to create an ORM stack, then walk through the menu driven deployment. Once the stack is created, use the menu to Plan and Apply the template.

Deploy to Oracle Cloud

OKE post-deployment

Please wait for 10-12 minutes until the cloud init script installs and configures everything.

You can check status of the OKE cluster using the following kubectl commands:

kubectl get pods -n cert-manager
kubectl get pods -n istio-system
kubectl get pods -n auth
kubectl get pods -n knative-eventing
kubectl get pods -n knative-serving
kubectl get pods -n kubeflow
kubectl get pods -n kubeflow-user-example-com

Kubeflow Access

ssh -i ~/.ssh/PRIVATE_KEY opc@EDGE_NODE_IP
cat /var/log/OKE-kubeflow-initialize.log|egrep -i "Point your browser to"

Note: The certificate created for this deployment is a self signed certificate and hence the browser will issue warning. It needs to be accepted. 

Login with either the default user's credential or using Oracle IDCS. The default email address is user@example.com and the password is what was provided with ORM (default is Kubeflow54321)

Destroying the Stack

Note that with the inclusion of SSL Load Balancer, you will need to remove the istio-ingressgateway service before you destroy the stack, or you will get an error.

kubectl delete svc istio-ingressgateway -n istio-system

This will remove the service, then you can destroy the build without errors.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published