This architecture uses OCI Streaming for ingesting continuous, high-volume streams of data into object storage through an integration made with Oracle's Service Connector Hub.
This data can then be leveraged with Oracle Data Science. Oracle Data Science is used to build, train, and manage machine learning (ML) models in Oracle Cloud Infrastructure.
Data Flow is used to run your Apache Spark applications.
-
Permission to
manage
the following types of resources in your Oracle Cloud Infrastructure tenancy:vcns
,nat-gateways
,route-tables
,subnets
,service-gateways
,security-lists
,stream
,stream-pull
,stream-push
,stream-pools
,serviceconnectors
,dataflow-family
, andfunctions-family
. -
Quota to create the following resources: 1 VCN, 1 subnet, 1 Internet Gateway, 1 NAT Gateway, 1 Service Gateway, 2 route rules, 1 stream/stream pool, 1 Fn App, 1 Fn Function, 2 Buckets, 1 Data Flow App, 1 Data Science Project/Notebook and 1 Service Connector Hub.
If you don't have the required permissions and quota, contact your tenancy administrator. See Policy Reference, Service Limits, Compartment Quotas.
Note: A set of policies and two dynamic groups are created in this Resource Manager stack allowing an administrator to deploy this solution. These are listed in "policies.tf" file and can be used as a reference when fitting this deployment to your specific IAM configuration.
-
If you aren't already signed in, when prompted, enter the tenancy and user credentials.
-
Review and accept the terms and conditions.
-
Select the region where you want to deploy the stack.
-
Follow the on-screen prompts and instructions to create the stack.
-
After creating the stack, click Terraform Actions, and select Plan.
-
Wait for the job to be completed, and review the plan.
To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.
-
If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
-
Navigate to your Service Connector Hub Instance (Analytics & AI -> Messaging -> Service Connector Hub).
-
Scroll down the page and click the three create buttons for the policies required for the Service Connector Hub:
The OCI Terraform Provider is now available for automatic download through the Terraform Provider Registry. For more information on how to get started view the documentation and setup guide.
Now, you'll want a local copy of this repo. You can make that with the commands:
git clone https://github.com/oracle-quickstart/oci-arch-adw-oac
cd oci-arch-adw-oac/odi-data-science
ls
First off, you'll need to do some pre-deploy setup. That's all detailed here.
Additionally you'll need to do some pre-deploy setup for Docker and Fn Project inside your machine:
sudo su -
yum update
yum install yum-utils
yum-config-manager --enable *addons
yum install docker-engine
groupadd docker
service docker restart
usermod -a -G docker opc
chmod 666 /var/run/docker.sock
exit
curl -LSs https://raw.githubusercontent.com/fnproject/cli/master/install | sh
exit
Next create a terraform.tfvars
file and populate with the following information:
# Authentication
tenancy_ocid = "<tenancy_ocid>"
user_ocid = "<user_ocid>"
fingerprint = "<finger_print>"
private_key_path = "<pem_private_key_path>"
# Region
region = "<oci_region>"
# Compartment
compartment_ocid = "<compartment_ocid>"
# Object Storage
bucket_namespace = "<enter_tenancy_name_here>"
# OCIR
ocir_user_name = "<ocir_user_name>"
ocir_user_password = "<ocir_user_password>"
Deploy:
terraform init
terraform plan
terraform apply
Follow steps 8-10 listed above under "Deploy Using Oracle Resource Manager" to create the Service Connector Hub policies.
When you no longer need the deployment, you can run this command to destroy it:
terraform destroy