Skip to content

Commit b2b3ad4

Browse files
clubandersonvMaroon
authored andcommitted
implementing vuln scanner and h100 cluster deployment
1 parent e74f1d1 commit b2b3ad4

File tree

3 files changed

+145
-30
lines changed

3 files changed

+145
-30
lines changed

β€Ž.tekton/README.md

+26-12
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
## πŸ› οΈ CI/CD Pipeline Overview – Your Project
22

3-
This pipeline is designed to support safe, efficient, and traceable development and deployment workflows using OpenShift Pipelines-as-Code, GitHub, and Quay.io.
3+
<!-- NOTE TO CONTRIBUTORS: every repo in the hc4ai organization is intended to have the same contents in this file. The origin is the copy in https://github.ibm.com/mspreitz/hc4ai-hello-neural/blob/dev/.tekton/README.md; submit PRs against that one -->
4+
5+
This pipeline is designed to support safe, efficient, and traceable development and deployment workflows using [OpenShift Pipelines-as-Code](https://pipelinesascode.com/), [Tekton](https://tekton.dev/), [buildah](https://buildah.io/), GitHub, and Quay.io.
6+
7+
This pipeline is used for CI/CD of the `dev` and `main` branches. This pipeline runs from source through container image build to deployment and testing in the hc4ai cluster.
48

59
---
610

@@ -24,19 +28,28 @@ Each repo includes a `.version.json` file at its root. This file controls:
2428

2529
#### πŸ”‘ Fields:
2630
- **dev-version**: Current version of the dev branch. Used to tag dev images.
27-
- **dev-registry**: Container registry location for development image pushes.
31+
- **dev-registry**: Container repository location for development image pushes.
2832
- **prod-version**: Managed by automation. Updated during promotion to match the dev-version.
29-
- **prod-registry**: Container registry for production image pushes. The promoted dev image is re-tagged and pushed here.
33+
- **prod-registry**: Container repository for production image pushes. The promoted dev image is re-tagged and pushed here.
3034

3135
The pipeline reads this file to:
3236
- Extract the appropriate version tag
33-
- Determine the correct registry for image pushes
37+
- Determine the correct repository for image pushes
3438
- Promote and tag dev images for prod
3539

3640
---
3741

42+
### Container Repositories
43+
44+
This pipeline maintains two container repositories for this GitHub repository, as follows.
45+
46+
- `quay.io/vllm-d/<repoName>-dev`. Hold builds from the `dev` branch as described below.
47+
- `quay.io/vllm-d/<repoName>`. Holds promotions to prod, as described below.
48+
49+
---
50+
3851
### βš™οΈ Pipeline Triggers
39-
Triggered on `push` and `pull_request` events targeting the `dev` or `main` branches.
52+
Triggered on `push` and `pull_request` events targeting the `dev` or `main` branches. The following workflows are the two behaviors of this pipeline.
4053

4154
### πŸ”§ dev Branch Workflow
4255
1. Checkout repository
@@ -47,20 +60,20 @@ Triggered on `push` and `pull_request` events targeting the `dev` or `main` bran
4760
- prod-version
4861
- prod-registry
4962
4. Build and push container image to:
50-
β†’ `<dev-registry>:<dev-version>`
63+
β†’ `<dev-repository>:<dev-version>`
5164
5. Tag the Git commit using the `dev-version`
52-
6. Optionally redeploy objects to OpenShift in `hc4ai-operator-dev`
65+
6. Optionally redeploy objects to OpenShift in the `hc4ai-operator-dev` namespace.
5366

5467
βœ… This process ensures that all code merged into dev is validated and deployed for testing.
5568

5669
### πŸš€ main Branch Workflow
5770
1. Checkout, lint, test, and parse `.version.json`
5871
2. Skip image rebuild
5972
3. Promote image by copying from:
60-
β†’ `<dev-registry:<dev-version>` β†’ `<prod-registry>:<prod-version>`
73+
β†’ `<dev-repository:<dev-version>` β†’ `<prod-repository>:<prod-version>`
6174
4. Tag the Git commit using the `prod-version`
6275
5. Update the upstream repo’s submodule to reference the new tag
63-
6. Redeploy to OpenShift in `hc4ai-operator`
76+
6. Redeploy to OpenShift in the `hc4ai-operator` namespace.
6477

6578
βœ… No image rebuilds occur on main. Only validated dev images are promoted, ensuring reproducibility.
6679

@@ -84,8 +97,8 @@ Tags are created using the configured Git credentials and pushed to the remote r
8497

8598
### ☸️ OpenShift Deployment
8699
The pipeline includes automated deployment:
87-
- On `dev`: Deploys to `hc4ai-operator-dev`
88-
- On `main`: Deploys to `hc4ai-operator`
100+
- On `dev`: Deploys to the `hc4ai-operator-dev` namespace. The Pod is named `<repoName>-major-minor`, using the `dev-version` from `.version.json`.
101+
- On `main`: Deploys to `hc4ai-operator` namespace. The Pod is named `<repoName>-major-minor`, using the `prod-version` from `.version.json`.
89102

90103
Using `make uninstall-openshift` and `make install-openshift`, resources are cleanly reset.
91104

@@ -112,6 +125,7 @@ After deployment, the pipeline:
112125

113126
### 🧠 Why `.version.json` Matters
114127
- Decouples versioning from Git commit hashes
115-
- Provides a single source of truth for version and registry info
128+
- Provides a single source of truth for version and repository info
116129
- Enables deterministic builds and controlled releases
117130
- Simplifies debugging and auditing across environments
131+

β€Ž.tekton/pipelinerun.yaml

+89-10
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,6 @@ metadata:
1212
(!has(body.ref) || body.ref == 'refs/heads/main' || body.ref == 'refs/heads/dev') &&
1313
(!has(body.head_commit) || !has(body.head_commit.author) || !body.head_commit.author.name.matches("(?i).*ci-tag-bot.*")) &&
1414
(!has(body.pull_request) || (body.pull_request.base.ref == 'main' || body.pull_request.base.ref == 'dev'))
15-
results.tekton.dev/columns: |
16-
[
17-
{
18-
"name": "Vulnerabilities",
19-
"type": "string",
20-
"jsonPath": ".status.pipelineResults[?(@.name==\"vulnerabilities\")].value"
21-
}
22-
]
2315
spec:
2416
podTemplate:
2517
serviceAccountName: pipeline
@@ -39,6 +31,10 @@ spec:
3931
- name: source_branch
4032
value: "{{ source_branch }}"
4133
pipelineSpec:
34+
results:
35+
- description: The common vulnerabilities and exposures (CVE) result
36+
name: SCAN_OUTPUT
37+
value: $(tasks.vulnerability-scan.results.SCAN_OUTPUT)
4238
params:
4339
- name: repo_url
4440
- name: revision
@@ -162,6 +158,90 @@ spec:
162158
- name: source
163159
workspace: source
164160

161+
- name: openshift-redeploy-h100
162+
when:
163+
- input: "$(params.runOptional)"
164+
operator: in
165+
values: ["true"]
166+
- input: "$(params.source_branch)"
167+
operator: in
168+
values: ["dev", "main"]
169+
- input: "$(tasks.read-cluster-name.results.cluster-name)"
170+
operator: notin
171+
values: ["cluster-platform-eval"]
172+
taskRef:
173+
name: openshift-redeploy-task
174+
params:
175+
- name: source-branch
176+
value: "$(params.source_branch)"
177+
- name: prod-version
178+
value: "$(tasks.extract-version-and-registry.results.prod-version)"
179+
- name: dev-version
180+
value: "$(tasks.extract-version-and-registry.results.dev-version)"
181+
- name: prod_image_tag_base
182+
value: "$(tasks.extract-version-and-registry.results.prod-image-tag-base)"
183+
- name: dev_image_tag_base
184+
value: "$(tasks.extract-version-and-registry.results.dev-image-tag-base)"
185+
runAfter:
186+
- extract-version-and-registry
187+
workspaces:
188+
- name: source
189+
workspace: source
190+
191+
- name: go-test-post-deploy-h100
192+
when:
193+
- input: "$(params.runOptional)"
194+
operator: in
195+
values: ["true"]
196+
- input: "$(params.source_branch)"
197+
operator: in
198+
values: ["dev", "main"]
199+
taskRef:
200+
name: go-test-post-deploy-task
201+
params:
202+
- name: source-branch
203+
value: "$(params.source_branch)"
204+
- name: prod-version
205+
value: "$(tasks.extract-version-and-registry.results.prod-version)"
206+
- name: dev-version
207+
value: "$(tasks.extract-version-and-registry.results.dev-version)"
208+
- name: prod_image_tag_base
209+
value: "$(tasks.extract-version-and-registry.results.prod-image-tag-base)"
210+
- name: dev_image_tag_base
211+
value: "$(tasks.extract-version-and-registry.results.dev-image-tag-base)"
212+
runAfter:
213+
- openshift-redeploy-h100
214+
workspaces:
215+
- name: source
216+
workspace: source
217+
218+
- name: benchmark-h100
219+
when:
220+
- input: "$(params.source_branch)"
221+
operator: in
222+
values: ["dev"]
223+
continueOn:
224+
errors: true
225+
params:
226+
- name: openshift_host
227+
value: "https://api.fmaas-vllm-d.fmaas.res.ibm.com:6443"
228+
- name: openshift_namespace
229+
value: "hc4ai-operator-dev"
230+
taskRef:
231+
name: benchmark-task
232+
runAfter:
233+
- go-test-post-deploy-h100
234+
235+
- name: pipeline-complete-dev-h100
236+
when:
237+
- input: "$(params.source_branch)"
238+
operator: in
239+
values: ["dev"]
240+
runAfter:
241+
- benchmark-h100
242+
taskRef:
243+
name: noop-task
244+
165245
- name: promote-to-prod
166246
when:
167247
- input: "$(params.runOptional)"
@@ -234,7 +314,7 @@ spec:
234314
- name: IMAGE_URL
235315
value: "$(tasks.buildah-build.results.image-url)"
236316
- name: SEVERITY
237-
value: "CRITICAL,HIGH"
317+
value: "CRITICAL,HIGH,MEDIUM,LOW"
238318
- name: ARGS
239319
value: "--exit-code 0"
240320
workspaces:
@@ -250,7 +330,6 @@ spec:
250330
values: ["cluster-platform-eval"]
251331
runAfter:
252332
- promote-to-prod
253-
# - buildah-build
254333
- vulnerability-scan
255334
taskRef:
256335
name: noop-task

β€Ž.tekton/vuln-scan-trivy.yaml

+30-8
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,30 @@ apiVersion: tekton.dev/v1
22
kind: Task
33
metadata:
44
name: trivy-scan
5+
annotations:
6+
task.output.location: results
7+
task.results.format: application/json
8+
task.results.key: SCAN_OUTPUT
59
spec:
610
params:
711
- name: IMAGE_URL
812
type: string
913
description: Full image URL (e.g., quay.io/org/image:tag)
1014
- name: SEVERITY
1115
type: string
12-
default: "CRITICAL,HIGH"
16+
default: "CRITICAL,HIGH,MEDIUM"
1317
description: Comma-separated severity levels
1418
- name: ARGS
1519
type: string
1620
default: ""
1721
description: Additional Trivy arguments
22+
results:
23+
- name: SCAN_OUTPUT
24+
description: CVE result format
1825
workspaces:
1926
- name: registry-secret
2027
description: Workspace with Docker config.json (auth for private registries)
2128
- name: output
22-
results:
23-
- name: vulnerabilities
24-
type: string
2529
steps:
2630
- name: trivy-scan
2731
image: docker:20.10.24-dind
@@ -57,11 +61,29 @@ spec:
5761
$(params.ARGS) \
5862
"$IMAGE" > /workspace/output/trivy-results.json; then
5963
echo "❌ Trivy scan failed"
60-
echo -n "-1" > $(results.vulnerabilities.path)
64+
echo -n "-1" > /tekton/results/vulnerabilities
6165
exit 1
6266
fi
6367
64-
echo "πŸ“Š Counting vulnerabilities..."
65-
vuln_count=$(jq '[.Results[].Vulnerabilities[]?] | length' /workspace/output/trivy-results.json)
68+
echo "πŸ“‹ Trivy scan result:"
69+
cat /workspace/output/trivy-results.json
70+
71+
echo "πŸ“Š Parsing vulnerabilities..."
72+
73+
vuln_count=$(jq '[.Results[].Vulnerabilities[]?] | length // 0' /workspace/output/trivy-results.json)
6674
echo "πŸ“Š Found $vuln_count vulnerabilities"
67-
echo -n "$vuln_count" > /tekton/results/vulnerabilities
75+
76+
if [ "$vuln_count" -gt 0 ]; then
77+
# Parse the vulnerabilities and ensure that missing categories are assigned zero count
78+
jq -rce '
79+
{
80+
vulnerabilities: {
81+
critical: ([.Results[].Vulnerabilities[]? | select(.Severity == "CRITICAL")] | length),
82+
high: ([.Results[].Vulnerabilities[]? | select(.Severity == "HIGH")] | length),
83+
medium: ([.Results[].Vulnerabilities[]? | select(.Severity == "MEDIUM")] | length),
84+
low: ([.Results[].Vulnerabilities[]? | select(.Severity == "LOW")] | length)
85+
}
86+
}' /workspace/output/trivy-results.json > /tekton/results/SCAN_OUTPUT
87+
else
88+
echo "πŸ“Š No vulnerabilities found, skipping parsing."
89+
fi

0 commit comments

Comments
Β (0)