Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH action to generate report #199

Draft
wants to merge 94 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
d16abb2
Inital commit to add GH action to generate report
asmacdo Sep 25, 2024
713d64c
Assume Jupyterhub Provisioning Role
asmacdo Sep 25, 2024
519360c
Fixup: indent
asmacdo Sep 25, 2024
e6f4814
Rename job
asmacdo Sep 25, 2024
72496f4
Add assumed role to update-kubeconfig
asmacdo Sep 25, 2024
8428d3a
No need to add ProvisioningRole to masters
asmacdo Sep 25, 2024
e170b59
Deploy a pod to the cluster, and schedule with Karpenter
asmacdo Sep 25, 2024
bfce046
Fixup: correct path to pod manifest
asmacdo Sep 25, 2024
0993129
Fixup again ugh, rename file
asmacdo Sep 25, 2024
87027d2
Delete Pod even if previous step times out
asmacdo Sep 25, 2024
686f686
Hack out initial du
asmacdo Oct 11, 2024
ff52971
tmp comment out job deployment, test dockerhub build
asmacdo Nov 8, 2024
ca6db89
Fixup hyphens for image name
asmacdo Nov 8, 2024
d228f9d
Write file to output location
asmacdo Nov 8, 2024
68f707f
use kubectl cp to retrieve report
asmacdo Nov 8, 2024
ad6b589
Combine run blocks to use vars
asmacdo Nov 8, 2024
f18e8b7
Mount efs and pass arg to du script
asmacdo Nov 8, 2024
387cfc1
Comment out repo pushing, lets see if the report runs
asmacdo Nov 8, 2024
04b4193
Restrict job to asmacdo for testing
asmacdo Nov 8, 2024
a443081
Sanity check. Just list the directories
asmacdo Nov 8, 2024
99ac264
Job was deployed, but never assigned to node, back to sanity check
asmacdo Nov 8, 2024
6ee89b2
change from job to pod
asmacdo Nov 8, 2024
a8f6ed3
deploy pod to same namespace as pvc
asmacdo Nov 8, 2024
664853b
Use ns in action
asmacdo Nov 8, 2024
e35c974
increase timeout to 60s
asmacdo Nov 8, 2024
a8af5f2
fixup: image name in manifest
asmacdo Nov 8, 2024
024cf6e
increase timeout to 150
asmacdo Nov 8, 2024
49c346e
override entrypoint so i can debug with exec
asmacdo Nov 8, 2024
0191c85
bound /home actually meant path was /home/home/asmacdo
asmacdo Nov 8, 2024
3eb9157
Create output dir prior to writing report
asmacdo Nov 8, 2024
676a00e
pod back to job
asmacdo Nov 11, 2024
c085751
Fixup use the correct job api
asmacdo Nov 11, 2024
3e18a37
Add namespace to pod retrieval
asmacdo Nov 11, 2024
0fa5ece
write directly to pv to test job
asmacdo Nov 11, 2024
e1ecbc3
fixup script fstring
asmacdo Nov 11, 2024
082d3cc
no retry on failure, we were spinning up 5 pods, lets just fail 1 time
asmacdo Nov 11, 2024
d46ea44
Fixup backup limit job not template
asmacdo Nov 11, 2024
965a81e
Initial report
asmacdo Nov 11, 2024
7366d2d
disable report
asmacdo Nov 11, 2024
747f0a4
deploy ec2 instance directly
asmacdo Dec 2, 2024
6156e21
Update AMI image
asmacdo Dec 2, 2024
588892c
update sg and subnet
asmacdo Dec 2, 2024
958630b
terminate even if job fails
asmacdo Dec 2, 2024
e24a666
debug: print public ip
asmacdo Dec 2, 2024
0e58f10
explicitly allocate public ip for ec2 instance
asmacdo Dec 2, 2024
5c28c0e
Add WIP scripts
asmacdo Dec 6, 2024
21811dd
rm old unused
asmacdo Dec 6, 2024
97de713
initial commit of scripts
asmacdo Dec 6, 2024
644f8c3
clean up launch script
asmacdo Dec 6, 2024
e176592
make scripe executable
asmacdo Dec 6, 2024
a101f18
fixup cleanup script
asmacdo Dec 6, 2024
615baf2
add a name to elastic ip (for easier manual cleanup)
asmacdo Dec 6, 2024
bb8f25a
Exit on fail
asmacdo Dec 6, 2024
a8a615a
Add permission for aws ec2 wait instance-status-ok
asmacdo Dec 6, 2024
8157a12
Upload scripts to instance
asmacdo Dec 6, 2024
d3f6f52
explicitly return
asmacdo Dec 6, 2024
f1f687f
output session variables to file
asmacdo Dec 11, 2024
a10bc2a
modify cleanup script to retrieve instance from temporary file
asmacdo Dec 11, 2024
7fd340d
All ec2 persmissions granted
asmacdo Dec 11, 2024
1649b35
Add EFS mount (hardcoded)
asmacdo Dec 11, 2024
30aa60c
No pager for termination
asmacdo Dec 11, 2024
cc845d4
force pseudo-terminal, otherwise hangs after yum install
asmacdo Dec 11, 2024
7854124
Add doublequotes to variable usage for proper expansion
asmacdo Dec 11, 2024
9fbad37
Fixup -t goes on ssh, not scp
asmacdo Dec 11, 2024
4fc9dde
Mount as a single command, since we dont have access to pty
asmacdo Dec 11, 2024
86e645e
add todos for manual steps
asmacdo Dec 11, 2024
c614004
Disable job for now
asmacdo Dec 11, 2024
5a207bc
Update AMI to ubuntu
asmacdo Dec 12, 2024
8ce97ee
Roll back to AL 2023
asmacdo Dec 12, 2024
f7fe412
drop gzip, just write json
asmacdo Dec 13, 2024
e9904c8
include target dir in relative paths
asmacdo Dec 13, 2024
7da2aae
Second script will not produce user report, but directory stats json
asmacdo Dec 13, 2024
41a65ed
inital algorithm hackout
asmacdo Dec 13, 2024
8eb0f06
Clean up and refactor for simplicity
asmacdo Dec 13, 2024
40947ef
Add basic tests
asmacdo Dec 13, 2024
0e9c065
test multiple directories in root
asmacdo Dec 13, 2024
e4794de
comment about [:-1]
asmacdo Dec 13, 2024
ee2c3b1
support abspaths
asmacdo Dec 14, 2024
e1dcd63
[DATALAD RUNCMD] blacken
asmacdo Dec 14, 2024
541f1f3
test propagation with files in all dirs
asmacdo Dec 14, 2024
ac364fb
Write files to disk as they are inspected
asmacdo Dec 15, 2024
05609a1
Comment out column headers in output
asmacdo Dec 15, 2024
7560db2
Write all fields for every file
asmacdo Dec 15, 2024
502ff76
Convert to reading tsv
asmacdo Dec 15, 2024
639f279
Fixup: update test to match tsv-read data
asmacdo Dec 15, 2024
96490e5
update for renamed script
asmacdo Dec 15, 2024
64d69a7
install pip
asmacdo Dec 15, 2024
6f3dae5
install parallel
asmacdo Dec 15, 2024
b38be7d
install dependencies in launch script
asmacdo Dec 15, 2024
f0b0709
Output to tmp, accept only 1 arg, target dir
asmacdo Dec 15, 2024
326bb55
add up sizes
asmacdo Dec 16, 2024
c881287
print useful info as index is created
asmacdo Dec 16, 2024
a3505f9
dont fail if output dir exists
asmacdo Dec 16, 2024
fcd9531
Create a report dict with only relevant stats
asmacdo Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/manifests/hello-world-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# manifests/hello-world-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: hello-world-pod
spec:
containers:
- name: hello
image: busybox
command: ['sh', '-c', 'echo Hello, World! && sleep 30']
nodeSelector:
NodeGroupType: default
NodePool: default
hub.jupyter.org/node-purpose: user
tolerations:
- key: "hub.jupyter.org/dedicated"
operator: "Equal"
value: "user"
effect: "NoSchedule"

64 changes: 64 additions & 0 deletions .github/workflows/report.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: Generate Data Usage Report

on:
pull_request:
branches:
- main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also run this on a weekly basis?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's on PR push just so I can test easily, but yes, 1/week sounds good to me. @kabilar Do you have a preference for what day/time?

Copy link
Member

@kabilar kabilar Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you. How about Mondays at 6am EST? We can then review the report on Monday mornings.


jobs:
generate_data_usage_report:
runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v3
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
# TODO param region
aws-region: us-east-2

- name: Assume JupyterhubProvisioningRole
# TODO param ProvisioningRoleARN and name ^
run: |
ROLE_ARN="arn:aws:iam::278212569472:role/JupyterhubProvisioningRole"
CREDS=$(aws sts assume-role --role-arn $ROLE_ARN --role-session-name "GitHubActionsSession")
export AWS_ACCESS_KEY_ID=$(echo $CREDS | jq -r '.Credentials.AccessKeyId')
export AWS_SECRET_ACCESS_KEY=$(echo $CREDS | jq -r '.Credentials.SecretAccessKey')
export AWS_SESSION_TOKEN=$(echo $CREDS | jq -r '.Credentials.SessionToken')


- name: Configure kubectl with AWS EKS
# TODO param name, region role-arn
run: |
aws eks update-kubeconfig --name eks-dandihub --region us-east-2 --role-arn arn:aws:iam::278212569472:role/JupyterhubProvisioningRole

- name: Sanity check
run: |
kubectl get pods -n jupyterhub

# Step 4: Deploy Hello World Pod from manifest
- name: Deploy Hello World Pod
run: |
kubectl apply -f .github/manifests/hello-world-pod.yaml

# Step 5: Wait for Pod to Complete
- name: Wait for Hello World Pod to complete
run: |
kubectl wait --for=condition=Ready pod/hello-world-pod --timeout=300s # 5 minutes
continue-on-error: true # Allow the workflow to continue even if this step fails

# Step 6: Get Pod Logs to verify it ran successfully, only if Step 5 succeeds
- name: Get Hello World Pod logs
run: |
kubectl logs hello-world-pod
if: ${{ success() }} # Only run this step if the previous step was successful

# Step 7: Cleanup - Always run this step, even if previous steps fail
- name: Delete Hello World Pod
run: |
kubectl delete pod hello-world-pod
if: ${{ always() }} # Always run this step, even if other steps fail
Loading