Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plug for users custom building new images! #3133

Open
balajialg opened this issue Jan 14, 2022 · 6 comments
Open

Plug for users custom building new images! #3133

balajialg opened this issue Jan 14, 2022 · 6 comments
Assignees

Comments

@balajialg
Copy link
Contributor

balajialg commented Jan 14, 2022

Summary

Recently was in a datahub onboarding meeting with an instructor who teaches an advanced class called Urban Informatics and Visualization. He came up with a request that I think deserves the team's attention from a product direction perspective. He felt that isolating students from git-related stuff works great at an introductory level. However, it is not helpful when they go to a professional setting where they are expected to set up their local development environment. So he is interested in using Datahub as a backup scenario only during edge cases where students cannot set up their environment locally.

In addition, he was curious whether students could make custom modifications to their conda environment in their instance of Datahub without the infra team's intervention. It reminded me of the justice innovation hub usecase that @yuvipanda is working on. It looks like instructors from a few advanced courses would benefit from this capability. I will watch out for other instructors who express interest in having such an usecase.

User Stories

  • As an instructor, I want students to create their own custom environment (like conda) in Datahub without the intervention of the infra team so that they learn the nitty-gritty of setting up their own environment locally!
@fperez
Copy link
Collaborator

fperez commented Jan 26, 2022

I'd flip this around: our hubs should be a perfect environment for full-time work, in-depth. And then we can teach students how to replicate the experience locally as well.

That's the path I'm pursuing in 159, and will explore it in-depth. Happy to share with other campus faculty my reasoning and approach.

@balajialg
Copy link
Contributor Author

balajialg commented Jan 26, 2022

@fperez That will be a fantastic use-case! We will highlight your class as a case study to any faculty interested in doing similar stuff and also can make a connection (maybe at the end of the semester? if you are interested and have the time).

On another note, Do you have a summary of the different activities students will perform using Datahub in this course? I have been trying to create a Datahub feature matrix and classify the features according to the complexity of the use case. Just started with this draft and your inputs will guide my thinking around what constitutes an advanced use case.

@fperez
Copy link
Collaborator

fperez commented Jan 27, 2022

Certainly @balajialg, happy to!

Activities:

  • The base environment will be JupyterLab, pinned at 3.2.8 (unless we find a critical bug or major desired feature).
  • Using git/github extensively - all assignments are done via Github Classroom.
  • Collaborating as teams through Github repos, with issues, PRs, etc.
  • Using the JupyterLab RTC features for teamwork, combined with syncthing for file sharing.
  • Running notebooks, much like D8/100 for data analysis and programming tasks.
  • Writing pure python code in scripts.
  • Possibly compiling some Cython/C code at the command line.
  • Developing documentation and building it with jupyterbook/sphinx.
  • Manipulating documents with LaTeX and Pandoc.
  • Developing small python packages, perhaps even up to posting on pypi (not sure yet).
  • Creating and sharing new conda environments for projects.
  • Working with hdf5 and/or netcdf data files.
  • Building code in public repos with binder support.
  • [If time permits] do some simple distributed computing tasks with Dask/xarray.

So far I'm teaching them how to work fully in JupyterLab + terminal, but I am starting to seriously consider the benefit of adding the virtual desktop package to our image to let them access some GUI applications occasionally. Not sure yet, but I'm starting to lean heavily towards yes, as I see more and more the value of them finding that the hub is a "suitable home for everything", and sometimes you do need GUI apps for certain tasks on the hub itself (QGIS is a good example of that need, but there are many more).

If we decide to do more virtual desktop work, I think this would raise the pressure on updating the base image to something more recent than 18.04 (I understand the desire to stick to LTS images, but it would be even nicer to have the system be flexible enough to pick say a 21.04 or 21.10 image for a 2022 course, and simply refresh a year later). Having a very old base image isn't a big deal if all you're doing is running notebooks, but it's much more of a hindrance if you start using a bunch of Linux GUI tools and they are all horribly outdated.

@balajialg
Copy link
Contributor Author

balajialg commented Jan 28, 2022

@fperez Wow, these are fascinating use cases! Thank you so much for taking the time to write the summary of student activities. Super inspiring and definitely will use this info to highlight hub's possibilities to interested faculty. Recently heard that the Dean of Civil Engineering reached out to multiple teams to gather solutions if they chose to move their labs to the cloud. They also have a few GUI-based applications that students will work on during the lab. Your class would be an ideal case study for them.

[You may already know it but still] Sharing the info that EECS hub already has Linux Desktop Environment enabled as you can observe through this PR. It seems likely that Stat 159 hub will have a similar configuration like EECS hub (with reduced compute considering the student size)

Tagging @yuvipanda @felder - Specific to your point about upgrading the base image from 18.0 to 21.10/21.04. Will this upgrade have any impact on the deployment for other hubs?

@fperez
Copy link
Collaborator

fperez commented Jan 28, 2022

Thanks @balajialg, happy to connect with others on campus about this, even before the end of the term.

And yes- in fact I used the EECS apt.txt file as the starting point of my PR :) Yuvi merged/deployed it today and I'm already using it (and filing issues about it ;).

This is becoming a fantastic setup!

@balajialg
Copy link
Contributor Author

balajialg commented Jan 29, 2022

Thanks a lot, @fperez! Will keep you posted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants