Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added openscope docker #208

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

rcpeene
Copy link

@rcpeene rcpeene commented Nov 26, 2024

I made the changes discussed here;
#202

I think I did everything required, but it is not exactly clear. Let me know if any more changes are required

Comment on lines +1 to +22
FROM ubuntu:22.04
# base requirements
RUN apt-get update
RUN apt-get install -y coreutils
RUN apt-get install -y libgl1-mesa-glx
RUN apt-get install -y libglib2.0-0
RUN apt-get install -y python3 python3-pip
RUN apt-get install -y git

RUN git config --global --add safe.directory /__w/openscope_databook/openscope_databook

# copy databook setup files
COPY requirements.txt ./openscope_databook/requirements.txt
COPY setup.py ./openscope_databook/setup.py
COPY README.md ./openscope_databook/README.md
COPY LICENSE.txt ./openscope_databook/LICENSE.txt
COPY databook_utils ./openscope_databook/databook_utils

# for reasons I don't understand, these must be installed before the rest the requirements
RUN pip install numpy cython
# set up databook dependencies
RUN pip install -e ./openscope_databook[dev]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

providing this suggestion as something that could simplify this. i'm not sure those root installs are necessary, but perhaps worth building and trying out this image

Suggested change
FROM ubuntu:22.04
# base requirements
RUN apt-get update
RUN apt-get install -y coreutils
RUN apt-get install -y libgl1-mesa-glx
RUN apt-get install -y libglib2.0-0
RUN apt-get install -y python3 python3-pip
RUN apt-get install -y git
RUN git config --global --add safe.directory /__w/openscope_databook/openscope_databook
# copy databook setup files
COPY requirements.txt ./openscope_databook/requirements.txt
COPY setup.py ./openscope_databook/setup.py
COPY README.md ./openscope_databook/README.md
COPY LICENSE.txt ./openscope_databook/LICENSE.txt
COPY databook_utils ./openscope_databook/databook_utils
# for reasons I don't understand, these must be installed before the rest the requirements
RUN pip install numpy cython
# set up databook dependencies
RUN pip install -e ./openscope_databook[dev]
FROM quay.io/jupyter/datascience-notebook:2024-02-06
USER root
RUN apt-get update && apt-get install -y coreutils \
libgl1-mesa-glx libglib2.0-0 \
&& rm -rf /var/lib/apt/lists/* && rm -rf /tmp/*
# apt-get may result in root-owned directories/files under $HOME
RUN mkdir /opt/extras && chown -R $NB_UID:$NB_GID $HOME /opt/extras
USER $NB_USER
RUN cd /tmp \
&& git clone https://github.com/AllenInstitute/openscope_databook.git \
&& pip install --no-cache-dir -e './openscope_databook[dev]'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @satra @rcpeene. This work for me for the most part. There is just a dependency conflict with the openscope_databook requirements.

Error message

133.0 ERROR: Cannot install None and databook_utils because these package versions have conflicting dependencies.
133.0 
133.0 The conflict is caused by:
133.0     databook-utils 1.1.0 depends on torch
133.0     suite2p 0.12.1 depends on torch==1.13.1
133.0 
133.0 To fix this you could try to:
133.0 1. loosen the range of package versions you've specified
133.0 2. remove package versions to allow pip attempt to solve the dependency conflict
133.0 
133.0 ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
------
openscope-docker.Dockerfile:14
--------------------
  13 |     
  14 | >>> RUN cd /tmp \
  15 | >>>     && git clone https://github.com/AllenInstitute/openscope_databook.git \
  16 | >>>     && pip install --no-cache-dir -e './openscope_databook[dev]'
--------------------
ERROR: failed to solve: process "/bin/bash -o pipefail -c cd /tmp     && git clone https://github.com/AllenInstitute/openscope_databook.git     && pip install --no-cache-dir -e './openscope_databook[dev]'" did not complete successfully: exit code: 1

@rcpeene Could we unpin suite2p from 0.12.1, which itself pins torch==1.13.1? The latest version of suite2p 0.14.4 uses torch>=1.13.1.

I was going to submit a pull request to the openscope_databook repo with the above suggestion but there were some other issues installing the requirements, so I couldn't fully test this change.

Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one issue is with git cloneing the repo that @satra included to retrieve the necessary files for the environment. Firstly, this interferes with The Databooks ability to copy over a particular branch's notebooks when launching the databook, and with using the docker to run our test suite on a particular branch within Github workflows.

I can test the rest of these changes and update PR if works

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcpeene - we can encode a specific committish

&& git checkout <committish> \

right after the clone statement.

i would actually highly recommend that. this way you preserve the exact state of the repo that's used to generate the docker image, and updating the committish would retrigger the docker build.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a way to pull the container from a registry rather then maintain it in the repo? If I make changes to the container used for my other platforms, this one may have to be manually updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can. it's just that we want to make sure we can trust it (as the tools in the container have access to the user's environment - could contain secrets, keys, etc.,.).

also that it has to work in a jupyterhub environment.

as long as we have a good way of determining that we can simply point to a container from a different registry.

@asmacdo
Copy link
Member

asmacdo commented Dec 3, 2024

+1 These images should all be based on a jupyter image

I'll add some docs to the dandi handbook for how these should be created and tested.
#223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants