Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reattaching to Dev Containers host agent never succeeds if install directory involves a symlink. #245248

Open
JohnBossom opened this issue Mar 31, 2025 · 4 comments
Assignees

Comments

@JohnBossom
Copy link

JohnBossom commented Mar 31, 2025

  • VSCode Version: 1.98.2
  • Local OS Version: macOS Sequoia Version 15.3.1
  • Remote OS Version: Red Hat Enterprise Linux release 8.9
  • Remote Extension/Connection Type: Dev Containers/WSL/Remote - Tunnels/VS Code Server
  • Logs:

We have discovered a bug in the client side extension.js code that manages the re-use of an already running host agent. This bug only occurs if the default install path involves a symlink. For default install of Dev Containers it is running from

~/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78

However, due to limited home directory space, it is normal practice here to redirect the vscode install
directories away from one's home directory for a a large directory. For example:

~/.vscode-server -> /other/location/vscode-server/.vscode-server
~/.vscode-remote-containers -> /other/location/vscode-server/.vscode-remote-containers

This practice has been followed long before the introduction of remote.SSH.serverInstallPath.
PLUS Dev Containers does not use this setting.

What this scenario has uncovered is a bug in the client-side extension.js code:

~/.vscode/extensions/ms-vscode-remote.remote-containers-0.401.0/dist/extension/extension.js

There is a function called Jue that obtains the list of already running host agents whereby the
list is dynamically determined by reading /proc/<pid/>/... including the cwd of the main shell process group for the host agent (more on that later). In calculating the value of cwd it is set to the output of readlink and, therefore all symlinks expanded.
In order to determine if this already running host agent can be reused, it's cwd is compared to two values labelled installFolder or sharedInstallFolder.
HOWEVER, the value of installFolder is using the original install location, ~/.vscode-server/bin/dd....

Since they value of cwd and installFolder do not match, the code decides it cannot reuse the host agent and so attempts to end the entire program group (aka tree) by issuing

kill -9 -<pgrp>

I was able to fix the reusing of the host agent by modifying the client side `extension.jsto perform the comparison against the thereadlink -f installFolder``` or ```readlink -f sharedInstallFolder```:

I ran extension.js through a javascript pretty printer first.

diff -r /Users/jebossom/extension.js /Users/jebossom/.vscode/extensions/ms-vscode-remote.remote-containers-0.401.0/dist/extension/extension.js
39386a39387,39412
> 
>    if ( l ) {
>          let {
>               stdout : l_real
>          } = await d.exec(`readlink -f ${l}`);
>          if ( l_real ) {
>               l_real = l_real.trim();
>               if ( l !== l_real ) {
>                    e.output.write(`Using readlink path ${JSON.stringify(l_real,void 0,"  ")}`, 1);
>                    l = l_real;
>               }
>          }
>    }
>    if ( u ) {
>          let {
>               stdout : u_real
>          } = await d.exec(`readlink -f ${u}`);
>          if ( u_real ) {
>               u_real = u_real.trim();
>               if ( u !== u_real ) {
>                    e.output.write(`Using readlink path ${JSON.stringify(u_real,void 0,"  ")}`, 1);
>                    u = u_real;
>               }
>          }
>    }
> 

Without the fix, above, the host agent is always terminated and restarted every time you close it's client window on your laptop.

This is reproducible with the off-the-shelf podman container, ubi8/ubi-init
Steps to Reproduce:

  1. On your remote host, obtain an off-the-shelf podman container:
podman pull registry.access.redhat.com/ub8/ubi-init
  1. Set up your install folders for vscode to be symlinks to some other location
~/.vscode-server -> /nobackup/jebossom/vscode-server/.vscode-server
~/.vscode-remote-containers -> /nobackup/jebossom/vscode-server/.vscode-remote-containers
  1. Tailer the following bash script to use this RHEL container and provide mounts for the locations:
#!/bins/bash
USER=jebossom
podman run \
    --init \
    -it \
    --rm \
    --name vscode-test-runc \
    --userns keep-id \
    -v /users/${USER}:/users/${USER} \
    -v /nobackup/${USER}:/nobackup/${USER} \
    --runtime=/usr/bin/runc \
    registry.access.redhat.com/ubi8/ubi-init \
    /bin/bash

NOTE:

  • /users/${USER} is our home directory and
  • /nobackup/${USER} is where we redirect the vscode installation.
  • We explicitly choose the OCI runtime runc here... I will be opening another bug regarding the use of the OCI runtime crun.
  1. Run the above bash script on the server. It will start an instance of the podman container called vscode-test-runc.
  2. on your laptop, set dev.containers.logLevel to trace
  3. on the laptop, attach to your remote host.
  4. From your attached window start a Dev Containers window Dev Containers: Attach to Running Container...
  5. Once it completes successfully, look at the process tree in the container. From the server:
ps x -o pid,ppid,pgid,sess,args
    PID    PPID    PGID    SESS COMMAND
      1       0       1       1 /run/podman-init -- /bin/bash
      7       1       7       1 /bin/bash
     46       0      46      46 /bin/sh
    125       0     125     125 /bin/sh
    145     125     125     125 /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node /tmp/vscode-remote-containers-server-b03495d6-7d6b-4e63-a3c7-4df4d1895029.js
    252       0     252     252 sh /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/bin/code-server --log trace --force-disable-user-env --server-data-dir /users/jebossom/.vscode-server --telemetry-level off --accept-server-license-
    263     252     252     252 /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/out/server-main.js --log trace --for
    275       0     275     275 /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node -e  ....const net = require('net'); ....const fs = require('fs'); ....process.stdin.pause(); ....const client = net.createConnection({ host: '127.
    290       0     290     290 /bin/sh -c # Watch installed extensions ...trap "exit 0" 15 ...old=`ls -A --full-time` ...counter=0 ...while [ $counter -lt 60 ] ...do ....sleep 1 ....new=`ls -A --full-time` ....if [ "$new" != "$old" ] ....then .....exit 1 ...
    315       0     315     315 /bin/sh -c # Watch machine settings ...trap "exit 0" 15 ...old=`ls -A --full-time settings.json 2>/dev/null || true` ...counter=0 ...while [ $counter -lt 60 ] ...do ....sleep 1 ....new=`ls -A --full-time settings.json 2>/dev/nu
    325       0     325     325 /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node -e  ....const net = require('net'); ....const fs = require('fs'); ....process.stdin.pause(); ....const client = net.createConnection({ host: '127.
    337     263     252     252 /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/out/bootstrap-fork --type=ptyHost --
    348     263     252     252 /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node --dns-result-order=ipv4first /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/out/boo
    459     348     252     252 /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/node /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/extensions/json-language-features/se
    498     315     315     315 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1
    505     290     290     290 /usr/bin/coreutils --coreutils-prog-shebang=sleep /usr/bin/sleep 1
    506       7     506       1 ps x -o pid,ppid,pgid,sess,args
  1. Now, end the client side Dev Containers window and attempt to reconnect to the same container. Looking through the logs you will find:
[14839 ms] Running extension host agents:
[
  {
    "pid": "252",
    "ppid": "0",
    "pgrp": "252",
    "cwd": "/nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78",
    "mntNS": "mnt:[4026532866]",
    "cmd": "sh /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/bin/code-server --log trace --force-disable-user-env --server-data-dir /users/jebossom/.vscode-server --telemetry-level off --accept-server-license-terms --host 127.0.0.1 --port 0 --connection-token-file /users/jebossom/.vscode-server/data/Machine/.connection-token-ddc367ed5c8936efe395cffeec279b04ffd7db78 --extensions-download-dir /users/jebossom/.vscode-server/extensionsCache --start-server  --skip-requirements-check ",
    "env": {
      "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "TERM": "xterm",
      "container": "oci",
      "SHELL": "/bin/sh",
      "HOME": "/users/jebossom"
    }
  }
]
[14839 ms] Terminating old extension host agent.
[14839 ms] Start: Run in container: kill -9 -252
[15037 ms] 
[15037 ms] 
[15037 ms] Stop (198 ms): Run in container: kill -9 -252

Notice that the value of cwd is the full path to the host agent's install directory with symlinks expanded.
Notice that the code determines it cannot re-attach to this host agent and issues kill -9 -252 to kill the entire process tree.
10. Now, close the Dev Containers window and we are going to re-use the same running container, but this time swap out the client side extension.js with the patched version that expands symlinks on the installFolder before comparing to cwd.
You will notice a new log message Using readlink path... below indicating that the patch had expanded symlinks on installFolder before comparing to cwd. You will now also notice that the code is successfully re-using the container's host agent `Extension host agent is already running.

[14949 ms] Running extension host agents:
[
  {
    "pid": "1999",
    "ppid": "0",
    "pgrp": "1999",
    "cwd": "/nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78",
    "mntNS": "mnt:[4026532866]",
    "cmd": "sh /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78/bin/code-server --log trace --force-disable-user-env --server-data-dir /users/jebossom/.vscode-server --telemetry-level off --accept-server-license-terms --host 127.0.0.1 --port 0 --connection-token-file /users/jebossom/.vscode-server/data/Machine/.connection-token-ddc367ed5c8936efe395cffeec279b04ffd7db78 --extensions-download-dir /users/jebossom/.vscode-server/extensionsCache --start-server  --skip-requirements-check ",
    "env": {
      "PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
      "TERM": "xterm",
      "container": "oci",
      "SHELL": "/bin/sh",
      "HOME": "/users/jebossom"
    }
  }
]
[14949 ms] Start: Run in container: readlink -f /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78
[15139 ms] /nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78
[15139 ms] 
[15139 ms] Stop (190 ms): Run in container: readlink -f /users/jebossom/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78
[15139 ms] Using readlink path "/nobackup/jebossom/vscode-server/.vscode-server/bin/ddc367ed5c8936efe395cffeec279b04ffd7db78"
[15139 ms] Extension host agent is already running.

Does this issue occur when you try this locally?: N/A
Does this issue occur when you try this locally and all extensions are disabled?: N/A

@JohnBossom
Copy link
Author

I will be opening a subsequent bug that I discovered when the RHEL podman OCI is switched from the older runc Go implementation to the newer crun implementation. i.e. If you repeat the above without my patch you will not be able to successfully re-attach to the existing container unless you manually kill all the host agent's processes in the container. This is because the extension code assumed it could set up a process group in order to kill the entire process tree, however, all processes in the process tree for the host agent have PGID and SESS set to 0.

@JohnBossom
Copy link
Author

Please feel free to reach out to me at jebossom@cisco.com should you require further information. I had to use my personal email address since I couldn't login with my work email for some reason.

@JohnBossom
Copy link
Author

Here is the second bug report…
https://github.com/microsoft/vscode-remote-release/issues/10833
It is uncovered by the first bug and can be avoided (for now) if the first bug is fixed.
I believe it will be more involved in order to fix this one.

@JohnBossom
Copy link
Author

JohnBossom commented Apr 4, 2025

Our rollout of an internal solution leveraging containers is contingent on a solution to this. If there is anything I can provide to assist you please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants