mongo/docs/devcontainer/troubleshooting.md

21 KiB

Troubleshooting Dev Containers

This guide covers common issues and their solutions when working with MongoDB dev containers.

Table of Contents

Container Build Issues

Build Fails with "No Space Left on Device"

Symptoms:

Error: failed to solve: write /var/lib/docker/...: no space left on device

Solutions:

  1. Clean up Docker resources:

    # Remove unused containers, images, and volumes
    docker system prune -a --volumes
    
    # Check disk usage
    docker system df
    
  2. Increase Docker disk allocation:

    • Docker Desktop/Rancher Desktop: Settings → Resources → Disk
    • Increase to at least 60 GB for comfortable MongoDB development
  3. Remove old dev containers:

    # List all containers
    docker ps -a
    
    # Remove specific container
    docker rm <container_id>
    
    # Remove all stopped containers
    docker container prune
    

Build Fails with Toolchain Download Error

Symptoms:

Error: curl: (22) The requested URL returned error: 404
Error: Failed to download toolchain

Solutions:

  1. Check internet connection: Ensure you can access S3:

    curl -I https://s3.amazonaws.com/boxes.10gen.com/
    
  2. Verify toolchain URL:

    # Check what's configured
    cat .devcontainer/toolchain_config.env
    
    # Try downloading manually to test
    curl -I "$(grep TOOLCHAIN_URL .devcontainer/toolchain_config.env | cut -d'"' -f2)"
    
  3. If toolchain URL is broken, report it to the MongoDB team. This is a devcontainer configuration issue that needs to be fixed upstream.

Build Fails with Checksum Mismatch

Symptoms:

Error: SHA256 checksum mismatch
Expected: abc123...
Got: def456...

This typically indicates the toolchain was updated but the config file wasn't.

Solutions:

  1. Pull latest changes from the repository (the maintainers may have already fixed this):

    git pull
    # Then rebuild container
    
  2. Clear Docker cache and rebuild:

    # Command Palette → "Dev Containers: Rebuild Container Without Cache"
    
  3. If problem persists, this is likely a devcontainer configuration issue - report it to the MongoDB team.

Container Fails to Start

Symptoms:

  • VS Code shows "Container failed to start"
  • No error message visible

Solutions:

  1. Check Docker logs:

    # Find container ID
    docker ps -a
    
    # View logs
    docker logs <container_id>
    
  2. Rebuild container:

    • Command Palette → "Dev Containers: Rebuild Container"
    • Or: "Dev Containers: Rebuild Container Without Cache"
  3. Check Docker daemon status:

    docker info
    docker version
    

Performance Problems

Slow Build Times

Symptoms:

  • Bazel builds taking 30+ minutes for incremental changes
  • File operations feel sluggish

Solutions:

  1. Verify you're using a named volume (not bind mount):

    # Inside container
    df -h /workspaces/mongo
    
    # Should NOT show a mount from host filesystem
    # Should be part of container's internal filesystem
    

    If using bind mount, migrate to named volume:

  2. Increase Docker resources:

    • CPUs: 6+ cores recommended
    • Memory: 16 GB recommended
    • Swap: 2-4 GB
  3. Check cache volume is mounted:

    # Inside container
    ls -la ~/.cache/bazel
    
    # Should have bazel cache directory
    
  4. Verify no antivirus scanning Docker:

    • Exclude Docker Desktop directory from antivirus
    • Exclude devcontainer volumes

Slow File Operations on macOS

Symptoms:

  • git status takes 5+ seconds
  • File save is delayed
  • Terminal autocomplete is slow

Root Cause: Bind mounts on macOS use osxfs which has high latency for filesystem operations.

Solution: Use named volumes instead of bind mounts (see Getting Started guide)

High CPU Usage

Symptoms:

  • Docker process using 100%+ CPU
  • System becomes unresponsive

Solutions:

  1. Check for runaway processes:

    # Inside container
    top
    htop  # If available
    
  2. Check for file watcher issues:

    # Limit file watchers (Linux)
    echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
    sudo sysctl -p
    

VS Code Issues

For additional VS Code-specific troubleshooting, see:

Extensions Not Installing

Symptoms:

  • Recommended extensions don't install automatically
  • Extension list is empty

Solutions:

  1. Manually install extensions:

    • View → Extensions
    • Search for each recommended extension
    • Click Install in Container
  2. Check extension compatibility:

    • Some extensions don't support containers
    • Look for "This extension is enabled globally" message
  3. Reinstall extensions:

    # Command Palette
    > Developer: Reinstall Extension...
    

clangd Not Working

Symptoms:

  • No C++ IntelliSense
  • "clangd: Server not running" error
  • Red squiggles everywhere

Solutions:

  1. Generate compile_commands.json:

    bazel build compiledb --config=local
    
    # Verify it exists
    ls -lh compile_commands.json
    
  2. Check clangd path:

    # Verify the wrapper script exists
    ls -l buildscripts/clangd_vscode.sh
    
    # Test it
    ./buildscripts/clangd_vscode.sh --version
    
  3. Restart clangd:

    • Command Palette → "clangd: Restart language server"
  4. Check clangd output:

    • Output → clangd (dropdown)
    • Look for errors
  5. Clear clangd cache:

    rm -rf ~/.cache/clangd
    

Python Extension Not Finding Interpreter

Symptoms:

  • "Select Python Interpreter" notification
  • Python imports not recognized
  • Linting/formatting not working

Solutions:

  1. Verify venv exists:

    ls -la python3-venv/bin/python
    source python3-venv/bin/activate
    which python
    
  2. Select interpreter in VS Code:

    • Command Palette → "Python: Select Interpreter"
    • Choose python3-venv/bin/python
  3. Rebuild venv:

    rm -rf python3-venv
    /opt/mongodbtoolchain/v5/bin/python3 -m venv python3-venv
    source python3-venv/bin/activate
    poetry install --no-root --sync
    
  4. Check settings.json:

    {
      "python.defaultInterpreterPath": "python3-venv/bin/python"
    }
    

Format on Save Not Working

Symptoms:

  • Files don't format when saved
  • Manual format works

Solutions:

  1. Check settings:

    {
      "editor.formatOnSave": true,
      "[cpp]": {
        "editor.defaultFormatter": "xaver.clang-format",
        "editor.formatOnSave": true
      }
    }
    
  2. Verify formatter is installed:

    • clang-format: Check extension is active
    • Ruff: Check extension is active
    • Prettier: Verify path in settings
  3. Test formatter manually:

    • Right-click → Format Document
    • Check for errors in Output panel

Git and SSH Issues

SSH Clone Fails: Permission Denied

Symptoms:

git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Solutions:

  1. Verify SSH keys exist on host:

    # On your HOST machine (not in container)
    ls -la ~/.ssh/id_*.pub
    
    # Should see id_ed25519.pub, id_rsa.pub, or similar
    
  2. Test SSH connection to GitHub:

    # On HOST machine
    ssh -T git@github.com
    
    # Should see: "Hi username! You've successfully authenticated..."
    # If this fails, your SSH key isn't added to GitHub
    
  3. Add SSH key to GitHub:

    # Copy your public key
    cat ~/.ssh/id_ed25519.pub  # or id_rsa.pub
    
    # Go to https://github.com/settings/keys
    # Click "New SSH key" and paste
    
  4. Ensure ssh-agent has your key:

    # On HOST machine
    ssh-add -l
    
    # If empty or shows "Could not open connection"
    eval "$(ssh-agent -s)"
    ssh-add ~/.ssh/id_ed25519  # or id_rsa
    

See Getting Started - SSH Setup for detailed instructions.

SSH Works on Host But Not in Container

Symptoms:

  • Can clone/push from host machine
  • Same operations fail inside devcontainer
  • "Permission denied" or asks for password

Root Cause: SSH agent forwarding isn't working properly.

Solutions:

  1. Verify agent forwarding requirements:

    # On HOST machine (before opening container)
    # SSH agent must be running
    echo $SSH_AUTH_SOCK
    # Should show a path, not empty
    
    # Agent must have keys loaded
    ssh-add -l
    # Should list your SSH keys
    
  2. Add keys to agent if missing:

    # On HOST machine
    ssh-add ~/.ssh/id_ed25519  # or id_rsa
    
    # Verify
    ssh-add -l
    
  3. Restart VS Code and rebuild container:

    • Close VS Code completely
    • Restart VS Code
    • Command Palette → "Dev Containers: Rebuild Container"
  4. Check SSH config (macOS):

    # On HOST machine
    # Add to ~/.ssh/config
    Host *
      AddKeysToAgent yes
      UseKeychain yes
      IdentityFile ~/.ssh/id_ed25519
    
  5. Start ssh-agent automatically (Linux):

    # Add to ~/.bashrc or ~/.zshrc on HOST
    if [ -z "$SSH_AUTH_SOCK" ]; then
      eval "$(ssh-agent -s)"
      ssh-add ~/.ssh/id_ed25519
    fi
    
  6. Windows: Ensure ssh-agent service is running:

    # In PowerShell as Administrator (on HOST)
    Get-Service ssh-agent | Set-Service -StartupType Automatic
    Start-Service ssh-agent
    
    # Then add your key
    ssh-add $env:USERPROFILE\.ssh\id_ed25519
    

Git Push Asks for Username/Password

Symptoms:

Username for 'https://github.com':
Password for 'https://user@github.com':

Causes:

  1. Repository was cloned with HTTPS instead of SSH
  2. SSH agent forwarding not working

Solutions:

Option 1: Switch to SSH (recommended):

# Check current remote URL
git remote -v

# If using HTTPS, switch to SSH
git remote set-url origin <ssh url>

# Verify
git remote -v

Option 2: Use Personal Access Token (for HTTPS):

# Generate token at https://github.com/settings/tokens
# Use token as password when prompted

# Or configure credential helper
git config --global credential.helper store
# Next time you enter credentials, they'll be saved

Option 3: Fix SSH agent forwarding: See "SSH Works on Host But Not in Container" section above.

Multiple SSH Keys (Personal + Work)

Problem: Have multiple GitHub accounts or SSH keys

Solution: Use SSH config to manage multiple keys:

# On HOST machine, edit ~/.ssh/config
Host github.com-work
  HostName github.com
  User git
  IdentityFile ~/.ssh/id_ed25519_work

Host github.com-personal
  HostName github.com
  User git
  IdentityFile ~/.ssh/id_ed25519_personal

# Add both keys to agent
ssh-add ~/.ssh/id_ed25519_work
ssh-add ~/.ssh/id_ed25519_personal

# Clone using specific host alias
git clone git@github.com-work:<repo>

Cannot Sign Commits with GPG

Symptoms:

error: gpg failed to sign the data
fatal: failed to write commit object

Solution:

GPG signing requires additional setup in devcontainers.

Use SSH signing (GitHub now supports this):

# Configure git to use SSH for signing
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true

Build System Issues

Bazel Fails with "Server terminated abruptly"

Symptoms:

ERROR: Bazel server terminated abruptly

Solutions:

  1. Clean Bazel cache:

    bazel clean --expunge
    
  2. Check disk space:

    df -h
    
  3. Restart container:

    • Command Palette → "Dev Containers: Rebuild Container"

Bazel Build Fails with Toolchain Errors

Symptoms:

ERROR: No matching toolchains found
ERROR: Cannot find compiler

Solutions:

  1. Verify toolchain installation:

    ls -la /opt/mongodbtoolchain/revisions/
    
    # Check compiler
    /opt/mongodbtoolchain/v5/bin/gcc --version
    
  2. Source toolchain environment:

    source /opt/mongodbtoolchain/revisions/*/activate
    
  3. Rebuild container to reinstall toolchain

EngFlow Authentication Fails

Symptoms:

ERROR: Failed to authenticate with EngFlow
ERROR: Build Event Service upload failed

Solutions:

  1. Check if credentials exist:

    ls -la ~/.config/engflow_auth/
    
  2. Re-authenticate with EngFlow:

    rm -r ~/.config/engflow_auth/*
    bazel run engflow_auth
    
  3. Build without EngFlow:

    bazel build --config=local install-mongod
    

Python Environment Issues

Poetry Install Fails

Symptoms:

ERROR: Failed to install packages
KeyringError: ...

Solutions:

  1. Set keyring backend:

    export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
    poetry install --no-root --sync
    
  2. Clear Poetry cache:

    poetry cache clear --all pypi
    poetry install --no-root --sync
    
  3. Verify Poetry version:

    poetry --version
    # Should be version specified in poetry_requirements.txt
    

Virtual Environment Not Activating

Symptoms:

  • (python3-venv) not in prompt
  • which python shows system Python

Solutions:

  1. Manually activate:

    source python3-venv/bin/activate
    
  2. Check shell config:

    cat ~/.bashrc | grep python3-venv
    cat ~/.zshrc | grep python3-venv
    
  3. Re-source config:

    source ~/.bashrc  # or ~/.zshrc
    

Import Errors in Python Scripts

Symptoms:

ModuleNotFoundError: No module named 'pymongo'

Solutions:

  1. Ensure venv is activated:

    which python
    # Should show: /workspaces/mongo/python3-venv/bin/python
    
  2. Reinstall dependencies:

    source python3-venv/bin/activate
    poetry install --no-root --sync
    
  3. Check Poetry lock file:

    poetry check
    poetry lock --check
    

Volume and Persistence Issues

Data Lost After Container Restart

Symptoms:

  • Bazel cache gone
  • History cleared
  • Python venv empty

Root Cause: Volumes not mounting correctly

Solutions:

  1. Check volumes are mounted:

    docker inspect <container_id> | grep -A 10 Mounts
    
  2. Verify volumes exist:

    docker volume ls | grep mongo
    
  3. Check devcontainer.json mounts:

    "mounts": [
      {
        "source": "mongo-cache",
        "target": "/home/youruser/.cache",
        "type": "volume"
      }
    ]
    

Cannot Access Files from Host

Symptoms:

  • Can't open files in host OS
  • Need to copy files out of container

Solution:

Files in named volumes are in Docker's VM, not directly accessible.

To access:

# Copy file from container to host
docker cp <container_id>:/workspaces/mongo/file.txt ~/Downloads/

# Or use VS Code
# Right-click file → Download...

To edit with external tools: Use bind mounts instead of named volumes (but sacrifices performance).

Volume Fills Up Disk

Symptoms:

docker system df
# Shows huge SIZE for volumes

Solutions:

  1. Clean Bazel cache:

    # Inside container
    bazel clean --expunge
    
  2. Remove old volumes:

    # List volumes
    docker volume ls
    
    # Remove specific volume if needed (WARNING: loses data!)
    docker volume rm old-cache-volume
    
  3. Limit Bazel cache size:

    # Add to ~/.bazelrc
    echo "build --disk_cache=~/.cache/bazel --disk_cache_size=10G" >> ~/.bazelrc
    

Platform-Specific Issues

macOS: "Docker Desktop Is Not Running"

Solutions:

  1. Start Docker Desktop/Rancher Desktop:

    • Check menu bar for Docker icon
    • Launch the application
  2. Reset Docker:

    • Rancher Desktop → Troubleshooting → Reset Kubernetes
    • Docker Desktop → Troubleshoot → Reset to factory defaults
  3. Check Docker context:

    docker context ls
    docker context use default
    

macOS: M1/M2 ARM Issues

Symptoms:

  • "exec format error"
  • Build fails with architecture mismatch

Solutions:

  1. Verify base image supports ARM:

    docker pull quay.io/mongodb/bazel-remote-execution:ubuntu24-...
    docker inspect --format='{{.Architecture}}' <image_id>
    
  2. Use platform flag if needed:

    FROM --platform=linux/amd64 <base_image>
    
  3. Check Rosetta 2 is enabled (Rancher Desktop)

Windows: WSL2 Integration Issues

Symptoms:

  • Container won't start on Windows
  • File permission errors

Solutions:

  1. Enable WSL2 integration:

    • Docker Desktop → Settings → Resources → WSL Integration
    • Enable integration for your WSL distro
  2. Use WSL2 terminal:

    • Open Ubuntu (or other WSL distro)
    • Clone and work from WSL filesystem, not /mnt/c/
  3. Check WSL version:

    wsl --list --verbose
    # Should show VERSION 2
    

Linux: Permission Denied Errors

Symptoms:

permission denied while trying to connect to Docker daemon

Solutions:

  1. Add user to docker group:

    sudo usermod -aG docker $USER
    newgrp docker  # Or logout/login
    
  2. Check Docker socket permissions:

    ls -l /var/run/docker.sock
    sudo chmod 666 /var/run/docker.sock  # Temporary
    

Docker Provider Issues

Rancher Desktop: Container Engine Not dockerd

Symptoms:

  • Cannot build devcontainer
  • Unexpected behavior

Solution:

  • Rancher Desktop → Preferences → Container Engine
  • Select "dockerd (moby)"
  • Restart Rancher Desktop

Docker Desktop: Resource Limits Too Low

Symptoms:

  • Slow builds
  • Out of memory errors

Solution: Go to Docker Desktop → Settings → Resources and allocate generously:

  • CPUs: Allocate as many as possible (leave 1-2 for host OS)
  • Memory: Allocate as much as possible (leave ~4-8 GB for host OS)
  • Swap: Optional but can help (2-4 GB if you have disk space)
  • Disk: Ensure plenty available (60+ GB recommended)

Note: MongoDB builds are resource-intensive. More resources = significantly faster builds.

OrbStack: Features Not Working

Symptoms:

  • Docker-outside-of-docker doesn't work
  • Volume mounts fail

Solution: OrbStack has some limitations with devcontainer features. Try:

  1. Update to latest OrbStack version
  2. Check OrbStack documentation for devcontainer compatibility
  3. Consider switching to Rancher Desktop for full feature support

Debugging Tips

Enable Verbose Logging

VS Code Dev Container logs:

  1. Command Palette → "Dev Containers: Show Container Log"
  2. Check for errors during build/start

Docker logs:

# Container logs
docker logs <container_id>

# Follow logs in real-time
docker logs -f <container_id>

Bazel verbose:

bazel build --verbose_failures --sandbox_debug install-mongod

Inspect Running Container

# Get container ID
docker ps

# Exec into container
docker exec -it <container_id> /bin/bash

# Check processes
docker exec <container_id> ps aux

# Check environment
docker exec <container_id> env

Check Resource Usage

# Inside container
df -h          # Disk usage
free -h        # Memory
top            # CPU/Memory by process

# From host
docker stats   # Live resource usage

Rebuild from Scratch

Sometimes the best fix is a clean rebuild:

# Stop and remove container
docker stop <container_id>
docker rm <container_id>

# Rebuild without cache
# Command Palette → "Dev Containers: Rebuild Container Without Cache"

Test Outside Devcontainer

To isolate whether an issue is devcontainer-specific:

# Clone locally
git clone git@github.com:mongodb/mongo.git
cd mongo

# Try building without devcontainer
# (Requires local toolchain setup)

Getting More Help

If your issue isn't covered here:

  1. Check VS Code Docs: code.visualstudio.com/docs/devcontainers
  2. Search Issues: MongoDB GitHub repository issues
  3. Ask the Team: MongoDB developers Slack/chat
  4. File a Bug: Include:
    • Error messages
    • Container logs
    • Steps to reproduce
    • OS and Docker version
    • devcontainer.json and Dockerfile (if modified)

See Also: