Nvidia driver issue after update PVE
fixing the Nvidia driver issue on my Proxmox VE (PVE) host
Summary
A Proxmox kernel update to the experimental Linux Kernel 7.0 broke the Nvidia kernel module compilation due to structural changes in the Linux virtual memory manager (VMA_LOCK_OFFSET). Because your Nvidia Quadro P1000 (Pascal architecture) requires the 580.xx driver series, which is incompatible with Kernel 7.0, both servers were rolled back to the stable Kernel 6.17.13-9-pve to restore full functionality and ensure node consistency for a future cluster.
Phase 1: Fixing the Primary Host (pve)
  • Identified the Root Cause: The system attempted to build the Nvidia driver against Kernel 7.0.2-4-pve, failing with compilation errors in nv-mmap.c.
  • Rolled Back the Kernel: Booted the Lenovo ThinkStation P520 into the GRUB advanced menu and selected the stable 6.17.13-9-pve kernel.
  • Installed Dependencies: Downloaded and installed the matching kernel headers:
bash
apt update && apt install -y pve-headers-6.17.13-9-pve
  • Installed Host Driver: Successfully installed the latest Pascal-supported driver (580.159.04):
bash
./NVIDIA-Linux-x86_64-580.159.04.run --no-questions --ui=none
  • Pinned the Kernel: Locked the bootloader to this kernel version to prevent Proxmox from automatically booting back into Kernel 7.0:
bash
proxmox-boot-tool kernel pin 6.17.13-9-pve
Phase 2: Fixing the Plex LXC Container (ID 111)
  • Identified Driver Mismatch: Inside the container, nvidia-smi failed with a libnvidia-ml.so error because the container's user-space libraries did not match the newly updated host driver.
  • Updated Container Libraries: Installed the exact matching 580.159.04 driver version inside the container without compiling kernel modules:
bash
./NVIDIA-Linux-x86_64-580.159.04.run --no-kernel-module --no-questions --ui=none
  • Restored Hardware Acceleration: Verified that nvidia-smi successfully detects the Quadro P1000 inside the container and restarted the Plex service:
bash
systemctl restart plexmediaserver
Phase 3: Aligning the Secondary Host (pve1) for possible Clustering
  • Checked Environment: Verified available kernels on the second node (pve1) to ensure optimal compatibility for live-migration and cluster stability.
  • Updated and Matched Kernel: After running system updates on pve1, the exact same 6.17.13-9-pve kernel became available.
  • Pinned and Rebooted: Set the matching kernel as permanent via systemd-boot/GRUB and rebooted the host:
bash
proxmox-boot-tool kernel pin 6.17.13-9-pve
reboot
.
Current Status
  • pve (Node 1): Running Kernel 6.17.13-9-pve with Nvidia Driver 580.159.04. Plex Hardware Transcoding (hw) is fully functional.
  • pve1 (Node 2): Running Kernel 6.17.13-9-pve.
  • Result: Both nodes are now 100% stable, identical in kernel versions.
2
3 comments
Ad de Jonge
4
Nvidia driver issue after update PVE
Home Lab Explorers
skool.com/homelabexplorers
Build, break, and master home labs and the technologies behind them! Dive into self-hosting, Docker, Kubernetes, DevOps, virtualization, and beyond.
Leaderboard (30-day)
Powered by