You are not logged in.
I'm struggling to get GPU acceleration working with tensorflow on my system while using the [icode]extra/python-tensorflow-opt-cuda[/icode] package. Ultimately, tensorflow complains of a mismatch in lib files (expecting one version older), and symlinking them as a workaround does not work (kernel crashes).
System Info
uname:
Linux ghostdog 6.18.21-1-lts #1 SMP PREEMPT_DYNAMIC Thu, 02 Apr 2026 15:44:36 +0000 x86_64 GNU/LinuxNvidia Driver Version: 595.58.03
Nvidia CUDA Version: 13.2
Device: NVIDIA GeForce RTX 3060
List of all my packages
My lspci data:
2b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3060] [10de:2487] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device [1458:407b]
Flags: bus master, fast devsel, latency 0, IRQ 78, IOMMU group 16
Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=32M]
I/O ports at f000 [size=128]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities:
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
2b:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1) (prog-if 00 [HDA compatible])
Subsystem: Gigabyte Technology Co., Ltd Device [1458:407b]
Flags: bus master, fast devsel, latency 0, IRQ 79, IOMMU group 16
Memory at fc080000 (32-bit, non-prefetchable) [size=16K]
Capabilities:
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intelThe (to my knowledge) relevant packages are installed:
core/linux-firmware-nvidia 20260309-1 [installed]
extra/cuda 13.2.0-1 [installed]
extra/cudnn 9.20.0.48-1 [installed]
extra/egl-gbm 1.1.3-1 [installed]
extra/egl-wayland 4:1.1.21-1 [installed]
extra/egl-wayland2 1.0.1-1 [installed]
extra/egl-x11 1.0.5-1 [installed]
extra/ffnvcodec-headers 13.0.19.0-1 [installed]
extra/libnvidia-container 1.19.0-1 [installed]
extra/libvdpau 1.5-4 [installed]
extra/libxnvctrl 595.58.03-1 [installed]
extra/nccl 2.29.7-1 [installed]
extra/nvidia-container-toolkit 1.19.0-1 [installed]
extra/nvidia-open-lts 1:595.58.03-3 [installed]
extra/nvidia-settings 595.58.03-1 [installed]
extra/nvidia-utils 595.58.03-1 [installed]
extra/nvtop 3.3.2-1 [installed]
extra/opencl-nvidia 595.58.03-1 [installed]
extra/python-pycuda 2026.1-2 [installed]
extra/python-tensorflow-opt-cuda 2.20.0-5 [installed]
multilib/lib32-icu 78.3-1 [installed]
multilib/lib32-nvidia-utils 595.58.03-1 [installed]Since (afaik) python-tensorflow-opt-cuda is an official extra package, meant to allow developers to use tensorflow cuda within the rolling environment of arch, is there a "neat" or "official" way to get this to work that I am missing?
Please see below for my full install process, rationaile, and code samples.
First, I installed the python-tensorflow-opt-cudapackage.
Then, created a new venv for my project, that can share that package:
python -m venv --system-site-packages training_venv
conda deactivate
source ./training_venv/bin/activateThen I added it as a kernel to run my code in:
ipython kernel install --user --name training_venv --display-name "Python (training)"And made sure to select it.
Then I added "/opt/cuda/lib64" to "/etc/ld.so.conf.d/cuda.conf" so jupyter can see the drivers, and ran "ldconfig" to update cache.
At this point I was getting multiple errors when trying to initialize tensorflow about it being unable to find libraries, which was found using this snippet:
import os
import ctypes
os.environ['TF_CUDA_PATHS'] = '/opt/cuda'
os.environ['LD_LIBRARY_PATH'] = '/opt/cuda/lib64:/usr/lib'
os.environ['TF_CPP_MAX_VLOG_LEVEL'] = "3"
## Manually try to load the runtime library to see the error
try:
ctypes.CDLL("/opt/cuda/lib64/libcudart.so")
print("CUDA Runtime library found and loaded!")
except Exception as e:
print(f"Failed to load CUDA library: {e}")
import tensorflow as tf
print("Physical Devices:", tf.config.list_physical_devices('GPU'))To try fix this, I attempted to symlink the .so files, which is hacky, but I thought might work... Unfortunately, this meant tensorflow imports correctly, but cannot run even the most basic example, e.g.
import tensorflow as tf
## 1. Create two constants
a = tf.constant([[1.0, 2.0]])
b = tf.constant([[3.0, 4.0]])
## 2. Add them
c = a + b
## 3. Print the result and the device it lives on
print("Result:", c)
print("Device used:", c.device)Spits out:
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1775783044.890444 150458 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9437 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:2b:00.0, compute capability: 8.6
Failed to initialize GPU device #0: shared object symbol not foundAnd more complex examples crash the entire python kernel.
Offline
Running:
sudo downgrade cuda cudnnand downgrading to the versions specified in the pkgbuild, which at time of writing is cuda-13.0.2-3-x86_64 and cudnn-9.16.0.29-1-x86_64.
This does not require downgrading nvidia drivers (which is a massive pain and borderline breaks my system).
After that I ran `source /etc/profile` as per the hints provided via pacman.
Running a log level 3 snippet (TF_CPP_MAX_VLOG_LEVEL=3) that simply loads and lists GPUs then generates the following output:
CUDA Runtime library found and loaded!
2026-04-13 07:17:45.056147: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:861] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0
2026-04-13 07:17:45.056168: I external/local_xla/xla/tsl/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled
2026-04-13 07:17:45.056172: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:901] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
2026-04-13 07:17:45.056174: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:931] GCS additional header DISABLED. No environment variable set.
2026-04-13 07:17:45.056178: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:310] GCS RetryConfig: init_delay_time_us = 1000000 ; max_delay_time_us = 32000000 ; max_retries = 10
2026-04-13 07:17:45.056180: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:310] GCS RetryConfig: init_delay_time_us = 1000000 ; max_delay_time_us = 32000000 ; max_retries = 10
2026-04-13 07:17:45.056418: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcudart.so.12
2026-04-13 07:17:46.655123: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcudart.so.12
2026-04-13 07:17:46.656659: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:861] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0
2026-04-13 07:17:46.656664: I external/local_xla/xla/tsl/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled
2026-04-13 07:17:46.656667: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:901] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
2026-04-13 07:17:46.656669: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:931] GCS additional header DISABLED. No environment variable set.
2026-04-13 07:17:46.656672: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:310] GCS RetryConfig: init_delay_time_us = 1000000 ; max_delay_time_us = 32000000 ; max_retries = 10
2026-04-13 07:17:46.656674: I external/local_xla/xla/tsl/platform/cloud/gcs_file_system.cc:310] GCS RetryConfig: init_delay_time_us = 1000000 ; max_delay_time_us = 32000000 ; max_retries = 10
Physical Devices: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2026-04-13 07:17:47.646524: I external/local_xla/xla/parse_flags_from_env.cc:214] For env var TF_XLA_FLAGS found arguments:
2026-04-13 07:17:47.646555: I external/local_xla/xla/parse_flags_from_env.cc:216] argv[0] = <argv[0]>
2026-04-13 07:17:47.646563: I external/local_xla/xla/parse_flags_from_env.cc:214] For env var TF_JITRT_FLAGS found arguments:
2026-04-13 07:17:47.646565: I external/local_xla/xla/parse_flags_from_env.cc:216] argv[0] = <argv[0]>
2026-04-13 07:17:47.647426: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcuda.so.1
2026-04-13 07:17:47.765859: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcudart.so.12
2026-04-13 07:17:48.279746: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcublas.so.12
2026-04-13 07:17:48.279823: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcublasLt.so.12
2026-04-13 07:17:48.281415: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcufft.so.11
2026-04-13 07:17:48.287271: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcusolver.so.11
2026-04-13 07:17:48.287312: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcusparse.so.12
2026-04-13 07:17:48.287405: I external/local_xla/xla/tsl/platform/default/dso_loader.cc:76] Successfully opened dynamic library libcudnn.so.9Note the lack of errors for loading .so files, which is new.
After that, if I attempt to do some GPU operations as per the testing script in the package repo:
import tensorflow as tf
import os
os.environ['TF_CPP_MAX_VLOG_LEVEL'] = "3" # Enable verbose logging to spot any issues...
with tf.device("/GPU:0"):
a = tf.random.normal([1, 2])
def temp(x):
return tf.shape(x)[0]
tf.autograph.to_graph(temp)It crashes the kernel specifying:
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1776029064.971957 381157 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9813 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:2b:00.0, compute capability: 8.6
Failed to initialize GPU device #0: shared object symbol not foundI'll keep trying to figure out further issues, but it seems like this is the most promising path yet...
Offline