NumPy Bug: ASIMDDP CPU Feature On ARM Not Wired
Hey everyone,
I've run into a bit of a snag while working with NumPy on ARM, and I wanted to share it with you all to see if anyone else has experienced something similar or has any insights.
Issue Description
It seems the ASIMDDP
CPU feature, as defined in NumPy's Meson build configuration, isn't actually wired into the ARM_FEATURES
. This means that if you try to specify ASIMDDP
in your cpu-baseline
, you'll likely run into a build failure. Let's dive into the specifics.
Diving Deep into the ASIMDDP Issue
When we talk about optimizing code for different CPU architectures, we often look at specific features that can boost performance. In the ARM world, one such feature is ASIMDDP
(Advanced SIMD Dot Product). This feature is designed to speed up certain mathematical operations by performing multiple calculations simultaneously. NumPy, being a powerhouse for numerical computations in Python, naturally tries to take advantage of these features where available.
Now, let's get a bit technical. If you peek into NumPy's build scripts, specifically the meson.build
files, you'll find sections dedicated to defining and enabling CPU features. The issue arises because while ASIMDDP
is defined as a feature, it's not correctly linked into the list of features that NumPy uses during the build process. This is like having a shiny new tool but forgetting to put it in your toolbox – it's there, but you can't use it!
To illustrate, consider the links provided in the issue description. The first link points to where ASIMDDP
is defined in NumPy's Meson build configuration. The second link shows where the available ARM features are listed. The problem is that ASIMDDP
, despite being defined, isn't included in that list. As a result, when you try to build NumPy with ASIMDDP
as a baseline, the build system throws an error because it doesn't recognize this feature as a valid option.
This might seem like a small detail, but it has significant implications. If ASIMDDP
isn't properly enabled, NumPy can't fully utilize the potential performance gains offered by this feature on ARM CPUs. This is particularly relevant for applications that heavily rely on numerical computations, such as scientific simulations, data analysis, and machine learning. In these scenarios, every bit of performance improvement counts.
So, what's the solution? Well, the first step is to ensure that ASIMDDP
is correctly wired into the ARM_FEATURES
list in NumPy's build scripts. This involves modifying the meson.build
files to include ASIMDDP
in the list of recognized features. Once this is done, the build system will be able to identify and enable ASIMDDP
when specified in the cpu-baseline
. This, in turn, will allow NumPy to take full advantage of the ASIMDDP
capabilities on ARM CPUs, leading to faster and more efficient computations.
Reproducing the Issue
To reproduce this issue, you can use the following command:
python -m build -w -Csetup-args=-Dcpu-baseline=ASIMDDP
This command attempts to build a NumPy wheel with ASIMDDP
set as the CPU baseline. If the issue is present, the build will fail and you'll see an error message in the output.
Error Message
The error message you'll likely encounter looks something like this:
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- Cython>=3.0.6
- meson-python>=0.18.0
* Getting build dependencies for wheel...
* Installing packages in isolated environment:
- patchelf >= 0.11.0
* Building wheel...
+ /tmp/build-env-gqg69ed9/bin/python /home/mgorny/numpy/vendored-meson/meson/meson.py setup /home/mgorny/numpy /home/mgorny/numpy/.mesonpy-z4l3id1a -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md -Dcpu-baseline=ASIMDDP --native-file=/home/mgorny/numpy/.mesonpy-z4l3id1a/meson-python-native-file.ini
The Meson build system
Version: 1.8.3
Source dir: /home/mgorny/numpy
Build dir: /home/mgorny/numpy/.mesonpy-z4l3id1a
Build type: native build
Project name: NumPy
Project version: 2.4.0.dev0+git20250814.8b89d0f
C compiler for the host machine: ccache cc (gcc 14.3.0 "cc (Gentoo Hardened 14.3.0 p8) 14.3.0")
C linker for the host machine: cc ld.bfd 2.44
C++ compiler for the host machine: ccache c++ (gcc 14.3.0 "c++ (Gentoo Hardened 14.3.0 p8) 14.3.0")
C++ linker for the host machine: c++ ld.bfd 2.44
Cython compiler for the host machine: cython (cython 3.1.3)
Host machine cpu family: aarch64
Host machine cpu: aarch64
Program python found: YES (/tmp/build-env-gqg69ed9/bin/python)
Found pkg-config: YES (/usr/bin/pkg-config) 2.4.3
Run-time dependency python found: YES 3.13
Has header "Python.h" with dependency python-3.13: YES
Compiler for C supports arguments -fno-strict-aliasing: YES
../meson_cpu/meson.build:202:6: ERROR: Problem encountered: Invalid token "ASIMDDP" within option --cpu-baseline
A full log can be found at /home/mgorny/numpy/.mesonpy-z4l3id1a/meson-logs/meson-log.txt
ERROR Backend subprocess exited when trying to invoke build_wheel
The key part of this error is the line `ERROR: Problem encountered: Invalid token