Python bindings for ggml
Python bindings for the ggml tensor library for machine learning.
⚠️ Neither this project norggmlcurrently guarantee backwards-compatibility, if you are using this library in other applications I strongly recommend pinning to specific releases in yourrequirements.txtfile.
Requirements
- Python 3.8+
- C compiler (gcc, clang, msvc, etc)
You can install ggml-python using pip:
pip install ggml-pythonIt is also possible to install a pre-built wheel with basic CPU support:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/cpuPre-built CUDA wheels are available for CUDA 11.8, 12.1, 12.2, 12.3, 12.4, 12.5, 13.0, and 13.2:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/<cuda-version>Where <cuda-version> is one of the following:
cu118: CUDA 11.8cu121: CUDA 12.1cu122: CUDA 12.2cu123: CUDA 12.3cu124: CUDA 12.4cu125: CUDA 12.5cu130: CUDA 13.0cu132: CUDA 13.2
For example, to install the CUDA 12.1 wheel:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/cu121Pre-built Metal wheels are available for Apple Silicon macOS:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/metalPre-built Vulkan wheels are available for Linux and Windows:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/vulkanPre-built ROCm wheels are available for Linux x86_64 with ROCm 7.2:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/rocm72Pre-built HIP Radeon wheels are available for Windows x86_64:
pip install ggml-python \
--extra-index-url https://abetlen.github.io/ggml-python/whl/hip-radeonWhen installing from source, pip compiles ggml with CMake and requires a C compiler installed on your system.
To build ggml with specific features (ie. OpenBLAS, GPU Support, etc) you can pass specific cmake options through the cmake.args pip install configuration setting. For example to install ggml-python with cuBLAS support you can run:
pip install --upgrade pip
pip install ggml-python --config-settings=cmake.args='-DGGML_CUDA=ON'| Option | Description | Default |
|---|---|---|
GGML_CUDA |
Enable cuBLAS support | OFF |
GGML_HIP |
Enable HIP / ROCm support | OFF |
GGML_OPENCL |
Enable OpenCL support | OFF |
GGML_BLAS |
Enable BLAS support | OFF |
GGML_BLAS_VENDOR |
Select BLAS vendor, for example OpenBLAS |
unset |
GGML_METAL |
Enable Metal support | OFF |
GGML_VULKAN |
Enable Vulkan support | OFF |
GGML_RPC |
Enable RPC support | OFF |
import ggml
import ctypes
# Allocate a new context with 16 MB of memory
params = ggml.ggml_init_params(mem_size=16 * 1024 * 1024, mem_buffer=None)
ctx = ggml.ggml_init(params)
# Instantiate tensors
x = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)
a = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)
b = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)
# Use ggml operations to build a computational graph
x2 = ggml.ggml_mul(ctx, x, x)
f = ggml.ggml_add(ctx, ggml.ggml_mul(ctx, a, x2), b)
gf = ggml.ggml_new_graph(ctx)
ggml.ggml_build_forward_expand(gf, f)
# Set the input values
ggml.ggml_set_f32(x, 2.0)
ggml.ggml_set_f32(a, 3.0)
ggml.ggml_set_f32(b, 4.0)
# Compute the graph
ggml.ggml_graph_compute_with_ctx(ctx, gf, 1)
# Get the output value
output = ggml.ggml_get_f32_1d(f, 0)
assert output == 16.0
# Free the context
ggml.ggml_free(ctx)If you are having trouble installing ggml-python or activating specific features please try to install it with the --verbose and --no-cache-dir flags to get more information about any issues:
pip install ggml-python --verbose --no-cache-dir --force-reinstall --upgradeThis project is licensed under the terms of the MIT license.