Skip to content

Releases: KernelTuner/kernel_tuner

Version 1.4.0

18 Jun 14:54

Choose a tag to compare

This release marks a large step for Kernel Tuner in terms of new functionality and changes. The most important additions/changes being:

  • Parallel tuning on multiple GPUs and GPU clusters using Ray
  • Multi-objective optimization through Pymoo
  • Change of default CUDA backend from PyCuda to cuda-python

And of course many other smaller additions and fixes. Below is a more detailed overview of all merged pull requests:

What's Changed

New Contributors

Full Changelog: 1.3.3...1.4.0

Version 1.3.3

22 Apr 10:40
945d24c

Choose a tag to compare

Forgot to bump the version on the 1.3.2 release, fixing this by making a new release.

Version 1.3.2

22 Apr 10:05
40577e0

Choose a tag to compare

This is a minor release that mostly fixes a few bugs and warnings. The largest addition is the addition of optimization strategies based on scikit-optimize.

What's Changed

Full Changelog: 1.3.1...1.3.2

Version 1.3.1

21 Jan 16:22

Choose a tag to compare

This release brings together several optimizations and smaller bug fixes. It also adds support for cuda-python versions 13 and higher.

What's Changed

  • Fix issue 335 ValueError: (...) is not in list during Bayesian optimization by @stijnh in #336
  • Remove error_message_searchspace_fully_observed message in BO by @stijnh in #338
  • fix issue #332 by @benvanwerkhoven in #334
  • Recalculate metrics for each configuration in simulation runner by @stijnh in #307
  • Bump urllib3 from 2.5.0 to 2.6.0 in /doc by @dependabot[bot] in #341
  • Change test-python-package.yml to use macos-latest by @stijnh in #344
  • Bump filelock from 3.18.0 to 3.20.1 in /doc by @dependabot[bot] in #346
  • Bump urllib3 from 2.6.0 to 2.6.3 in /doc by @dependabot[bot] in #347
  • Bump virtualenv from 20.30.0 to 20.36.1 in /doc by @dependabot[bot] in #349
  • Bump filelock from 3.20.1 to 3.20.3 in /doc by @dependabot[bot] in #350
  • Clarify contribution guidelines regarding AI-generated code by @benvanwerkhoven in #353
  • Use the new cuda-python modules by @isazi in #345
  • Optimized searchspace operations by @fjwillemsen in #354
  • Add new Hamming-adjacent neighborhood method by @stijnh in #313
  • This fixes issue #333 on backwards compatibility with the old restrictions function by @fjwillemsen in #337
  • Fix bug in simulated annealing when dealing with negative objectives by @stijnh in #331
  • Replace bfloat16 dtype from bfloat16 package by one from ml_dtypes package by @stijnh in #330
  • Fix evaluation count in PyATF search strategies by @stijnh in #342

Full Changelog: 1.3.0...1.3.1

Version 1.3.0

03 Sep 14:42
5a48c0b

Choose a tag to compare

This release presents another major step forwards in particular with regard to hyperparameter tuning of the optimization strategies in Kernel Tuner. In addition, many of the optimization strategies have been made aware of constraints. This means they will initialize with only valid configurations, use the search space object to query only valid neighbors, and when needed repair invalid configs to valid neighboring ones.

In addition, the Differential Evolution strategy previously relied on scipy.optimize.diff_evo, which has now been replaced with a brand new implementation that is more suited for discrete search spaces, including those with strings as parameter valus, and the strategy is also constraint-aware.

Finally, Kernel Tuner now also allows users to pass their own optimization algorithms as search strategies for auto-tuning. For this purpose, kernel_tuner.strategies.wrapper implements an OptAlgWrapper class that can wrap an existing optimizer.

What's Changed

New Contributors

Full Changelog: 1.2...1.3.0

Version 1.2

17 Jul 08:23
8ce5847

Choose a tag to compare

This release includes many different fixes and upgrades in different areas. In particular, the search space construction, and OpenMP support. Bugs were fixed related to optimizing using maximization instead of minimization impacting all strategies and in particular for Firefly. Smaller improvements have been made to improve user-friendliness, documentation, Python 3.13 compatibility, the HIP backend, support for string-valued tunable parameters for mixed-precision tuning.

What's Changed

Full Changelog: 1.1.3...1.2

Version 1.1.3

21 May 12:21

Choose a tag to compare

This release contains a number of small bugfixes and enables support on Nvidia Blackwell GPUs.

What's Changed

New Contributors

Full Changelog: 1.1.2...1.1.3

Version 1.1.2

08 Apr 08:48

Choose a tag to compare

This release would not have been necessary if I had not forgotten to increment the version number on the previous release that I made 20 minutes ago. Alas, we all make mistakes sometimes.

Version 1.1.1

08 Apr 08:25
ece0719

Choose a tag to compare

The sole purpose of this release is to support Numpy 2.0 and newer. The main motivation for this is to make the examples and tutorial notebooks working again on Google Colab.

What's Changed

Full Changelog: 1.1.0...1.1.1

Version 1.1.0

04 Apr 10:10
85da990

Choose a tag to compare

This release integrates many smaller changes that have been made over the past year.

The most significant new features are:

  • The NCUObserver to include performance metrics from the Nvidia Profiler during tuning
  • TegraObserver to read/set clock frequencies, power and temperature on Nvidia Jetson GPUs

In addition, a lot of work has been put into several backends, including OpenACC, the compiler backend, the HIP backend and so on.

Thanks to everyone who contributed to Kernel Tuner in the past year!

What's Changed

New Contributors

Full Changelog: 1.0...1.1.0