Add support for building a cuda + dml package #1600

baijumeswani · 2025-07-01T17:54:21Z

Add support for cuda + dml package. The python package will still be called onnxruntime-genai-cuda, but if --use_dml was passed in as a build time flag, dml will be available.

natke

Does this mean we could have packages called onnxruntime-genai-cuda that contain different binaries?

Can we create a different package name for the combined binary.

I think this could be a source of confusion

baijumeswani · 2025-07-01T18:06:02Z

I agree it can cause some confusion.

If we build with the command:

python build.py --use_cuda --use_dml

It would build a package called onnxruntime-genai-cuda with support for dml as well as cuda. I think adding yet another package name is hard to maintain considering that we may not publish this package. Onnxruntime built with similar flags will build a package called onnxruntime-gpu (not onnxruntime-directml or a different package name). If we keep creating a new package name for every combination of device type supported, then that will become difficult to maintain.

I think for python in the medium term, let's combine all our packages into onnxruntime-genai and then add the dependencies via pip install onnxruntime-genai[dml] or pip install onnxruntime-genai[cuda] and so on or if users want to install their own onnxruntime dependencies, they would just pip install onnxruntime-genai?

natke · 2025-07-01T19:15:34Z

I agree it can cause some confusion.

If we build with the command:
python build.py --use_cuda --use_dml
It would build a package called onnxruntime-genai-cuda with support for dml as well as cuda. I think adding yet another package name is hard to maintain considering that we may not publish this package. Onnxruntime built with similar flags will build a package called onnxruntime-gpu (not onnxruntime-directml or a different package name). If we keep creating a new package name for every combination of device type supported, then that will become difficult to maintain.

I think for python in the medium term, let's combine all our packages into onnxruntime-genai and then add the dependencies via pip install onnxruntime-genai[dml] or pip install onnxruntime-genai[cuda] and so on or if users want to install their own onnxruntime dependencies, they would just pip install onnxruntime-genai?

I think that's a great option for Python but we would still have the confusion for NuGet packages. How does a user know that their package has DML in it?

baijumeswani · 2025-07-01T19:20:29Z

As long as we don't publish these kinds of packages (support for multiple compile time providers), it should be ok. If we publish them, then we should change the name to something more meaningful.

Or maybe we can bubble this up to even ort. We could curate a list of eps we want to support in the default package and then build both ort and ort-genai that includes support for all the default package eps. And we simply call the package onnxruntime and onnxruntime-genai (or Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntimeGenAI). That would solve a lot of the problems.

ajindal1 · 2025-07-01T21:11:49Z

I think for python in the medium term, let's combine all our packages into onnxruntime-genai and then add the dependencies via pip install onnxruntime-genai[dml] or pip install onnxruntime-genai[cuda] and so on or if users want to install their own onnxruntime dependencies, they would just pip install onnxruntime-genai?

I like this idea but one downside would be that it will create some iterations and have a little worse user experience. One option is we can keep the same package name (onnxruntime-genai), but store the wheels at different locations for different versions, like the standard version exists at pypi, and cuda/DML versions can be stored somewhere else. This is the approach used by PyTorch.

cmake/check_cuda.cmake

baijumeswani · 2025-07-02T20:14:26Z

I like this idea but one downside would be that it will create some iterations and have a little worse user experience. One option is we can keep the same package name (onnxruntime-genai), but store the wheels at different locations for different versions, like the standard version exists at pypi, and cuda/DML versions can be stored somewhere else. This is the approach used by PyTorch.

Ideally it would be nice if one package can support multiple device types (dml + cuda + webgpu + cpu + others). Then we don't need multiple packages. We can technically do this right now. The only problem being that we list specific ort packages as dependencies (onnxruntime-gpu for cuda, onnxruntime-dml for dml and onnxruntime-qnn for qnn). Since this is only a dependency issue, it can be resolved with pip install onnxruntime-genai[gpu] or pip install onnxruntime-genai[qnn]. Hoping that this process can be simplified in the future.

But since we are not publishing any new package through this PR, I think we should discuss this in our scrum and decouple from this pr for now.

natke · 2025-07-02T21:28:44Z

I like this idea but one downside would be that it will create some iterations and have a little worse user experience. One option is we can keep the same package name (onnxruntime-genai), but store the wheels at different locations for different versions, like the standard version exists at pypi, and cuda/DML versions can be stored somewhere else. This is the approach used by PyTorch.

Ideally it would be nice if one package can support multiple device types (dml + cuda + webgpu + cpu + others). Then we don't need multiple packages. We can technically do this right now. The only problem being that we list specific ort packages as dependencies (onnxruntime-gpu for cuda, onnxruntime-dml for dml and onnxruntime-qnn for qnn). Since this is only a dependency issue, it can be resolved with pip install onnxruntime-genai[gpu] or pip install onnxruntime-genai[qnn]. Hoping that this process can be simplified in the future.

But since we are not publishing any new package through this PR, I think we should discuss this in our scrum and decouple from this pr for now.

I would like to see a different name for the package if it is bundling different binaries. We will be building and shipping integration drops to IHVs and I can see this getting very confusing and causing a lot of churn. For each deliverable I would like to have a manifest, which shows dependencies and versions and from that it would be clear which binaries shipped (ie it's not just about packaging)

Add support for building a cuda + dml package

494ae02

natke self-requested a review July 1, 2025 17:55

natke reviewed Jul 1, 2025

View reviewed changes

Fix pipelines

0dff395

ajindal1 reviewed Jul 1, 2025

View reviewed changes

cmake/check_cuda.cmake Show resolved Hide resolved

natke approved these changes Jul 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for building a cuda + dml package #1600

Add support for building a cuda + dml package #1600

Uh oh!

baijumeswani commented Jul 1, 2025

Uh oh!

natke left a comment

Uh oh!

baijumeswani commented Jul 1, 2025 •

edited

Loading

Uh oh!

natke commented Jul 1, 2025

Uh oh!

baijumeswani commented Jul 1, 2025

Uh oh!

ajindal1 commented Jul 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

baijumeswani commented Jul 2, 2025

Uh oh!

natke commented Jul 2, 2025

Uh oh!

Uh oh!

Add support for building a cuda + dml package #1600

Are you sure you want to change the base?

Add support for building a cuda + dml package #1600

Uh oh!

Conversation

baijumeswani commented Jul 1, 2025

Uh oh!

natke left a comment

Choose a reason for hiding this comment

Uh oh!

baijumeswani commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

natke commented Jul 1, 2025

Uh oh!

baijumeswani commented Jul 1, 2025

Uh oh!

ajindal1 commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

baijumeswani commented Jul 2, 2025

Uh oh!

natke commented Jul 2, 2025

Uh oh!

Uh oh!

baijumeswani commented Jul 1, 2025 •

edited

Loading

ajindal1 commented Jul 1, 2025 •

edited

Loading