-
Notifications
You must be signed in to change notification settings - Fork 195
Add support for building a cuda + dml package #1600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we could have packages called onnxruntime-genai-cuda that contain different binaries?
Can we create a different package name for the combined binary.
I think this could be a source of confusion
I agree it can cause some confusion. If we build with the command:
It would build a package called I think for python in the medium term, let's combine all our packages into |
I think that's a great option for Python but we would still have the confusion for NuGet packages. How does a user know that their package has DML in it? |
As long as we don't publish these kinds of packages (support for multiple compile time providers), it should be ok. If we publish them, then we should change the name to something more meaningful. Or maybe we can bubble this up to even ort. We could curate a list of eps we want to support in the default package and then build both ort and ort-genai that includes support for all the default package eps. And we simply call the package onnxruntime and onnxruntime-genai (or Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntimeGenAI). That would solve a lot of the problems. |
I like this idea but one downside would be that it will create some iterations and have a little worse user experience. One option is we can keep the same package name ( |
Ideally it would be nice if one package can support multiple device types (dml + cuda + webgpu + cpu + others). Then we don't need multiple packages. We can technically do this right now. The only problem being that we list specific ort packages as dependencies (onnxruntime-gpu for cuda, onnxruntime-dml for dml and onnxruntime-qnn for qnn). Since this is only a dependency issue, it can be resolved with But since we are not publishing any new package through this PR, I think we should discuss this in our scrum and decouple from this pr for now. |
I would like to see a different name for the package if it is bundling different binaries. We will be building and shipping integration drops to IHVs and I can see this getting very confusing and causing a lot of churn. For each deliverable I would like to have a manifest, which shows dependencies and versions and from that it would be clear which binaries shipped (ie it's not just about packaging) |
Add support for cuda + dml package. The python package will still be called onnxruntime-genai-cuda, but if
--use_dml
was passed in as a build time flag, dml will be available.