Optimize QLinearSoftmax Transpose #22849

yihonglyu · 2024-11-15T06:38:39Z

Description

Improved transpose around QLinearSoftmax in Level 3 NHWC Transformer.
Removed redundant code HandleQLinearConcat, HandleQLinearBinaryOp.

Motivation and Context

By merging and eliminating redundant transpose , the Image Segmentation i8 model (MobileNetv2 + DeepLabv3) achieves a 2.34X speedup.

- Improved transpose around QLinearSoftmax in Level 3 NHWC Transformer. - Removed redundant code HandleQLinearConcat, HandleQLinearBinaryOp.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/optimizer/transpose_optimizer_test.cc

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yihonglyu added 3 commits November 14, 2024 21:27

Optimize QLinearSoftmax Transpose

146fad6

- Improved transpose around QLinearSoftmax in Level 3 NHWC Transformer. - Removed redundant code HandleQLinearConcat, HandleQLinearBinaryOp.

Cleanup code

a23d2cc

Add unit tests

def160e

yihonglyu requested a review from skottmckay November 15, 2024 06:38

Cleanup code

8c030e4

github-actions bot reviewed Nov 15, 2024

View reviewed changes

onnxruntime/test/optimizer/transpose_optimizer_test.cc Outdated Show resolved Hide resolved

Update onnxruntime/test/optimizer/transpose_optimizer_test.cc

d23a7d7

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize QLinearSoftmax Transpose #22849

Optimize QLinearSoftmax Transpose #22849

yihonglyu commented Nov 15, 2024

github-actions bot left a comment

Optimize QLinearSoftmax Transpose #22849

Are you sure you want to change the base?

Optimize QLinearSoftmax Transpose #22849

Conversation

yihonglyu commented Nov 15, 2024

Description

Motivation and Context

github-actions bot left a comment

Choose a reason for hiding this comment