enable Stable Diffusion with Microsoft Olive under Automatic1111(Xformer) to speedup via DirectML for AMD GPUs

Marees · Aug 19, 2023

Microsoft and AMD have been working together to optimize the Olive path on AMD hardware, accelerated via the Microsoft DirectML platform API and the AMD User Mode Driver’s ML (Machine Learning) layer for DirectML allowing users access to the power of the AMD GPU’s AI (Artificial Intelligence) capabilities.

https://community.amd.com/t5/gaming...tic1111-stable-diffusion-webui-on/ba-p/625585

Running on the default PyTorch path, the AMD Radeon RX 7900 XTX delivers 1.87 iterations/second.

Running on the optimized model with Microsoft Olive, the AMD Radeon RX 7900 XTX delivers 18.59 iterations/second.

Overview of Microsoft Olive

Microsoft Olive is a Python tool that can be used to convert, optimize, quantize, and auto-tune models for optimal inference performance with ONNX Runtime execution providers like DirectML. Olive greatly simplifies model processing by providing a single toolchain to compose optimization techniques, which is especially important with more complex models like Stable Diffusion that are sensitive to the ordering of optimization techniques. The DirectML sample for Stable Diffusion applies the following techniques:

Model conversion: translates the base models from PyTorch to ONNX.

Transformer graph optimization: fuses subgraphs into multi-head attention operators and eliminating inefficient from conversion.

Quantization: converts most layers from FP32 to FP16 to reduce the model's GPU memory footprint and improve performance.

Combined, the above optimizations enable DirectML to leverage AMD GPUs for greatly improved performance when performing inference with transformer models like Stable Diffusion.

Create Optimized Model

(Following the instruction from Olive, we can generate optimized Stable Diffusion model using Olive)

Open Anaconda/Miniconda Terminal

Create a new environment by sequentially entering the following commands into the terminal, followed by the enter key. Important to note that Python 3.9 is required.

conda create --name olive python=3.9

conda activate olive

pip install olive-ai[directml]==0.2.1

git clone https://github.com/microsoft/olive --branch v0.2.1

cd olive\examples\directml\stable_diffusion

pip install -r requirements.txt

pip install pydantic==1.10.12

Generate an ONNX model and optimize it for run-time. This may take a long time.

python stable_diffusion.py --optimize

The optimized model will be stored at the following directory, keep this open for later: olive\examples\directml\stable_diffusion\models\optimized\runwayml. The model folder will be called “stable-diffusion-v1-5”. Use the following command to see what other models are supported: python stable_diffusion.py –help

sc5mu93 · Aug 28, 2023

Any one had any luck getting this to work? Automatic 1111 DirectML fork takes a dump on me after I have stepped through this

TheSlySyl · Sep 4, 2023

I much moved entirely to ComfyUI for my stable diffusion needs when SDXL released, but I also have an Nvidia GPU so I haven't had to really look much farther.

enable Stable Diffusion with Microsoft Olive under Automatic1111(Xformer) to speedup via DirectML for AMD GPUs

Marees

2[H]4U

sc5mu93

[H]ard|Gawd

TheSlySyl

2[H]4U