Only Coders - Where knowledge meets opportunity

python (12.9k questions)

javascript (9.2k questions)

reactjs (4.7k questions)

java (4.2k questions)

c# (3.5k questions)

html (3.3k questions)

Questions - onnxruntime

onnx.load() | ALBert throws DecodeError: Error parsing message

Goal: re-develop this BERT Notebook to use textattack/albert-base-v2-MRPC. Kernel: conda_pytorch_p36. PyTorch 1.8.1+cpu. I convert a PyTorch / HuggingFace Transformers model to ONNX and store it. Deco...

DanielBell99

python

deep-learning

onnx

quantization

onnxruntime

Votes: 0

Answers: 1

Latest Answer

The problem was with updating the config variables for my new model. Changes: configs.output_dir = "albert-base-v2-MRPC" configs.model_name_or_path = "albert-base-v2-MRPC" I then ...

DanielBell99

Converting PyTorch to ONNX model increases file size for ALBert

Goal: Use this Notebook to perform quantisation on albert-base-v2 model. Kernel: conda_pytorch_p36. Outputs in Sections 1.2 & 2.2 show that: converting vanilla BERT from PyTorch to ONNX stays th...

DanielBell99

python

pytorch

onnx

quantization

onnxruntime

Votes: 0

Answers: 1

Latest Answer

Explanation ALBert model has shared weights among layers. torch.onnx.export outputs the weights to different tensors, which causes the model size to grow larger. A number of Git Issues have been marke...

DanielBell99

ONNX Runtime error: node->GetOutputEdgesCount() == 0 was false. Can't remove node

I have a simple Keras RNN model, composed by embedding, LSTM, and linear layers: loaded_model.layers Out[23]: [<keras.layers.embeddings.Embedding at 0x2275dc1f6a0>, <keras.layers.recurrent...

MattS

lstm

onnx

onnxruntime

tf2onnx

Votes: 0

Answers: 0

onnxruntime inference is way slower than pytorch on GPU

I was comparing the inference times for an input using pytorch and onnxruntime and I find that onnxruntime is actually slower on GPU while being significantly faster on CPU I was tryng this on Window...

sn710

machine-learning

pytorch

gpu

onnx

onnxruntime

Votes: 0

Answers: 2

Latest Answer

When calculating inference time exclude all code that should be run once like resnet.eval() from the loop. Please include imports in example import torch from torchvision import models import onnxrunt...

Igor

Posts

Questions

Blogs

Jobs

Questions about onnxruntime

Read more about onnxruntime