I am trying to implement a simple autoencoder using PyTorch
. My dataset consists of 256 x 256 x 3 images. I have built a torch.utils.data.dataloader.DataLoade
Whenever you have:
RuntimeError: size mismatch, m1: [a x b], m2: [c x d]
all you have to care is b=c
and you are done:
m1
is [a x b]
which is [batch size x in features]
m2
is [c x d]
which is [in features x out features]
If your input is 3 x 256 x 256
, then you need to convert it to B x N
to pass it through the linear layer: nn.Linear(3*256*256, 128)
where B
is the batch_size
and N
is the linear layer input size.
If you are giving one image at a time, you can convert your input tensor of shape 3 x 256 x 256
to 1 x (3*256*256)
as follows.
img = img.view(1, -1) # converts [3 x 256 x 256] to 1 x 196608
output = model(img)
Your error:
size mismatch, m1: [76800 x 256], m2: [784 x 128]
says that previous layer output shape is not equal to next layer input shape
[76800 x 256], m2: [784 x 128] # Incorrect!
[76800 x 256], m2: [256 x 128] # Correct!