How to properly implement data reorganization using PyTorch?

好久不见. 提交于 2020-01-23 12:36:25

问题


It's going to be a long post, sorry in advance...

I'm working on a denoising algorithm and my goal is to:

  • Use PyTorch to design / train the model
  • Convert the PyTorch model into a CoreML model

The denoising algorithm consists in the following 3 parts:

    1. A "down-sampling" + noise level map
    1. A regular convnet
    1. An "up-sampling"

The first part is quite simple in its idea, but not so easy to explain. Given for instance an input color image and a input value "sigma" that represents the standard deviation of the image noise. The "down-sampling" part is in fact a space-to-depth. In short, for a given channel and for a subset of 2x2 pixels, the space-to-depth creates a single pixel composed of 4 channels. The number of channels is multiplied by 4 while the height and width are divided by 2. The data is simply reorganized. The noise level map consists in creating 3 channels containing the standard deviation value so that the convnet knows how to properly denoise the input image. This will be maybe more clear with some code:

def downsample_and_noise_map(input, sigma):

    # Input tensor size (batch, channels, height, width)
    in_n, in_c, in_h, in_w = input.size()

    # Output tensor size
    out_h = in_h // 2
    out_w = in_w // 2
    sigma_c = in_c      # nb of channels of the standard deviation tensor
    image_c = in_c * 4  # nb of channels of the image tensor

    # Standard deviation tensor
    output_sigma = sigma.view(1, 1, 1, 1).repeat(in_n, sigma_c, out_h, out_w)

    # Image tensor
    output_image = torch.zeros((in_n, image_c, out_h, out_w))
    output_image[:, 0::4, :, :] = input[:, :, 0::2, 0::2]
    output_image[:, 1::4, :, :] = input[:, :, 0::2, 1::2]
    output_image[:, 2::4, :, :] = input[:, :, 1::2, 0::2]
    output_image[:, 3::4, :, :] = input[:, :, 1::2, 1::2]

    # Concatenate standard deviation and image tensors
    return torch.cat((output_sigma, output_image), dim=1)

This function is then called as the first step in the model's forward function:

def forward(self, x, sigma):
    x = downsample_and_noise_map(x, sigma)
    x = self.convnet(x)
    x = upsample(x)
    return x

Let's consider an input tensor of size 1x3x100x100 (PyTorch standard: batch, channels, height, width) and a sigma value of 0.1. The output tensor has the following properties:

  • Tensor's shape is 1x15x50x50
  • Tensor's values for channels 0, 1 and 2 are all equal to sigma = 0.1
  • Tensor's values for channels 3, 4, 5, 6 are composed of the input image values of channel 0
  • Tensor's values for channels 7, 8, 9, 10 are composed of the input image values of channel 1
  • Tensor's values for channels 11, 12, 13, 14 are composed of the input image values of channel 2

If this code is not clear enough, I can post an even more naive version.

The up-sampling part is the reciprocal function of the downsampling one.

I was able to use this function for training and testing in PyTorch.

Then, I tried to convert the model to CoreML with ONNX as an intermediate step. The conversion to ONNX generated "TracerWarning". Conversion from ONNX to CoreML failed (TypeError: 1.0 has type numpy.float64, but expected one of: int, long). The problem came from the down-sampling + noise level map (and from up-sampling too).

When I removed the down-sampling + noise level map and up-sampling layers, I was able to convert to ONNX and to CoreML very easily since only a simple convnet remained. This means I have a solution to my problem: implement these 2 layers using 2 shaders on the mobile side. But I'm not satisfied with this solution as I want my model to contain all layers ^^

Before considering writing a post here, I crawled Internet to find an answer and I was able to write a better version of the previous function using reshape and permute. This version removed all ONNX warning, but the CoreML conversion still failed...

def downsample_and_noise_map(input, sigma):

    # Input image size
    in_n, in_c, in_h, in_w = input.size()

    # Output tensor size
    out_n = in_n
    out_h = in_h // 2
    out_w = in_w // 2

    # Create standard deviation tensor
    output_sigma = sigma.view(out_n, 1, 1, 1).repeat(out_n, in_c, out_h, out_w)

    # Split RGB channels
    channels_rgb = torch.split(input, 1, dim=1)

    # Reshape (space-to-depth) each image channel
    channels_reshaped = []
    for channel in channels_rgb:
        channel = channel.reshape(1, out_h, 2, out_w, 2)
        channel = channel.permute(2, 4, 0, 1, 3)
        channel = channel.reshape(1, 4, out_h, out_w)
        channels_reshaped.append(channel)

    # Concatenate all reshaped image channels together
    output_image = torch.cat(channels_reshaped, dim=1)

    # Concatenate standard deviation and image tensors
    output = torch.cat([output_sigma, output_image], dim=1)

    return output

So here are (some of) my questions:

  • What is the preferred PyTorch way to implement a function such as downsample_and_noise_map function within a model?
  • Same question but when the conversion to ONNX and then to CoreML is part of the equation?
  • Is the PyTorch -> ONNX -> CoreML still best path to deploy the model for iOS production?

Thanks for your help (and your patience) ^^


回答1:


Disclaimer I'm not familiar with CoreML or deploying to iOS but I do have experience deploying PyTorch models in TensorRT and OpenVINO via ONNX.

The main issues I've faced when deploying to other frameworks is that operations like slicing and repeating tensors tend to have limited support in other frameworks. Often we can construct equivalent conv or transpose-conv operations which achieve the desired behavior.

In order to ensure we don't export the logic used to construct the conv weights I've separated the weight initialization from the application of the weights. This makes the ONNX export much more straightforward since all it sees is some constant tensors being applied.

class DownsampleAndNoiseMap():
    def __init__(self):
        self.initialized = False
        self.weight = None
        self.zeros = None

    def init_weights(self, input):
        with torch.no_grad():
            in_n, in_c, in_h, in_w = input.size()

            out_h = int(in_h // 2)
            out_w = int(in_w // 2)
            sigma_c = in_c
            image_c = in_c * 4

            # conv weights used for downsampling
            self.weight = torch.zeros(image_c, in_c, 2, 2).to(input)
            for c in range(in_c):
                self.weight[4 * c, c, 0, 0] = 1
                self.weight[4 * c + 1, c, 0, 1] = 1
                self.weight[4 * c + 2, c, 1, 0] = 1
                self.weight[4 * c + 3, c, 1, 1] = 1

            # zeros used to replace repeat
            self.zeros = torch.zeros(in_n, sigma_c, out_h, out_w).to(input)

        self.initialized = True

    def __call__(self, input, sigma):
        assert self.initialized
        output_sigma = self.zeros + sigma
        output_image = torch.nn.functional.conv2d(input, self.weight, stride=2)
        return torch.cat((output_sigma, output_image), dim=1)

class Upsample():
    def __init__(self):
        self.initialized = False
        self.weight = None

    def init_weights(self, input):
        with torch.no_grad():
            in_n, in_c, in_h, in_w = input.size()

            image_c = in_c * 4

            self.weight = torch.zeros(in_c + image_c, in_c, 2, 2).to(input)
            for c in range(in_c):
                self.weight[in_c + 4 * c, c, 0, 0] = 1
                self.weight[in_c + 4 * c + 1, c, 0, 1] = 1
                self.weight[in_c + 4 * c + 2, c, 1, 0] = 1
                self.weight[in_c + 4 * c + 3, c, 1, 1] = 1

        self.initialized = True

    def __call__(self, input):
        assert self.initialized
        return torch.nn.functional.conv_transpose2d(input, self.weight, stride=2)

I made the assumption that upsample was the reciprocal of downsample in the sense that x == upsample(downsample_and_noise_map(x, sigma)) (correct me if I'm wrong in this assumption). I also verified that my version of downsample agrees with yours.

# consistency checking code
x = torch.randn(1, 3, 100, 100)
sigma = torch.randn(1)

# OP downsampling
y1 = downsample_and_noise_map(x, sigma)

ds = DownsampleAndNoiseMap()
ds.init_weights(x)
y2 = ds(x, sigma)

print('downsample diff:', torch.sum(torch.abs(y1 - y2)).item())

us = Upsample()
us.init_weights(x)
x_recov = us(ds(x, sigma))

print('recovery error:', torch.sum(torch.abs(x - x_recov)).item())

which results in

downsample diff: 0.0
recovery error: 0.0

Exporting to ONNX

When exporting we need to invoke init_weights for the new classes before using torch.onnx.export. For example

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.downsample = DownsampleAndNoiseMap()
        self.upsample = Upsample()
        self.convnet = lambda x: x  # placeholder

    def init_weights(self, x):
        self.downsample.init_weights(x)
        self.upsample.init_weights(x)

    def forward(self, x, sigma):
        x = self.downsample(x, sigma)
        x = self.convnet(x)
        x = self.upsample(x)
        return x

x = torch.randn(1, 3, 100, 100)
sigma = torch.randn(1)

model = Model()
# ... load state dict here
model.init_weights(x)
torch.onnx.export(model, (x, sigma), 'deploy.onnx', verbose=True, input_names=["input", "sigma"], output_names=["output"])

which gives the ONNX graph

graph(%input : Float(1, 3, 100, 100)
      %sigma : Float(1)) {
  %2 : Float(1, 3, 50, 50) = onnx::Constant[value=<Tensor>](), scope: Model
  %3 : Float(1, 3, 50, 50) = onnx::Add(%2, %sigma), scope: Model
  %4 : Float(12, 3, 2, 2) = onnx::Constant[value=<Tensor>](), scope: Model
  %5 : Float(1, 12, 50, 50) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[2, 2], pads=[0, 0, 0, 0], strides=[2, 2]](%input, %4), scope: Model
  %6 : Float(1, 15, 50, 50) = onnx::Concat[axis=1](%3, %5), scope: Model
  %7 : Float(15, 3, 2, 2) = onnx::Constant[value=<Tensor>](), scope: Model
  %output : Float(1, 3, 100, 100) = onnx::ConvTranspose[dilations=[1, 1], group=1, kernel_shape=[2, 2], pads=[0, 0, 0, 0], strides=[2, 2]](%6, %7), scope: Model
  return (%output);
}

As for the last question about the recommended way to deploy on iOS I can't answer that since I don't have experience in that area.



来源:https://stackoverflow.com/questions/59177052/how-to-properly-implement-data-reorganization-using-pytorch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!