精华内容
下载资源
问答
  • MXNet 作者 / 亚马逊主任科学家 李沐 PyTorch是一个纯命令式的深度学习框架。它因为提供简单易懂的编程...

                                                                       

                                                                        MXNet 作者 / 亚马逊主任科学家  李沐

     

    PyTorch 是一个纯命令式的深度学习框架。它因为提供简单易懂的编程接口而广受欢迎,而且正在快速的流行开来。例如 Caffe2 最近就并入了 PyTorch。

    可能大家不是特别知道的是,MXNet 通过 ndarray 和 gluon 模块提供了非常类似 PyTorch 的编程接口。本文将简单对比如何用这两个框架来实现同样的算法。

    多维矩阵

    对于多维矩阵,PyTorch 沿用了 Torch 的风格称之为 tensor,MXNet 则追随了 NumPy 的称呼 ndarray。下面我们创建一个两维矩阵,其中每个元素初始化成 1。然后每个元素加 1 后打印。

    • PyTorch:

     

    • MXNet:

     

    忽略包名的不一样的话,这里主要的区别是 MXNet 的形状传入参数跟 NumPy 一样需要用括号括起来。

     

    模型训练......

     

     

     

     

     

    展开全文
  • pytorch/mxnet模型tensorrt部署

    千次阅读 2020-09-07 15:11:06
    pytorch/mxnet模型tensorrt部署流程onnx问题记录tensorrt问题记录 本文用于记录pytorch/mxnet模型使用tersorrt的整个流程以及遇到的坑。 tensorrt支持TensorFlow的uff和onnx以及自定义模型的推理加速,对于pytorch...

    本文用于记录pytorch/mxnet模型使用tersorrt的整个流程以及遇到的坑。
    tensorrt支持TensorFlow的uff和onnx以及自定义模型的推理加速,对于pytorch有第三方接口torch2trt项目,但是这个需要定义好模型在加入,不能把模型和tensorrt分离

    import torch
    from torch2trt import torch2trt
    from torchvision.models.alexnet import alexnet
    
    # create some regular pytorch model...
    model = alexnet(pretrained=True).eval().cuda()
    
    # create example data
    x = torch.ones((1, 3, 224, 224)).cuda()
    
    # convert to TensorRT feeding sample data as input
    model_trt = torch2trt(model, [x])
    

    部署的时候还依赖pytorch环境,就没尝试。

    mxnet官方是有接口直接转tensorrt的,

    arg_params.update(aux_params)
    all_params = dict([(k, v.as_in_context(mx.gpu(0))) for k, v in arg_params.items()])
    executor = mx.contrib.tensorrt.tensorrt_bind(sym, ctx=mx.gpu(0), all_params=all_params,data=batch_shape, grad_req='null', force_rebind=True)
    y_gen = executor.forward(is_train=False, data=input)
    y_gen[0].wait_to_read()
    

    这个也没有尝试,主要还是想部署时分离,只用tensorrt环境,不需要装深度学习全家桶

    pytorch和mxnet转换为onnx的模型官方都有接口和文档,使用方法也很简单

    #mxnet转onnx
    sym = './resnet-50-symbol.json'
    params = './resnet-50-0000.params'
    input_shape = (1, 3, 224, 224)
    onnx_file = './resnet-50.onnx'
    converted_model_path = onnx_mxnet.export_model(sym, params, [input_shape], np.float32, onnx_file)
    
    #pytorch转onnx
    import torch
    import torchvision
    
    dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
    model = torchvision.models.alexnet(pretrained=True).cuda()
    
    # Providing input and output names sets the display names for values
    # within the model's graph. Setting these does not change the semantics
    # of the graph; it is only for readability.
    #
    # The inputs to the network consist of the flat list of inputs (i.e.
    # the values you would pass to the forward() method) followed by the
    # flat list of parameters. You can partially specify names, i.e. provide
    # a list here shorter than the number of inputs to the model, and we will
    # only set that subset of names, starting from the beginning.
    input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
    output_names = [ "output1" ]
    
    torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)
    

    转onnx问题记录

    1. 自定义层SegmentConsensus 不识别
      对于自定义层,在pytorch转onnx需要自定义,onnx转trt是还需要自定义,对于这种层还是建议搞懂底层原理,用基础的操作来实现,这个层比较简单,使用了mean和index_select操作实现了

    2. TracerWarning: There are 2 live references to the data region being modified when tracing in-place operator copy_ (possibly due to an assignment). This might cause the trace to be incorrect, because all other views that also reference this data will not reflect this change in the trace! On the other hand, if all other views use the same memory chunk, but are disjoint (e.g. are outputs of torch.split), this might still be safe
      这个错误是说修改的数据有两个引用导致无法trace,错误的代码如下:

    out[:, :-1, :fold] = x[:, 1:, :fold] # shift left
    out[:, 1:, fold: 2 * fold] = x[:, :-1, fold: 2 * fold]  # shift right
    out[:, :, 2 * fold:] = x[:, :, 2 * fold:] # not shift
    

    查了一些资料应该是说左边赋值是一个引用,切片又是一个引用,两个引用无法trace,那么把切片使用index_select替换

    left_side = torch.cat((x[:, 1:, :fold], torch.zeros(1, 1, fold, h, w)), dim=1)
    middle_side = torch.cat((torch.zeros(1, 1, fold, h, w), x[:, :n_segment - 1, fold: 2 * fold]), dim=1)
    out = torch.cat((left_side, middle_side, x[:, :, 2 * fold:]), dim=2)
    
    1. 模型部分转换为onnx
      保存的模型可能是pretrained的模型,实际使用中只需要用部分层,对于mxnet可以在sym文件中直接指定出口层,再转换即可
    sym, arg_params, aux_params = mx.model.load_checkpoint(pretrained, epoch)
    sym = get_output_sym(sym, 'fc1_output')
    arg_params.update(aux_params)
    onnx_mx.export_model(sym, arg_params, input_shape, onnx_file_path=onnx_file_path, verbose=True)
    

    对于pytorch可以继承torch.nn.Module将模型传进来自己进行修改定制

    class ExtractFeature(torch.nn.Module):
        def __init__(self, cnn, frames=16):
            super().__init__() 
            self.model = cnn
            self.num_segments = frames
            self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        def forward(self, data):
            
            # st = time.time()
            # print('feature extracting start')
            n = self.model
            pool = torch.nn.MaxPool2d(3,2)
            with torch.no_grad():
                input= data.view((-1, 3) + data.size()[-2:]).to(self.device)
                x=n.conv1(input)
                x=n.bn1(x)
                x=n.relu(x)
                x=n.maxpool(x)
                x=n.layer1(x)
                x=n.layer2(x)
                x=n.layer3(x)
                x=n.layer4(x)
                x=pool(x)
                x=x.flatten(start_dim=1)
                ndata=x
            data=ndata.view((-1, self.num_segments) + ndata.size()[1:])
            return data
    
    1. 模型调用不使用默认的forward
      模型继承torch.nn.Module,该类有个__call__方法可以使类可以像函数一样被调用,在__call__中调用了apply方法最终调用到forward方法,如果模型使用中不使用forward方法,该怎么转onnx呢?如下这种
      out = net.forward_features(x)
      显式调用了forward_features方法,开始想通过继承方式,将forward_features函数直接返回父类的forward,其实可以直接修改方法的指向,像下面这样直接修改指向即可
      OCR.forward = OCR.forward_ocr

    2. Exporting the operator GatherElements to ONNX opset version 9 is not supported
      opset9 不支持该op,可以将opset version调高,目前最高是12,越高支持的op越多,opset_version默认是9

    torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names,opset_version=11,)
    
    1. dynamic input
      动态输入包括batchsize, 以及可变h,w,使用dynamic_axes参数指定可变的维度
    torch.onnx.export(OCR, dummy_input ,onnx_ocr_forword_ocr_path, 
                         input_names=['input'], 
                         output_names=['segm_pred', 'segm_pred2', 'rbox', 'rbox2', 'angle', 'angle2', 'x'],
                         opset_version=11,
                         dynamic_axes={"input": {0: 'batch',2:'h', 3:'w'}})
    
    1. onnxruntime测试
      模型转换完成后需要测试onnx的模型和原模型的输出是否一致,先用onnxruntime来跑模型,运行时报错can’t load culib 10.1,找不到cuda库,查看了代码和官方文档,明确指定只支持cuda10.1,不是对应的版本重新安装对应的版本即可
      在这里插入图片描述

    tensorrt问题记录

    1. 在tensorrt官网下载最新的tensorrt7.1版本,安装好后,配置环境变量,库里面都是so库,和一些c文件,无法import tensorrt,查看官网说明发现tensorrt 的Python接口是不支持Windows的,无法在Windows下用Python接口
      在这里插入图片描述

    2. [TensorRT] ERROR: …/rtSafe/cuda/caskConvolutionRunner.cpp (290) - Cask Error in checkCaskExecError: 7 (Cask Convolution execution)
      [TensorRT] ERROR: FAILED_EXECUTION: std::exception
      这个问题是因为创建的engine和执行不在一个线程中,使用了多线程,将创建和执行放在一个线程中

    3. [TensorRT] ERROR: …/rtSafe/cuda/cudaConvolutionRunner.cpp (303) - Cudnn Error in execute: 7 (CUDNN_STATUS_MAPPING_ERROR)
      [TensorRT] ERROR: FAILED_EXECUTION: std::exception
      创建engine后不使用to(device)和cuda操作,pytorch和mxnet都需要将模型和数据cuda操作,需要删除

    4. [TensorRT] WARNING: Explicit batch network detected and batch size specified, use execute without batch size instead.
      [TensorRT] ERROR: Parameter check failed at: engine.cpp::resolveSlots::1024, condition: allInputDimensionsSpecified(routine)
      动态batchsize tensorrt不能直接构建engine,需要设置profile构建

    profile = builder.create_optimization_profile()
    profile.set_shape(                                                                                                                                          
                ModelData.INPUT_NAME,                                                                                                                               
                ModelData.MIN_INPUT_SHAPE,                                                                                                                          
                ModelData.OPT_INPUT_SHAPE,                                                                                                                          
                ModelData.MAX_INPUT_SHAPE)
                            
    config.add_optimization_profile(profile)
    engine = builder.build_engine(network,config)
    
    1. [TensorRT]ERROR: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
      看起来像是显存爆了,nvidia-smi -l 打开显存实时占用发现显存还剩很多,调试运行后发现分配的buffer size未负数,当使用动态batchsize时候第一个维度变成了-1,分配的size是负数就失败了,将负数变成正数在*batchsize分配buffer
    size = trt.volume(engine.get_binding_shape(binding)) * batch_size
    if size < 0:
       size *= -1
    dtype = trt.nptype(engine.get_binding_dtype(binding))
    # Allocate host and device buffers
    host_mem = cuda.pagelocked_empty(size, dtype)
    
    1. dynamic input,目标检测输入宽高不确定,和问题4中batchsize问题一样也需要在profile中设置H/W的最小值、典型值和最大值,还需要在allocate_buffers时传入w/h,如果不传入默认都是-1,算出来的size会很小,binding分别为输入和输出计算size并分配buffer,需要分别传入输入和输出的h_和w_
    for binding in engine:
        bind = engine.get_binding_shape(binding)
        vol = trt.volume(bind)
        # size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
        if binding == 'input':              
            size = trt.volume(engine.get_binding_shape(binding)) * batch_size * h_ * w_
        else:
            size = trt.volume(engine.get_binding_shape(binding)) * batch_size * math.ceil(h_ / 4) *  math.ceil(w_ / 4)
        if size < 0:
            size *= -1
    
        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        host_mem = cuda.pagelocked_empty(size, dtype)
    
    1. [TensorRT] ERROR: instance normalization doesn’t support dynamic input
      instance normalization 不支持动态输入,又是在目标检测模型中使用的,无法规避,这个问题也可以归类为不支持的op,可以
      在onnx自定义op然后在tensorrt定义plugin,也可以重写op,这里使用重写op实现,在torch/onnx/symbolic_opset9.py中将原有的instance_norm函数改写如下
    @parse_args('v', 'v', 'v', 'v', 'v', 'i', 'f', 'f', 'i')
    def instance_norm(g, input, weight, bias, running_mean, running_var, use_input_stats, momentum, eps,
                      cudnn_enabled):
                      
        axes = [-i for i in range(2, 0, -1)]
    
        two_cst = g.op("Constant", value_t=torch.tensor(2.))
        eps_cst = g.op("Constant", value_t=torch.tensor(eps))
    
        mean = g.op("ReduceMean", input, axes_i=axes)
        numerator = sub(g, input, mean)
        # variance = e((x - e(x))^2), and (x - e(x)) is the numerator in the layer_norm formula
        variance = g.op("ReduceMean", pow(g, numerator, two_cst), axes_i=axes)
        denominator = sqrt(g, add(g, variance, eps_cst))
    
        inst_norm = div(g, numerator, denominator)
        if not (weight is None or weight.node().mustBeNone()):
            inst_norm = mul(g, inst_norm, weight)
        if not (bias is None or bias.node().mustBeNone()):
            inst_norm = add(g, inst_norm, bias)
    
        return inst_norm
    
    1. instance_norm改写后报mul elementwise dimension mismatch [1,256,4,8]and [1,1,1,256]
      乘法维度不匹配,mul是在计算完均值和方差后成γ参数时报错的,根据pytorch的broadcasting 机制,不同维度的数据会通过expand和repeat操作达到相同维度,但是expand是尾部对齐,变成四维后增加的维度都在前面,最后一个维度都不是1无法进行repeat操作,导致维度不匹配,那直接UNsqueeze两维出来将channel放在第二个维度再broadcasting 就可以维度相同了
    @parse_args('v', 'v', 'v', 'v', 'v', 'i', 'f', 'f', 'i')
    def instance_norm(g, input, weight, bias, running_mean, running_var, use_input_stats, momentum, eps,
                      cudnn_enabled):
                      
        axes = [-i for i in range(2, 0, -1)]
    
        two_cst = g.op("Constant", value_t=torch.tensor(2.))
        eps_cst = g.op("Constant", value_t=torch.tensor(eps))
    
        mean = g.op("ReduceMean", input, axes_i=axes)
        numerator = sub(g, input, mean)
        # variance = e((x - e(x))^2), and (x - e(x)) is the numerator in the layer_norm formula
        variance = g.op("ReduceMean", pow(g, numerator, two_cst), axes_i=axes)
        denominator = sqrt(g, add(g, variance, eps_cst))
    
        inst_norm = div(g, numerator, denominator)
        weight = g.op("Unsqueeze", weight, axes_i=[-1])
        weight = g.op("Unsqueeze", weight, axes_i=[-1])
        bias = g.op("Unsqueeze", bias, axes_i=[-1])
        bias = g.op("Unsqueeze", bias, axes_i=[-1])
        if not (weight is None or weight.node().mustBeNone()):
            inst_norm = mul(g, inst_norm, weight)
        if not (bias is None or bias.node().mustBeNone()):
            inst_norm = add(g, inst_norm, bias)
    
        return inst_norm
    
    展开全文
  • PytorchMXNet对照学习

    2020-07-01 01:54:43
    PytorchMXNet对照学习 这篇是对MXNet tutorial网站的翻译和总结,适合有一定PyTorch经验,正在转向使用MXNet的同学阅读 1.运行效率 根据NVidia performance benchmarks 在2019年四月测试结果显示Apache MXNet 在 ...

    Pytorch与MXNet对照学习

    这篇是对MXNet tutorial网站的翻译和总结笔记,适合有一定PyTorch经验,正在转向使用MXNet的同学

    1.运行效率

    根据NVidia performance benchmarks 在2019年四月测试结果显示Apache MXNet 在 training ResNet-50时优于 PyTorch ~77% : 10,925 images 每秒 vs. 6,175.

    2.读取数据

    PyTorch:
    import torch
    x = torch.ones(5,3)
    y = x + 1
    y
    
    MXNet:

    创建tensor时,MXNet需要传入tuple

    from mxnet import nd
    
    x = nd.ones((5,3))
    y = x + 1
    y
    

    3. 创建模型

    PyTorch:
    import torch.nn as pt_nn
    
    pt_net = pt_nn.Sequential(
        pt_nn.Linear(28*28, 256),
        pt_nn.ReLU(),
        pt_nn.Linear(256, 10))
    
    MXNet:
    1. Dense中无需传入input size,MXNet会在第一次forward pass时自动推断input size。
    2. 可以在全连接层和卷积层里传入激活函数 e.g. activation=‘relu’
    import mxnet.gluon.nn as mx_nn
    
    mx_net = mx_nn.Sequential()
    mx_net.add(mx_nn.Dense(256, activation='relu'),
               mx_nn.Dense(10))
    mx_net.initialize()
    

    4. 损失函数与优化算法

    PyTorch:
    pt_loss_fn = pt_nn.CrossEntropyLoss()
    pt_trainer = torch.optim.SGD(pt_net.parameters(), lr=0.1)
    
    MXNet:
    1. 使用Trainer class, 可以接受一个优化算法作为参数 e.g. ‘sgd’
    2. 从network中获取参数使用.collect_params()
    mx_loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
    mx_trainer = gluon.Trainer(mx_net.collect_params(),
                               'sgd', {'learning_rate': 0.1})
    

    5. 训练

    PyTorch:
    import time
    
    for epoch in range(5):
        total_loss = .0
        tic = time.time()
        for X, y in pt_train_data:
            pt_trainer.zero_grad()
            loss = pt_loss_fn(pt_net(X.view(-1, 28*28)), y)
            loss.backward()
            pt_trainer.step()
            total_loss += loss.mean()
        print('epoch %d, avg loss %.4f, time %.2f' % (
            epoch, total_loss/len(pt_train_data), time.time()-tic))
    
    MXNet:
    1. 计算需要在autograd.record()范围内,才能在反向传播时自动积分
    2. 无需像PyTorch每次调用optimizer.zero_grad(),MXNet中默认写入新梯度,而不是累积
    3. update weights时,需要传入update的size,一般为batch_size
    4. 需要调用asscalar(),把多维数组转化为标量
    from mxnet import autograd
    
    for epoch in range(5):
        total_loss = .0
        tic = time.time()
        for X, y in mx_train_data:
            with autograd.record():
                loss = mx_loss_fn(mx_net(X), y)
            loss.backward()
            mx_trainer.step(batch_size=128)
            total_loss += loss.mean().asscalar()
        print('epoch %d, avg loss %.4f, time %.2f' % (
            epoch, total_loss/len(mx_train_data), time.time()-tic))
    
    展开全文
  • pytorch模型转mxnet

    千次阅读 2018-12-16 11:28:16
    gluon把mxnet再进行封装,封装的风格非常接近pytorch 使用gluon的好处是非常容易把pytorch模型向mxnet转化 唯一的问题是gluon封装还不成熟,封装好的layer不多,很多常用的layer 如concat,upsampling等layer都没有...

    介绍

    gluon把mxnet再进行封装,封装的风格非常接近pytorch

    使用gluon的好处是非常容易把pytorch模型向mxnet转化

    唯一的问题是gluon封装还不成熟,封装好的layer不多,很多常用的layer 如concat,upsampling等layer都没有

    这里关注如何把pytorch 模型快速转换成 mxnet基于symbol 和 exector设计的网络

    pytorch转mxnet module

    关键点:

    • mxnet 设计网络时symbol 名称要和pytorch初始化中各网络层名称对应 
    • torch.load()读入pytorch模型checkpoint 字典,取当中的'state_dict'元素,也是一个字典
    • pytorch state_dict 字典中key是网络层参数的名称,val是参数ndarray
    • pytorch 的参数名称的组织形式和mxnet一样,但是连接符号不同,pytorch是'.',而mxnet是'_'比如:

    pytorch '0.conv1.0.weight'

    mxnet  '0_conv1_0_weight'

    • pytorch 的参数array 和mxnet 的参数array 完全一样,只要名称对上,直接赋值即可初始化mxnet模型

    需要做的有以下几点:

    • 设计和pytorch网络对应的mxnet网络
    • 加载pytorch checkpoint
    • 调整pytorch checkpoint state_dict 的key名称和mxnet命名格式一致

    FlowNet2S PytorchToMxnet

    pytorch flownet2S 的checkpoint 可以在github上搜到

    import mxnet as mx
    from symbol_util import *
    import pickle
    
    def get_loss(data, label, loss_scale, name, get_input=False, is_sparse = False, type='stereo'):
    
        if type == 'stereo':
            data = mx.sym.Activation(data=data, act_type='relu',name=name+'relu')
        # loss
        if  is_sparse:
            loss =mx.symbol.Custom(data=data, label=label, name=name, loss_scale= loss_scale, is_l1=True,
                op_type='SparseRegressionLoss')
        else:
            loss = mx.sym.MAERegressionOutput(data=data, label=label, name=name, grad_scale=loss_scale)
        return (loss,data) if get_input else loss
    
    
    def flownet_s(loss_scale, is_sparse=False, name=''):
        img1 = mx.symbol.Variable('img1')
        img2 = mx.symbol.Variable('img2')
        data = mx.symbol.concat(img1,img2,dim=1)
        labels = {'loss{}'.format(i): mx.sym.Variable('loss{}_label'.format(i)) for i in range(0, 7)}
        # print('labels: ',labels)
        prediction = {}# a dict for loss collection
        loss = []#a list
    
        #normalize
        data = (data-125)/255
    
        # extract featrue
        conv1 = mx.sym.Convolution(data, pad=(3, 3), kernel=(7, 7), stride=(2, 2), num_filter=64, name=name + 'conv1_0')
        conv1 = mx.sym.LeakyReLU(data=conv1, act_type='leaky', slope=0.1)
    
        conv2 = mx.sym.Convolution(conv1, pad=(2, 2), kernel=(5, 5), stride=(2, 2), num_filter=128, name=name + 'conv2_0')
        conv2 = mx.sym.LeakyReLU(data=conv2, act_type='leaky', slope=0.1)
    
        conv3a = mx.sym.Convolution(conv2, pad=(2, 2), kernel=(5, 5), stride=(2, 2), num_filter=256, name=name + 'conv3_0')
        conv3a = mx.sym.LeakyReLU(data=conv3a, act_type='leaky', slope=0.1)
    
        conv3b = mx.sym.Convolution(conv3a, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=256, name=name + 'conv3_1_0')
        conv3b = mx.sym.LeakyReLU(data=conv3b, act_type='leaky', slope=0.1)
    
        conv4a = mx.sym.Convolution(conv3b, pad=(1, 1), kernel=(3, 3), stride=(2, 2), num_filter=512, name=name + 'conv4_0')
        conv4a = mx.sym.LeakyReLU(data=conv4a, act_type='leaky', slope=0.1)
    
        conv4b = mx.sym.Convolution(conv4a, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=512, name=name + 'conv4_1_0')
        conv4b = mx.sym.LeakyReLU(data=conv4b, act_type='leaky', slope=0.1)
    
        conv5a = mx.sym.Convolution(conv4b, pad=(1, 1), kernel=(3, 3), stride=(2, 2), num_filter=512, name=name + 'conv5_0')
        conv5a = mx.sym.LeakyReLU(data=conv5a, act_type='leaky', slope=0.1)
    
        conv5b = mx.sym.Convolution(conv5a, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=512, name=name + 'conv5_1_0')
        conv5b = mx.sym.LeakyReLU(data=conv5b, act_type='leaky', slope=0.1)
    
        conv6a = mx.sym.Convolution(conv5b, pad=(1, 1), kernel=(3, 3), stride=(2, 2), num_filter=1024, name=name + 'conv6_0')
        conv6a = mx.sym.LeakyReLU(data=conv6a, act_type='leaky', slope=0.1)
    
        conv6b = mx.sym.Convolution(conv6a, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=1024,
                                    name=name + 'conv6_1_0')
        conv6b = mx.sym.LeakyReLU(data=conv6b, act_type='leaky', slope=0.1, )
    
        #predict flow
        pr6 = mx.sym.Convolution(conv6b, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=2,
                                 name=name + 'predict_flow6')
        prediction['loss6'] = pr6
    
        upsample_pr6to5 = mx.sym.Deconvolution(pr6, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=2,
                                               name=name + 'upsampled_flow6_to_5', no_bias=True)
        upconv5 = mx.sym.Deconvolution(conv6b, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=512,
                                       name=name + 'deconv5_0', no_bias=False)
        upconv5 = mx.sym.LeakyReLU(data=upconv5, act_type='leaky', slope=0.1)
        iconv5 = mx.sym.Concat(conv5b, upconv5, upsample_pr6to5, dim=1)
    
    
        pr5 = mx.sym.Convolution(iconv5, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=2,
                                 name=name + 'predict_flow5')
        prediction['loss5'] = pr5
    
        upconv4 = mx.sym.Deconvolution(iconv5, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=256,
                                       name=name + 'deconv4_0', no_bias=False)
        upconv4 = mx.sym.LeakyReLU(data=upconv4, act_type='leaky', slope=0.1)
    
        upsample_pr5to4 = mx.sym.Deconvolution(pr5, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=2,
                                               name=name + 'upsampled_flow5_to_4', no_bias=True)
    
        iconv4 = mx.sym.Concat(conv4b, upconv4, upsample_pr5to4)
    
        pr4 = mx.sym.Convolution(iconv4, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=2,
                                 name=name + 'predict_flow4')
        prediction['loss4'] = pr4
    
        upconv3 = mx.sym.Deconvolution(iconv4, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=128,
                                       name=name + 'deconv3_0', no_bias=False)
        upconv3 = mx.sym.LeakyReLU(data=upconv3, act_type='leaky', slope=0.1)
    
        upsample_pr4to3 = mx.sym.Deconvolution(pr4, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=2,
                                               name= name + 'upsampled_flow4_to_3', no_bias=True)
        iconv3 = mx.sym.Concat(conv3b, upconv3, upsample_pr4to3)
    
        pr3 = mx.sym.Convolution(iconv3, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=2,
                                 name=name + 'predict_flow3')
        prediction['loss3'] = pr3
    
        upconv2 = mx.sym.Deconvolution(iconv3, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=64,
                                       name=name + 'deconv2_0', no_bias=False)
        upconv2 = mx.sym.LeakyReLU(data=upconv2, act_type='leaky', slope=0.1)
    
        upsample_pr3to2 = mx.sym.Deconvolution(pr3, pad=(1, 1), kernel=(4, 4), stride=(2, 2), num_filter=2,
                                               name=name + 'upsampled_flow3_to_2', no_bias=True)
        iconv2 = mx.sym.Concat(conv2, upconv2, upsample_pr3to2)
    
        pr2 = mx.sym.Convolution(iconv2, pad=(1, 1), kernel=(3, 3), stride=(1, 1), num_filter=2,
                                 name=name + 'predict_flow2')
        prediction['loss2'] = pr2
        flow = mx.sym.UpSampling(arg0=pr2,scale=4,num_filter=2,num_args = 1,sample_type='nearest', name='upsample_flow2_to_1')
        # ignore the loss functions with loss scale of zero
        keys = loss_scale.keys()
        # keys.sort()
        #obtain the symbol of the losses
        for key in keys:
            # loss.append(get_loss(prediction[key] * 20, labels[key], loss_scale[key], name=key + name,get_input=False, is_sparse=is_sparse, type='flow'))
            loss.append(mx.sym.MAERegressionOutput(data=prediction[key] * 20, label=labels[key], name=key + name, grad_scale=loss_scale[key]))
        # print('loss:  ',loss)
        #group 暂时不知道为嘛要group
        loss_group =mx.sym.Group(loss)
        # print('net:  ',loss_group)
        return loss_group,flow
    
    import gluonbook as gb
    import torch
    from utils.frame_utils import *
    import numpy as np
    if __name__ == '__main__':
        checkpoint = torch.load("C:/Users/junjie.huang/PycharmProjects/flownet2_mxnet/flownet2_pytorch/FlowNet2-S_checkpoint.pth.tar")
        # # checkpoint是一个字典
        print(isinstance(checkpoint['state_dict'], dict))
        # # 打印checkpoint字典中的key名
        print('keys of checkpoint:')
        for i in checkpoint:
            print(i)
        print('')
        # # pytorch 模型参数保存在一个key名为'state_dict'的元素中
        state_dict = checkpoint['state_dict']
        # # state_dict也是一个字典
        print('keys of state_dict:')
        for i in state_dict:
            print(i)
            # print(state_dict[i].size())
        print('')
        # print(state_dict)
        #字典的value是torch.tensor
        print(torch.is_tensor(state_dict['conv1.0.weight']))
        #查看某个value的size
        print(state_dict['conv1.0.weight'].size())
    
        #flownet-mxnet init
        loss_scale={'loss2': 1.00,
                   'loss3': 1.00,
                   'loss4': 1.00,
                   'loss5': 1.00,
                   'loss6': 1.00}
        loss,flow = flownet_s(loss_scale=loss_scale,is_sparse=False)
        print('loss information: ')
        print('loss:',loss)
        print('type:',type(loss))
        print('list_arguments:',loss.list_arguments())
        print('list_outputs:',loss.list_outputs())
        print('list_inputs:',loss.list_inputs())
        print('')
    
        print('flow information: ')
        print('flow:',flow)
        print('type:',type(flow))
        print('list_arguments:',flow.list_arguments())
        print('list_outputs:',flow.list_outputs())
        print('list_inputs:',flow.list_inputs())
        print('')
        name_mxnet = symbol.list_arguments()
        print(type(name_mxnet))
        for key in name_mxnet:
            print(key)
    
        name_mxnet.sort()
        for key in name_mxnet:
            print(key)
        print(name_mxnet)
    
        shapes = (1, 3, 384, 512)
        ctx = gb.try_gpu()
        # exe = symbol.simple_bind(ctx=ctx, img1=shapes,img2=shapes)
        exe = flow.simple_bind(ctx=ctx, img1=shapes, img2=shapes)
        print('exe type: ',type(exe))
        print('exe:  ',exe)
        #module
        # mod = mx.mod.Module(flow)
        # print('mod type: ', type(exe))
        # print('mod:  ', exe)
    
        pim1 = read_gen("C:/Users/junjie.huang/PycharmProjects/flownet2_mxnet/data/0000007-img0.ppm")
        pim2 = read_gen("C:/Users/junjie.huang/PycharmProjects/flownet2_mxnet/data/0000007-img1.ppm")
        print(pim1.shape)
    
        '''使用pytorch 的state_dict 初始化 mxnet 模型参数'''
        for key in state_dict:
            # print(type(key))
            k_split = key.split('.')
            key_mx = '_'.join(k_split)
            # print(key,key_mx)
            try:
                exe.arg_dict[key_mx][:]=state_dict[key].data
            except:
                print(key,exe.arg_dict[key_mx].shape,state_dict[key].data.shape)
    
        
        exe.arg_dict['img1'][:] = pim1[np.newaxis, :, :, :].transpose(0, 3, 1, 2).data
        exe.arg_dict['img2'][:] = pim2[np.newaxis, :, :, :].transpose(0, 3, 1, 2).data
    
        result = exe.forward()
        print('result:  ',type(result))
        # for tmp in result:
        #     print(type(tmp))
        #     print(tmp.shape)
        # color = flow2color(exe.outputs[0].asnumpy()[0].transpose(1, 2, 0))
        outputs = exe.outputs
        print('output type:  ',type(outputs))
        # for tmp in outputs:
        #     print(type(tmp))
        #     print(tmp.shape)
    
        #来自pytroch flownet2
        from visualize import flow2color
        # color = flow2color(exe.outputs[0].asnumpy()[0].transpose(1,2,0))
        flow_color = flow2color(exe.outputs[0].asnumpy()[0].transpose(1, 2, 0))
        print('color type:',type(flow_color))
        import matplotlib.pyplot as plt
        #来自pytorch
        from torchvision.transforms import ToPILImage
        TF = ToPILImage()
        images = TF(flow_color)
        images.show()
        # plt.imshow(color)
    

     

    展开全文
  • AWS 是一组Docker映像,用于在TensorFlow,TensorFlow 2,PyTorchMXNet中训练和提供模型。 深度学习容器使用TensorFlow和MXNet,Nvidia CUDA(用于GPU实例)和英特尔MKL(用于CPU实例)库提供优化的环境,可在...
  • 参考文献 ...一. 各大主流开源框架 这些框架都支持CUDA,所以表中的编程语言里没有将cuda写上。 还有一些没有列入表中,是因为他它们已经升级为大家更喜欢或者使用起来更...不管怎么说tensorflow、pytorch是必须会的,
  • 自:https://www.jianshu.com/p/16f69668ce25 这是一篇总结文,给大家来捋清楚...现如今开源生态非常完善,深度学习相关的开源框架众多,光是为人熟知的就有caffe,tensorflow,pytorch/caffe2,keras,mxnet,paddl
  • http://issuehub.io/?label[]=ONNX mxnet pytorch 转换onnx模型的问题解答
  • 有关将 PyTorch 转换为 ONNX,然后加载到 MXNet 的教程关于ONNX 概述将 PyTorch 模型转换为 ONNX,然后将模型加载到 MXNet 中函数介绍参数 转载自: ...
  • mxnet模型转pytorch

    2020-08-09 11:58:55
    文章目录核心:模型序列字典中键名的匹配1 读取mxnet模型2 读取pytorch模型3 键名匹配4 键值复制5 测试且保存模型6 问题mxnet中bn的输入输出和pytorch模型中的不一致使用model.named_modules()函数,输出n,m分别为...
  • 深度学习框架转换:pytorch onnx

    千次阅读 2020-06-04 10:01:42
    它使得不同的人工智能框架(如 Pytorch, MXNet)可以采用相同格式存储模型数据并交互。ONNX 的规范及代码主要由微软,亚马逊 ,Facebook 和 IBM 等公司共同开发,以开放源代码的方式托管在 Github 上。目前官方支持...
  • mxnet模型转pytorch模型

    千次阅读 2019-07-30 09:05:58
    转换基本流程: 1)创建pytorch的网络结构模型; 2)利用mxnet来读取其存储的预训练模型,用于读取模型的参数; 3)遍历mxnet加载的模型参数;...流程基本是与caffe模型转pytorch模型这篇文章...
  • 主要介绍了MxNet预训练模型到Pytorch模型的转换方式,具有很好的参考价值,希望对大家有所帮助。一起跟随小编过来看看吧
  • 数据格式转换: mxnet的idx,rec格式数据生成参考:insight...mxnet的idx,rec格式数据转换成pytorch容易读取的数据参考:InsightFace_Pytorch MobileFaceNet各种实现方式: mxnet:insightface tensorflow:M...
  • MxNet预训练模型到Pytorch模型的转换

    千次阅读 2018-06-28 20:28:30
    预训练模型在不同深度学习框架中的转换是一种常见的任务。今天刚好DPN预训练模型转换问题,顺手将这个过程记录一下。核心转换函数如下所示... _, mxnet_weights, mxnet_aux = mxnet.model.load_checkpoint(checkpoi...
  • Jiahao YAO (Peking University): CoreML, MXNet Emitter, PyTorch Parser; HomePage Ru ZHANG (Chinese Academy of Sciences): CoreML Emitter, DarkNet Parser, Keras, TensorFlow frozen graph Parser; Yolo and ...
  • 使用dali加速,前提是 gpu没有跑满,不然效果也不大 ...1、【Pytorch】nvidia-dali——一种加速数据增强的方法 https://blog.csdn.net/weixin_42028608/article/details/105564060 2、官方自定义数据接口使用介绍 ...
  • 不管是tensorflow、pytorchmxnet部署时移植到目标框架,那么先将模型成onnx这个中间商,再成目标框架或NPU是明智的选择。 二、使用步骤 1.成onnx示例代码 model = MobileFaceNet() model.load_state_dict...
  • https://github.com/jnulzl/Pytorch_Retinaface_To_Caffe 作者在 mycaffe 提供了该层源码 这里的caffe是,上采样NEAREST实现的 #####################Upsample_nearest_1##################### layer{ name:...
  • 简介 数据操作 注意:torch.from_numpy()所有在CPU上的Tensor(除了CharTensor)都支持与NumPy数组相互转换。 torch.tensor()方法会进行数据拷贝,返回的Tensor和原来的数据...《动手学深度学习》(PyTorch版) ...
  • Pytorch模型(.pth)onnx模型(.onnx)

    千次阅读 2021-01-19 12:02:00
    它使得不同的人工智能框架(如Pytorch, MXNet)可以采用相同格式存储模型数据并交互。 ONNX的规范及代码主要由微软,亚马逊 ,Facebook 和 IBM等公司共同开发,以开放源代码的方式托管在Github上。目前官方支持加载...
  • PyTorch ONNX & TensorRT

    2020-12-19 17:56:08
    Model Quantization的一个比较经典的流程:对PyTorch训练好的model,先onnx,再成比如TensorRT(即quantization部分)。ONNX可以看成一个model的中转站,可以转换各种形式的model。(Quantization根据平台不同而...
  • namedtuple('Batch', ['data']) mod.forward(Batch([mx.nd.array(data)])) output = mod.get_outputs() mxnet转onnx def mxnet_onnx(): ########################################### input_shape = (1, 3, 224, 224...

空空如也

空空如也

1 2 3 4 5
收藏数 91
精华内容 36
关键字:

pytorch转mxnet