精华内容
下载资源
问答
  • model.state_dict和model.parametersmodel.named_parameters区别 在pytorch中,针对model,有上述方法,他们都包含模型参数,但是他们有些区别。下面对它们简单总结。 model.state_dict::该方法常用来保存模型,...

    model.state_dict和model.parameters和model.named_parameters区别


    在pytorch中,针对model,有上述方法,他们都包含模型参数,但是他们有些区别。下面对它们简单总结。
    model.state_dict::该方法常用来保存模型,它返回的是一个字典,分别对应模型中的键和值
    model.parameters:该方法只输出模型参数,返回的是一个迭代器,可以通过for param in model.parameters()遍历所有的参数
    model.named_parameters:该方法可以输出模型的参数和该参数对应层的名字,返回的也是一个迭代器,可以通过for layer_name, param in model.named_parameters()遍历所有的参数名和参数

    参考:https://blog.csdn.net/qq_33590958/article/details/103544175

    展开全文
  • Pytorch中的model.modules,model.named_modules,model.children,model.named_children,model.parameter,model.named_parameters.model.state_dict实例方法的区别和联系1. model.modules()2. model.named_modules()3....

    Pytorch中的model.modules,model.named_modules,model.children,model.named_children,model.parameter,model.named_parameters.model.state_dict实例方法的区别和联系


    模型示例:

    import torch 
    import torch.nn as nn 
    
    class Net(nn.Module):
    
        def __init__(self, num_class=10):
            super().__init__()
            self.features = nn.Sequential(
                nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3),
                nn.BatchNorm2d(6),
                nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=2, stride=2),
                nn.Conv2d(in_channels=6, out_channels=9, kernel_size=3),
                nn.BatchNorm2d(9),
                nn.ReLU(inplace=True),
                nn.MaxPool2d(kernel_size=2, stride=2)
            )
    
            self.classifier = nn.Sequential(
                nn.Linear(9*8*8, 128),
                nn.ReLU(inplace=True),
                nn.Dropout(),
                nn.Linear(128, num_class)
            )
    
        def forward(self, x):
            output = self.features(x)
            output = output.view(output.size()[0], -1)
            output = self.classifier(output)
        
            return output
    
    model = Net()
    

    网络Net本身是一个nn.Module的子类,它又包含了features和classifier两个由Sequential容器组成的nn.Module子类,features和classifier各自又包含众多的网络层,它们都属于nn.Module子类,所以从外到内共有3个层次。
    下面我们来看这几个实例方法的返回值都是什么:

    In [7]: model.named_modules()                                                                                                       
    Out[7]: <generator object Module.named_modules at 0x7f5db88f3840>
    
    In [8]: model.modules()                                                         
    Out[8]: <generator object Module.modules at 0x7f5db3f53c00>
    
    In [9]: model.children()                                                        
    Out[9]: <generator object Module.children at 0x7f5db3f53408>
    
    In [10]: model.named_children()                                                 
    Out[10]: <generator object Module.named_children at 0x7f5db80305e8>
    
    In [11]: model.parameters()                                                     
    Out[11]: <generator object Module.parameters at 0x7f5db3f534f8>
    
    In [12]: model.named_parameters()                                               
    Out[12]: <generator object Module.named_parameters at 0x7f5d42da7570>
    
    In [13]: model.state_dict()                                                     
    Out[13]: 
    OrderedDict([('features.0.weight', tensor([[[[ 0.1200, -0.1627, -0.0841],
                            [-0.1369, -0.1525,  0.0541],
                            [ 0.1203,  0.0564,  0.0908]],
                          ……
    

    可以看出,除了model.state_dict()返回的是一个字典,其他几个方法返回值都显示的是一个生成器,是一个可迭代变量,我们通过列表推导式用for循环将返回值取出来进一步进行观察:

    In [14]: model_modules = [x for x in model.modules()]          
    In [15]: model_named_modules = [x for x in model.named_modules()]        
    In [16]: model_children = [x for x in model.children()]         
    In [17]: model_named_children = [x for x in model.named_children()]                                                                 
    In [18]: model_parameters = [x for x in model.parameters()]                                                                         
    In [19]: model_named_parameters = [x for x in model.named_parameters()]
    

    1. model.modules()

    model.modules()迭代遍历模型的所有子层,所有子层即指nn.Module子类,在本文的例子中,Net(), features(), classifier(),以及nn.xxx构成的卷积,池化,ReLU, Linear, BN, Dropout等都是nn.Module子类,也就是model.modules()会迭代的遍历它们所有对象。我们看一下列表model_modules:

    In [20]: model_modules                                                                                                               
    Out[20]: 
    [Net(
       (features): Sequential(
         (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
         (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
         (2): ReLU(inplace)
         (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
         (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
         (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
         (6): ReLU(inplace)
         (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
       )
       (classifier): Sequential(
         (0): Linear(in_features=576, out_features=128, bias=True)
         (1): ReLU(inplace)
         (2): Dropout(p=0.5)
         (3): Linear(in_features=128, out_features=10, bias=True)
       )
     ), 
    Sequential(
       (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
       (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
       (2): ReLU(inplace)
       (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
       (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
       (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
       (6): ReLU(inplace)
       (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
     ), 
    Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1)), 
    BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), 
    ReLU(inplace), 
    MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), 
    Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1)), 
    BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), 
    ReLU(inplace), 
    MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), 
    Sequential(
       (0): Linear(in_features=576, out_features=128, bias=True)
       (1): ReLU(inplace)
       (2): Dropout(p=0.5)
       (3): Linear(in_features=128, out_features=10, bias=True)
     ), 
    Linear(in_features=576, out_features=128, bias=True), 
    ReLU(inplace), 
    Dropout(p=0.5), 
    Linear(in_features=128, out_features=10, bias=True)]
    
    In [21]: len(model_modules)                                                                                                          
    Out[21]: 15
    

    可以看出,model_modules列表中共有15个元素,首先是整个Net,然后遍历了Net下的features子层,进一步遍历了feature下的所有层,然后又遍历了classifier子层以及其下的所有层。所以说model.modules()能够迭代地遍历模型的所有子层。

    2. model.named_modules()

    顾名思义,它就是有名字的model.modules()。model.named_modules()不但返回模型的所有子层,还会返回这些层的名字:

    In [28]: len(model_named_modules)                                                                                                    
    Out[28]: 15
    
    In [29]: model_named_modules                                                                                                         
    Out[29]: 
    [('', Net(
        (features): Sequential(
          (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
          (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace)
          (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
          (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
          (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (6): ReLU(inplace)
          (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
        )
        (classifier): Sequential(
          (0): Linear(in_features=576, out_features=128, bias=True)
          (1): ReLU(inplace)
          (2): Dropout(p=0.5)
          (3): Linear(in_features=128, out_features=10, bias=True)
        )
      )), 
    ('features', Sequential(
        (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace)
        (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
        (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
        (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6): ReLU(inplace)
        (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      )), 
    ('features.0', Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))), 
    ('features.1', BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)), ('features.2', ReLU(inplace)), 
    ('features.3', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)), 
    ('features.4', Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))), 
    ('features.5', BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)), ('features.6', ReLU(inplace)), 
    ('features.7', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)), 
    ('classifier',
      Sequential(
        (0): Linear(in_features=576, out_features=128, bias=True)
        (1): ReLU(inplace)
        (2): Dropout(p=0.5)
        (3): Linear(in_features=128, out_features=10, bias=True)
      )), 
    ('classifier.0', Linear(in_features=576, out_features=128, bias=True)), 
    ('classifier.1', ReLU(inplace)), 
    ('classifier.2', Dropout(p=0.5)), 
    ('classifier.3', Linear(in_features=128, out_features=10, bias=True))]
    

    可以看出,model.named_modules()也遍历了15个元素,但每个元素都有了自己的名字,从名字可以看出,除了在模型定义时有命名的features和classifier,其它层的名字都是PyTorch内部按一定规则自动命名的。返回层以及层的名字的好处是可以按名字通过迭代的方法修改特定的层,如果在模型定义的时候就给每个层起了名字,比如卷积层都是conv1,conv2…的形式,那么我们可以这样处理:

    for name, layer in model.named_modules():
        if 'conv' in name:
            ...对layer进行处理
    

    当然,在没有返回名字的情形中,采用isinstance()函数也可以完成上述操作:

    for layer in model.modules():
        if isinstance(layer, nn.Conv2d):
            ...对layer进行处理
    

    3. model.children()

    如果把这个网络模型Net按层次从外到内进行划分的话,features和classifier是Net的子层,而conv2d, ReLU, BatchNorm, Maxpool2d这些有时features的子层, Linear, Dropout, ReLU等是classifier的子层,上面的model.modules()不但会遍历模型的子层,还会遍历子层的子层,以及所有子层。
    而model.children()只会遍历模型的子层,这里即是features和classifier。

    In [22]: len(model_children)                                                                                                         
    Out[22]: 2
    
    In [22]: model_children                                                                                                              
    Out[22]: 
    [Sequential(
       (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
       (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
       (2): ReLU(inplace)
       (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
       (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
       (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
       (6): ReLU(inplace)
       (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
     ), 
    Sequential(
       (0): Linear(in_features=576, out_features=128, bias=True)
       (1): ReLU(inplace)
       (2): Dropout(p=0.5)
       (3): Linear(in_features=128, out_features=10, bias=True)
     )]
    

    4. model.named_children()

    model.named_children()就是带名字的model.children(), 相比model.children(), model.named_children()不但迭代的遍历模型的子层,还会返回子层的名字:

    In [23]: len(model_named_children)                                                                                                   
    Out[23]: 2
    
    In [24]: model_named_children                                                                                                        
    Out[24]: 
    [('features', Sequential(
        (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
        (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (2): ReLU(inplace)
        (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
        (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
        (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6): ReLU(inplace)
        (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      )), 
    ('classifier', Sequential(
        (0): Linear(in_features=576, out_features=128, bias=True)
        (1): ReLU(inplace)
        (2): Dropout(p=0.5)
        (3): Linear(in_features=128, out_features=10, bias=True)
      ))]
    

    对比上面的model.children(), 这里的model.named_children()还返回了两个子层的名称:features 和 classifier .

    5. model.parameters()

    迭代地返回模型的所有参数。
    Python3返回的是迭代器,model_parameters = list(model.parameters())

    In [30]: len(model_parameters)                                                                                                       
    Out[30]: 12
    
    In [31]: model_parameters                                                                                                            
    Out[31]: 
    [Parameter containing:
     tensor([[[[ 0.1200, -0.1627, -0.0841],
               [-0.1369, -0.1525,  0.0541],
               [ 0.1203,  0.0564,  0.0908]],
               ……
              [[-0.1587,  0.0735, -0.0066],
               [ 0.0210,  0.0257, -0.0838],
               [-0.1797,  0.0675,  0.1282]]]], requires_grad=True),
     Parameter containing:
     tensor([-0.1251,  0.1673,  0.1241, -0.1876,  0.0683,  0.0346],
            requires_grad=True),
     Parameter containing:
     tensor([0.0072, 0.0272, 0.8620, 0.0633, 0.9411, 0.2971], requires_grad=True),
     Parameter containing:
     tensor([0., 0., 0., 0., 0., 0.], requires_grad=True),
     Parameter containing:
     tensor([[[[ 0.0632, -0.1078, -0.0800],
               [-0.0488,  0.0167,  0.0473],
               [-0.0743,  0.0469, -0.1214]],
               …… 
              [[-0.1067, -0.0851,  0.0498],
               [-0.0695,  0.0380, -0.0289],
               [-0.0700,  0.0969, -0.0557]]]], requires_grad=True),
     Parameter containing:
     tensor([-0.0608,  0.0154,  0.0231,  0.0886, -0.0577,  0.0658, -0.1135, -0.0221,
              0.0991], requires_grad=True),
     Parameter containing:
     tensor([0.2514, 0.1924, 0.9139, 0.8075, 0.6851, 0.4522, 0.5963, 0.8135, 0.4010],
            requires_grad=True),
     Parameter containing:
     tensor([0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True),
     Parameter containing:
     tensor([[ 0.0223,  0.0079, -0.0332,  ..., -0.0394,  0.0291,  0.0068],
             [ 0.0037, -0.0079,  0.0011,  ..., -0.0277, -0.0273,  0.0009],
             [ 0.0150, -0.0110,  0.0319,  ..., -0.0110, -0.0072, -0.0333],
             ...,
             [-0.0274, -0.0296, -0.0156,  ...,  0.0359, -0.0303, -0.0114],
             [ 0.0222,  0.0243, -0.0115,  ...,  0.0369, -0.0347,  0.0291],
             [ 0.0045,  0.0156,  0.0281,  ..., -0.0348, -0.0370, -0.0152]],
            requires_grad=True),
     Parameter containing:
     tensor([ 0.0072, -0.0399, -0.0138,  0.0062, -0.0099, -0.0006, -0.0142, -0.0337,
              ……
             -0.0370, -0.0121, -0.0348, -0.0200, -0.0285,  0.0367,  0.0050, -0.0166],
            requires_grad=True),
     Parameter containing:
     tensor([[-0.0130,  0.0301,  0.0721,  ..., -0.0634,  0.0325, -0.0830],
             [-0.0086, -0.0374, -0.0281,  ..., -0.0543,  0.0105,  0.0822],
             [-0.0305,  0.0047, -0.0090,  ...,  0.0370, -0.0187,  0.0824],
             ...,
             [ 0.0529, -0.0236,  0.0219,  ...,  0.0250,  0.0620, -0.0446],
             [ 0.0077, -0.0576,  0.0600,  ..., -0.0412, -0.0290,  0.0103],
             [ 0.0375, -0.0147,  0.0622,  ...,  0.0350,  0.0179,  0.0667]],
            requires_grad=True),
     Parameter containing:
     tensor([-0.0709, -0.0675, -0.0492,  0.0694,  0.0390, -0.0861, -0.0427, -0.0638,
             -0.0123,  0.0845], requires_grad=True)]
    

    6. model.named_parameters()

    如果你是从前面看过来的,就会知道,这里就是迭代的返回带有名字的参数,会给每个参数加上带有 .weight或 .bias的名字以区分权重和偏置:

    In [32]: len(model_named_parameters)                                                                                                 
    Out[32]: 12
    
    In [33]: model_named_parameters                                                                                                      
    Out[33]: 
    [('features.0.weight', Parameter containing:
      tensor([[[[ 0.1200, -0.1627, -0.0841],
                [-0.1369, -0.1525,  0.0541],
                [ 0.1203,  0.0564,  0.0908]],
               ……
               [[-0.1587,  0.0735, -0.0066],
                [ 0.0210,  0.0257, -0.0838],
                [-0.1797,  0.0675,  0.1282]]]], requires_grad=True)),
     ('features.0.bias', Parameter containing:
      tensor([-0.1251,  0.1673,  0.1241, -0.1876,  0.0683,  0.0346],
             requires_grad=True)),
     ('features.1.weight', Parameter containing:
      tensor([0.0072, 0.0272, 0.8620, 0.0633, 0.9411, 0.2971], requires_grad=True)),
     ('features.1.bias', Parameter containing:
      tensor([0., 0., 0., 0., 0., 0.], requires_grad=True)),
     ('features.4.weight', Parameter containing:
      tensor([[[[ 0.0632, -0.1078, -0.0800],
                [-0.0488,  0.0167,  0.0473],
                [-0.0743,  0.0469, -0.1214]],
               ……
               [[-0.1067, -0.0851,  0.0498],
                [-0.0695,  0.0380, -0.0289],
                [-0.0700,  0.0969, -0.0557]]]], requires_grad=True)),
     ('features.4.bias', Parameter containing:
      tensor([-0.0608,  0.0154,  0.0231,  0.0886, -0.0577,  0.0658, -0.1135, -0.0221,
               0.0991], requires_grad=True)),
     ('features.5.weight', Parameter containing:
      tensor([0.2514, 0.1924, 0.9139, 0.8075, 0.6851, 0.4522, 0.5963, 0.8135, 0.4010],
             requires_grad=True)),
     ('features.5.bias', Parameter containing:
      tensor([0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)),
     ('classifier.0.weight', Parameter containing:
      tensor([[ 0.0223,  0.0079, -0.0332,  ..., -0.0394,  0.0291,  0.0068],
              ……
              [ 0.0045,  0.0156,  0.0281,  ..., -0.0348, -0.0370, -0.0152]],
             requires_grad=True)),
     ('classifier.0.bias', Parameter containing:
      tensor([ 0.0072, -0.0399, -0.0138,  0.0062, -0.0099, -0.0006, -0.0142, -0.0337,
               ……
              -0.0370, -0.0121, -0.0348, -0.0200, -0.0285,  0.0367,  0.0050, -0.0166],
             requires_grad=True)),
     ('classifier.3.weight', Parameter containing:
      tensor([[-0.0130,  0.0301,  0.0721,  ..., -0.0634,  0.0325, -0.0830],
              [-0.0086, -0.0374, -0.0281,  ..., -0.0543,  0.0105,  0.0822],
              [-0.0305,  0.0047, -0.0090,  ...,  0.0370, -0.0187,  0.0824],
              ...,
              [ 0.0529, -0.0236,  0.0219,  ...,  0.0250,  0.0620, -0.0446],
              [ 0.0077, -0.0576,  0.0600,  ..., -0.0412, -0.0290,  0.0103],
              [ 0.0375, -0.0147,  0.0622,  ...,  0.0350,  0.0179,  0.0667]],
             requires_grad=True)),
     ('classifier.3.bias', Parameter containing:
      tensor([-0.0709, -0.0675, -0.0492,  0.0694,  0.0390, -0.0861, -0.0427, -0.0638,
              -0.0123,  0.0845], requires_grad=True))]
    

    7. model.state_dict()

    model.state_dict()直接返回模型的字典,和前面几个方法不同的是这里不需要迭代,它本身就是一个字典,可以直接通过修改state_dict来修改模型各层的参数,用于参数剪枝特别方便。详细的state_dict方法,在我的这篇文章中有介绍:PyTorch模型的保存与加载

    展开全文
  • 1、model.named_parameters(),迭代打印model.named_parameters()将会打印每一次迭代元素的名字和param for name, param in model.named_parameters(): print(name,param.requires_grad) param.requires_grad=...

    1、model.named_parameters(),迭代打印model.named_parameters()将会打印每一次迭代元素的名字和param

    for name, param in model.named_parameters():
    	print(name,param.requires_grad)
    	param.requires_grad=False
    

    2、model.parameters(),迭代打印model.parameters()将会打印每一次迭代元素的param而不会打印名字,这是他和named_parameters的区别,两者都可以用来改变requires_grad的属性

    for  param in model.parameters():
    	print(param.requires_grad)
    	param.requires_grad=False
    

    3、model.state_dict().items() 每次迭代打印该选项的话,会打印所有的name和param,但是这里的所有的param都是requires_grad=False,没有办法改变requires_grad的属性,所以改变requires_grad的属性只能通过上面的两种方式。

    for name, param in model.state_dict().items():
    	print(name,param.requires_grad=True)
    

    4、改变了requires_grad之后要修改optimizer的属性

            optimizer = optim.SGD(
                filter(lambda p: p.requires_grad, model.parameters()),   #只更新requires_grad=True的参数
                lr=cfg.TRAIN.LR,
                momentum=cfg.TRAIN.MOMENTUM,
                weight_decay=cfg.TRAIN.WD,
                nesterov=cfg.TRAIN.NESTEROV
            )
    

    5、随机参数初始化

    def init_weights(m):
        if isinstance(m, nn.Conv2d):
            torch.nn.init.xavier_uniform(m.weight.data)
    model.apply(init_weights)
    
    展开全文
  • pytorch中model.parameters()函数

    千次阅读 2020-04-08 11:28:14
    optimizer = optim.Adam(model.parameters(), lr=learning_rate) #优化函数,model.parameters()为该实例中可优化的参数,lr为参数优化的选项(学习率等) for name, param in model.named_parameters(): #查看可...
    model = ConvNet() #创建了一个实例
    optimizer = optim.Adam(model.parameters(), lr=learning_rate) #优化函数,model.parameters()为该实例中可优化的参数,lr为参数优化的选项(学习率等)
    for name, param in model.named_parameters(): #查看可优化的参数有哪些
      if param.requires_grad:
        print(name)
    展开全文
  • model.named_parameters() 迭代打印model.named_parameters()将会打印每一次迭代元素的名字和param。 for name, param in net.named_parameters(): print(name,param.requires_grad) param.requires_grad = ...
  • xs = [G.parameters(), D.parameters()] for i in xs[0]: print(i) print("+++++++") for i in xs[0]: print(i) print("++++++++++") 生成结果是: Parameter containing: tensor([[ 1....
  • filter(lambda p: p.requires_grad, model.parameters()), 'lr': args.LR}], lr=args.LR, momentum=args.MOMENTUM, weight_decay=args.WD, nesterov=args.NESTEROV, ) 这也是局部finetune的一个方法
  • 在使用pytorch过程中,我发现了torch中存在3个功能极其类似的方法,它们分别是model.parameters()、model.named_parameters()和model.state_dict(),下面就具体来说说这三个函数的差异 首先,说说比较接近的model....
  • pytorch nn.Module.parameters

    2020-08-23 11:10:48
    for param in model.parameters(): >>> print(type(param), param.size()) <class 'torch.Tensor'> (20L,) <class 'torch.Tensor'> (20L, 1L, 5L, 5L) API parameters(recurse: bool = True)...
  • #k,v就分别是w和b for k, v in net.named_parameters(): ...model.state_dict()、model.parameters()、model.named_parameters()这三个方法都可以查看Module的参数信息,用于更新参数,或者用于模型的保存。 ...
  • [pytorch]children,modules和parameters的区别和用法childrenmoduleparameters children children只获取最浅层的网络结构,相应的named_children则返回tuple的数据,tuple[0]是该层的名称,tuple[1]是相应的结构: ...
  • java.lang.Exception: Method Read should have no parameters  at org.junit.runners.model.FrameworkMethod.validatePublicVoidNoArg(FrameworkMethod.java:76)  at org.junit.runners.ParentRunner.validat...
  • model.zero_grad()与optimizer.zero_grad()

    千次阅读 2019-03-16 15:48:53
    optimizer.zero_grad()#当optimizer=optim.Optimizer(model.parameters())时,两者等效 如果想要把某一Variable的梯度置为0,只需用以下语句: Variable.grad.data.zero_() 作者:CodeTutor 来源:CSDN 原文:ht....
  • 一、pytorch中的model.eval() 和 model.train() 再pytorch中我们可以使用eval和train来控制模型是出于验证还是训练模式,那么两者对网络模型的具体影响是什么呢? 1. model.eval() eval主要是用来影响网络中的...
  • print('Is model on gpu: ', next(model.parameters()).is_cuda) 输出若是True,则model在gpu上;若是False,则model在cpu上。 2. 输出数据data的device字段。 print('data device: ', data.device) 输出gpu则在gpu...
  • 废话不多说,直接上代码吧~ model.zero_grad() optimizer.zero_grad() ... """Sets gradients of all model parameters to zero.""" for p in self.parameters(): if p.grad is not None: p.grad.d
  • 详解keras的model.summary()输出参数Param计算过程

    万次阅读 多人点赞 2018-12-22 20:32:24
    使用keras构建深度学习模型,我们会通过model.summary()输出模型各层的参数状况,如下: ________________________________________________________________ Layer (type) Output Shape Param # ==============...
  • 1、model.named_parameters(),迭代打印model.named_parameters()将会打印每一次迭代元素的名字和param for name, param in model.named_parameters(): print(name,param.requires_grad) param.requires_grad=...
  • 实际上当optimizer使用optim.Optimzer进行定义后,那么在模型训练过程中model.zero_grad()和optimzier.zero_grad()...if optimizer = optim.Optimizer(net.parameters()), model.grad_zero() and optimizer.grad...
  • 【pytorch】Module.parameters()函数实现与网络参数管理

    万次阅读 多人点赞 2018-08-30 01:21:37
    我们知道可以通过Module.parameters()获取网络的参数,那这个是如何实现的呢?我先直接看看函数的代码实现: ...Returns an iterator over module parameters. This is typically passed to an optimizer. ...
  • javax.lang.model.type源码解析

    万次阅读 2018-08-31 19:20:08
    本文我们介绍一下javax.lang.model.type.类不多,如图: 至于为啥要看这部分的代码,原因很简单,javac 实现了javax.lang.model中的API. 解析 TypeKind 该类是一个媒介,定义了java中各种类型所对应的枚举. ...
  • Parameters fit_intercept 释义:是否计算该模型的截距。 设置:bool型,可选,默认True,如果使用中心化的数据,可以考虑设置为False,不考虑截距。 normalize 释义:是否对数据进行标准化处理 设置:bool型,可选,默认False...
  • model.zero_grad() optimizer.zero_grad() 首先,这两种方式都是把模型中参数的梯度设为0 当optimizer = optim.Optimizer(net.parameters())时,二者等效,其中Optimizer可以是Adam、SGD等优化器 def zero_grad...
  • model.zero_grad() optimizer.zero_grad() 首先,这两种方式都是把模型中参数的梯度设为0。 当optimizer = optim.Optimizer... """Sets gradients of all model parameters to zero.""" for p in self.pa.
  • python学习教程:sklearn常用的API参数解析:sklearn.linear_model.LinearRegression 转存失败重新上传取消 sklearn.linear_model.LinearRegression 调用 sklearn.linear_model.LinearRegression(fit_...
  • 在训练Pytorch的时候,我们会使用 model.zero_grad() optimizer.zero_grad() 首先,这两种方式都是把模型中参数的梯度设为0 当optimizer = optim.... """Sets gradients of all model parameters to zero.""" f
  • from src.models.model import SfSNet if __name__ == '__main__': net = SfSNet() net.eval() index = 0 for name, param in list(net.named_parameters()): print(str(index) + ':', name, param.size()) ...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 112,311
精华内容 44,924
关键字:

model.parameters()