精华内容
下载资源
问答
  • autograd

    2021-01-26 21:43:24
    一、torch.autograd 1、torch.autograd.backward torch.autograd.backward(tensors, grad_tensors=None, retain_graph=None, create_graph=False) 作用:自动求取梯度 tensors: 用于求导的张量,如loss, ...

    1、torch.autograd.backward

    torch.autograd.backward(tensors,
                            grad_tensors=None,
                            retain_graph=None,
                            create_graph=False)
    
    
    作用:自动求取梯度
    
    tensors: 用于求导的张量,如loss, torch.autograd.backward(aa) == a.backward()
    
    retain_graph: 保存计算图
    
    create_graph: 创建导数计算图,用于高阶求导
    
    grad_tensors: 多梯度权重

    2、torch.autograd.grad

    torch.autograd.grad(outputs,
                        inputs,
                        grad_outputs=None,
                        retain_graph=None,
                        create_graph=False)
    
    
    功能: 求取梯度
    
    outputs: 用于求导的张量,如loss
    
    inputs: 需要梯度的张量
    
    create_graph: 创建导数计算图,用于高阶求导
    
    retain_graph: 保存计算图
    
    grad_outputs: 多梯度权重
    
    

     

    展开全文
  • Autograd

    2018-08-27 17:28:02
    Autograd Autograd is now a core torch package for automatic differentiation. It uses a tape based system for automatic differentiation. In the forward phase, the autograd tape will remember all the ...

    Autograd

    Autograd is now a core torch package for automatic differentiation. It uses a tape based system for automatic differentiation.

    In the forward phase, the autograd tape will remember all the operations it executed, and in the backward phase, it will replay the operations.

    Tensors that track history

    In autograd, if any input Tensor of an operation has requires_grad=True, the computation will be tracked. After computing the backward pass, a gradient w.r.t. this tensor is accumulated into .gradattribute.

    There’s one more class which is very important for autograd implementation - a FunctionTensorand Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each variable has a .grad_fn attribute that references a function that has created a function (except for Tensors created by the user - these have None as .grad_fn).

    If you want to compute the derivatives, you can call .backward() on a Tensor. If Tensor is a scalar (i.e. it holds a one element tensor), you don’t need to specify any arguments to backward(), however if it has more elements, you need to specify a grad_output argument that is a tensor of matching shape.

    import torch
    

    Create a tensor and set requires_grad=True to track computation with it

    x = torch.ones(2, 2, requires_grad=True)
    print(x)
    

    Out:

    tensor([[1., 1.],
            [1., 1.]], requires_grad=True)
    
    print(x.data)
    

    Out:

    tensor([[1., 1.],
            [1., 1.]])
    
    print(x.grad)
    

    Out:

    None
    
    print(x.grad_fn)  # we've created x ourselves
    

    Out:

    None
    

    Do an operation of x:

    y = x + 2
    print(y)
    

    Out:

    tensor([[3., 3.],
            [3., 3.]], grad_fn=<AddBackward>)
    

    y was created as a result of an operation, so it has a grad_fn

    print(y.grad_fn)
    

    Out:

    <AddBackward object at 0x7f23f4054eb8>
    

    More operations on y:

    z = y * y * 3
    out = z.mean()
    
    print(z, out)
    

    Out:

    tensor([[27., 27.],
            [27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=<MeanBackward1>)
    

    .requires_grad_( ... ) changes an existing Tensor’s requires_grad flag in-place. The input flag defaults to True if not given.

    a = torch.randn(2, 2)
    a = ((a * 3) / (a - 1))
    print(a.requires_grad)
    a.requires_grad_(True)
    print(a.requires_grad)
    b = (a * a).sum()
    print(b.grad_fn)
    

    Out:

    False
    True
    <SumBackward0 object at 0x7f23f76eb9e8>
    

    Gradients

    let’s backprop now and print gradients d(out)/dx

    out.backward()
    print(x.grad)
    

    Out:

    tensor([[4.5000, 4.5000],
            [4.5000, 4.5000]])
    

    By default, gradient computation flushes all the internal buffers contained in the graph, so if you even want to do the backward on some part of the graph twice, you need to pass in retain_variables = True during the first pass.

    x = torch.ones(2, 2, requires_grad=True)
    y = x + 2
    y.backward(torch.ones(2, 2), retain_graph=True)
    # the retain_variables flag will prevent the internal buffers from being freed
    print(x.grad)
    

    Out:

    tensor([[1., 1.],
            [1., 1.]])
    
    z = y * y
    print(z)
    

    Out:

    tensor([[9., 9.],
            [9., 9.]], grad_fn=<ThMulBackward>)
    

    just backprop random gradients

    gradient = torch.randn(2, 2)
    
    # this would fail if we didn't specify
    # that we want to retain variables
    y.backward(gradient)
    
    print(x.grad)
    

    Out:

    tensor([[ 0.8181,  1.6773],
            [ 1.6309, -0.5167]])
    

    You can also stops autograd from tracking history on Tensors with requires_grad=True by wrapping the code block in with torch.no_grad():

    print(x.requires_grad)
    print((x ** 2).requires_grad)
    
    with torch.no_grad():
        print((x ** 2).requires_grad)
    

    Out:

    True
    True
    False

     

    更过学习资料欢迎关注微信公众号,回复“学习书籍”获取,谢谢。

     

    展开全文
  • AutoGrad.jl是Julia的自动差异包。 它是流行的Python 软件包的移植,并构成了 Julia深度学习框架的基础。 通过跟踪原始操作并使用此执行跟踪来计算梯度,AutoGrad可以区分常规的Julia代码,包括循环,条件,帮助...
  • 笔记链接:Autograd链接 以下为笔记内容 3.2 Autograd torch.autograd是为了方便用户计算,专门开发的一套自动求导引擎,能够根据输入与前向传播过程自动构建计算图,并执行反向传播. 3.2.1 Variable from future ...

    笔记链接:Autograd链接
    以下为笔记内容
    3.2 Autograd
    torch.autograd是为了方便用户计算,专门开发的一套自动求导引擎,能够根据输入与前向传播过程自动构建计算图,并执行反向传播.
    3.2.1 Variable
    from future import print_function
    import torch as t
    from torch.autograd import Variable as V
    “”"--------------------"""

    从tensor中创建Variable, 需要指定求导

    a = V(t.ones(3, 4), requires_grad =True)
    a
    “”"--------------------"""
    b = V(t.zeros(3,4))
    b
    “”"--------------------"""

    函数的使用与Tensor一致

    c = a + b

    c = a.add(b)
    c
    “”"--------------------"""
    d = c.sum()
    d.backward() # 反向传播
    “”"--------------------"""

    注意二者的区别

    前者在取data后变为tensor,从tensor计算sum得到float

    后者计算sum后还是Variable

    c.data.sum(), c.sum()
    “”"--------------------"""
    a.grad
    “”"--------------------"""

    此处虽然没有指定C需要求导,但是C依赖于A,而A需要求导

    因此c的requires_grad属性会自动设置为true

    a.requires_grad, b.requires_grad, c.requires_grad,d.requires_grad
    “”"--------------------"""

    由用户创建的variable属于叶子节点,对应的grad_fn为none

    a.is_leaf,b.is_leaf,c.is_leaf,d.is_leaf
    “”"--------------------"""
    #c.grad 是none,c不是叶子节点,它的梯度用来计算A的梯度
    # 虽然c.requeires_grad = True ,但其梯度计算之后就会立刻被释放
    c.grad is None
    手动推导和Atograd的区别
    from future import print_function
    import torch as t
    from torch.autograd import Variable as V

    def f(x):
    “”“计算y”""
    y = x ** 2 * t.exp(x)
    return y

    def gradf(x):
    “手动求导”
    dx = x * 2 t.exp(x) + x**2t.exp(x)
    return dx
    “”"--------------------"""
    x = V(t.randn(3,4),requires_grad = True)
    y = f(x)
    y
    “”"--------------------"""
    y.backward(t.ones(y.size())) # grad_variable形状与Y一致
    x.grad
    “”"--------------------"""

    autograd 计算的公式与手动计算的结果一样

    gradf(x)
    autograd的细节:
    from future import print_function
    import torch as t
    from torch.autograd import Variable as V

    x = V(t.ones(1))
    b = V(t.rand(1),requires_grad =True)
    w = V(t.rand(1),requires_grad =True)
    y = w * x # 等价于 y = w.mul(x)
    z = y + b # 等价于 z = y.add(b)
    “”"--------------------"""
    x.requires_grad, b.requires_grad,w.requires_grad
    “”"--------------------"""

    虽然未指定y.requires_grad 为True,但由于y依赖于需要求导的w

    故而y.requires_grad 为True

    y.requires_grad
    “”"--------------------"""
    x.is_leaf,w.is_leaf,b.is_leaf,y.is_leaf,z.is_leaf
    “”"--------------------"""

    next_functions保存grad_fn的输入,是一个tuple,tuple的元素也是Function

    第一个是y,它是乘法(mul)的输出,所以对应的反向传播函数y.grad_fn是MulBackward

    第二个是b,它是叶子节点,由用户创建,grad_fn为None,但是有

    z.grad_fn.next_functions
    “”"--------------------"""

    variable的grad_fn对应着和图中的function相对应

    z.grad_fn.next_functions[0][0] == y.grad_fn
    “”"--------------------"""

    叶子节点的grad_fn是None

    w.grad_fn,x.grad_fn
    计算w的梯度的时候,需要用到x的数值(${\partial y\over \partial w} = x $),这些数值在前向过程中会保存成buffer,在计算完梯度之后会自动清空。为了能够多次反向传播需要指定retain_graph来保留这些buffer。
    from future import print_function
    import torch as t
    from torch.autograd import Variable as V

    使用retain_graph来保存buffer

    z.backward(retain_graph=True)
    w.grad
    “”"--------------------"""

    多次反向传播,梯度累加,这也就是w中AccumulateGrad标识的含义

    z.backward()
    w.grad
    PyTorch使用的是动态图,它的计算图在每次前向传播时都是从头开始构建,所以它能够使用Python控制语句(如for、if等)根据需求创建计算图。这点在自然语言处理领域中很有用,它意味着你不需要事先构建所有可能用到的图的路径,图在运行时才构建。
    from future import print_function
    import torch as t
    from torch.autograd import Variable as V

    def abs(x):
    if x.data[0] > 0:return x
    else:return -x

    x = t.ones(1, requires_grad =True)
    y = abs(x)
    y.backward()
    x.grad
    “”"--------------------"""
    x = -1 * t.ones(1)
    x = x.requires_grad_()
    y = abs(x)
    y.backward()
    x.grad
    “”"--------------------"""
    def f(x):
    result = 1
    for ii in x:
    if ii.item()>0: result=ii*result
    return result
    x = t.arange(-2.0,4.0,requires_grad=True)

    y = f(x) # y = x[3]*x[4]*x[5]
    y.backward()
    x.grad
    “”"--------------------"""
    有些时候我们可能不希望autograd对tensor求导。认为求导需要缓存许多中间结构,增加额外的内存/显存开销,那么我们可以关闭自动求导。对于不需要反向传播的情景(如inference,即测试推理时),关闭自动求导可实现一定程度的速度提升,并节省约一半显存,因其不需要分配空间计算梯度。
    x = t.ones(1, requires_grad=True)
    w = t.rand(1, requires_grad=True)
    y = x * w

    y依赖于w,而w.requires_grad = True

    x.requires_grad, w.requires_grad, y.requires_grad
    “”"--------------------"""
    with t.no_grad():
    x = t.ones(1)
    w = t.rand(1, requires_grad = True)
    y = x * w

    y依赖于w和x,虽然w.requires_grad = True,但是y的requires_grad依旧为False

    x.requires_grad, w.requires_grad, y.requires_grad
    “”"--------------------"""

    等价于t.no_grad()

    t.set_grad_enabled(False)
    x = t.ones(1)
    w = t.rand(1, requires_grad = True)
    y = x * w

    y依赖于w和x,虽然w.requires_grad = True,但是y的requires_grad依旧为False

    x.requires_grad, w.requires_grad, y.requires_grad

    在反向传播过程中非叶子节点的导数计算完之后即被清空。若想查看这些变量的梯度,有两种方法:
    使用autograd.grad函数
    使用hook
    autograd.grad和hook方法都是很强大的工具,更详细的用法参考官方api文档,这里举例说明基础的使用。推荐使用hook方法,但是在实际使用中应尽量避免修改grad的值。
    x = t.ones(3, requires_grad=True)
    w = t.rand(3, requires_grad=True)
    y = x * w

    y依赖于w,而w.requires_grad = True

    z = y.sum()
    x.requires_grad, w.requires_grad, y.requires_grad
    “”"--------------------"""

    非叶子节点grad计算完之后自动清空,y.grad是None

    z.backward()
    (x.grad, w.grad, y.grad)
    “”"--------------------"""

    第一种方法:使用grad获取中间变量的梯度

    x = t.ones(3, requires_grad=True)
    w = t.rand(3, requires_grad=True)
    y = x * w
    z = y.sum()

    z对y的梯度,隐式调用backward()

    t.autograd.grad(z, y)
    “”"--------------------"""

    第二种方法:使用hook

    hook是一个函数,输入是梯度,不应该有返回值

    def variable_hook(grad):
    print(‘y的梯度:’,grad)

    x = t.ones(3, requires_grad=True)
    w = t.rand(3, requires_grad=True)
    y = x * w

    注册hook

    hook_handle = y.register_hook(variable_hook)
    z = y.sum()
    z.backward()

    除非你每次都要用hook,否则用完之后记得移除hook

    hook_handle.remove()
    “”"--------------------"""

    variable中grad属性和backward函数grad_variables参数的含义
    x = t.arange(0,3, requires_grad=True)
    y = x2 + x*2
    z = y.sum()
    z.backward() # 从z开始反向传播
    x.grad
    “”"--------------------"""
    x = t.arange(0,3, requires_grad=True)
    y = x
    2 + x*2
    z = y.sum()
    y_gradient = t.Tensor([1,1,1]) # dz/dy
    y.backward(y_gradient) #从y开始反向传播
    x.grad
    用Variable实现线性回归
    import torch as t
    %matplotlib inline
    from matplotlib import pyplot as plt
    from IPython import display
    import numpy as np
    “”"--------------------"""

    设置随机数种子,为了在不同人电脑上运行时下面的输出一致

    t.manual_seed(1000)
    def get_fake_data(batch_size=8):
    ‘’’ 产生随机数据:y = x*2 + 3,加上了一些噪声’’’
    x = t.rand(batch_size,1) * 5
    y = x * 2 + 3 + t.randn(batch_size, 1)
    return x, y
    “”"--------------------"""

    来看看产生x-y分布是什么样的

    x, y = get_fake_data()
    plt.scatter(x.squeeze().numpy(), y.squeeze().numpy())
    “”"--------------------"""

    随机初始化参数

    w = t.rand(1,1, requires_grad=True)
    b = t.zeros(1,1, requires_grad=True)
    losses = np.zeros(500)

    lr =0.005 # 学习率

    for ii in range(500):
    x, y = get_fake_data(batch_size=32)

    # forward:计算loss
    y_pred = x.mm(w) + b.expand_as(y)
    loss = 0.5 * (y_pred - y) ** 2
    loss = loss.sum()
    losses[ii] = loss.item()
    
    # backward:手动计算梯度
    loss.backward()
    
    # 更新参数
    w.data.sub_(lr * w.grad.data)
    b.data.sub_(lr * b.grad.data)
    
    # 梯度清零
    w.grad.data.zero_()
    b.grad.data.zero_()
    
    if ii%50 ==0:
        # 画图
        display.clear_output(wait=True)
        x = t.arange(0, 6).view(-1, 1)
    	x = t.tensor(x, dtype=t.float32)  # 要注意格式
        y = x.mm(w.data) + b.data.expand_as(x)
        plt.plot(x.numpy(), y.numpy()) # predicted
        
        x2, y2 = get_fake_data(batch_size=20) 
        plt.scatter(x2.numpy(), y2.numpy()) # true data
        
        plt.xlim(0,5)
        plt.ylim(0,13)   
        plt.show()
        plt.pause(0.5)
    

    print(w.item(), b.item())
    “”"--------------------"""
    plt.plot(losses)
    plt.ylim(5,50)

    展开全文
  • pytorch autograd

    2020-03-23 21:28:56
    pytorch autograd 自动求导机制 Internally, autograd represents this graph as a graph of Function objects (really expressions), which can be apply() ed to compute the result of evaluating the graph. ...

    pytorch autograd 自动求导机制

    Internally, autograd represents this graph as a graph of Function objects (really expressions), which can be apply() ed to compute the result of evaluating the graph. When computing the forwards pass, autograd simultaneously performs the requested computations and builds up a graph representing the function that computes the gradient (the .grad_fn attribute of each torch.Tensor is an entry point into this graph). When the forwards pass is completed, we evaluate this graph in the backwards pass to compute the gradients.

    An important thing to note is that the graph is recreated from scratch at every iteration, and this is exactly what allows for using arbitrary Python control flow statements, that can change the overall shape and size of the graph at every iteration. You don’t have to encode all possible paths before you launch the training - what you run is what you differentiate.

    Every operation performed on Tensor s creates a new function object, that performs the computation, and records that it happened. The history is retained in the form of a DAG of functions, with edges denoting data dependencies (input <- output). Then, when backward is called, the graph is processed in the topological ordering, by calling backward() methods of each Function object, and passing returned gradients on to next Function s.

    Adding operations to autograd requires implementing a new Function subclass for each operation. Recall that Function s are what autograd uses to compute the results and gradients, and encode the operation history. Every new function requires you to implement 2 methods:
    forward()
    backward()

    展开全文
  • autograd_learning.ipynb

    2021-01-23 20:20:25
    Pytorch autograd学习代码 - 线性回归的一个例子
  • PyTorch Autograd

    2020-05-20 11:06:21
    这就是PyTorch的autograd出现的地方。它抽象了复杂的数学,并帮助我们“神奇地”仅用几行代码即可计算出高维曲线的梯度。这篇文章试图描述autograd的魔力。 PyTorch Basics 在继续之前,我们需要了解一些基本的...
  • 自动毕业 我用几种编程语言实现AutoGrad算法的尝试
  • PyTorch 的 Autograd

    万次阅读 多人点赞 2019-06-15 22:16:21
    PyTorch 作为一个深度学习平台,在深度学习任务中比 NumPy 这个科学计算库强在哪里?我觉得一是 PyTorch 提供了自动求导...由此可见,自动求导 (autograd) 是 PyTorch,乃至其他大部分深度学习框架中的重要组成部分。
  • Deep Learning autograd.pdf

    2020-07-09 13:10:34
    Deep Learning autograd.pdf
  • Autograd 自动微分

    2019-10-12 13:06:14
      深度学习的算法本质上是通过反向传播求导数,Pytorch的Autograd模块实现了此功能。在tensor上的所有操作,Autograd 都能自动求微分,避免了手动计算微分的过程。   其中,autograd.Variable.是Autograd中核心...
  • autograd-for-dummy 从头开始编写的最小的 autograd 引擎和神经网络库。 受到 Andrej Karpathy 的极大启发,但在此过程中有很多解释数学、概念等的评论。 Autograd 引擎 autograd.engine模块实现了一个标量值的 ...
  • PyTorch autograd 机制

    千次阅读 多人点赞 2021-02-27 09:43:25
    autograd 机制概述代码实现手动定义求导计算流量反向传播计算线性回归导包构造 x, y构造模型参数 & 损失函数训练模型完整代码 概述 PyTorch 干的最厉害的一件事情就是帮我们把反向传播全部计算好了. 代码实现 ...
  • Pytorch Autograd

    2018-04-13 20:42:40
    Pytorch学习入门(二)— Autograd pytorch能自动反向传播。 前向传播时,能够自动记录每个操作的拓扑顺序,反向传播时则可以自动反向传播。 Variable Variable是一个类,对是对tensor的一个wrapper。 有三个...
  • 自动梯度autograd

    2021-07-28 16:06:37
    from torch.autograd import Variable batch_n = 100 input_data = 1000 hidden_layer = 100 output_data = 10 完成自动梯度需要用到torch.autograd包中的Variable类对我们定义的Tensor数据类型变量进行封装, x=...
  • PyTorch的 autograd

    2021-03-18 14:13:31
    AutoGrad简介 从概念上讲,autograd记录一个图,记录在您执行操作时创建数据的所有操作,从而为您提供一个有向无环图,其叶是输入张量,根是输出张量。通过从根到叶跟踪这个图,您可以使用链式法则自动计算梯度。 ...
  • Pytorch_Autograd

    2020-04-10 14:58:00
    Autograd:自动求导机制 Pytorch的核心是autograd包。autograd包为张量上的所有操作提供了自动求导,他是一个运行时定义的框架,意味着反向传播是根据代码来确定如何运行,并且每次迭代可以不同 张量(Tensor) torch....
  • Torch Autograd详解

    千次阅读 2019-03-30 20:45:17
    1.autograd自动微分 假如我们有一个向量x=(1,1)当成input,经过一系列运算得到了output变量y,如下图所示: 如图所示,向量x经过与4和自身相乘之后得到向量z,z再求长度,得到y 我们想要求y关于x的微分时,pytorch会...
  • autograd自动微分

    2021-01-02 22:11:18
    autograd 包是 PyTorch 中所有神经网络的核心。该 autograd 软件包为 Tensors 上的所有操作提供自动微分。它是一个由运行定义的框架,这意味着以代码运行方式定义你的后向传播,并且每次迭代都可以不同。 torch....
  • AutoGrad-Web-应用程序 还在处理
  • Pytorch - Autograd

    2021-04-28 12:52:55
    文章目录Auto-gradConcurrency Training on CPUCustom Layers and BackwardTransfer ...Internally, autograd represents this graph as a graph of Function objects, which can be apply() to compute the result of
  • Autograd自动求导

    2019-09-02 22:56:33
    Autograd: 自动求导机制 PyTorch 中所有神经网络的核心是 autograd 包。 我们先简单介绍一下这个包,然后训练第一个简单的神经网络。 autograd包为张量上的所有操作提供了自动求导。 它是一个在运行时定义的框架,这...
  • 链接: Autograd AUTOMATIC DIFFERENTIATION WITH TORCH.AUTOGRAD 用torch.autograd自动微分 When training neural networks, the most frequently used algorithm is back propagation. In thi
  • PyTorch入门——Autograd

    2021-01-07 20:49:04
    一、什么是Autograd         Autograd是PyTorch中所有神经网络的核心,它提供自动计算张量梯度的方法。使用Autograd我们在搭建神经网络是只需要定义正向传播过程,PyTorch会自动生成反向传播...
  • 自定义autograd function

    2020-11-24 15:31:35
    在TSN代码中, segmentconsensus是一个自定义函数, 所以要写一下它对应的梯度...class SegmentConsensus(torch.autograd.Function): @staticmethod def forward(ctx, input_tensor, consensus_type, dim): ctx.
  • PyTorch中autograd

    2019-10-16 12:45:24
    PyTorch中autograd: 链接:autograd. 核心:chain rule–链式规则,用于反向自动梯度微分,y=f(x), l=g(y),如果y是non-scalar的张量,y对x的full Jacobian梯度无法直接计算。但如果只需要计算vector-Jacobian ...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 15,565
精华内容 6,226
关键字:

autograd