• 27KB weixin_38606404 2020-09-18 05:01:30
• 2KB weixin_42181888 2021-03-29 19:40:25
• 2KB weixin_42131705 2021-03-16 14:59:55
• ## python-flops-源码 Python

15KB weixin_42160645 2021-05-14 03:29:17
• 15KB weixin_42696271 2021-09-11 08:56:36
• 16.96MB u011832525 2020-09-13 18:55:56
• 16.96MB weixin_38743481 2019-09-18 09:41:04
• 25KB weixin_38531017 2021-01-20 13:23:06
• 10KB weixin_42109598 2021-04-29 18:42:04
• 14KB weixin_42131342 2021-07-24 17:54:21
• FLOPS(float point operations per second) and MIPS(million instructions per second) are units of measure for the numerical computing performance of a computer. Floating-point operations are typically ...
Computational performance:https://en.wikipedia.org/wiki/FLOPS FLOPS(float point operations per second) and MIPS(million instructions per second) are units of measure for the numerical computing performance of a computer. Floating-point operations are typically used in fields such as scientific computational research. The unit MIPS measures integer performance of a computer. Examples of integer operation include data movement (A to B) or value testing (If A = B, then C). MIPS as a performance benchmark is adequate when a computer is used in database queries, word processing, spreadsheets, or to run multiple virtual operating systems.[3][4] Frank H. McMahon, of the Lawrence Livermore National Laboratory, invented the terms FLOPS and MFLOPS (megaFLOPS) so that he could compare the supercomputers of the day by the number of floating-point calculations they performed per second. This was much better than using the prevalent MIPS to compare computers as this statistic usually had little bearing on the arithmetic capability of the machine. FLOPS on an HPC-system can be calculated using this equation:

This can be simplified to the most common case: a computer that has exactly 1 CPU:

FLOPS can be recorded in different measures of precision, for example, the TOP500 supercomputer list ranks computers by 64 bit (double-precision floating-point format) operations per second, abbreviated to FP64.[6] Similar measures are available for 32-bit (FP32(single precision floating point)) and 16-bit] (FP16(double precision floating point)) operations.

注意, 和FLOPs(浮点运算次数，用以衡量算法复杂度)的区别
展开全文
vily_lei 2020-08-21 12:45:40
• ## 深度学习中的FLOPs介绍及计算(注意区分FLOPS) 深度学习 pytorch 神经网络

FLOPSFLOPs FLOPS：注意全大写，是floating point operations per second的缩写，意指每秒浮点运算次数，理解为计算速度。是一个衡量硬件性能的指标。 FLOPs：注意s小写，是floating point operations的缩写（s表...
FLOPS与FLOPs
FLOPS：注意全大写，是floating point operations per second的缩写，意指每秒浮点运算次数，理解为计算速度。是一个衡量硬件性能的指标。
FLOPs：注意s小写，是floating point operations的缩写（s表复数），意指浮点运算数，理解为计算量。可以用来衡量算法/模型的复杂度。
全连接网络中FLOPs的计算
推导
以4个输入神经元和3个输出神经元为例  计算一个输出神经元的的计算过程为

y

1

=

w

11

∗

x

1

+

w

21

∗

x

2

+

w

31

∗

x

3

+

w

41

∗

x

4

y1 = w_{11}*x_1+w_{21}*x_2+w_{31}*x_3+w_{41}*x_4

所需的计算次数为
4次乘法3次加法
共需4+3=7计算。推广到I个输入神经元O个输出神经元后则计算一个输出神经元所需要的计算次数为

I

+

(

I

−

1

)

=

2

I

−

1

I+(I-1)=2I-1

，则总的计算次数为

F

L

O

P

s

=

(

2

I

−

1

)

∗

O

FLOPs = (2I-1)*O

考虑bias则为

y

1

=

w

11

∗

x

1

+

w

21

∗

x

2

+

w

31

∗

x

3

+

w

41

∗

x

4

+

b

1

y1 = w_{11}*x_1+w_{21}*x_2+w_{31}*x_3+w_{41}*x_4+b1

总的计算次数为

F

L

O

P

s

=

2

I

∗

O

FLOPs = 2I*O

结果
FC（full connected）层FLOPs的计算公式如下(不考虑bias时有-1，有bias时没有-1):

F

L

O

P

s

=

(

2

×

I

−

1

)

×

O

FLOPs = (2 \times I - 1) \times O

其中:
I = input neuron numbers(输入神经元的数量)  O = output neuron numbers(输出神经元的数量)
CNN中FLOPs的计算

以下答案不考虑activation function的运算

推导

对于输入通道数为

C

i

n

C_{in}

,卷积核的大小为K,输出通道数为

C

o

u

t

C_{out}

,输出特征图的尺寸为

H

∗

W

H*W

进行一次卷积运算的计算次数为
乘法

C

i

n

K

2

C_{in}K^2

次加法

C

i

n

K

2

−

1

C_{in}K^2-1

次共计

C

i

n

K

2

+

C

i

n

K

2

−

1

=

2

C

i

n

K

2

−

1

C_{in}K^2+C_{in}K^2-1=2C_{in}K^2-1

次，若考虑bias则再加1次  得到一个channel的特征图所需的卷积次数为

H

∗

W

H*W

次  共计需得到

C

o

u

t

C_{out}

个特征图
因此对于CNN中的一个卷积层来说总的计算次数为(不考虑bias时有-1，考虑bias时没有-1):

F

L

O

P

s

=

(

2

C

i

n

K

2

−

1

)

H

W

C

o

u

t

FLOPs = (2C_{in}K^2-1)HWC_{out}

结果
卷积层FLOPs的计算公式如下(不考虑bias时有-1，有bias时没有-1):

F

L

O

P

s

=

(

2

C

i

n

K

2

−

1

)

H

W

C

o

u

t

FLOPs = (2C_{in}K^2-1)HWC_{out}

其中:

C

i

n

C_{in}

= input channelK= kernel sizeH,W = output feature map size

C

o

u

t

C_{out}

= output channel
计算FLOPs的代码或包
torchstat
from torchstat import stat
import torchvision.models as models

model = models.vgg16()
stat(model, (3, 224, 224))

        module name  input shape output shape       params memory(MB)              MAdd             Flops   MemRead(B)  MemWrite(B) duration[%]    MemR+W(B)
0        features.0    3 224 224   64 224 224       1792.0      12.25     173,408,256.0      89,915,392.0     609280.0   12845056.0       3.67%   13454336.0
1        features.1   64 224 224   64 224 224          0.0      12.25       3,211,264.0       3,211,264.0   12845056.0   12845056.0       1.83%   25690112.0
2        features.2   64 224 224   64 224 224      36928.0      12.25   3,699,376,128.0   1,852,899,328.0   12992768.0   12845056.0       8.43%   25837824.0
3        features.3   64 224 224   64 224 224          0.0      12.25       3,211,264.0       3,211,264.0   12845056.0   12845056.0       1.45%   25690112.0
4        features.4   64 224 224   64 112 112          0.0       3.06       2,408,448.0       3,211,264.0   12845056.0    3211264.0      11.37%   16056320.0
5        features.5   64 112 112  128 112 112      73856.0       6.12   1,849,688,064.0     926,449,664.0    3506688.0    6422528.0       4.03%    9929216.0
6        features.6  128 112 112  128 112 112          0.0       6.12       1,605,632.0       1,605,632.0    6422528.0    6422528.0       0.73%   12845056.0
7        features.7  128 112 112  128 112 112     147584.0       6.12   3,699,376,128.0   1,851,293,696.0    7012864.0    6422528.0       5.86%   13435392.0
8        features.8  128 112 112  128 112 112          0.0       6.12       1,605,632.0       1,605,632.0    6422528.0    6422528.0       0.37%   12845056.0
9        features.9  128 112 112  128  56  56          0.0       1.53       1,204,224.0       1,605,632.0    6422528.0    1605632.0       7.32%    8028160.0
10      features.10  128  56  56  256  56  56     295168.0       3.06   1,849,688,064.0     925,646,848.0    2786304.0    3211264.0       3.30%    5997568.0
11      features.11  256  56  56  256  56  56          0.0       3.06         802,816.0         802,816.0    3211264.0    3211264.0       0.00%    6422528.0
12      features.12  256  56  56  256  56  56     590080.0       3.06   3,699,376,128.0   1,850,490,880.0    5571584.0    3211264.0       5.13%    8782848.0
13      features.13  256  56  56  256  56  56          0.0       3.06         802,816.0         802,816.0    3211264.0    3211264.0       0.37%    6422528.0
14      features.14  256  56  56  256  56  56     590080.0       3.06   3,699,376,128.0   1,850,490,880.0    5571584.0    3211264.0       4.76%    8782848.0
15      features.15  256  56  56  256  56  56          0.0       3.06         802,816.0         802,816.0    3211264.0    3211264.0       0.37%    6422528.0
16      features.16  256  56  56  256  28  28          0.0       0.77         602,112.0         802,816.0    3211264.0     802816.0       2.56%    4014080.0
17      features.17  256  28  28  512  28  28    1180160.0       1.53   1,849,688,064.0     925,245,440.0    5523456.0    1605632.0       3.66%    7129088.0
18      features.18  512  28  28  512  28  28          0.0       1.53         401,408.0         401,408.0    1605632.0    1605632.0       0.00%    3211264.0
19      features.19  512  28  28  512  28  28    2359808.0       1.53   3,699,376,128.0   1,850,089,472.0   11044864.0    1605632.0       5.50%   12650496.0
20      features.20  512  28  28  512  28  28          0.0       1.53         401,408.0         401,408.0    1605632.0    1605632.0       0.00%    3211264.0
21      features.21  512  28  28  512  28  28    2359808.0       1.53   3,699,376,128.0   1,850,089,472.0   11044864.0    1605632.0       5.49%   12650496.0
22      features.22  512  28  28  512  28  28          0.0       1.53         401,408.0         401,408.0    1605632.0    1605632.0       0.00%    3211264.0
23      features.23  512  28  28  512  14  14          0.0       0.38         301,056.0         401,408.0    1605632.0     401408.0       1.10%    2007040.0
24      features.24  512  14  14  512  14  14    2359808.0       0.38     924,844,032.0     462,522,368.0    9840640.0     401408.0       2.94%   10242048.0
25      features.25  512  14  14  512  14  14          0.0       0.38         100,352.0         100,352.0     401408.0     401408.0       0.00%     802816.0
26      features.26  512  14  14  512  14  14    2359808.0       0.38     924,844,032.0     462,522,368.0    9840640.0     401408.0       2.57%   10242048.0
27      features.27  512  14  14  512  14  14          0.0       0.38         100,352.0         100,352.0     401408.0     401408.0       0.00%     802816.0
28      features.28  512  14  14  512  14  14    2359808.0       0.38     924,844,032.0     462,522,368.0    9840640.0     401408.0       2.19%   10242048.0
29      features.29  512  14  14  512  14  14          0.0       0.38         100,352.0         100,352.0     401408.0     401408.0       0.37%     802816.0
30      features.30  512  14  14  512   7   7          0.0       0.10          75,264.0         100,352.0     401408.0     100352.0       0.37%     501760.0
31          avgpool  512   7   7  512   7   7          0.0       0.10               0.0               0.0          0.0          0.0       0.00%          0.0
32     classifier.0        25088         4096  102764544.0       0.02     205,516,800.0     102,760,448.0  411158528.0      16384.0      10.62%  411174912.0
33     classifier.1         4096         4096          0.0       0.02           4,096.0           4,096.0      16384.0      16384.0       0.00%      32768.0
34     classifier.2         4096         4096          0.0       0.02               0.0               0.0          0.0          0.0       0.37%          0.0
35     classifier.3         4096         4096   16781312.0       0.02      33,550,336.0      16,777,216.0   67141632.0      16384.0       2.20%   67158016.0
36     classifier.4         4096         4096          0.0       0.02           4,096.0           4,096.0      16384.0      16384.0       0.00%      32768.0
37     classifier.5         4096         4096          0.0       0.02               0.0               0.0          0.0          0.0       0.37%          0.0
38     classifier.6         4096         1000    4097000.0       0.00       8,191,000.0       4,096,000.0   16404384.0       4000.0       0.73%   16408384.0
total                                          138357544.0     109.39  30,958,666,264.0  15,503,489,024.0   16404384.0       4000.0     100.00%  783170624.0
============================================================================================================================================================
Total params: 138,357,544
------------------------------------------------------------------------------------------------------------------------------------------------------------
Total memory: 109.39MB
Total Flops: 15.5GFlops
Total MemR+W: 746.89MB


参考资料
CNN 模型所需的计算力（flops）和参数（parameters）数量是怎么计算的？  分享一个FLOPs计算神器  CNN Explainer  Molchanov P , Tyree S , Karras T , et al. Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning[J]. 2016. 
展开全文
qq_41834400 2021-09-14 10:58:19
• 级仿真分析 电路分析与数字电路应用.27 MATLAB系统级仿真分析 电路分析与数字电路应用.28 基本触发器模块介绍触发器模块:位于Simulink Extras节点下的Flip Flops模块库......? ? ? Matlab及其应用 2.4.1 特殊变量...

级仿真分析 电路分析与数字电路应用.27 MATLAB系统级仿真分析 电路分析与数字电路应用.28 基本触发器模块介绍触发器模块:位于Simulink Extras节点下的Flip Flops模块库......
? ? ? Matlab及其应用 2.4.1 特殊变量特殊变量 ans pi eps flops inf nan i j nargin nargout realmin realmax Matlab及其应用 取值 用于结果的缺省变量名 ......
MATLAB 语言的首创者 Cleve Moler 教授在数值分析,特别是在数值 线性代数的领域...最小正实数 realmax:最大正实数 nan:不定量 flops:浮点运算数 (3)变量操作 ......
6 举例3 Simulink仿真环境 MATLAB变量变量与数值显示格式 变量必须以字母开头(小于19个字符),区分大小写。 MATLAB特殊变量 Ans, pi, eps(计算机的最小数),flops(......
2.矩阵和数组的概念 在MATLAB的运算中,经常要使用标量、向量、矩阵和数组,这几...特殊变量 ans pi eps flops inf NaN或nan i或 j nargin nargout realmin ......
高等应用数学问题的 MATLAB 求解 liyw0816@ 7 2013-10-21 flops:浮点运算数 Lastwarn:最新的警告信息 Lasterr:最新的错误信息 2013-10-21 高等应用数学......
flops 浮点运算次数 flow Matlab 提供的演示数据 fmin 求单...
反双曲正切 正割 双曲正割 反双曲正割 余割 双曲余割 反余割 反双曲余割 余切 双曲余切 反余切 反双曲余切 Inf Nan Flops Nargin Nargout Computer Isieee ......
Matlab编辑命令 编辑命令:↑、↓、→、← 编辑命令 ? 工作空间管理命令 工作...特殊变量表特殊变量 ans pi eps flops inf NaN i,j realmin realmax 取值 ......
特指 0/0 常量名 realmin realmax inf flops 常量值 最小可用正实数 最大可用正实数 正无穷大,如 1/0 浮点运算数 在 MATLAB 语言中,定义变量时应避免与......
标点符号在 MATLAB 中具有 特殊含义,所以变量名中不允许使用标点符号。 特殊变量见下表: 特殊变量 Ans Pi Eps Flops Inf NaN i(和)j nargin nargout 用于结果......
如果你不知道如何使用这个指令,请使用 help flops 来查出它的用法。 14、写一个 MATLAB 的递归函式 fibo.m 来计算 Fibonacci 数列,其定义如下: fibo(n+2) ......
MATLAB 程序控制结构 1 主要内容 2.1 MATLAB变量及其操作 2.2 MATLAB数组与...当和 1 相加就产生一个 eps 比 1 大的数 flops 浮点运算数 Inf 或-Inf ......
实验一 应用仿真软件进行数值计算和 数据可视化 实验目的 1、掌握MATLAB在线帮助...最小正实数 realmax:最大正实数 nan:不定量 flops:浮点运算数 (3)变量操作 ......
MATLAB特殊变量 ? Ans, pi, eps(计算机的最小数),flops(浮点运算数), inf, NaN(不定量),i j (i=j= ) ? Nargin(函数的输入变量数目), Nargout(函数的......
数学建模与数学实验 MATLAB入门 MATLAB作图 MATLAB作为线性系统的一种分析和仿真 ...特殊变量表 特殊变量 ans pi eps flops inf NaN i,j nargin nargout realmin......
Nargout ? Flops MATLAB定义的正的极小值2.2204e-16...
flops 浮点运算次数 flow Matlab 提供的演示数据 fmin 求单...
MATLAB 程序控制结构 1 主要内容 2.1 MATLAB变量及其操作 2.2 MATLAB数组与...当和 1 相加就产生一个 eps 比 1 大的数 flops 浮点运算数 Inf 或-Inf ......
() 虚数单位i=j=√-1 Nargin nargout realmax realmin flops eps 函数输入参数个数 函数输出参数个数 最大的正实数 最小的正实数 浮点运算次数 MATLAB定义的......

展开全文
weixin_42341237 2021-04-20 02:52:49
• ## cnn中关于FLOPS的理解及计算 ncnn 卷积神经网络 flops

FLOPS：注意全大写，是floating point operations per second的缩写，指每秒浮点运算次数，理解为计算速度。是一个衡量硬件性能的指标。 FLOPs：注意s小写，是floating point operations的缩写（s表复数），指浮点运...
相关概念
FLOPS：注意全大写，是floating point operations per second的缩写，指每秒浮点运算次数，理解为计算速度。是一个衡量硬件性能的指标。
FLOPs：注意s小写，是floating point operations的缩写（s表复数），指浮点运算数，理解为计算量。可以用来衡量算法/模型的复杂度。
MACs：乘加运算（Multiplication and Accumulation），相当于2次浮点运算，硬件支持乘加指令可加快计算速度。
OP的计算
1. conv 计算
def compute_conv2d_flops(mod, input_shape = None, output_shape = None, macs = False):
_, cin, _, _ = input_shape
_, _, h, w, = output_shape

w_cout, w_cin, w_h, w_w =  mod.weight.data.shape

if mod.groups != 1:
input_channels = 1
else:
assert cin == w_cin
input_channels = w_cin

output_channels = w_cout
stride = mod.stride[0]
#     flops = h * w * output_channels * input_channels * w_h * w_w / (stride**2)
flops = h * w * output_channels * input_channels * w_h * w_w

if not macs:
flops_bias = output_shape[1:].numel() if mod.bias is not None else 0
flops = 2 * flops + flops_bias

return int(flops)

2. fc 计算
def compute_fc_flops(mod, input_shape = None, output_shape = None, macs = False):
ft_in, ft_out =  mod.weight.data.shape
flops = ft_in * ft_out

if not macs:
flops_bias = ft_out if mod.bias is not None else 0
flops = 2 * flops + flops_bias

return int(flops)

def compute_bn2d_flops(mod, input_shape = None, output_shape = None, macs = False):
# subtract, divide, gamma, beta
flops = 2 * input_shape[1:].numel()

if not macs:
flops *= 2

return int(flops)

3. relu 计算
def compute_relu_flops(mod, input_shape = None, output_shape = None, macs = False):

flops = 0
if not macs:
flops = input_shape[1:].numel()

return int(flops)

4. maxpool 计算
def compute_maxpool2d_flops(mod, input_shape = None, output_shape = None, macs = False):

flops = 0
if not macs:
flops = mod.kernel_size**2 * output_shape[1:].numel()

return flops

5. averagepool 计算
def compute_avgpool2d_flops(mod, input_shape = None, output_shape = None, macs = False):

flops = 0
if not macs:
flops = mod.kernel_size**2 * output_shape[1:].numel()

return flops

6. softmax 计算
def compute_softmax_flops(mod, input_shape = None, output_shape = None, macs = False):

nfeatures = input_shape[1:].numel()

total_exp = nfeatures # https://stackoverflow.com/questions/3979942/what-is-the-complexity-real-cost-of-exp-in-cmath-compared-to-a-flop
total_div = nfeatures

flops = total_div + total_exp

if not macs:

return flops

另一种计算方法–感觉更合理
'''
A simplified 3-D Tensor (channels, height, weight) for convolutional neural networks.
'''
class Tensor(object):
def __init__(self, c, h, w):
self.c = c
self.h = h
self.w = w

def equals(self, other):
return self.c == other.c and self.h == other.h and self.w == other.w

return (self.c % other.c == 0 or other.c % self.c == 0) and \
(self.h % other.h == 0 or other.h % self.h == 0) and \
(self.w % other.w == 0 or other.w % self.w == 0)

'''
Calculate the single-sample inference-time params and FLOPs of a convolutional
neural network with PyTorch-like APIs.
To calculate the params and FLOPs of certain network architecture, CNNCalculator
needs to be inherited and the network needs to be defined as in PyTorch.
For convenience, some basic operators are pre-defined and other modules can be
defined in a similar way. Parameters and FLOPs in Batch Normalization and other
types of layers are also computed. If only Convolutional and Linear layers are
Refer to MobileNet.py for details.
'''
class CNNCalculator(object):
def __init__(self, only_mac=False):
self.params = 0
self.flops = 0
self.only_mac = only_mac

def calculate(self, *inputs):
raise NotImplementedError

def Conv2d(self, tensor, out_c, size, stride=1, padding=0, groups=1, bias=True, name='conv'):
if type(size) == int:
size = (size, size)
if type(stride) == int:
stride = (stride, stride)
assert type(size) == tuple and len(size) == 2, 'illegal size parameters'
assert type(stride) == tuple and len(stride) == 2, 'illegal stride parameters'
size_h, size_w = size
stride_h, stride_w = stride

in_c = tensor.c
out_h = (tensor.h - size_h + 2 * padding_h) // stride_h + 1
out_w = (tensor.w - size_w + 2 * padding_w) // stride_w + 1
assert in_c % groups == 0 and out_c % groups == 0, 'in_c and out_c must be divisible by groups'

self.params += out_c * in_c // groups * size_h * size_w
self.flops += out_c * out_h * out_w * in_c // groups * size_h * size_w
if bias:
self.params += out_c
self.flops += out_c * out_w * out_h

return Tensor(out_c, out_h, out_w)

def BatchNorm2d(self, tensor, name='batch_norm'):
return tensor
# Batch normalization can be combined with the preceding convolution, so there are no FLOPs.
# out_c = tensor.c
# out_h = tensor.h
# out_w = tensor.w

# if self.only_mac:
# self.params += 4 * out_c
# self.flops += out_c * out_h * out_w
# return Tensor(out_c, out_h, out_w)

def ReLU(self, tensor, name='relu'):
out_c = tensor.c
out_h = tensor.h
out_w = tensor.w

if not self.only_mac:
self.flops += out_c * out_h * out_w
return Tensor(out_c, out_h, out_w)

def Sigmoid(self, tensor, name='relu'):
out_c = tensor.c
out_h = tensor.h
out_w = tensor.w

if not self.only_mac:
self.flops += out_c * out_h * out_w
return Tensor(out_c, out_h, out_w)

def Pool2d(self, tensor, size, stride=1, padding=0, name='pool'):
if type(size) == int:
size = (size, size)
if type(stride) == int:
stride = (stride, stride)
assert type(size) == tuple and len(size) == 2, 'illegal size parameters'
assert type(stride) == tuple and len(stride) == 2, 'illegal stride parameters'
size_h, size_w = size
stride_h, stride_w = stride

out_c = tensor.c
out_h = (tensor.h - size_h + 2 * padding_h) // stride_h + 1
out_w = (tensor.w - size_w + 2 * padding_w) // stride_w + 1
if not self.only_mac:
self.flops += out_c * out_h * out_w * size_h * size_w
return Tensor(out_c, out_h, out_w)

def AvgPool2d(self, tensor, size, stride=1, padding=0, name='avg_pool'):

def MaxPool2d(self, tensor, size, stride=1, padding=0, name='max_pool'):

def GlobalAvgPool2d(self, tensor, name='global_avg_pool'):
size = (tensor.h, tensor.w)
return self.AvgPool2d(tensor, size)

def GlobalMaxPool2d(self, tensor, name='global_max_pool'):
size = (tensor.h, tensor.w)
return self.MaxPool2d(tensor, size)

def Linear(self, tensor, out_c, name='fully_connected'):
in_c = tensor.c
out_h = tensor.h
out_w = tensor.w
assert out_h == 1 and out_w == 1, 'out_h or out_w is greater than 1 in Linear layer.'
self.params += in_c * out_c
self.flops += in_c * out_c
return Tensor(out_c, out_h, out_w)

def Concat(self, tensors, name='concat'):
out_c = 0
out_h = tensors[0].h
out_w = tensors[0].w
for tensor in tensors:
assert tensor.h == out_h and tensor.w == out_w, 'tensor dimensions mismatch in Concat layer.'
out_c += tensor.c
return Tensor(out_c, out_h, out_w)

out_c = tensor.c
out_h = tensor.h
out_w = tensor.w
if not self.only_mac:
self.flops += out_c * out_h * out_w
return Tensor(out_c, out_h, out_w)

def Multi(self, tensor, other, name='multi'):

def SplitBySize(self, tensor, sizes, name='split_by_size'):
assert sum(sizes) == tensor.c, 'sizes and tensor.c do not match.'
return [Tensor(c, tensor.h, tensor.w) for c in sizes]

转载请注明出处:https://blog.csdn.net/tbl1234567.作者:陶表犁
展开全文
tbl1234567 2020-10-23 13:48:22
• jwen_ 2021-04-18 21:21:34
• qq_42189368 2021-03-26 10:36:36
• ## MACs和FLOPs 深度学习 机器学习 神经网络

qq_49030008 2021-09-13 17:17:04
• ## FLOPS和FLOPs 深度学习

feng__shuai 2020-10-21 20:03:30
• ## cal_param_flops.py 工具

4KB m0_38064529 2020-05-09 16:30:02
• ## 显存、参数、FLOPs、FLOPS 机器学习 神经网络 深度学习

sdw5723118 2021-08-30 20:22:15
• ## FLOPS、FLOPs、FPS 目标检测

qq_38973721 2021-10-08 14:08:49
• ## FLOPs、FLOPS、Params的含义及PyTorch中的计算方法 pytorch 深度学习 机器学习 nvidia

weixin_44966641 2021-09-04 19:40:10
• ## FLOPs和使用fvcore计算FLOPs 深度学习

qq_36136196 2021-08-06 15:52:12
• ## 计算模型的FLOPs python 算法

weixin_47569171 2020-07-21 16:56:57
• ## 神经网络层的FLOPs计算 深度学习 神经网络

qq_38318634 2021-06-10 22:02:22
• ## FLOPS, FLOPs and MACs 深度学习 pytorch python

aidanmo 2021-11-16 22:00:27
• u011808673 2020-04-27 13:54:04
• u013289254 2020-11-30 19:03:02

...