精华内容
下载资源
问答
  • 关于Keras中,当数据比较大时,不能全部载入内存,在训练的时候就需要利用train_on_batch或fit_generator进行训练了。 两者均是利用生成器,每次载入一个batch-size的数据进行训练。 那么fit_generator与train_on_...
  • 利用 train_on_batch 精细管理训练过程 大部分使用 keras 的同学使用 fit() 或者 fit_generator() 进行模型训练, 这两个 api 对于刚接触深度学习的同学非常友好和方便,但是由于其是非常深度的封装,对于希望自定义...

    利用 train_on_batch 精细管理训练过程

    大部分使用 keras 的同学使用 fit() 或者 fit_generator() 进行模型训练, 这两个 api 对于刚接触深度学习的同学非常友好和方便,但是由于其是非常深度的封装,对于希望自定义训练过程的同学就显得不是那么方便(从 torch 转 keras 的同学可能更喜欢自定义训练过程),而且,对于 GAN 这种需要分步进行训练的模型,也无法直接使用 fit 或者 fit_generator 直接训练的。因此,keras 提供了 train_on_batch 这个 api,对一个 mini-batch 的数据进行梯度更新。
    总结优点如下:

    • 更精细自定义训练过程,更精准的收集 loss 和 metrics
    • 分步训练模型-GAN的实现
    • 多GPU训练保存模型更加方便
    • 更多样的数据加载方式,结合 torch dataloader 的使用

    下面介绍 train_on_batch 的使用

    1. train_on_batch 的输入输出

    1.1 输入

    y_pred = Model.train_on_batch(
        x,
        y=None,
        sample_weight=None,
        class_weight=None,
        reset_metrics=True,
        return_dict=False,
    )
    
    • x:模型输入,单输入就是一个 numpy 数组, 多输入就是 numpy 数组的列表
    • y:标签,单输出模型就是一个 numpy 数组, 多输出模型就是 numpy 数组列表
    • sample_weight:mini-batch 中每个样本对应的权重,形状为 (batch_size)
    • class_weight:类别权重,作用于损失函数,为各个类别的损失添加权重,主要用于类别不平衡的情况, 形状为 (num_classes)
    • reset_metrics:默认True,返回的metrics只针对这个mini-batch, 如果False,metrics 会跨批次累积
    • return_dict:默认 False, y_pred 为一个列表,如果 True 则 y_pred 是一个字典

    1.2 输出

    • 单输出模型,1个loss,没有metrics,train_on_batch 返回一个标量,代表这个 mini-batch 的 loss, 例如
    model = keras.models.Model(inputs=inputs, outputs=outputs)
    model.compile(Adam, loss=['binary_crossentropy'])
    y_pred = model.train_on_batch(x=image,y=label)
    # y_pred 为标量
    
    • 单输出模型,有1个loss,n个metrics,train_on_batch 返回一个列表, 列表长度为 1+n, 例如
    model = keras.models.Model(inputs=inputs, outputs=outputs)
    model.compile(Adam, loss=['binary_crossentropy'], metrics=['accuracy'])
    y_pred = model.train_on_batch(x=image,y=label)
    # len(y_pred) == 2, y_pred[0]为loss, y_pred[1]为accuracy
    
    • 多输出模型,n个loss,m个metrics,train_on_batch返回一个列表,列表长度为 1+n+m, 例如
    model = keras.models.Model(inputs=inputs, outputs=[output1, output2])
    model.compile(Adam, 
    			  loss=['binary_crossentropy', 'binary_crossentropy'], 
    			  metrics=['accuracy', 'accuracy'])
    y_pred = model.train_on_batch(x=image,y=label)
    # len(y_pred) == 5, y_pred[0]为total_loss(按照loss_weights加权),
    # y_pred[1]为loss_1, y_pred[2]为loss_2
    # y_pred[3]为acc_1,y_pred[4]为acc_2
    

    2. train_on_batch 多GPU训练模型

    2.1 多GPU模型初始化,加载权重,模型编译,模型保存

    注意!训练时对 para_model 操作,保存时对 model 做操作

    import tensorflow as tf
    import keras
    import os
    
    # 初始化GPU的使用个数
    gpu = "0,1"
    os.environ["CUDA_VISIBLE_DEVICES"] = gpu
    gpu_num = len(gpu.split(','))
    
    # model初始化
    with tf.device('/cpu:0'):# 使用多GPU时,先在CPU上初始化模型
    	model = YourModel(input_size, num_classes)
    	model.load_weights('*.h5') # 如果有权重需要加载,在这里实现
    para_model = keras.utils.multi_gpu_model(model, gpus=gpu_num) # 在GPU上初始化多GPU模型
    para_model.compile(optimizer, loss=[...], metrics=[...]) # 编译多GPU模型
    	
    # 训练和验证,对 para_model 使用 train_on_batch
    def train():
    	para_model.train_on_batch(...)
    		
    def evaluate():
    	para_model.test_on_batch(...)
    		
    # 保存模型,注意!训练时对 para_model 操作,保存时对 model 做操作
    # 不要使用 para_model.save() 或者 para_model.save_weights(),否则加载时会出问题
    model.save('*.h5')
    model.save_weights('*.h5')
    

    3. 自定义学习率调整策略

    由于无法使用callback,我们使用 keras.backend.get_value() 和 keras.backend.set_value() 来获取和设置当前学习率。举个栗子, 实现一下最简单阶梯下降学习率,每10个epoch,学习率下降0.1倍

    import keras.backend as K
    
    for epoch in range(100):
    	train_one_epoch()
    	evaluate()
    	# 每10个epoch,lr缩小0.1倍
    	if epoch%10==0 and epoch!=0:
    		lr = K.get_value(model.optimizer.lr) # 获取当前学习率
    		lr = lr * 0.1 # 学习率缩小0.1倍
    		K.set_value(model.optimizer.lr, lr) # 设置学习率
    

    4. keras和torch的结合

    torch 的 dataloader 是目前为止我用过最好用的数据加载方式,使用 train_on_batch 一部分的原因是因为我能够用 torch dataloader 载入数据,然后用 train_on_batch 对模型进行训练,通过合理的控制 cpu worker 的使用个数和 batch_size 的大小,使模型的训练效率最大化

    4.1 dataloader+train_on_batch 训练keras模型pipeline

    # 定义 torch dataset
    class Dataset(torch.utils.data.Dataset):
    	def __init__(self, root_list, transforms=None):
    		self.root_list = root_list
    		self.transforms = transforms
    		
    	def __getitem__(self, idx):
    		# 假设是图像分类任务
    		image = ... # 读取单张图像
    		label = ... # 读取标签
    		if self.transforms is not None:
    			image = self.transforms(image)
    		return image, label # shape: (H,W,3), salar
    		
    	def __len__(self):
    		return len(self.root_list)
    		
    # 自定义 collate_fn 使 dataloader 返回 numpy array
    def collate_fn(batch):
    	# 这里的 batch 是 tuple 列表,[(image, label),(image, label),...]
    	image, label = zip(*batch)
    	image = np.asarray(image) # (batch_size, H, W, 3)
    	label = np.asarray(label) # (batch_size)
    	return image, label # 如果 datast 返回的图像是 ndarray,这样loader返回的也是 ndarray
    	
    # 定义dataset
    train_dataset = Dataset(train_list)
    valid_dataset = Dataset(valid_list)
    test_dataset = Dataset(test_list)
    
    # 定义 dataloader, 如果不使用自定义 collate_fn,
    # 从 loader 取出的默认是 torch Tensor,需要做一个 .numpy()的转换
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size, shuffle=True, num_workers=4, collate_fn=collate_fn)
    valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size, shuffle=False, num_workers=4, collate_fn=collate_fn)
    test_loader = torch.utils.data.DataLoader(test_dataset, batch_size, shuffle=False, num_workers=4, collate_fn=collate_fn)
    
    # 定义 train,evaluate,test
    def train():
    	for i,(inputs, label) in enumerate(train_loader):
    		# 如果 inputs 和 label 是 torch Tensor
    		# 请用 inputs = inputs.numpy() 和 label = label.numpy() 转成 ndarray
    		y_pred = model.train_on_batch(inputs, label)
    		
    def evaluate():
    	for i,(inputs, label) in enumerate(valid_loader):
    		# 如果 inputs 和 label 是 Tensor,同上
    		y_pred = model.test_on_batch(inputs, label)
    		
    def test():
    	for i,(inputs, label) in enumerate(test_loader):
    		# 如果 inputs 和 label 是 Tensor,同上
    		y_pred = model.test_on_batch(inputs, label)
    		
    def run():
    	for epoch in num_epoch:
    		train()
    		evaluate()
    	test()
    	
    if __name__ == "__main__":
    	run()
    

    总结

    还有一些使用 train_on_batch 的地方比如 GAN 的训练,这里就不介绍了,具体可以上 github 上搜索,例如 keras-dcgan

    参考

    keras 官方 api: train_on_batch

    展开全文
  • 在利用TensorFlow的TensorBoard对train_on_batch的输出进行画图时发现了一些问题。下面对train_on_batch的输出进行讲解。在讲解train_on_batch之前,先看一下Keras的model.compile函数。下面利用Keras版Faster R-CNN...

    在利用TensorFlowTensorBoardtrain_on_batch的输出进行画图时发现了一些问题。下面对train_on_batch的输出进行讲解。在讲解train_on_batch之前,先看一下Kerasmodel.compile函数。下面利用KerasFaster R-CNN代码进行讲解。示例代码如下:

    # define the RPN, built on the base layers
    
    rpn = nn.rpn(shared_layers, num_anchors)
    
    classifier = nn.classifier(shared_layers, roi_input, cfg.num_rois, nb_classes=len(classes_count), trainable=True)
    
    model_rpn = Model(img_input, rpn[:2])
    model_classifier = Model([img_input, roi_input], classifier)
    .
    .
    .
    optimizer = Adam(lr=1e-5)
    optimizer_classifier = Adam(lr=1e-5)
    model_rpn.compile(optimizer=optimizer,
                         loss=[losses_fn.rpn_loss_cls(num_anchors), losses_fn.rpn_loss_regr(num_anchors)])
    model_classifier.compile(optimizer=optimizer_classifier,
                               loss=[losses_fn.class_loss_cls, losses_fn.class_loss_regr(len(classes_count) - 1)],
                               metrics={'dense_class_{}'.format(len(classes_count)): 'accuracy'})
    .
    .
    .
    loss_rpn = model_rpn.train_on_batch(X, Y)
    loss_class = model_classifier.train_on_batch([X, X2[:, sel_samples, :]],
                                                                 [Y1[:, sel_samples, :], Y2[:, sel_samples, :]])
    
    

    model_rpnloss=[losses_fn.rpn_loss_cls(num_anchors), losses_fn.rpn_loss_regr(num_anchors)],由model_rpn的定义model_rpn = Model(img_input, rpn[:2])可知,model_rpn的输入是img_input,输出为rpn[:2],即对应两个输出,则对应两个loss输出,但是在进行训练train_on_batch的时候是输出三个值的list,其中第一个值为输出的总的loss,其名为'loss',是两个输出的loss的总和。两个输出loss的名字由定义网络的输出名加上'_loss'。下面是rpn网络的定义,注意其定义的输出名name的值即可,如下:

    def rpn(base_layers, num_anchors):
        x = Convolution2D(512, (3, 3), padding='same', activation='relu', kernel_initializer='normal', name='rpn_conv1')(
            base_layers)
    
        x_class = Convolution2D(num_anchors, (1, 1), activation='sigmoid', kernel_initializer='uniform',
                                name='rpn_out_class')(x)
        x_regr = Convolution2D(num_anchors * 6, (1, 1), activation='linear', kernel_initializer='zero',
                               name='rpn_out_regress')(x)  
    
        return [x_class, x_regr, base_layers]
    

    可以利用model.metrics_names查看train_on_batch的输出的每个值的特性,下面是model_rpn的输出:

    同理,可得model_classifier的输出如下:

    model_classifiermetrics_namesmodel_rpn不同的是,model_classifiercompile时指定了'accuracy'输出。

    下面具体看一下model.compile的部分定义:

    class Model(Network):
        """The `Model` class adds training & evaluation routines to a `Network`.
        """
    
        def compile(self, optimizer,
                    loss=None,
                    metrics=None,
                    loss_weights=None,
                    sample_weight_mode=None,
                    weighted_metrics=None,
                    target_tensors=None,
                    **kwargs):
            """Configures the model for training.
    
            # Arguments
                optimizer: String (name of optimizer) or optimizer instance.
                    See [optimizers](/optimizers).
                loss: String (name of objective function) or objective function.
                    See [losses](/losses).
                    If the model has multiple outputs, you can use a different loss
                    on each output by passing a dictionary or a list of losses.
                    The loss value that will be minimized by the model
                    will then be the sum of all individual losses.
                metrics: List of metrics to be evaluated by the model
                    during training and testing.
                    Typically you will use `metrics=['accuracy']`.
                    To specify different metrics for different outputs of a
                    multi-output model, you could also pass a dictionary,
                    such as `metrics={'output_a': 'accuracy'}`.
                loss_weights: Optional list or dictionary specifying scalar
                    coefficients (Python floats) to weight the loss contributions
                    of different model outputs.
                    The loss value that will be minimized by the model
                    will then be the *weighted sum* of all individual losses,
                    weighted by the `loss_weights` coefficients.
                    If a list, it is expected to have a 1:1 mapping
                    to the model's outputs. If a tensor, it is expected to map
                    output names (strings) to scalar coefficients.
                sample_weight_mode: If you need to do timestep-wise
                    sample weighting (2D weights), set this to `"temporal"`.
                    `None` defaults to sample-wise weights (1D).
                    If the model has multiple outputs, you can use a different
                    `sample_weight_mode` on each output by passing a
                    dictionary or a list of modes.
                weighted_metrics: List of metrics to be evaluated and weighted
                    by sample_weight or class_weight during training and testing.
                target_tensors: By default, Keras will create placeholders for the
                    model's target, which will be fed with the target data during
                    training. If instead you would like to use your own
                    target tensors (in turn, Keras will not expect external
                    Numpy data for these targets at training time), you
                    can specify them via the `target_tensors` argument. It can be
                    a single tensor (for a single-output model), a list of tensors,
                    or a dict mapping output names to target tensors.
                **kwargs: When using the Theano/CNTK backends, these arguments
                    are passed into `K.function`.
                    When using the TensorFlow backend,
                    these arguments are passed into `tf.Session.run`.
    
            # Raises
                ValueError: In case of invalid arguments for
                    `optimizer`, `loss`, `metrics` or `sample_weight_mode`.
            """
            self.optimizer = optimizers.get(optimizer)
            self.loss = loss or []
            self.metrics = metrics or []
            self.loss_weights = loss_weights
            self.sample_weight_mode = sample_weight_mode
            self.weighted_metrics = weighted_metrics
    
            if not self.built:
                # Model is not compilable because
                # it does not know its number of inputs
                # and outputs, nor their shapes and names.
                # We will compile after the first
                # time the model gets called on training data.
                return
            self._is_compiled = True
    
            # Prepare loss functions.
            if isinstance(loss, dict):
                for name in loss:
                    if name not in self.output_names:
                        raise ValueError('Unknown entry in loss '
                                         'dictionary: "' + name + '". '
                                         'Only expected the following keys: ' +
                                         str(self.output_names))
                loss_functions = []
                for name in self.output_names:
                    if name not in loss:
                        warnings.warn('Output "' + name +
                                      '" missing from loss dictionary. '
                                      'We assume this was done on purpose, '
                                      'and we will not be expecting '
                                      'any data to be passed to "' + name +
                                      '" during training.', stacklevel=2)
                    loss_functions.append(losses.get(loss.get(name)))
            elif isinstance(loss, list):
                if len(loss) != len(self.outputs):
                    raise ValueError('When passing a list as loss, '
                                     'it should have one entry per model outputs. '
                                     'The model has ' + str(len(self.outputs)) +
                                     ' outputs, but you passed loss=' +
                                     str(loss))
                loss_functions = [losses.get(l) for l in loss]
            else:
                loss_function = losses.get(loss)
                loss_functions = [loss_function for _ in range(len(self.outputs))]
            self.loss_functions = loss_functions
            weighted_losses = [
                weighted_masked_objective(fn) for fn in loss_functions]
            skip_target_indices = []
            skip_target_weighing_indices = []
            self._feed_outputs = []
            self._feed_output_names = []
            self._feed_output_shapes = []
            self._feed_loss_fns = []
            for i in range(len(weighted_losses)):
                if weighted_losses[i] is None:
                    skip_target_indices.append(i)
                    skip_target_weighing_indices.append(i)

    下面具体看一下train_on_batch的部分定义:

        def train_on_batch(self, x, y,
                           sample_weight=None,
                           class_weight=None):
            """Runs a single gradient update on a single batch of data.
    
            # Arguments
                x: Numpy array of training data,
                    or list of Numpy arrays if the model has multiple inputs.
                    If all inputs in the model are named,
                    you can also pass a dictionary
                    mapping input names to Numpy arrays.
                y: Numpy array of target data,
                    or list of Numpy arrays if the model has multiple outputs.
                    If all outputs in the model are named,
                    you can also pass a dictionary
                    mapping output names to Numpy arrays.
                sample_weight: Optional array of the same length as x, containing
                    weights to apply to the model's loss for each sample.
                    In the case of temporal data, you can pass a 2D array
                    with shape (samples, sequence_length),
                    to apply a different weight to every timestep of every sample.
                    In this case you should make sure to specify
                    sample_weight_mode="temporal" in compile().
                class_weight: Optional dictionary mapping
                    class indices (integers) to
                    a weight (float) to apply to the model's loss for the samples
                    from this class during training.
                    This can be useful to tell the model to "pay more attention" to
                    samples from an under-represented class.
    
            # Returns
                Scalar training loss
                (if the model has a single output and no metrics)
                or list of scalars (if the model has multiple outputs
                and/or metrics). The attribute `model.metrics_names` will give you
                the display labels for the scalar outputs.
            """
            x, y, sample_weights = self._standardize_user_data(
                x, y,
                sample_weight=sample_weight,
                class_weight=class_weight)
            if self._uses_dynamic_learning_phase():
                ins = x + y + sample_weights + [1.]
            else:
                ins = x + y + sample_weights
            self._make_train_function()
            outputs = self.train_function(ins)
            return unpack_singleton(outputs)

    关于Kerasmodel.compiletrain_on_batch代码的解读,有时间的话我会详细梳理一下,敬请期待~

    下面就是关于TensorBoard for train_on_batch的代码了:

    import numpy as np
    import tensorflow as tf
    from keras.callbacks import TensorBoard
    from keras.layers import Input, Dense
    from keras.models import Model
    
    
    def write_log(callback, names, logs, batch_no):
        for name, value in zip(names, logs):
            summary = tf.Summary()
            summary_value = summary.value.add()
            summary_value.simple_value = value
            summary_value.tag = name
            callback.writer.add_summary(summary, batch_no)
            callback.writer.flush()
        
    net_in = Input(shape=(3,))
    net_out = Dense(1)(net_in)
    model = Model(net_in, net_out)
    model.compile(loss='mse', optimizer='sgd', metrics=['mae'])
    
    log_path = './graph'
    callback = TensorBoard(log_path)
    callback.set_model(model)
    train_names = ['train_loss', 'train_mae']
    val_names = ['val_loss', 'val_mae']
    for batch_no in range(100):
        X_train, Y_train = np.random.rand(32, 3), np.random.rand(32, 1)
        logs = model.train_on_batch(X_train, Y_train)
        write_log(callback, train_names, logs, batch_no)
        
        if batch_no % 10 == 0:
            X_val, Y_val = np.random.rand(32, 3), np.random.rand(32, 1)
            logs = model.train_on_batch(X_val, Y_val)
            write_log(callback, val_names, logs, batch_no//10)

    注意for name, value in zip(names, logs)这行代码中namevalue的准确对应,请参考前述中model.compiletrain_on_batch的讲解。

    如果您觉得我的文章对您有所帮助,欢迎扫码进行赞赏!

     

    展开全文
  • 大部分使用 tensorflow 的同学会使用 fit() 或者 fit_generator() 方法训练模型, 这两个 api 对于刚接触...因此,tensorflow 提供了 train_on_batch 这个 api,对一个 mini-batch 的数据进行梯度更新。 总结优点如下:

    大部分使用 tensorflow 的同学会使用 fit() 或者 fit_generator() 方法训练模型, 这两个 api 对于刚接触深度学习的同学非常友好和方便,但是由于其是非常深度的封装,对于希望自定义训练过程的同学就显得不是那么方便,而且,对于 GAN 这种需要分步进行训练的模型,也无法直接使用 fit 或者 fit_generator 直接训练的。因此,tensorflow 提供了 train_on_batch 这个 api,对一个 mini-batch 的数据进行梯度更新。

    总结优点如下:

    • 更精细自定义训练过程,更精准的收集 loss 和 metrics
    • 分步训练模型-GAN的实现
    • 多GPU训练保存模型更加方便
    • 更多样的数据加载方式

    函数原型:

    y_pred = Model.train_on_batch(
        x,
        y=None,
        sample_weight=None,
        class_weight=None,
        reset_metrics=True,
        return_dict=False,
    )
    

    官方文档:train_on_batch

    参数详解:

    • x:模型输入,单输入就是一个 numpy 数组, 多输入就是 numpy 数组的列表
    • y:标签,单输出模型就是一个 numpy 数组, 多输出模型就是 numpy 数组列表
    • sample_weight:mini-batch 中每个样本对应的权重,形状为 (batch_size)
    • class_weight:类别权重,作用于损失函数,为各个类别的损失添加权重,主要用于类别不平衡的情况, 形状为 (num_classes)
    • reset_metrics:默认True,返回的metrics只针对这个mini-batch, 如果False,metrics 会跨批次累积
    • return_dict:默认 False, y_pred 为一个列表,如果 True 则 y_pred 是一个字典

    实例:

    • 单输出模型,且只有loss,没有metrics, 此时 y_pred 为一个标量,代表这个 mini-batch 的 loss, 例如下面的例子
    model = keras.models.Model(inputs=inputs, outputs=outputs)
    model.compile(Adam, loss=['binary_crossentropy'])
    history = model.train_on_batch(x=image,y=label) 
    # history 为标量
    
    • 单输出模型,既有loss,也有metrics, 此时 y_pred 为一个列表,代表这个 mini-batch 的 loss 和 metrics, 列表长度为 1+len(metrics), 例如下面的例子
    model = keras.models.Model(inputs=inputs, outputs=outputs)
    model.compile(Adam, loss=['binary_crossentropy'], metrics=['accuracy'])
    history = model.train_on_batch(x=image,y=label) # len(history ) == 2
    # history 为长度为2的列表, 
    # history [0]为loss, 
    # history [1]为accuracy
    
    • 多输出模型,既有loss,也有metrics, 此时 y_pred 为一个列表,列表长度为 1+len(loss)+len(metrics), 例如下面的例子
    model = keras.models.Model(inputs=inputs, outputs=[output1, output2])
    model.compile(Adam, loss=['binary_crossentropy', 'binary_crossentropy'], 
    			  metrics=['accuracy', 'accuracy'])
    history = model.train_on_batch(x=image,y=label) # len(history ) == 5
    # history [0]为总loss(按照loss_weights加权),
    # history [1]为第一个输出的loss, 
    # history [2]为第二个输出的loss
    # history [3]为第一个accuracy,
    # history [4]为第二个accuracy
    
    展开全文
  • loss = self.model.train_on_batch(X,y) ``` 这里的target data(y)是一个2维的列表数组,第一列是对应执行的动作,第二列是折扣奖励,那么在训练的时候,神经网络的输出数据和target data的维度不一致,是如何...
  • train_on_batch returns Scalar training loss (if the model has a single output and no metrics) or list of scalars (if the model has multiple outputs and/or metrics). The attribute model.metrics_names ...

    train_on_batch

    returns

    Scalar training loss
    (if the model has a single output and no metrics)
    or list of scalars (if the model has multiple outputs
    and/or metrics). The attribute model.metrics_names will give you
    the display labels for the scalar outputs.

    如果模型的输出是单一的并且没有指定指标,那么返回这段训练的损失;
    如果模型有多种输出或者多个指标,返回列表
    属性model.metrics_names将为您提供标量输出的标签。(可视化时使用)

    keras metrics

    https://keras.io/metrics/
    在这里插入图片描述
    metrics[0]是loss,如果metrics中指定的指标与loss函数相同,打印:
    在这里插入图片描述

    from keras.callbacks import TensorBoard

    部分参数:

    log_dir: the path of the directory where to save the log
    files to be parsed by TensorBoard.
    histogram_freq: frequency (in epochs) at which to compute activation
    and weight histograms for the layers of the model. If set to 0,
    histograms won’t be computed. Validation data (or split) must be
    specified for histogram visualizations.
    write_graph: whether to visualize the graph in TensorBoard.
    The log file can become quite large when
    write_graph is set to True.
    write_grads: whether to visualize gradient histograms in TensorBoard.
    histogram_freq must be greater than 0.
    batch_size: size of batch of inputs to feed to the network
    for histograms computation.
    write_images: whether to write model weights to visualize as
    image in TensorBoard.

    use

        board = TensorBoard(log_dir=tb_log, histogram_freq=2, batch_size=batch_size, write_images=True)
        board.set_model(model)
    

    实践

    def named_logs(metrics_names, logs):
        result = {}
        for log in zip(metrics_names, logs):
            result[log[0]] = log[1]
        return result
    
    
        loss_file = open(os.path.join(workspace, "loss_file.txt"), 'w+')
    
        board = TensorBoard(log_dir=tb_log, batch_size=batch_size, write_images=True)
        board.set_model(model)
        # Train.
        t1 = time.time()
        for i in range(1, epoch):
            for (batch_x, batch_y) in train_gen.generate(xs=[tr_x], ys=[tr_y]):
                metrics = model.train_on_batch(batch_x, batch_y)
                loss_avg.add(metrics[0])
                # TODO: add checkpoint
    
            tr_loss = eval(model, eval_tr_gen, tr_x, tr_y)
            te_loss = eval(model, eval_te_gen, te_x, te_y)
            # inter_loss_avg整个epoch的平均训练损失,tr_loss这一轮训练完之后训练集的损失,te_loss这一轮训练完之后测试集的损失
            loss_str = "Epoch: %d / %d, inter_loss_avg: %f, tr_loss: %f, te_loss: %f" % (i, epoch, loss_avg.val(), tr_loss, te_loss)
            loss_file.write(loss_str + "\n")
            loss_file.flush()
            loss_avg.reset()
    
            board.on_epoch_end(i, utils.named_logs(model.metrics_names, metrics))
    
            # Save out training stats.
            stat_dict = {'epoch': i,
                         'tr_loss': tr_loss,
                         'te_loss': te_loss, }
            stat_path = os.path.join(stats_dir, "%diters.p" % i)
            pickle.dump(stat_dict, open(stat_path, 'wb'), protocol=pickle.HIGHEST_PROTOCOL)
    
            # Save model.
            if i % save_interval == 0:
                model_path = os.path.join(model_dir, "md_%d_epoch.h5" % i)
                model.save(model_path)
                print("Saved model to %s" % model_path)
        board.on_train_end(None)
        print("Training time: %s s" % (time.time() - t1,))
    
    

    注意:
    在这里插入图片描述
    train_on_batch时,要生成histograms,就必须提供验证数据,并且不能是一个generator。必须是实实在在的数据。
    如果是fit(),那就好办了,传参validation_data(x_val,y_val)

    结语

    绕一大圈,不如加个内存条,何必造轮子

        # Directorie for model checkpoint
        ckpt_dir = os.path.join(workspace, "models", "%ddb" % int(tr_snr))
        prp_data.create_folder(ckpt_dir)
    
        model = Sequential()
        model.add(Flatten(input_shape=(n_concat, n_freq)))
        model.add(Dense(n_hid))
        model.add(LeakyReLU())
        model.add(Dropout(0.2))
        model.add(Dense(n_hid))
        model.add(LeakyReLU(0.2))
        model.add(Dropout(0.2))
        model.add(Dense(n_hid))
        model.add(LeakyReLU(0.2))
        model.add(Dropout(0.2))
        model.add(Dense(n_freq, activation='linear'))
        model.summary()
    
        model.load_weights(os.path.join(workspace, pre_model))
    
        # TODO: diy callback func, get the loss from real val data.
        callbacks_list = [
            EarlyStopping(monitor='val_mean_sq', patience=6, ),
            ModelCheckpoint(filepath=ckpt_dir+'/ckpt_model.h5', monitor='val_loss', save_best_only=True, mode='min', ),
            ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=7, ),
            TensorBoard(log_dir=tb_log, histogram_freq=1, batch_size=batch_size, write_images=True)
        ]
    
        model.compile(loss='mean_absolute_error',
                      optimizer=Nadam(lr=lr),
                      metrics=['mse'])
    
        plot_model(model, show_layer_names=True, to_file=os.path.join(workspace,'model.png'))
        # Train.
        t1 = time.time()
        # model.fit(tr_x, tr_y, epochs=epoch, batch_size=batch_size, callbacks=callbacks_list, validation_data=(te_x, te_y))
        model.fit(tr_x, tr_y, epochs=epoch, batch_size=batch_size, callbacks=callbacks_list, validation_split=0.2)
    
        print("Training time: %s s" % (time.time() - t1,))
    
    
    展开全文
  • Keras之fit_generator与train_on_batch

    万次阅读 2018-11-13 15:35:44
    关于Keras中,当数据比较大时,不能全部载入内存,在训练的时候就需要利用train_on_batch或fit_generator进行训练了。两者均是利用生成器,每次载入一个batch-size的数据进行训练。那么fit_generator与train_on_...
  • 首先是fit,是最常用的也是最基础的。...fit(x=None, y=None, batch_size=None, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=...
  • keras在compile完模型后需要训练,除了常用的model.fit()与model....使用train_on_batch优点: 1、更精细自定义训练过程,更精准的收集 loss 和 metrics 2、分步训练模型-GAN的实现 3、多GPU训练保存模型更加方便 ...
  • 那么请问调用train_on_batch()是对每个样本计算loss然后更新模型(更新100次)然后返回最后一次更新之后的loss值吗? 或者是将100个样本的loss值求和之后更新模型呢?这种更新方法返回的loss值又是什么呢?
  • Generator是keras中很方便的数据输入方式,既可以节省内存空间,又自带数据增强的功能,一般用于fit_generator这种比较单一的训练方式,不适于train_on_batch这种拓展性较高的训练方式。但实际上generator是可以用于...
  • model.train_on_batch 函数作用函数定义 函数作用 model.train_on_batch() 在训练集数据的一批数据上进行训练 函数定义 train_on_batch(x, y, sample_weight=None, class_weight=None) 参数含义: x...
  • kera中train_on_batch 自定义训练过程

    千次阅读 2020-03-22 18:42:18
    train_on_batch 可以在keras中自定义精细化训练过程使用。 使用示例: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' import numpy as np import matplotlib.pyplot as plt from keras.models import ...
  • 本文主要内容:Keras 中的`fit()`函数、`fit_generator()`函数、`train_on_batch()`函数的分析及应用。
  • def write_log(callback, names, logs, batch_no): for name, value in zip(names, logs): summary = tf.Summary() summary_value = summary.value.add() summary_value.simple_v...
  • 【题目】Keras中使用ImageDataGenerator对每一个batch进行数据增强,再使用train_on_batch 一、使用ImageDataGenerator的flow方法可以获得一个生成器,之后可以使用__next__()或者next()获取每个batch的数据(x,y...
  • 在运行的时候报错:AttributeError: ‘ModelCheckpoint’ object has no attribute ‘on_train_batch_begin’ 应该将 from keras.callbacks import Modelcheckpoint 改为从tensorflow中导入,即 from tensorflow....
  • keras train_on_batch

    万次阅读 2018-10-18 18:26:18
    import numpy as np import tensorflow as tf from keras.callbacks import TensorBoard from keras.layers import Input, Dense from keras.models import Model...def write_log(callback, names, logs, batch_no)...
  • 修改代码keras.callbacks.ModelCheckpoint(filepath, monitor=‘val_loss’, verbose=0, save_best_only=False, save_weights_only=False, mode=‘auto’, period=1...Callbacks method on_test_batch_end is slow ...
  • 'EarlyStopping' object has no attribute '_implements_train_batch_hooks'
  • model.fit(train_x, train_y, batch_size=32, epochs=300, callbacks=[lrate]) 结果发现在运行的过程中一直报错如下: AttributeError: 'LearningRateScheduler' object has no attribute '_implements_train_batch_...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 31,238
精华内容 12,495
关键字:

train_on_batch