精华内容
下载资源
问答
  • PyTorch + Ray Tune 调参

    千次阅读 2021-01-01 15:43:27
    参考了PyTorch官方文档和Ray Tune官方文档 1、HYPERPARAMETER TUNING WITH RAY TUNE 2、How to use Tune with PyTorch 以PyTorch中的CIFAR 10图片分类为例,示范如何将Ray Tune融入PyTorch模型训练过程中。 其中...

    参考了PyTorch官方文档Ray Tune官方文档

    1、HYPERPARAMETER TUNING WITH RAY TUNE

    2、How to use Tune with PyTorch

    以PyTorch中的CIFAR 10图片分类为例,示范如何将Ray Tune融入PyTorch模型训练过程中。

    其中,要求我们对原PyTorch程序做一些小的修改,包括:

    • 将数据加载和训练过程封装到函数中;
    • 使一些网络参数可配置;
    • 增加检查点(可选);
    • 定义用于模型调参的搜索空间。

    下面以示例代码解析的形式介绍Ray Tune具体如何操作:

    from functools import partial
    import numpy as np
    import os
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torch.utils.data import random_split
    import torchvision
    import torchvision.transforms as transforms
    
    from ray import tune
    from ray.tune import CLIReporter
    from ray.tune.schedulers import ASHAScheduler
    
    
    # 定义神经网络模型
    class Net(nn.Module):
        def __init__(self, l1=120, l2=84):
            super(Net, self).__init__()
            self.conv1 = nn.Conv2d(3, 6, 5)
            self.pool = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(6, 16, 5)
            self.fc1 = nn.Linear(16 * 5 * 5, l1)        # 参数待指定
            self.fc2 = nn.Linear(l1, l2)        # 参数待指定
            self.fc3 = nn.Linear(l2, 10)        # 参数待指定
    
        def forward(self, x):
            x = self.pool(F.relu(self.conv1(x)))
            x = self.pool(F.relu(self.conv2(x)))
            x = x.view(-1, 16 * 5 * 5)
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x
    
    
    # 封装数据加载过程,传递全局数据路径,以保证不同实验间共享数据路径
    def load_data(data_dir="/home/taoshouzheng/Local_Connection/Algorithms/ray/"):
        transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])
    
        trainset = torchvision.datasets.CIFAR10(
            root=data_dir, train=True, download=True, transform=transform)
    
        testset = torchvision.datasets.CIFAR10(
            root=data_dir, train=False, download=True, transform=transform)
    
        return trainset, testset
    
    
    # 封装训练脚本
    # config参数用于指定超参数
    # checkpoint_dir参数用于存储检查点
    # data_dir参数用于指定数据加载和存储路径
    def train_cifar(config, checkpoint_dir=None, data_dir=None):
    
        # 模型实例化
        net = Net(config["l1"], config["l2"])       # 2个超参数
    
        # 这种写法保证没有GPU可用时模型也可以训练
        device = "cpu"
        if torch.cuda.is_available():
            device = "cuda:0"
            if torch.cuda.device_count() > 1:
                # 将模型封装到nn.DataParallel中以支持多GPU并行训练
                net = nn.DataParallel(net)
        net.to(device)
    
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9)      # 1个超参数
    
        # 用于存储检查点
        if checkpoint_dir:
            # 模型的状态、优化器的状态
            model_state, optimizer_state = torch.load(
                os.path.join(checkpoint_dir, "checkpoint"))
            net.load_state_dict(model_state)
            optimizer.load_state_dict(optimizer_state)
    
        trainset, testset = load_data(data_dir)
    
        test_abs = int(len(trainset) * 0.8)
        # 将训练数据划分为训练集(80%)和验证集(20%)
        train_subset, val_subset = random_split(
            trainset, [test_abs, len(trainset) - test_abs])
    
        trainloader = torch.utils.data.DataLoader(
            train_subset,
            batch_size=int(config["batch_size"]),       # 1个超参数
            shuffle=True,
            num_workers=8)
        valloader = torch.utils.data.DataLoader(
            val_subset,
            batch_size=int(config["batch_size"]),
            shuffle=True,
            num_workers=8)
    
        for epoch in range(10):  # loop over the dataset multiple times
            running_loss = 0.0
            epoch_steps = 0
    
            # 训练循环
            for i, data in enumerate(trainloader, 0):
                # get the inputs; data is a list of [inputs, labels]
                inputs, labels = data
                inputs, labels = inputs.to(device), labels.to(device)
                # zero the parameter gradients
                optimizer.zero_grad()
                # forward + backward + optimize
                outputs = net(inputs)
                loss = criterion(outputs, labels)
                loss.backward()
                optimizer.step()
                # print statistics
                running_loss += loss.item()
                epoch_steps += 1
                if i % 2000 == 1999:  # print every 2000 mini-batches
                    print("[%d, %5d] loss: %.3f" % (epoch + 1, i + 1,
                                                    running_loss / epoch_steps))
                    running_loss = 0.0
    
            # 验证循环
            # Validation loss
            val_loss = 0.0
            val_steps = 0
            total = 0
            correct = 0
            for i, data in enumerate(valloader, 0):
                with torch.no_grad():
                    inputs, labels = data
                    inputs, labels = inputs.to(device), labels.to(device)
    
                    outputs = net(inputs)
                    _, predicted = torch.max(outputs.data, 1)
                    total += labels.size(0)
                    correct += (predicted == labels).sum().item()
    
                    loss = criterion(outputs, labels)
                    val_loss += loss.cpu().numpy()
                    val_steps += 1
    
            # 保存检查点
            # ray.tune.checkpoint_dir(step)返回检查点路径
            with tune.checkpoint_dir(epoch) as checkpoint_dir:
                path = os.path.join(checkpoint_dir, "checkpoint")
                torch.save((net.state_dict(), optimizer.state_dict()), path)
            # 打印平均损失和平均精度
            tune.report(loss=(val_loss / val_steps), accuracy=correct / total)
        print("Finished Training")
    
    
    # 测试集精度
    def test_accuracy(net, device="cpu"):
        trainset, testset = load_data()
    
        testloader = torch.utils.data.DataLoader(
            testset, batch_size=4, shuffle=False, num_workers=2)
    
        correct = 0
        total = 0
        with torch.no_grad():
            for data in testloader:
                images, labels = data
                images, labels = images.to(device), labels.to(device)
                outputs = net(images)
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
    
        return correct / total
    
    
    def main(num_samples=10, max_num_epochs=10, gpus_per_trial=2):
        # 全局文件路径
        data_dir = os.path.abspath("/home/taoshouzheng/Local_Connection/Algorithms/ray/")
        # 加载训练数据
        load_data(data_dir)
        # 配置超参数搜索空间
        # 每次实验,Ray Tune会随机采样超参数组合,并行训练模型,找到最优参数组合
        config = {
            # 自定义采样方法
            "l1": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)),
            "l2": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)),
            # 随机分布采样
            "lr": tune.loguniform(1e-4, 1e-1),
            # 从类别型值中随机选择
            "batch_size": tune.choice([2, 4, 8, 16])
        }
        # ASHAScheduler会根据指定标准提前中止坏实验
        scheduler = ASHAScheduler(
            metric="loss",
            mode="min",
            max_t=max_num_epochs,
            grace_period=1,
            reduction_factor=2)
        # 在命令行打印实验报告
        reporter = CLIReporter(
            # parameter_columns=["l1", "l2", "lr", "batch_size"],
            metric_columns=["loss", "accuracy", "training_iteration"])
        # 执行训练过程
        result = tune.run(
            partial(train_cifar, data_dir=data_dir),
            # 指定训练资源
            resources_per_trial={"cpu": 8, "gpu": gpus_per_trial},
            config=config,
            num_samples=num_samples,
            scheduler=scheduler,
            progress_reporter=reporter)
    
        # 找出最佳实验
        best_trial = result.get_best_trial("loss", "min", "last")
        # 打印最佳实验的参数配置
        print("Best trial config: {}".format(best_trial.config))
        print("Best trial final validation loss: {}".format(
            best_trial.last_result["loss"]))
        print("Best trial final validation accuracy: {}".format(
            best_trial.last_result["accuracy"]))
    
        # 打印最优超参数组合对应的模型在测试集上的性能
        best_trained_model = Net(best_trial.config["l1"], best_trial.config["l2"])
        device = "cpu"
        if torch.cuda.is_available():
            device = "cuda:0"
            if gpus_per_trial > 1:
                best_trained_model = nn.DataParallel(best_trained_model)
        best_trained_model.to(device)
    
        best_checkpoint_dir = best_trial.checkpoint.value
        model_state, optimizer_state = torch.load(os.path.join(
            best_checkpoint_dir, "checkpoint"))
        best_trained_model.load_state_dict(model_state)
    
        test_acc = test_accuracy(best_trained_model, device)
        print("Best trial test set accuracy: {}".format(test_acc))
    
    
    if __name__ == "__main__":
        # You can change the number of GPUs per trial here:
        main(num_samples=10, max_num_epochs=10, gpus_per_trial=0)

    第一次运行结果如下:

     

    第二次运行结果如下:

     第三次运行结果如下:

     第四次运行结果如下:

     第五次运行结果如下:

    从以上5组结果可以看出,虽然Ray Tune调参很高效,但最好运行多次对比效果,尤其当超参数组合比较复杂的时候。 

    展开全文
  • Ray Tune包括最新的超参数搜索算法,与TensorBoard等分析库集成,并通过Ray’s distributed machine learning engine本地支持分布式训练。 在本教程中,我们将向大家展示如何将Ray Tune集成到Py Torch培训..

    超参数调参可以使平均模型和高精度模型之间的差异。通常简单的事情,比如选择不同的学习速率或改变网络层大小,都会对您的模型性能产生巨大的影响。

    幸运的是,有一些工具可以帮助找到参数的最佳组合。 Ray Tune 是分布式超参数调优的行业标准工具。Ray Tune包括最新的超参数搜索算法,与TensorBoard等分析库集成,并通过Ray’s distributed machine learning engine本地支持分布式训练。

    在本教程中,我们将向大家展示如何将Ray Tune集成到Py Torch培训工作流程中。我们将从Py Torch文档this tutorial from the PyTorch documentation 中扩展本教程,用于训练CIFAR10图像分类器。

    As you will see, we only need to add some slight modifications. In particular, we need to

    wrap data loading and training in functions,

    将数据的训练和加载包装在功能模块里

    make some network parameters configurable,

    对网络参数进行配置

    add checkpointing (optional),

    增加checkpoint

    and define the search space for the model tuning

    定义搜索区域用来模型调优

    要运行本教程,请确保安装了以下包:

    ray[tune]: Distributed hyperparameter tuning library

    • 分布式超参数调参库
    • torchvision: For the data transformers

     

    Setup / Imports

    首先从imports开始:

    from functools import partial
    import numpy as np
    import os
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torch.optim as optim
    from torch.utils.data import random_split
    import torchvision
    import torchvision.transforms as transforms
    from ray import tune
    from ray.tune import CLIReporter
    from ray.tune.schedulers import ASHAScheduler

     

     

    Data loaders

    我们将数据加载器包装在自己的函数中,并传递一个全局数据目录。这样我们就可以在不同的试验之间共享一个数据目录。

    def load_data(data_dir="./data"):
        transform = transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])
    ​
        trainset = torchvision.datasets.CIFAR10(
            root=data_dir, train=True, download=True, transform=transform)
    ​
        testset = torchvision.datasets.CIFAR10(
            root=data_dir, train=False, download=True, transform=transform)
    ​
        return trainset, testset

     

    Configurable neural network

    我们只能调整那些可配置的参数。在本例中,我们可以指定完全连接层的层大小:

    class Net(nn.Module):
        def __init__(self, l1=120, l2=84):
            super(Net, self).__init__()
            self.conv1 = nn.Conv2d(3, 6, 5)
            self.pool = nn.MaxPool2d(2, 2)
            self.conv2 = nn.Conv2d(6, 16, 5)
            self.fc1 = nn.Linear(16 * 5 * 5, l1)
            self.fc2 = nn.Linear(l1, l2)
            self.fc3 = nn.Linear(l2, 10)
    ​
        def forward(self, x):
            x = self.pool(F.relu(self.conv1(x)))
            x = self.pool(F.relu(self.conv2(x)))
            x = x.view(-1, 16 * 5 * 5)
            x = F.relu(self.fc1(x))
            x = F.relu(self.fc2(x))
            x = self.fc3(x)
            return x

     

    The train function

    Now it gets interesting,因为我们从Py Torch文档中引入了一些对示例的更改。

    我们将训练脚本包装在函数train_cifar中 train_cifar(config, checkpoint_dir=None, data_dir=None)配置参数将接收我们希望使用的超参数。checkpoint_dir参数用于恢复检查点。data_dir指定我们加载和存储数据的目录,因此多次运行可以共享相同的数据源。

     

    net = Net(config["l1"], config["l2"])
    ​
    if checkpoint_dir:
        model_state, optimizer_state = torch.load(
            os.path.join(checkpoint_dir, "checkpoint"))
        net.load_state_dict(model_state)
        optimizer.load_state_dict(optimizer_state)

    优化器的学习速率也是可配置的:

    optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9)

    我们还将培训数据拆分为一个培训和验证子集。因此,我们对80%的数据进行培训,并计算其余20%的验证损失。我们迭代训练和测试集的批处理大小也是可配置的。

     

    Adding (multi) GPU support with DataParallel

    图像分类主要得益于GPU。幸运的是,我们可以继续在RayTune中使用PyTorch的抽象。因此,我们可以在nn中包装我们的模型。数据并行支持多个GPU上的数据并行培训:

    device = "cpu"
    if torch.cuda.is_available():
        device = "cuda:0"
        if torch.cuda.device_count() > 1:
            net = nn.DataParallel(net)
    net.to(device)

    通过使用设备变量,我们确保当我们没有可用的GPU时,培训也能工作。Py Torch要求我们显式地将数据发送到GPU内存,如下所示:

    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

    该代码现在支持对CPU、单个GPU和多个GPU进行培训。 值得注意的是,Ray还支持fractional GPUs,因此我们可以在试验中共享GPU,只要模型仍然适合GPU内存。 我们稍后再谈。

     

    Communicating with Ray Tune

    The most interesting part is the communication with Ray Tune:

    with tune.checkpoint_dir(epoch) as checkpoint_dir:
        path = os.path.join(checkpoint_dir, "checkpoint")
        torch.save((net.state_dict(), optimizer.state_dict()), path)
    ​
    tune.report(loss=(val_loss / val_steps), accuracy=correct / total)

    在这里,我们首先保存一个检查点,然后向RayTune报告一些度量。 具体来说,我们将验证损失和准确性送回RayTune。 然后,Ray Tune可以使用这些度量来决定哪些超参数配置导致最佳结果。 这些指标也可以用于早期停止性能不佳的试验,以避免在这些试验上浪费资源。

    检查点保存是可选的,但是,如果我们想使用诸如Population Based Training的高级调度程序,这是必要的。 此外,通过保存检查点,我们可以稍后加载经过训练的模型并在测试集上验证它们。

     

    Full training function

    完整的代码示例如下:

    def train_cifar(config, checkpoint_dir=None, data_dir=None):
        net = Net(config["l1"], config["l2"])
    ​
        device = "cpu"
        if torch.cuda.is_available():
            device = "cuda:0"
            if torch.cuda.device_count() > 1:
                net = nn.DataParallel(net)
        net.to(device)
    ​
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.SGD(net.parameters(), lr=config["lr"], momentum=0.9)
    ​
        if checkpoint_dir:
            model_state, optimizer_state = torch.load(
                os.path.join(checkpoint_dir, "checkpoint"))
            net.load_state_dict(model_state)
            optimizer.load_state_dict(optimizer_state)
    ​
        trainset, testset = load_data(data_dir)
    ​
        test_abs = int(len(trainset) * 0.8)
        train_subset, val_subset = random_split(
            trainset, [test_abs, len(trainset) - test_abs])
    ​
        trainloader = torch.utils.data.DataLoader(
            train_subset,
            batch_size=int(config["batch_size"]),
            shuffle=True,
            num_workers=8)
        valloader = torch.utils.data.DataLoader(
            val_subset,
            batch_size=int(config["batch_size"]),
            shuffle=True,
            num_workers=8)
    ​
        for epoch in range(10):  # loop over the dataset multiple times
            running_loss = 0.0
            epoch_steps = 0
            for i, data in enumerate(trainloader, 0):
                # get the inputs; data is a list of [inputs, labels]
                inputs, labels = data
                inputs, labels = inputs.to(device), labels.to(device)
    
    
                # zero the parameter gradients
                optimizer.zero_grad()
    
    
                # forward + backward + optimize
                outputs = net(inputs)
                loss = criterion(outputs, labels)
                loss.backward()
                optimizer.step()
    
    
                # print statistics
                running_loss += loss.item()
                epoch_steps += 1
                if i % 2000 == 1999:  # print every 2000 mini-batches
                    print("[%d, %5d] loss: %.3f" % (epoch + 1, i + 1,
                                                    running_loss / epoch_steps))
                    running_loss = 0.0
    
    
            # Validation loss
            val_loss = 0.0
            val_steps = 0
            total = 0
            correct = 0
            for i, data in enumerate(valloader, 0):
                with torch.no_grad():
                    inputs, labels = data
                    inputs, labels = inputs.to(device), labels.to(device)
    ​
                    outputs = net(inputs)
                    _, predicted = torch.max(outputs.data, 1)
                    total += labels.size(0)
                    correct += (predicted == labels).sum().item()
    ​
                    loss = criterion(outputs, labels)
                    val_loss += loss.cpu().numpy()
                    val_steps += 1
    ​
            with tune.checkpoint_dir(epoch) as checkpoint_dir:
                path = os.path.join(checkpoint_dir, "checkpoint")
                torch.save((net.state_dict(), optimizer.state_dict()), path)
    ​
            tune.report(loss=(val_loss / val_steps), accuracy=correct / total)
        print("Finished Trainin

    大多数代码都是直接从原始示例中改编的。

     

    ​Test set accuracy

     

    通常,机器学习模型的性能是在一个没有用于训练模型的数据的搁置测试集上进行测试的。 我们还将其包装为一个函数:

    def test_accuracy(net, device="cpu"):
        trainset, testset = load_data()
    ​
        testloader = torch.utils.data.DataLoader(
            testset, batch_size=4, shuffle=False, num_workers=2)
    ​
        correct = 0
        total = 0
        with torch.no_grad():
            for data in testloader:
                images, labels = data
                images, labels = images.to(device), labels.to(device)
                outputs = net(images)
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
    ​
        return correct / total

    该函数还需要device 参数,因此我们可以在GPU上进行测试集验证。

     

    Configuring the search space

    最后,我们需要定义RayTune的搜索空间。 下面举一个例子:

     

    config = {
        "l1": tune.sample_from(lambda _: 2**np.random.randint(2, 9)),
        "l2": tune.sample_from(lambda _: 2**np.random.randint(2, 9)),
        "lr": tune.loguniform(1e-4, 1e-1),
        "batch_size": tune.choice([2, 4, 8, 16])
    }

     

    sample_from()函数可以定义自己的示例方法以获得超参数。 在本例中,L1和L2参数应该是4到256之间的2的幂,所以要么是4、8、16、32、64、128或256。 应在0.0001到0.1之间均匀采样lr(学习率。 最后,批处理大小是2、4、8和16之间的选择。

     在每一次试验中,Ray Tune现在将随机地从这些搜索空间中抽取一个参数组合。 然后,它将并行训练一些模型,并在其中找到性能最好的模型。 我们还使用ASHAScheduler,它将提前终止执行不良的试验。

     我们用functools.partial包装train_cifar函数来设置常量data_dir参数。 我们还可以告诉RayTune每个试验应该有哪些资源:

    gpus_per_trial = 2
    # ...
    result = tune.run(
        partial(train_cifar, data_dir=data_dir),
        resources_per_trial={"cpu": 8, "gpu": gpus_per_trial},
        config=config,
        num_samples=num_samples,
        scheduler=scheduler,
        progress_reporter=reporter,
        checkpoint_at_end=True)

     

    可以指定CPU的数量,然后可用,例如。 以增加PyTorch数据加载器实例的num_workers。 在每个试验中,选定的GPU数对PyTorch是可见的。 审判不能访问尚未为他们请求的GPU-所以您不必关心使用同一组资源的两个审判。

    这里我们也可以指定分数GPU,所以像gpus_per_trial=0.5这样的东西是完全有效的。 然后,试验将相互共享GPU。 只需要确保模型仍然适合GPU内存。

    在对模型进行训练后,我们将找到性能最好的模型,并从检查点文件加载训练好的网络。 然后,我们获得测试集的准确性,并通过打印报告一切。

    完整的主要功能如下:

    def main(num_samples=10, max_num_epochs=10, gpus_per_trial=2):
        data_dir = os.path.abspath("./data")
        load_data(data_dir)
        config = {
            "l1": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)),
            "l2": tune.sample_from(lambda _: 2 ** np.random.randint(2, 9)),
            "lr": tune.loguniform(1e-4, 1e-1),
            "batch_size": tune.choice([2, 4, 8, 16])
        }
        scheduler = ASHAScheduler(
            metric="loss",
            mode="min",
            max_t=max_num_epochs,
            grace_period=1,
            reduction_factor=2)
        reporter = CLIReporter(
            # parameter_columns=["l1", "l2", "lr", "batch_size"],
            metric_columns=["loss", "accuracy", "training_iteration"])
        result = tune.run(
            partial(train_cifar, data_dir=data_dir),
            resources_per_trial={"cpu": 2, "gpu": gpus_per_trial},
            config=config,
            num_samples=num_samples,
            scheduler=scheduler,
            progress_reporter=reporter)
    ​
        best_trial = result.get_best_trial("loss", "min", "last")
        print("Best trial config: {}".format(best_trial.config))
        print("Best trial final validation loss: {}".format(
            best_trial.last_result["loss"]))
        print("Best trial final validation accuracy: {}".format(
            best_trial.last_result["accuracy"]))
    ​
        best_trained_model = Net(best_trial.config["l1"], best_trial.config["l2"])
        device = "cpu"
        if torch.cuda.is_available():
            device = "cuda:0"
            if gpus_per_trial > 1:
                best_trained_model = nn.DataParallel(best_trained_model)
        best_trained_model.to(device)
    ​
        best_checkpoint_dir = best_trial.checkpoint.value
        model_state, optimizer_state = torch.load(os.path.join(
            best_checkpoint_dir, "checkpoint"))
        best_trained_model.load_state_dict(model_state)
    ​
        test_acc = test_accuracy(best_trained_model, device)
        print("Best trial test set accuracy: {}".format(test_acc))
    ​
    ​
    if __name__ == "__main__":
        # You can change the number of GPUs per trial here:
        main(num_samples=10, max_num_epochs=10, gpus_per_trial=0)

     

    ​输出:

     

    Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to /var/lib/jenkins/workspace/beginner_source/data/cifar-10-python.tar.gz
    Extracting /var/lib/jenkins/workspace/beginner_source/data/cifar-10-python.tar.gz to /var/lib/jenkins/workspace/beginner_source/data
    Files already downloaded and verified
    == Status ==
    Memory usage on this node: 4.0/240.1 GiB
    Using AsyncHyperBand: num_stopped=0
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: None
    Resources requested: 2/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (9 PENDING, 1 RUNNING)
    +---------------------+----------+-------+--------------+------+------+-------------+
    | Trial name          | status   | loc   |   batch_size |   l1 |   l2 |          lr |
    |---------------------+----------+-------+--------------+------+------+-------------|
    | DEFAULT_77a44_00000 | RUNNING  |       |            4 |    8 |  128 | 0.0210161   |
    | DEFAULT_77a44_00001 | PENDING  |       |            2 |  256 |  128 | 0.000461678 |
    | DEFAULT_77a44_00002 | PENDING  |       |            8 |   32 |   16 | 0.0131231   |
    | DEFAULT_77a44_00003 | PENDING  |       |            4 |    4 |  128 | 0.00551547  |
    | DEFAULT_77a44_00004 | PENDING  |       |            2 |  256 |  256 | 0.0647615   |
    | DEFAULT_77a44_00005 | PENDING  |       |            4 |    4 |  128 | 0.0421917   |
    | DEFAULT_77a44_00006 | PENDING  |       |            2 |    8 |    8 | 0.000359613 |
    | DEFAULT_77a44_00007 | PENDING  |       |            4 |  128 |   16 | 0.00202898  |
    | DEFAULT_77a44_00008 | PENDING  |       |            2 |    4 |    8 | 0.000162963 |
    | DEFAULT_77a44_00009 | PENDING  |       |            2 |   32 |  256 | 0.000134494 |
    +---------------------+----------+-------+--------------+------+------+-------------+
    
    
    
    
    [2m[36m(pid=1164)[0m Files already downloaded and verified
    [2m[36m(pid=1145)[0m Files already downloaded and verified
    [2m[36m(pid=1104)[0m Files already downloaded and verified
    [2m[36m(pid=1119)[0m Files already downloaded and verified
    [2m[36m(pid=1140)[0m Files already downloaded and verified
    [2m[36m(pid=1118)[0m Files already downloaded and verified
    [2m[36m(pid=1098)[0m Files already downloaded and verified
    [2m[36m(pid=1101)[0m Files already downloaded and verified
    [2m[36m(pid=1165)[0m Files already downloaded and verified
    [2m[36m(pid=1126)[0m Files already downloaded and verified
    [2m[36m(pid=1164)[0m Files already downloaded and verified
    [2m[36m(pid=1098)[0m Files already downloaded and verified
    [2m[36m(pid=1145)[0m Files already downloaded and verified
    [2m[36m(pid=1104)[0m Files already downloaded and verified
    [2m[36m(pid=1119)[0m Files already downloaded and verified
    [2m[36m(pid=1140)[0m Files already downloaded and verified
    [2m[36m(pid=1118)[0m Files already downloaded and verified
    [2m[36m(pid=1101)[0m Files already downloaded and verified
    [2m[36m(pid=1165)[0m Files already downloaded and verified
    [2m[36m(pid=1126)[0m Files already downloaded and verified
    [2m[36m(pid=1126)[0m [1,  2000] loss: 2.295
    [2m[36m(pid=1101)[0m [1,  2000] loss: 2.310
    [2m[36m(pid=1165)[0m [1,  2000] loss: 2.193
    [2m[36m(pid=1119)[0m [1,  2000] loss: 2.302
    [2m[36m(pid=1145)[0m [1,  2000] loss: 2.296
    [2m[36m(pid=1118)[0m [1,  2000] loss: 2.326
    [2m[36m(pid=1104)[0m [1,  2000] loss: 2.303
    [2m[36m(pid=1098)[0m [1,  2000] loss: 2.083
    [2m[36m(pid=1164)[0m [1,  2000] loss: 1.995
    [2m[36m(pid=1140)[0m [1,  2000] loss: 2.377
    [2m[36m(pid=1126)[0m [1,  4000] loss: 1.078
    [2m[36m(pid=1101)[0m [1,  4000] loss: 1.149
    [2m[36m(pid=1119)[0m [1,  4000] loss: 1.149
    [2m[36m(pid=1165)[0m [1,  4000] loss: 1.020
    [2m[36m(pid=1118)[0m [1,  4000] loss: 1.161
    [2m[36m(pid=1104)[0m [1,  4000] loss: 1.157
    [2m[36m(pid=1145)[0m [1,  4000] loss: 1.052
    [2m[36m(pid=1098)[0m [1,  4000] loss: 0.883
    [2m[36m(pid=1164)[0m [1,  4000] loss: 0.927
    [2m[36m(pid=1140)[0m [1,  4000] loss: 1.186
    [2m[36m(pid=1126)[0m [1,  6000] loss: 0.684
    [2m[36m(pid=1101)[0m [1,  6000] loss: 0.760
    [2m[36m(pid=1119)[0m [1,  6000] loss: 0.758
    [2m[36m(pid=1165)[0m [1,  6000] loss: 0.660
    [2m[36m(pid=1118)[0m [1,  6000] loss: 0.775
    [2m[36m(pid=1104)[0m [1,  6000] loss: 0.770
    [2m[36m(pid=1145)[0m [1,  6000] loss: 0.624
    [2m[36m(pid=1098)[0m [1,  6000] loss: 0.542
    Result for DEFAULT_77a44_00002:
      accuracy: 0.2841
      date: 2020-10-09_19-56-48
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.881975656604767
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 41.3854501247406
      time_this_iter_s: 41.3854501247406
      time_total_s: 41.3854501247406
      timestamp: 1602273408
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 8.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=0
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.881975656604767
    Resources requested: 20/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (10 RUNNING)
    +---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status   | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | RUNNING  |                 |            4 |    8 |  128 | 0.0210161   |         |            |                      |
    | DEFAULT_77a44_00001 | RUNNING  |                 |            2 |  256 |  128 | 0.000461678 |         |            |                      |
    | DEFAULT_77a44_00002 | RUNNING  | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.88198 |     0.2841 |                    1 |
    | DEFAULT_77a44_00003 | RUNNING  |                 |            4 |    4 |  128 | 0.00551547  |         |            |                      |
    | DEFAULT_77a44_00004 | RUNNING  |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | RUNNING  |                 |            4 |    4 |  128 | 0.0421917   |         |            |                      |
    | DEFAULT_77a44_00006 | RUNNING  |                 |            2 |    8 |    8 | 0.000359613 |         |            |                      |
    | DEFAULT_77a44_00007 | RUNNING  |                 |            4 |  128 |   16 | 0.00202898  |         |            |                      |
    | DEFAULT_77a44_00008 | RUNNING  |                 |            2 |    4 |    8 | 0.000162963 |         |            |                      |
    | DEFAULT_77a44_00009 | RUNNING  |                 |            2 |   32 |  256 | 0.000134494 |         |            |                      |
    +---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [1,  8000] loss: 0.499
    [2m[36m(pid=1101)[0m [1,  8000] loss: 0.559
    [2m[36m(pid=1119)[0m [1,  8000] loss: 0.552
    [2m[36m(pid=1165)[0m [1,  8000] loss: 0.488
    [2m[36m(pid=1118)[0m [1,  8000] loss: 0.581
    [2m[36m(pid=1104)[0m [1,  8000] loss: 0.579
    [2m[36m(pid=1145)[0m [1,  8000] loss: 0.448
    [2m[36m(pid=1098)[0m [1,  8000] loss: 0.389
    [2m[36m(pid=1140)[0m [1,  6000] loss: 0.793
    [2m[36m(pid=1164)[0m [2,  2000] loss: 1.870
    [2m[36m(pid=1101)[0m [1, 10000] loss: 0.435
    [2m[36m(pid=1126)[0m [1, 10000] loss: 0.386
    [2m[36m(pid=1119)[0m [1, 10000] loss: 0.427
    [2m[36m(pid=1165)[0m [1, 10000] loss: 0.390
    [2m[36m(pid=1118)[0m [1, 10000] loss: 0.465
    [2m[36m(pid=1104)[0m [1, 10000] loss: 0.462
    [2m[36m(pid=1145)[0m [1, 10000] loss: 0.341
    [2m[36m(pid=1098)[0m [1, 10000] loss: 0.302
    [2m[36m(pid=1101)[0m [1, 12000] loss: 0.353
    [2m[36m(pid=1126)[0m [1, 12000] loss: 0.311
    [2m[36m(pid=1119)[0m [1, 12000] loss: 0.345
    [2m[36m(pid=1164)[0m [2,  4000] loss: 0.938
    [2m[36m(pid=1140)[0m [1,  8000] loss: 0.594
    Result for DEFAULT_77a44_00003:
      accuracy: 0.2563
      date: 2020-10-09_19-57-13
      done: true
      experiment_id: 5c01db6fb7974f6087f128418068ab25
      experiment_tag: 3_batch_size=4,l1=4,l2=128,lr=0.0055155
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.9565512576580049
      node_ip: 172.17.0.2
      pid: 1165
      should_checkpoint: true
      time_since_restore: 65.84106469154358
      time_this_iter_s: 65.84106469154358
      time_total_s: 65.84106469154358
      timestamp: 1602273433
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00003
    
    
    == Status ==
    Memory usage on this node: 8.9/240.1 GiB
    Using AsyncHyperBand: num_stopped=1
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: None | Iter 1.000: -1.919263457131386
    Resources requested: 20/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (10 RUNNING)
    +---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status   | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | RUNNING  |                 |            4 |    8 |  128 | 0.0210161   |         |            |                      |
    | DEFAULT_77a44_00001 | RUNNING  |                 |            2 |  256 |  128 | 0.000461678 |         |            |                      |
    | DEFAULT_77a44_00002 | RUNNING  | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.88198 |     0.2841 |                    1 |
    | DEFAULT_77a44_00003 | RUNNING  | 172.17.0.2:1165 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING  |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | RUNNING  |                 |            4 |    4 |  128 | 0.0421917   |         |            |                      |
    | DEFAULT_77a44_00006 | RUNNING  |                 |            2 |    8 |    8 | 0.000359613 |         |            |                      |
    | DEFAULT_77a44_00007 | RUNNING  |                 |            4 |  128 |   16 | 0.00202898  |         |            |                      |
    | DEFAULT_77a44_00008 | RUNNING  |                 |            2 |    4 |    8 | 0.000162963 |         |            |                      |
    | DEFAULT_77a44_00009 | RUNNING  |                 |            2 |   32 |  256 | 0.000134494 |         |            |                      |
    +---------------------+----------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Result for DEFAULT_77a44_00005:
      accuracy: 0.0986
      date: 2020-10-09_19-57-13
      done: true
      experiment_id: 8d41531f8ac84a2fa81eb0d04bb4809a
      experiment_tag: 5_batch_size=4,l1=4,l2=128,lr=0.042192
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 2.3523551787376404
      node_ip: 172.17.0.2
      pid: 1118
      should_checkpoint: true
      time_since_restore: 66.13440608978271
      time_this_iter_s: 66.13440608978271
      time_total_s: 66.13440608978271
      timestamp: 1602273433
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00005
    
    
    Result for DEFAULT_77a44_00000:
      accuracy: 0.1073
      date: 2020-10-09_19-57-13
      done: true
      experiment_id: 71350ebb3b9b4c2ca892c43094b6e672
      experiment_tag: 0_batch_size=4,l1=8,l2=128,lr=0.021016
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 2.306087596511841
      node_ip: 172.17.0.2
      pid: 1104
      should_checkpoint: true
      time_since_restore: 66.43020415306091
      time_this_iter_s: 66.43020415306091
      time_total_s: 66.43020415306091
      timestamp: 1602273433
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00000
    
    
    Result for DEFAULT_77a44_00007:
      accuracy: 0.4484
      date: 2020-10-09_19-57-14
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.505290996646881
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 67.45768523216248
      time_this_iter_s: 67.45768523216248
      time_total_s: 67.45768523216248
      timestamp: 1602273434
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00007
    
    
    [2m[36m(pid=1145)[0m [1, 12000] loss: 0.270
    [2m[36m(pid=1126)[0m [1, 14000] loss: 0.260
    [2m[36m(pid=1101)[0m [1, 14000] loss: 0.301
    [2m[36m(pid=1119)[0m [1, 14000] loss: 0.288
    Result for DEFAULT_77a44_00002:
      accuracy: 0.2704
      date: 2020-10-09_19-57-21
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 2
      loss: 1.9036258604049683
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 74.83478355407715
      time_this_iter_s: 33.44933342933655
      time_total_s: 74.83478355407715
      timestamp: 1602273441
      timesteps_since_restore: 0
      training_iteration: 2
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 7.3/240.1 GiB
    Using AsyncHyperBand: num_stopped=3
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.9036258604049683 | Iter 1.000: -1.9565512576580049
    Resources requested: 14/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (7 RUNNING, 3 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    |                 |            2 |  256 |  128 | 0.000461678 |         |            |                      |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.90363 |     0.2704 |                    2 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    |                 |            2 |    8 |    8 | 0.000359613 |         |            |                      |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.50529 |     0.4484 |                    1 |
    | DEFAULT_77a44_00008 | RUNNING    |                 |            2 |    4 |    8 | 0.000162963 |         |            |                      |
    | DEFAULT_77a44_00009 | RUNNING    |                 |            2 |   32 |  256 | 0.000134494 |         |            |                      |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1098)[0m [2,  2000] loss: 1.427
    [2m[36m(pid=1145)[0m [1, 14000] loss: 0.227
    [2m[36m(pid=1140)[0m [1, 10000] loss: 0.476
    [2m[36m(pid=1101)[0m [1, 16000] loss: 0.260
    [2m[36m(pid=1126)[0m [1, 16000] loss: 0.223
    [2m[36m(pid=1119)[0m [1, 16000] loss: 0.245
    [2m[36m(pid=1164)[0m [3,  2000] loss: 1.876
    [2m[36m(pid=1098)[0m [2,  4000] loss: 0.711
    [2m[36m(pid=1145)[0m [1, 16000] loss: 0.196
    [2m[36m(pid=1101)[0m [1, 18000] loss: 0.226
    [2m[36m(pid=1126)[0m [1, 18000] loss: 0.194
    [2m[36m(pid=1119)[0m [1, 18000] loss: 0.216
    [2m[36m(pid=1140)[0m [1, 12000] loss: 0.396
    [2m[36m(pid=1098)[0m [2,  6000] loss: 0.462
    [2m[36m(pid=1164)[0m [3,  4000] loss: 0.927
    [2m[36m(pid=1126)[0m [1, 20000] loss: 0.171
    [2m[36m(pid=1101)[0m [1, 20000] loss: 0.200
    [2m[36m(pid=1145)[0m [1, 18000] loss: 0.170
    [2m[36m(pid=1119)[0m [1, 20000] loss: 0.188
    [2m[36m(pid=1098)[0m [2,  8000] loss: 0.345
    Result for DEFAULT_77a44_00002:
      accuracy: 0.3206
      date: 2020-10-09_19-57-52
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 3
      loss: 1.9260577551841735
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 105.59961199760437
      time_this_iter_s: 30.76482844352722
      time_total_s: 105.59961199760437
      timestamp: 1602273472
      timesteps_since_restore: 0
      training_iteration: 3
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 7.3/240.1 GiB
    Using AsyncHyperBand: num_stopped=3
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.9036258604049683 | Iter 1.000: -1.9565512576580049
    Resources requested: 14/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (7 RUNNING, 3 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    |                 |            2 |  256 |  128 | 0.000461678 |         |            |                      |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.92606 |     0.3206 |                    3 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    |                 |            2 |    8 |    8 | 0.000359613 |         |            |                      |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.50529 |     0.4484 |                    1 |
    | DEFAULT_77a44_00008 | RUNNING    |                 |            2 |    4 |    8 | 0.000162963 |         |            |                      |
    | DEFAULT_77a44_00009 | RUNNING    |                 |            2 |   32 |  256 | 0.000134494 |         |            |                      |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [1, 20000] loss: 0.148
    [2m[36m(pid=1140)[0m [1, 14000] loss: 0.339
    Result for DEFAULT_77a44_00008:
      accuracy: 0.1883
      date: 2020-10-09_19-57-56
      done: true
      experiment_id: 528c728f0abd4dde8df53627aa7b3cc9
      experiment_tag: 8_batch_size=2,l1=4,l2=8,lr=0.00016296
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.984449322938919
      node_ip: 172.17.0.2
      pid: 1101
      should_checkpoint: true
      time_since_restore: 109.06154918670654
      time_this_iter_s: 109.06154918670654
      time_total_s: 109.06154918670654
      timestamp: 1602273476
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00008
    
    
    Result for DEFAULT_77a44_00006:
      accuracy: 0.3722
      date: 2020-10-09_19-57-56
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.6620629720330238
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 109.24619793891907
      time_this_iter_s: 109.24619793891907
      time_total_s: 109.24619793891907
      timestamp: 1602273476
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00006
    
    
    Result for DEFAULT_77a44_00009:
      accuracy: 0.3066
      date: 2020-10-09_19-57-58
      done: false
      experiment_id: 448a03d8183b48e4a732b9974760de96
      experiment_tag: 9_batch_size=2,l1=32,l2=256,lr=0.00013449
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.8606878761410712
      node_ip: 172.17.0.2
      pid: 1119
      should_checkpoint: true
      time_since_restore: 111.55251812934875
      time_this_iter_s: 111.55251812934875
      time_total_s: 111.55251812934875
      timestamp: 1602273478
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00009
    
    
    == Status ==
    Memory usage on this node: 6.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=4
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.9036258604049683 | Iter 1.000: -1.919263457131386
    Resources requested: 12/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (6 RUNNING, 4 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    |                 |            2 |  256 |  128 | 0.000461678 |         |            |                      |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.92606 |     0.3206 |                    3 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.50529 |     0.4484 |                    1 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1098)[0m [2, 10000] loss: 0.275
    [2m[36m(pid=1164)[0m [4,  2000] loss: 1.842
    [2m[36m(pid=1126)[0m [2,  2000] loss: 1.660
    Result for DEFAULT_77a44_00001:
      accuracy: 0.4374
      date: 2020-10-09_19-58-05
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 1.5289554242562502
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 118.45757269859314
      time_this_iter_s: 118.45757269859314
      time_total_s: 118.45757269859314
      timestamp: 1602273485
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 6.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=4
    Bracket: Iter 8.000: None | Iter 4.000: None | Iter 2.000: -1.9036258604049683 | Iter 1.000: -1.881975656604767
    Resources requested: 12/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (6 RUNNING, 4 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.92606 |     0.3206 |                    3 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.50529 |     0.4484 |                    1 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1119)[0m [2,  2000] loss: 1.796
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5087
      date: 2020-10-09_19-58-08
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 2
      loss: 1.3934748243197799
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 121.18621754646301
      time_this_iter_s: 53.72853231430054
      time_total_s: 121.18621754646301
      timestamp: 1602273488
      timesteps_since_restore: 0
      training_iteration: 2
      trial_id: 77a44_00007
    
    
    [2m[36m(pid=1140)[0m [1, 16000] loss: 0.298
    [2m[36m(pid=1126)[0m [2,  4000] loss: 0.801
    [2m[36m(pid=1164)[0m [4,  4000] loss: 0.914
    [2m[36m(pid=1145)[0m [2,  2000] loss: 1.454
    [2m[36m(pid=1119)[0m [2,  4000] loss: 0.886
    [2m[36m(pid=1098)[0m [3,  2000] loss: 1.292
    [2m[36m(pid=1126)[0m [2,  6000] loss: 0.528
    [2m[36m(pid=1140)[0m [1, 18000] loss: 0.264
    Result for DEFAULT_77a44_00002:
      accuracy: 0.3437
      date: 2020-10-09_19-58-23
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 4
      loss: 1.8035019870758056
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 136.0801386833191
      time_this_iter_s: 30.48052668571472
      time_total_s: 136.0801386833191
      timestamp: 1602273503
      timesteps_since_restore: 0
      training_iteration: 4
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 6.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=4
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.6485503423623742 | Iter 1.000: -1.881975656604767
    Resources requested: 12/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (6 RUNNING, 4 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.8035  |     0.3437 |                    4 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    |                 |            2 |  256 |  256 | 0.0647615   |         |            |                      |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.39347 |     0.5087 |                    2 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [2,  4000] loss: 0.730
    [2m[36m(pid=1119)[0m [2,  6000] loss: 0.570
    [2m[36m(pid=1098)[0m [3,  4000] loss: 0.647
    [2m[36m(pid=1126)[0m [2,  8000] loss: 0.389
    [2m[36m(pid=1119)[0m [2,  8000] loss: 0.417
    [2m[36m(pid=1145)[0m [2,  6000] loss: 0.476
    [2m[36m(pid=1164)[0m [5,  2000] loss: 1.852
    [2m[36m(pid=1098)[0m [3,  6000] loss: 0.428
    [2m[36m(pid=1126)[0m [2, 10000] loss: 0.306
    [2m[36m(pid=1140)[0m [1, 20000] loss: 0.237
    [2m[36m(pid=1119)[0m [2, 10000] loss: 0.326
    [2m[36m(pid=1145)[0m [2,  8000] loss: 0.349
    [2m[36m(pid=1126)[0m [2, 12000] loss: 0.255
    [2m[36m(pid=1164)[0m [5,  4000] loss: 0.934
    [2m[36m(pid=1098)[0m [3,  8000] loss: 0.325
    Result for DEFAULT_77a44_00004:
      accuracy: 0.1024
      date: 2020-10-09_19-58-49
      done: true
      experiment_id: 2ca91983c1654f39a11db9cdd1e47f10
      experiment_tag: 4_batch_size=2,l1=256,l2=256,lr=0.064762
      hostname: 234fef3cc6b0
      iterations_since_restore: 1
      loss: 2.346003741002083
      node_ip: 172.17.0.2
      pid: 1140
      should_checkpoint: true
      time_since_restore: 161.9359531402588
      time_this_iter_s: 161.9359531402588
      time_total_s: 161.9359531402588
      timestamp: 1602273529
      timesteps_since_restore: 0
      training_iteration: 1
      trial_id: 77a44_00004
    
    
    == Status ==
    Memory usage on this node: 6.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=5
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.6485503423623742 | Iter 1.000: -1.919263457131386
    Resources requested: 12/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (6 RUNNING, 4 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.8035  |     0.3437 |                    4 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | RUNNING    | 172.17.0.2:1140 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.39347 |     0.5087 |                    2 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1119)[0m [2, 12000] loss: 0.271
    [2m[36m(pid=1145)[0m [2, 10000] loss: 0.276
    [2m[36m(pid=1126)[0m [2, 14000] loss: 0.213
    Result for DEFAULT_77a44_00002:
      accuracy: 0.3035
      date: 2020-10-09_19-58-53
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 5
      loss: 1.8839821341514587
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 166.10145020484924
      time_this_iter_s: 30.02131152153015
      time_total_s: 166.10145020484924
      timestamp: 1602273533
      timesteps_since_restore: 0
      training_iteration: 5
      trial_id: 77a44_00002
    
    
    [2m[36m(pid=1098)[0m [3, 10000] loss: 0.254
    [2m[36m(pid=1119)[0m [2, 14000] loss: 0.228
    [2m[36m(pid=1145)[0m [2, 12000] loss: 0.230
    [2m[36m(pid=1126)[0m [2, 16000] loss: 0.187
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5319
      date: 2020-10-09_19-59-00
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 3
      loss: 1.3139552696928383
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 173.1586651802063
      time_this_iter_s: 51.972447633743286
      time_total_s: 173.1586651802063
      timestamp: 1602273540
      timesteps_since_restore: 0
      training_iteration: 3
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 6.3/240.1 GiB
    Using AsyncHyperBand: num_stopped=5
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.6485503423623742 | Iter 1.000: -1.919263457131386
    Resources requested: 10/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (5 RUNNING, 5 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.88398 |     0.3035 |                    5 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31396 |     0.5319 |                    3 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1164)[0m [6,  2000] loss: 1.907
    [2m[36m(pid=1119)[0m [2, 16000] loss: 0.198
    [2m[36m(pid=1145)[0m [2, 14000] loss: 0.192
    [2m[36m(pid=1126)[0m [2, 18000] loss: 0.166
    [2m[36m(pid=1098)[0m [4,  2000] loss: 1.200
    [2m[36m(pid=1119)[0m [2, 18000] loss: 0.177
    [2m[36m(pid=1164)[0m [6,  4000] loss: 0.960
    [2m[36m(pid=1126)[0m [2, 20000] loss: 0.148
    [2m[36m(pid=1145)[0m [2, 16000] loss: 0.164
    [2m[36m(pid=1098)[0m [4,  4000] loss: 0.599
    [2m[36m(pid=1119)[0m [2, 20000] loss: 0.152
    Result for DEFAULT_77a44_00002:
      accuracy: 0.2862
      date: 2020-10-09_19-59-22
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 6
      loss: 1.9193087907791138
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 195.79263925552368
      time_this_iter_s: 29.69118905067444
      time_total_s: 195.79263925552368
      timestamp: 1602273562
      timesteps_since_restore: 0
      training_iteration: 6
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 6.3/240.1 GiB
    Using AsyncHyperBand: num_stopped=5
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.6485503423623742 | Iter 1.000: -1.919263457131386
    Resources requested: 10/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (5 RUNNING, 5 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.91931 |     0.2862 |                    6 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.66206 |     0.3722 |                    1 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31396 |     0.5319 |                    3 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.86069 |     0.3066 |                    1 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [2, 18000] loss: 0.147
    Result for DEFAULT_77a44_00006:
      accuracy: 0.4589
      date: 2020-10-09_19-59-27
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 2
      loss: 1.448237135411054
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 199.99908256530762
      time_this_iter_s: 90.75288462638855
      time_total_s: 199.99908256530762
      timestamp: 1602273567
      timesteps_since_restore: 0
      training_iteration: 2
      trial_id: 77a44_00006
    
    
    [2m[36m(pid=1098)[0m [4,  6000] loss: 0.403
    Result for DEFAULT_77a44_00009:
      accuracy: 0.4358
      date: 2020-10-09_19-59-33
      done: true
      experiment_id: 448a03d8183b48e4a732b9974760de96
      experiment_tag: 9_batch_size=2,l1=32,l2=256,lr=0.00013449
      hostname: 234fef3cc6b0
      iterations_since_restore: 2
      loss: 1.5461469007849693
      node_ip: 172.17.0.2
      pid: 1119
      should_checkpoint: true
      time_since_restore: 206.13924598693848
      time_this_iter_s: 94.58672785758972
      time_total_s: 206.13924598693848
      timestamp: 1602273573
      timesteps_since_restore: 0
      training_iteration: 2
      trial_id: 77a44_00009
    
    
    == Status ==
    Memory usage on this node: 6.3/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.4971920180980116 | Iter 1.000: -1.919263457131386
    Resources requested: 10/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (5 RUNNING, 5 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.52896 |     0.4374 |                    1 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.91931 |     0.2862 |                    6 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31396 |     0.5319 |                    3 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | RUNNING    | 172.17.0.2:1119 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [2, 20000] loss: 0.130
    [2m[36m(pid=1164)[0m [7,  2000] loss: 1.967
    [2m[36m(pid=1126)[0m [3,  2000] loss: 1.454
    [2m[36m(pid=1098)[0m [4,  8000] loss: 0.310
    [2m[36m(pid=1126)[0m [3,  4000] loss: 0.715
    [2m[36m(pid=1164)[0m [7,  4000] loss: 0.997
    [2m[36m(pid=1098)[0m [4, 10000] loss: 0.248
    Result for DEFAULT_77a44_00001:
      accuracy: 0.5459
      date: 2020-10-09_19-59-44
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 2
      loss: 1.2801105223743245
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 217.948983669281
      time_this_iter_s: 99.49141097068787
      time_total_s: 217.948983669281
      timestamp: 1602273584
      timesteps_since_restore: 0
      training_iteration: 2
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 5.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: None | Iter 4.000: -1.8035019870758056 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.28011 |     0.5459 |                    2 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.91931 |     0.2862 |                    6 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31396 |     0.5319 |                    3 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [3,  6000] loss: 0.488
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5309
      date: 2020-10-09_19-59-50
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 4
      loss: 1.3358730784237385
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 223.8010766506195
      time_this_iter_s: 50.64241147041321
      time_total_s: 223.8010766506195
      timestamp: 1602273590
      timesteps_since_restore: 0
      training_iteration: 4
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.8/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: None | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.28011 |     0.5459 |                    2 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 1.91931 |     0.2862 |                    6 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.33587 |     0.5309 |                    4 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Result for DEFAULT_77a44_00002:
      accuracy: 0.2505
      date: 2020-10-09_19-59-52
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 7
      loss: 2.00418664560318
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 225.23884892463684
      time_this_iter_s: 29.44620966911316
      time_total_s: 225.23884892463684
      timestamp: 1602273592
      timesteps_since_restore: 0
      training_iteration: 7
      trial_id: 77a44_00002
    
    
    [2m[36m(pid=1145)[0m [3,  2000] loss: 1.219
    [2m[36m(pid=1126)[0m [3,  8000] loss: 0.356
    [2m[36m(pid=1098)[0m [5,  2000] loss: 1.144
    [2m[36m(pid=1145)[0m [3,  4000] loss: 0.632
    [2m[36m(pid=1164)[0m [8,  2000] loss: 1.980
    [2m[36m(pid=1126)[0m [3, 10000] loss: 0.283
    [2m[36m(pid=1098)[0m [5,  4000] loss: 0.566
    [2m[36m(pid=1145)[0m [3,  6000] loss: 0.410
    [2m[36m(pid=1164)[0m [8,  4000] loss: 1.014
    [2m[36m(pid=1126)[0m [3, 12000] loss: 0.236
    [2m[36m(pid=1098)[0m [5,  6000] loss: 0.390
    [2m[36m(pid=1145)[0m [3,  8000] loss: 0.304
    [2m[36m(pid=1126)[0m [3, 14000] loss: 0.198
    Result for DEFAULT_77a44_00002:
      accuracy: 0.2253
      date: 2020-10-09_20-00-21
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 8
      loss: 2.1314156931877135
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 254.41000890731812
      time_this_iter_s: 29.171159982681274
      time_total_s: 254.41000890731812
      timestamp: 1602273621
      timesteps_since_restore: 0
      training_iteration: 8
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 5.7/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.28011 |     0.5459 |                    2 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 2.13142 |     0.2253 |                    8 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.33587 |     0.5309 |                    4 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1098)[0m [5,  8000] loss: 0.297
    [2m[36m(pid=1145)[0m [3, 10000] loss: 0.245
    [2m[36m(pid=1126)[0m [3, 16000] loss: 0.173
    [2m[36m(pid=1164)[0m [9,  2000] loss: 2.112
    [2m[36m(pid=1098)[0m [5, 10000] loss: 0.235
    [2m[36m(pid=1145)[0m [3, 12000] loss: 0.203
    [2m[36m(pid=1126)[0m [3, 18000] loss: 0.154
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5628
      date: 2020-10-09_20-00-40
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 5
      loss: 1.2729537689715624
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 273.7186484336853
      time_this_iter_s: 49.917571783065796
      time_total_s: 273.7186484336853
      timestamp: 1602273640
      timesteps_since_restore: 0
      training_iteration: 5
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.7/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.28011 |     0.5459 |                    2 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 2.13142 |     0.2253 |                    8 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.27295 |     0.5628 |                    5 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1164)[0m [9,  4000] loss: 1.053
    [2m[36m(pid=1126)[0m [3, 20000] loss: 0.141
    [2m[36m(pid=1145)[0m [3, 14000] loss: 0.170
    [2m[36m(pid=1098)[0m [6,  2000] loss: 1.095
    Result for DEFAULT_77a44_00002:
      accuracy: 0.17
      date: 2020-10-09_20-00-51
      done: false
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 9
      loss: 2.1584741218566896
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 284.08941316604614
      time_this_iter_s: 29.679404258728027
      time_total_s: 284.08941316604614
      timestamp: 1602273651
      timesteps_since_restore: 0
      training_iteration: 9
      trial_id: 77a44_00002
    
    
    == Status ==
    Memory usage on this node: 5.7/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.28011 |     0.5459 |                    2 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 2.15847 |     0.17   |                    9 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.44824 |     0.4589 |                    2 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.27295 |     0.5628 |                    5 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [3, 16000] loss: 0.149
    Result for DEFAULT_77a44_00006:
      accuracy: 0.4727
      date: 2020-10-09_20-00-55
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 3
      loss: 1.4226891365654766
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 287.9995017051697
      time_this_iter_s: 88.00041913986206
      time_total_s: 287.9995017051697
      timestamp: 1602273655
      timesteps_since_restore: 0
      training_iteration: 3
      trial_id: 77a44_00006
    
    
    [2m[36m(pid=1098)[0m [6,  4000] loss: 0.556
    [2m[36m(pid=1145)[0m [3, 18000] loss: 0.136
    [2m[36m(pid=1164)[0m [10,  2000] loss: 2.212
    [2m[36m(pid=1126)[0m [4,  2000] loss: 1.392
    [2m[36m(pid=1098)[0m [6,  6000] loss: 0.376
    [2m[36m(pid=1145)[0m [3, 20000] loss: 0.114
    [2m[36m(pid=1126)[0m [4,  4000] loss: 0.679
    [2m[36m(pid=1164)[0m [10,  4000] loss: 1.133
    [2m[36m(pid=1098)[0m [6,  8000] loss: 0.279
    [2m[36m(pid=1126)[0m [4,  6000] loss: 0.458
    Result for DEFAULT_77a44_00001:
      accuracy: 0.5798
      date: 2020-10-09_20-01-21
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 3
      loss: 1.1820625860116911
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 314.0342721939087
      time_this_iter_s: 96.08528852462769
      time_total_s: 314.0342721939087
      timestamp: 1602273681
      timesteps_since_restore: 0
      training_iteration: 3
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 5.7/240.1 GiB
    Using AsyncHyperBand: num_stopped=6
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 8/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (4 RUNNING, 6 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.18206 |     0.5798 |                    3 |
    | DEFAULT_77a44_00002 | RUNNING    | 172.17.0.2:1164 |            8 |   32 |   16 | 0.0131231   | 2.15847 |     0.17   |                    9 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.42269 |     0.4727 |                    3 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.27295 |     0.5628 |                    5 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Result for DEFAULT_77a44_00002:
      accuracy: 0.1292
      date: 2020-10-09_20-01-21
      done: true
      experiment_id: 2cf1c1fc6eaf4ed5961e07d3ec779432
      experiment_tag: 2_batch_size=8,l1=32,l2=16,lr=0.013123
      hostname: 234fef3cc6b0
      iterations_since_restore: 10
      loss: 2.2377114813804626
      node_ip: 172.17.0.2
      pid: 1164
      should_checkpoint: true
      time_since_restore: 314.6153542995453
      time_this_iter_s: 30.525941133499146
      time_total_s: 314.6153542995453
      timestamp: 1602273681
      timesteps_since_restore: 0
      training_iteration: 10
      trial_id: 77a44_00002
    
    
    [2m[36m(pid=1098)[0m [6, 10000] loss: 0.232
    [2m[36m(pid=1126)[0m [4,  8000] loss: 0.342
    [2m[36m(pid=1145)[0m [4,  2000] loss: 1.100
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5459
      date: 2020-10-09_20-01-30
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 6
      loss: 1.3732997598737477
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 323.68818259239197
      time_this_iter_s: 49.969534158706665
      time_total_s: 323.68818259239197
      timestamp: 1602273690
      timesteps_since_restore: 0
      training_iteration: 6
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.2/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.18206 |     0.5798 |                    3 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.42269 |     0.4727 |                    3 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.3733  |     0.5459 |                    6 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [4, 10000] loss: 0.271
    [2m[36m(pid=1145)[0m [4,  4000] loss: 0.556
    [2m[36m(pid=1098)[0m [7,  2000] loss: 1.034
    [2m[36m(pid=1126)[0m [4, 12000] loss: 0.229
    [2m[36m(pid=1145)[0m [4,  6000] loss: 0.364
    [2m[36m(pid=1126)[0m [4, 14000] loss: 0.196
    [2m[36m(pid=1098)[0m [7,  4000] loss: 0.541
    [2m[36m(pid=1145)[0m [4,  8000] loss: 0.274
    [2m[36m(pid=1126)[0m [4, 16000] loss: 0.169
    [2m[36m(pid=1098)[0m [7,  6000] loss: 0.368
    [2m[36m(pid=1145)[0m [4, 10000] loss: 0.215
    [2m[36m(pid=1126)[0m [4, 18000] loss: 0.150
    [2m[36m(pid=1098)[0m [7,  8000] loss: 0.273
    [2m[36m(pid=1126)[0m [4, 20000] loss: 0.135
    [2m[36m(pid=1145)[0m [4, 12000] loss: 0.182
    [2m[36m(pid=1098)[0m [7, 10000] loss: 0.217
    [2m[36m(pid=1145)[0m [4, 14000] loss: 0.158
    Result for DEFAULT_77a44_00007:
      accuracy: 0.576
      date: 2020-10-09_20-02-19
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 7
      loss: 1.24756854121387
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 372.3224792480469
      time_this_iter_s: 48.63429665565491
      time_total_s: 372.3224792480469
      timestamp: 1602273739
      timesteps_since_restore: 0
      training_iteration: 7
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.569687532749772 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.18206 |     0.5798 |                    3 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.42269 |     0.4727 |                    3 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.24757 |     0.576  |                    7 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Result for DEFAULT_77a44_00006:
      accuracy: 0.4961
      date: 2020-10-09_20-02-20
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 4
      loss: 1.3667119354642927
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 373.32873916625977
      time_this_iter_s: 85.32923746109009
      time_total_s: 373.32873916625977
      timestamp: 1602273740
      timesteps_since_restore: 0
      training_iteration: 4
      trial_id: 77a44_00006
    
    
    [2m[36m(pid=1145)[0m [4, 16000] loss: 0.134
    [2m[36m(pid=1126)[0m [5,  2000] loss: 1.317
    [2m[36m(pid=1098)[0m [8,  2000] loss: 1.013
    [2m[36m(pid=1145)[0m [4, 18000] loss: 0.120
    [2m[36m(pid=1126)[0m [5,  4000] loss: 0.660
    [2m[36m(pid=1098)[0m [8,  4000] loss: 0.521
    [2m[36m(pid=1126)[0m [5,  6000] loss: 0.438
    [2m[36m(pid=1145)[0m [4, 20000] loss: 0.108
    [2m[36m(pid=1098)[0m [8,  6000] loss: 0.350
    [2m[36m(pid=1126)[0m [5,  8000] loss: 0.331
    [2m[36m(pid=1098)[0m [8,  8000] loss: 0.267
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6009
      date: 2020-10-09_20-02-54
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 4
      loss: 1.1593985119301593
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 407.62501096725464
      time_this_iter_s: 93.59073877334595
      time_total_s: 407.62501096725464
      timestamp: 1602273774
      timesteps_since_restore: 0
      training_iteration: 4
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -2.1314156931877135 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.1594  |     0.6009 |                    4 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.36671 |     0.4961 |                    4 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.24757 |     0.576  |                    7 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [5, 10000] loss: 0.271
    [2m[36m(pid=1098)[0m [8, 10000] loss: 0.218
    [2m[36m(pid=1145)[0m [5,  2000] loss: 0.967
    [2m[36m(pid=1126)[0m [5, 12000] loss: 0.221
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5664
      date: 2020-10-09_20-03-08
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 8
      loss: 1.3161735702279955
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 421.1367325782776
      time_this_iter_s: 48.81425333023071
      time_total_s: 421.1367325782776
      timestamp: 1602273788
      timesteps_since_restore: 0
      training_iteration: 8
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.1594  |     0.6009 |                    4 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.36671 |     0.4961 |                    4 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31617 |     0.5664 |                    8 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [5,  4000] loss: 0.496
    [2m[36m(pid=1126)[0m [5, 14000] loss: 0.186
    [2m[36m(pid=1098)[0m [9,  2000] loss: 0.986
    [2m[36m(pid=1145)[0m [5,  6000] loss: 0.332
    [2m[36m(pid=1126)[0m [5, 16000] loss: 0.164
    [2m[36m(pid=1098)[0m [9,  4000] loss: 0.503
    [2m[36m(pid=1126)[0m [5, 18000] loss: 0.144
    [2m[36m(pid=1145)[0m [5,  8000] loss: 0.243
    [2m[36m(pid=1098)[0m [9,  6000] loss: 0.342
    [2m[36m(pid=1126)[0m [5, 20000] loss: 0.129
    [2m[36m(pid=1145)[0m [5, 10000] loss: 0.204
    [2m[36m(pid=1098)[0m [9,  8000] loss: 0.266
    [2m[36m(pid=1145)[0m [5, 12000] loss: 0.167
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5285
      date: 2020-10-09_20-03-45
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 5
      loss: 1.2945664445526899
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 458.353075504303
      time_this_iter_s: 85.02433633804321
      time_total_s: 458.353075504303
      timestamp: 1602273825
      timesteps_since_restore: 0
      training_iteration: 5
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.1594  |     0.6009 |                    4 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.29457 |     0.5285 |                    5 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.31617 |     0.5664 |                    8 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1098)[0m [9, 10000] loss: 0.213
    [2m[36m(pid=1145)[0m [5, 14000] loss: 0.144
    [2m[36m(pid=1126)[0m [6,  2000] loss: 1.270
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5803
      date: 2020-10-09_20-03-56
      done: false
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 9
      loss: 1.3147958470012993
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 469.72292470932007
      time_this_iter_s: 48.58619213104248
      time_total_s: 469.72292470932007
      timestamp: 1602273836
      timesteps_since_restore: 0
      training_iteration: 9
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.1594  |     0.6009 |                    4 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.29457 |     0.5285 |                    5 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.3148  |     0.5803 |                    9 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [6,  4000] loss: 0.624
    [2m[36m(pid=1145)[0m [5, 16000] loss: 0.127
    [2m[36m(pid=1098)[0m [10,  2000] loss: 0.949
    [2m[36m(pid=1126)[0m [6,  6000] loss: 0.430
    [2m[36m(pid=1145)[0m [5, 18000] loss: 0.112
    [2m[36m(pid=1098)[0m [10,  4000] loss: 0.502
    [2m[36m(pid=1126)[0m [6,  8000] loss: 0.323
    [2m[36m(pid=1145)[0m [5, 20000] loss: 0.099
    [2m[36m(pid=1098)[0m [10,  6000] loss: 0.346
    [2m[36m(pid=1126)[0m [6, 10000] loss: 0.258
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6221
      date: 2020-10-09_20-04-28
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 5
      loss: 1.0875221006242093
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 501.5412850379944
      time_this_iter_s: 93.91627407073975
      time_total_s: 501.5412850379944
      timestamp: 1602273868
      timesteps_since_restore: 0
      training_iteration: 5
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=7
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.08752 |     0.6221 |                    5 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.29457 |     0.5285 |                    5 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.3148  |     0.5803 |                    9 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [6, 12000] loss: 0.211
    [2m[36m(pid=1098)[0m [10,  8000] loss: 0.253
    [2m[36m(pid=1145)[0m [6,  2000] loss: 0.827
    [2m[36m(pid=1126)[0m [6, 14000] loss: 0.177
    [2m[36m(pid=1098)[0m [10, 10000] loss: 0.210
    [2m[36m(pid=1145)[0m [6,  4000] loss: 0.448
    [2m[36m(pid=1126)[0m [6, 16000] loss: 0.160
    Result for DEFAULT_77a44_00007:
      accuracy: 0.5713
      date: 2020-10-09_20-04-45
      done: true
      experiment_id: 1e0a3b1304eb470898956b381db607e6
      experiment_tag: 7_batch_size=4,l1=128,l2=16,lr=0.002029
      hostname: 234fef3cc6b0
      iterations_since_restore: 10
      loss: 1.2877456236266531
      node_ip: 172.17.0.2
      pid: 1098
      should_checkpoint: true
      time_since_restore: 518.6297419071198
      time_this_iter_s: 48.90681719779968
      time_total_s: 518.6297419071198
      timestamp: 1602273885
      timesteps_since_restore: 0
      training_iteration: 10
      trial_id: 77a44_00007
    
    
    == Status ==
    Memory usage on this node: 5.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 6/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (3 RUNNING, 7 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.08752 |     0.6221 |                    5 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.29457 |     0.5285 |                    5 |
    | DEFAULT_77a44_00007 | RUNNING    | 172.17.0.2:1098 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [6, 18000] loss: 0.143
    [2m[36m(pid=1145)[0m [6,  6000] loss: 0.297
    [2m[36m(pid=1126)[0m [6, 20000] loss: 0.127
    [2m[36m(pid=1145)[0m [6,  8000] loss: 0.235
    [2m[36m(pid=1145)[0m [6, 10000] loss: 0.184
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5484
      date: 2020-10-09_20-05-10
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 6
      loss: 1.2631257870631292
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 543.5542225837708
      time_this_iter_s: 85.20114707946777
      time_total_s: 543.5542225837708
      timestamp: 1602273910
      timesteps_since_restore: 0
      training_iteration: 6
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.08752 |     0.6221 |                    5 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.26313 |     0.5484 |                    6 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [6, 12000] loss: 0.157
    [2m[36m(pid=1126)[0m [7,  2000] loss: 1.256
    [2m[36m(pid=1126)[0m [7,  4000] loss: 0.631
    [2m[36m(pid=1145)[0m [6, 14000] loss: 0.131
    [2m[36m(pid=1126)[0m [7,  6000] loss: 0.407
    [2m[36m(pid=1145)[0m [6, 16000] loss: 0.121
    [2m[36m(pid=1126)[0m [7,  8000] loss: 0.311
    [2m[36m(pid=1145)[0m [6, 18000] loss: 0.101
    [2m[36m(pid=1126)[0m [7, 10000] loss: 0.243
    [2m[36m(pid=1145)[0m [6, 20000] loss: 0.094
    [2m[36m(pid=1126)[0m [7, 12000] loss: 0.203
    Result for DEFAULT_77a44_00001:
      accuracy: 0.61
      date: 2020-10-09_20-06-01
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 6
      loss: 1.1592615005358762
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 594.7056727409363
      time_this_iter_s: 93.1643877029419
      time_total_s: 594.7056727409363
      timestamp: 1602273961
      timesteps_since_restore: 0
      training_iteration: 6
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.15926 |     0.61   |                    6 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.26313 |     0.5484 |                    6 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [7, 14000] loss: 0.176
    [2m[36m(pid=1126)[0m [7, 16000] loss: 0.156
    [2m[36m(pid=1145)[0m [7,  2000] loss: 0.802
    [2m[36m(pid=1126)[0m [7, 18000] loss: 0.141
    [2m[36m(pid=1145)[0m [7,  4000] loss: 0.393
    [2m[36m(pid=1126)[0m [7, 20000] loss: 0.123
    [2m[36m(pid=1145)[0m [7,  6000] loss: 0.282
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5369
      date: 2020-10-09_20-06-34
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 7
      loss: 1.2813393794611097
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 627.4627993106842
      time_this_iter_s: 83.90857672691345
      time_total_s: 627.4627993106842
      timestamp: 1602273994
      timesteps_since_restore: 0
      training_iteration: 7
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.15926 |     0.61   |                    6 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.28134 |     0.5369 |                    7 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [7,  8000] loss: 0.206
    [2m[36m(pid=1126)[0m [8,  2000] loss: 1.200
    [2m[36m(pid=1145)[0m [7, 10000] loss: 0.171
    [2m[36m(pid=1126)[0m [8,  4000] loss: 0.602
    [2m[36m(pid=1145)[0m [7, 12000] loss: 0.138
    [2m[36m(pid=1126)[0m [8,  6000] loss: 0.407
    [2m[36m(pid=1145)[0m [7, 14000] loss: 0.121
    [2m[36m(pid=1126)[0m [8,  8000] loss: 0.296
    [2m[36m(pid=1145)[0m [7, 16000] loss: 0.109
    [2m[36m(pid=1126)[0m [8, 10000] loss: 0.247
    [2m[36m(pid=1145)[0m [7, 18000] loss: 0.098
    [2m[36m(pid=1126)[0m [8, 12000] loss: 0.205
    [2m[36m(pid=1145)[0m [7, 20000] loss: 0.086
    [2m[36m(pid=1126)[0m [8, 14000] loss: 0.175
    [2m[36m(pid=1126)[0m [8, 16000] loss: 0.152
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6115
      date: 2020-10-09_20-07-35
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 7
      loss: 1.1567747425308288
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 687.9970579147339
      time_this_iter_s: 93.29138517379761
      time_total_s: 687.9970579147339
      timestamp: 1602274055
      timesteps_since_restore: 0
      training_iteration: 7
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.7237946317078545 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.15677 |     0.6115 |                    7 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.28134 |     0.5369 |                    7 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [8, 18000] loss: 0.136
    [2m[36m(pid=1145)[0m [8,  2000] loss: 0.721
    [2m[36m(pid=1126)[0m [8, 20000] loss: 0.122
    [2m[36m(pid=1145)[0m [8,  4000] loss: 0.373
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5222
      date: 2020-10-09_20-07-58
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 8
      loss: 1.3225798389766366
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 711.2751452922821
      time_this_iter_s: 83.8123459815979
      time_total_s: 711.2751452922821
      timestamp: 1602274078
      timesteps_since_restore: 0
      training_iteration: 8
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.3225798389766366 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.15677 |     0.6115 |                    7 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.32258 |     0.5222 |                    8 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [8,  6000] loss: 0.246
    [2m[36m(pid=1126)[0m [9,  2000] loss: 1.150
    [2m[36m(pid=1145)[0m [8,  8000] loss: 0.191
    [2m[36m(pid=1126)[0m [9,  4000] loss: 0.587
    [2m[36m(pid=1145)[0m [8, 10000] loss: 0.153
    [2m[36m(pid=1126)[0m [9,  6000] loss: 0.383
    [2m[36m(pid=1145)[0m [8, 12000] loss: 0.128
    [2m[36m(pid=1126)[0m [9,  8000] loss: 0.297
    [2m[36m(pid=1145)[0m [8, 14000] loss: 0.116
    [2m[36m(pid=1126)[0m [9, 10000] loss: 0.239
    [2m[36m(pid=1145)[0m [8, 16000] loss: 0.098
    [2m[36m(pid=1126)[0m [9, 12000] loss: 0.200
    [2m[36m(pid=1145)[0m [8, 18000] loss: 0.093
    [2m[36m(pid=1126)[0m [9, 14000] loss: 0.173
    [2m[36m(pid=1126)[0m [9, 16000] loss: 0.155
    [2m[36m(pid=1145)[0m [8, 20000] loss: 0.083
    [2m[36m(pid=1126)[0m [9, 18000] loss: 0.135
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6234
      date: 2020-10-09_20-09-07
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 8
      loss: 1.1474703996328957
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 780.5215935707092
      time_this_iter_s: 92.52453565597534
      time_total_s: 780.5215935707092
      timestamp: 1602274147
      timesteps_since_restore: 0
      training_iteration: 8
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.14747 |     0.6234 |                    8 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.32258 |     0.5222 |                    8 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1126)[0m [9, 20000] loss: 0.122
    [2m[36m(pid=1145)[0m [9,  2000] loss: 0.652
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5382
      date: 2020-10-09_20-09-21
      done: false
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 9
      loss: 1.2859820882213302
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 794.5377962589264
      time_this_iter_s: 83.26265096664429
      time_total_s: 794.5377962589264
      timestamp: 1602274161
      timesteps_since_restore: 0
      training_iteration: 9
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.14747 |     0.6234 |                    8 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.28598 |     0.5382 |                    9 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [9,  4000] loss: 0.336
    [2m[36m(pid=1126)[0m [10,  2000] loss: 1.142
    [2m[36m(pid=1145)[0m [9,  6000] loss: 0.233
    [2m[36m(pid=1126)[0m [10,  4000] loss: 0.570
    [2m[36m(pid=1145)[0m [9,  8000] loss: 0.178
    [2m[36m(pid=1126)[0m [10,  6000] loss: 0.395
    [2m[36m(pid=1145)[0m [9, 10000] loss: 0.143
    [2m[36m(pid=1126)[0m [10,  8000] loss: 0.299
    [2m[36m(pid=1145)[0m [9, 12000] loss: 0.118
    [2m[36m(pid=1126)[0m [10, 10000] loss: 0.228
    [2m[36m(pid=1145)[0m [9, 14000] loss: 0.104
    [2m[36m(pid=1126)[0m [10, 12000] loss: 0.196
    [2m[36m(pid=1145)[0m [9, 16000] loss: 0.093
    [2m[36m(pid=1126)[0m [10, 14000] loss: 0.169
    [2m[36m(pid=1126)[0m [10, 16000] loss: 0.151
    [2m[36m(pid=1145)[0m [9, 18000] loss: 0.083
    [2m[36m(pid=1126)[0m [10, 18000] loss: 0.132
    [2m[36m(pid=1145)[0m [9, 20000] loss: 0.078
    [2m[36m(pid=1126)[0m [10, 20000] loss: 0.118
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6124
      date: 2020-10-09_20-10-40
      done: false
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 9
      loss: 1.2186276267750566
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 873.050055027008
      time_this_iter_s: 92.52846145629883
      time_total_s: 873.050055027008
      timestamp: 1602274240
      timesteps_since_restore: 0
      training_iteration: 9
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=8
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.21863 |     0.6124 |                    9 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.28598 |     0.5382 |                    9 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Result for DEFAULT_77a44_00006:
      accuracy: 0.5454
      date: 2020-10-09_20-10-45
      done: true
      experiment_id: 696157fc029f42e781f0779431a5902f
      experiment_tag: 6_batch_size=2,l1=8,l2=8,lr=0.00035961
      hostname: 234fef3cc6b0
      iterations_since_restore: 10
      loss: 1.290222985061258
      node_ip: 172.17.0.2
      pid: 1126
      should_checkpoint: true
      time_since_restore: 878.2885060310364
      time_this_iter_s: 83.75070977210999
      time_total_s: 878.2885060310364
      timestamp: 1602274245
      timesteps_since_restore: 0
      training_iteration: 10
      trial_id: 77a44_00006
    
    
    == Status ==
    Memory usage on this node: 4.6/240.1 GiB
    Using AsyncHyperBand: num_stopped=9
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 4/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (2 RUNNING, 8 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.21863 |     0.6124 |                    9 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | RUNNING    | 172.17.0.2:1126 |            2 |    8 |    8 | 0.000359613 | 1.29022 |     0.5454 |                   10 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    [2m[36m(pid=1145)[0m [10,  2000] loss: 0.564
    [2m[36m(pid=1145)[0m [10,  4000] loss: 0.304
    [2m[36m(pid=1145)[0m [10,  6000] loss: 0.210
    [2m[36m(pid=1145)[0m [10,  8000] loss: 0.165
    [2m[36m(pid=1145)[0m [10, 10000] loss: 0.132
    [2m[36m(pid=1145)[0m [10, 12000] loss: 0.107
    [2m[36m(pid=1145)[0m [10, 14000] loss: 0.096
    [2m[36m(pid=1145)[0m [10, 16000] loss: 0.089
    [2m[36m(pid=1145)[0m [10, 18000] loss: 0.082
    [2m[36m(pid=1145)[0m [10, 20000] loss: 0.071
    Result for DEFAULT_77a44_00001:
      accuracy: 0.6152
      date: 2020-10-09_20-12-10
      done: true
      experiment_id: f3958015aa1f4ab2a11c7e4fc8b68da6
      experiment_tag: 1_batch_size=2,l1=256,l2=128,lr=0.00046168
      hostname: 234fef3cc6b0
      iterations_since_restore: 10
      loss: 1.3026221742785826
      node_ip: 172.17.0.2
      pid: 1145
      should_checkpoint: true
      time_since_restore: 963.3746852874756
      time_this_iter_s: 90.32463026046753
      time_total_s: 963.3746852874756
      timestamp: 1602274330
      timesteps_since_restore: 0
      training_iteration: 10
      trial_id: 77a44_00001
    
    
    == Status ==
    Memory usage on this node: 4.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=10
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 2/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (1 RUNNING, 9 TERMINATED)
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc             |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |                 |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | RUNNING    | 172.17.0.2:1145 |            2 |  256 |  128 | 0.000461678 | 1.30262 |     0.6152 |                   10 |
    | DEFAULT_77a44_00002 | TERMINATED |                 |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |                 |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |                 |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |                 |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | TERMINATED |                 |            2 |    8 |    8 | 0.000359613 | 1.29022 |     0.5454 |                   10 |
    | DEFAULT_77a44_00007 | TERMINATED |                 |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |                 |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |                 |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-----------------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    == Status ==
    Memory usage on this node: 4.1/240.1 GiB
    Using AsyncHyperBand: num_stopped=10
    Bracket: Iter 8.000: -1.3193767046023162 | Iter 4.000: -1.3512925069440156 | Iter 2.000: -1.448237135411054 | Iter 1.000: -1.919263457131386
    Resources requested: 0/32 CPUs, 0/2 GPUs, 0.0/157.76 GiB heap, 0.0/49.41 GiB objects
    Result logdir: /var/lib/jenkins/ray_results/DEFAULT
    Number of trials: 10 (10 TERMINATED)
    +---------------------+------------+-------+--------------+------+------+-------------+---------+------------+----------------------+
    | Trial name          | status     | loc   |   batch_size |   l1 |   l2 |          lr |    loss |   accuracy |   training_iteration |
    |---------------------+------------+-------+--------------+------+------+-------------+---------+------------+----------------------|
    | DEFAULT_77a44_00000 | TERMINATED |       |            4 |    8 |  128 | 0.0210161   | 2.30609 |     0.1073 |                    1 |
    | DEFAULT_77a44_00001 | TERMINATED |       |            2 |  256 |  128 | 0.000461678 | 1.30262 |     0.6152 |                   10 |
    | DEFAULT_77a44_00002 | TERMINATED |       |            8 |   32 |   16 | 0.0131231   | 2.23771 |     0.1292 |                   10 |
    | DEFAULT_77a44_00003 | TERMINATED |       |            4 |    4 |  128 | 0.00551547  | 1.95655 |     0.2563 |                    1 |
    | DEFAULT_77a44_00004 | TERMINATED |       |            2 |  256 |  256 | 0.0647615   | 2.346   |     0.1024 |                    1 |
    | DEFAULT_77a44_00005 | TERMINATED |       |            4 |    4 |  128 | 0.0421917   | 2.35236 |     0.0986 |                    1 |
    | DEFAULT_77a44_00006 | TERMINATED |       |            2 |    8 |    8 | 0.000359613 | 1.29022 |     0.5454 |                   10 |
    | DEFAULT_77a44_00007 | TERMINATED |       |            4 |  128 |   16 | 0.00202898  | 1.28775 |     0.5713 |                   10 |
    | DEFAULT_77a44_00008 | TERMINATED |       |            2 |    4 |    8 | 0.000162963 | 1.98445 |     0.1883 |                    1 |
    | DEFAULT_77a44_00009 | TERMINATED |       |            2 |   32 |  256 | 0.000134494 | 1.54615 |     0.4358 |                    2 |
    +---------------------+------------+-------+--------------+------+------+-------------+---------+------------+----------------------+
    
    
    
    
    Best trial config: {'l1': 128, 'l2': 16, 'lr': 0.0020289809406172947, 'batch_size': 4}
    Best trial final validation loss: 1.2877456236266531
    Best trial final validation accuracy:

     

    如果运行该代码,示例输出可能如下所示:

    为了避免浪费资源,大多数试验已及早停止。 性能最好的试验达到了大约58%的验证精度,这可以在测试集上得到确认。

    就这样!大家现在可以调整PyTorch模型的参数。

    接下来,给大家介绍一下租用GPU做实验的方法,我们是在智星云租用的GPU,使用体验很好。具体大家可以参考:智星云官网: http://www.ai-galaxy.cn/,淘宝店:https://shop36573300.taobao.com/公众号: 智星AI

       

           

     

     

    展开全文
  • Ray Tune Hyperparameter Optimization Framework

    千次阅读 2018-07-24 09:50:21
    Ray Tune是一个可扩展的超参数优化框架,用于强化学习和深度学习。 从在单台计算机上运行一个实验到使用高效搜索算法在大型集群上运行,而无需更改代码。 本篇博客中所提及的函数。   一、简单开始 首先需要...

    Ray Tune是一个可扩展的超参数优化框架,用于强化学习和深度学习。 从在单台计算机上运行一个实验到使用高效搜索算法在大型集群上运行,而无需更改代码。

    本篇博客中所提及的函数

     

    一、简单开始

    首先需要安装Ray,使用命令 pip install ray

    简单示例:

    import ray
    import ray.tune as tune
    
    ray.init()
    tune.register_trainable("train_func", train_func)
    
    all_trials = tune.run_experiments({
        "my_experiment": {
            "run": "train_func",
            "stop": {"mean_accuracy": 99},
            "config": {
                "lr": tune.grid_search([0.2, 0.4, 0.6]),
                "momentum": tune.grid_search([0.1, 0.2]),
            }
        }
    })

    对于想要调整的函数,添加两行修改(请注意,我们使用PyTorch作为示例,但Ray Tune适用于任何深度学习框架,PyTorch中文文档

    def train_func(config, reporter):  # add a reporter arg
         model = NeuralNet()
         optimizer = torch.optim.SGD(
             model.parameters(), lr=config["lr"], momentum=config["momentum"])
         dataset = ( ... )
    
         for idx, (data, target) in enumerate(dataset):
             # ...
             output = model(data)
             loss = F.MSELoss(output, target)
             loss.backward()
             optimizer.step()
             accuracy = eval_accuracy(...)
             reporter(timesteps_total=idx, mean_accuracy=accuracy) # report metrics

    这个PyTorch脚本使用Ray Tune在train_func函数上运行一个小的网格搜索,在命令行上报告状态,直到达到mean_accuracy> = 99的停止条件:

    == Status ==
    Using FIFO scheduling algorithm.
    Resources used: 4/8 CPUs, 0/0 GPUs
    Result logdir: ~/ray_results/my_experiment
     - train_func_0_lr=0.2,momentum=1:  RUNNING [pid=6778], 209 s, 20604 ts, 7.29 acc
     - train_func_1_lr=0.4,momentum=1:  RUNNING [pid=6780], 208 s, 20522 ts, 53.1 acc
     - train_func_2_lr=0.6,momentum=1:  TERMINATED [pid=6789], 21 s, 2190 ts, 100 acc
     - train_func_3_lr=0.2,momentum=2:  RUNNING [pid=6791], 208 s, 41004 ts, 8.37 acc
     - train_func_4_lr=0.4,momentum=2:  RUNNING [pid=6800], 209 s, 41204 ts, 70.1 acc
     - train_func_5_lr=0.6,momentum=2:  TERMINATED [pid=6809], 10 s, 2164 ts, 100 acc

    为了报告增量进度,train_func定期调用Ray Tune传入的报告函数,以返回当前时间步长和ray.tune.result.TrainingResult中定义的其他度量。 增量结果将同步到群集的头节点上的本地磁盘。

    tune.run_experiments返回一个Trial对象列表,你可以通过trial.last_result检查结果。

     

    二、特点

    Ray Tune有如下特点:

    1. 可扩展的搜索算法实现,如基于人口的训练(PBT),中值停止规则,基于模型的优化(HyperOpt)和HyperBand。
    2. 与可视化工具集成,如TensorBoard,rllab的VisKit和平行坐标可视化。
    3. 灵活的试验性变量生成,包括网格搜索,随机搜索和条件参数分布。

    4. 资源感知调度,包括支持并行运行的算法,这些算法本身可以并行和分布。

     

    三、概念

    Ray Tune在集群中调度了许多trials。 每个trial都运行一个用户定义的Python函数或类,并通过传递用户代码的配置变量进行参数化。

    要运行任何给定的函数,你需要运行register_trainable注册一个名称。 这使得所有Ray上的worker都意识到这一功能的存在。

    ray.tune.register_trainable(name, trainable)

    Ray Tune提供run_experiments函数,用于生成和运行实验规范描述的trials。 trials由实施搜索算法的试验调度程序安排和管理(默认为FIFO)。

    ray.tune.run_experiments(experiments, scheduler=None, with_server=False, server_port=4321, verbose=True, queue_trials=False)

    Ray Tune可以在Ray的任何地方使用,例如 在电脑上使用嵌入在Python脚本中的ray.init()或用于大规模并行的自动缩放集群。

    具体示例参考

     

    四、试验性调度程序

    默认情况下,Ray Tune使用FIFOScheduler类按顺序调度trials。 但是,你还可以指定自定义计划算法,该算法可以提前停止试验,扰动参数或合并来自外部服务的建议。 当前实施的试验调度器包括基于群体的训练(PBT),中值停止规则,基于模型的优化(HyperOpt)和HyperBand。

    run_experiments({...}, scheduler=AsyncHyperBandScheduler())

     

    五、处理大型数据集

    经常需要在驱动程序上计算大对象(例如,训练数据,模型权重)并在每个trial中使用该对象。 Ray Tune提供了一个pin_in_object_store实用程序函数,可用于广播此类大对象。 以这种方式固定的对象在驱动程序进程运行时永远不会从Ray对象存储库中逐出,并且可以通过get_pinned_object从任何任务中有效地检索。

    import ray
    from ray.tune import register_trainable, run_experiments
    from ray.tune.util import pin_in_object_store, get_pinned_object
    
    import numpy as np
    
    ray.init()
    
    # X_id can be referenced in closures
    X_id = pin_in_object_store(np.random.random(size=100000000))
    
    def f(config, reporter):
        X = get_pinned_object(X_id)
        # use X
    
    register_trainable("f", f)
    run_experiments(...)

     

    六、HyperOpt集成

    HyperOptScheduler是一个trial调度程序,由HyperOpt支持执行基于顺序模型的超参数优化。 要使用此调度程序,需要通过以下命令安装HyperOpt:

    $ pip install --upgrade git+git://github.com/hyperopt/hyperopt.git

    一个示例

    注意:

    HyperOptScheduler在奖励属性中采用了增加的度量标准。 如果试图最小化损失,请务必在函数/类报告中指定mean_loss,并在HyperOptScheduler初始化程序中指定reward_attr = neg_mean_loss。

     

    七、Trial检查点

    要启用检查点,你必须实现一个Trainable类(可训练的函数不是可检查的,因为它们永远不会将控制权返回给它们的调用者)。 最简单的方法是子类化预定义的Trainable类并实现其_train,_save和_restore抽象方法(示例):需要实现此接口以支持调度程序(如HyperBand和PBT)中的资源多路复用。

    对于TensorFlow模型训练,这看起来像这样(完整tensorflow示例):

    class MyClass(Trainable):
        def _setup(self):
            self.saver = tf.train.Saver()
            self.sess = ...
            self.iteration = 0
    
        def _train(self):
            self.sess.run(...)
            self.iteration += 1
    
        def _save(self, checkpoint_dir):
            return self.saver.save(
                self.sess, checkpoint_dir + "/save",
                global_step=self.iteration)
    
        def _restore(self, path):
            return self.saver.restore(self.sess, path)

    另外,检查点可用于为实验提供容错。 设置checkpoint_freq:N和max_failures:M,即每N次迭代的试验设置checkpoint,每次试验最多M次崩溃就进行恢复,例如:

    run_experiments({
        "my_experiment": {
            ...
            "checkpoint_freq": 10,
            "max_failures": 5,
        },
    })

    必须实现以下的类接口才能启用检查点:

    class ray.tune.trainable.Trainable(config=None, logger_creator=None)

     

    八、客户端API

    你可以使用Tune客户端API添加或删除试验来修改正在进行的实验。为此,请验证是否安装了请求库:

    $ pip install requests

    要使用客户端API,您可以使用with_server = True开始实验:

    run_experiments({...}, with_server=True, server_port=4321)

    然后,在客户端,您可以使用以下类。 服务器地址默认为localhost:4321。 如果在群集上,你可能希望转发此端口(例如ssh -L <local_port>:localhost:<remote_port> <address>),以便你可以在本地计算机上使用客户端。

    class ray.tune.web_server.TuneClient(tune_address)

    一个Client API示例

     

    九、示例

    你可以在此处找到使用Ray Tune及其各种功能的示例列表,包括使用Keras,TensorFlow和基于人口训练的示例

     

     

     

     

    展开全文
  • HyperSpace与Ray Tune超参数搜索功能的集成。 使用要求: 目标函数的定义,该函数采用config参数,这是Ray Tune提供的字典。 一个argparse Namespace对象,它包含:args.trials(试验次数),args.out(中间结果...
  • Ray Tune Ray Tune 是一个标准的超参数调优工具,包含多种参数搜索算法,并且支持分布式计算,使用方式简单。同时支持pytorch、tensorflow等训练框架,和tensorboard可视化。 超参数 神经网络结构搜索(层数、节点...

    Ray Tune

    Ray Tune 是一个标准的超参数调优工具,包含多种参数搜索算法,并且支持分布式计算,使用方式简单。同时支持pytorch、tensorflow等训练框架,和tensorboard可视化。

    超参数

    • 神经网络结构搜索(层数、节点数、类型、连接方式)
    • 学习率
    • optimizer
    • loss weight

    使用方法

    安装:

    pip install ray torchvision
    

    pytorch 集成tune到pipeline

    • class-based ray.tune.Trainable API
    • function-based tune.run API

    pytorch class-based ray.tune.Trainable example:

    # https://github.com/ray-project/ray/blob/master/python/ray/tune/examples/mnist_pytorch_trainable.py
    from __future__ import print_function
    
    import argparse
    import os
    import torch
    import torch.optim as optim
    
    import ray
    from ray import tune
    from ray.tune.schedulers import ASHAScheduler
    from ray.tune.examples.mnist_pytorch import (train, test, get_data_loaders,
                                                 ConvNet)
    
    # Change these values if you want the training to run quicker or slower.
    EPOCH_SIZE = 512
    TEST_SIZE = 256
    
    # Training settings
    parser = argparse.ArgumentParser(description="PyTorch MNIST Example")
    parser.add_argument(
        "--use-gpu",
        action="store_true",
        default=False,
        help="enables CUDA training")
    parser.add_argument(
        "--ray-address", type=str, help="The Redis address of the cluster.")
    parser.add_argument(
        "--smoke-test", action="store_true", help="Finish quickly for testing")
    
    
    # Below comments are for documentation purposes only.
    # yapf: disable
    # __trainable_example_begin__
    class TrainMNIST(tune.Trainable):
        def setup(self, config):
            use_cuda = config.get("use_gpu") and torch.cuda.is_available()
            self.device = torch.device("cuda" if use_cuda else "cpu")
            self.train_loader, self.test_loader = get_data_loaders()
            self.model = ConvNet().to(self.device)
            self.optimizer = optim.SGD(
                self.model.parameters(),
                lr=config.get("lr", 0.01), 
                momentum=config.get("momentum", 0.9))
    
        def step(self):
            train(
                self.model, self.optimizer, self.train_loader, device=self.device)
            acc = test(self.model, self.test_loader, self.device)
            return {"mean_accuracy": acc}
    
        def save_checkpoint(self, checkpoint_dir):
            checkpoint_path = os.path.join(checkpoint_dir, "model.pth")
            torch.save(self.model.state_dict(), checkpoint_path)
            return checkpoint_path
    
        def load_checkpoint(self, checkpoint_path):
            self.model.load_state_dict(torch.load(checkpoint_path))
    
    
    # __trainable_example_end__
    # yapf: enable
    
    if __name__ == "__main__":
        args = parser.parse_args()
        ray.init(address=args.ray_address, num_cpus=6 if args.smoke_test else None)
        sched = ASHAScheduler()
        analysis = tune.run(
            TrainMNIST,
            metric="mean_accuracy", # 最后比较的指标
            mode="max", # 指标越大越好
            scheduler=sched, #指定超参优化器
            stop={
                "mean_accuracy": 0.95,
                "training_iteration": 3 if args.smoke_test else 20,
            },# 设定提前终止条件 
            resources_per_trial={
                "cpu": 3,
                "gpu": int(args.use_gpu)
            }, # 每个trial 需要的资源
            num_samples=1 if args.smoke_test else 20, #运行Trails的数目 
            checkpoint_at_end=True,
            checkpoint_freq=3,
            config={
                "args": args,
                "lr": tune.uniform(0.001, 0.1),
                "momentum": tune.uniform(0.1, 0.9),
            }) #设定搜索的参数空间
    
        print("Best config is:", analysis.best_config)
    

    python function-based tune.run API

    
    from __future__ import print_function
    
    import argparse
    import os
    import torch
    import torch.optim as optim
    
    import ray
    from ray import tune
    from ray.tune.schedulers import ASHAScheduler
    from ray.tune.examples.mnist_pytorch import (train, test, get_data_loaders,
                                                 ConvNet)
    
    def train_mnist(config):
        use_cuda = torch.cuda.is_available()
        device = torch.device("cuda" if use_cuda else "cpu")
        train_loader, test_loader = get_data_loaders()
        model = ConvNet().to(device)
    
        optimizer = optim.SGD(
            model.parameters(), lr=config["lr"], momentum=config["momentum"])
    
        while True:
            train(model, optimizer, train_loader, device)
            acc = test(model, test_loader, device)
            # Set this to run Tune.
            tune.report(mean_accuracy=acc)
    
    
    if __name__ == "__main__":
        parser = argparse.ArgumentParser(description="PyTorch MNIST Example")
        parser.add_argument(
            "--cuda",
            action="store_true",
            default=False,
            help="Enables GPU training")
        parser.add_argument(
            "--smoke-test", action="store_true", help="Finish quickly for testing")
        parser.add_argument(
            "--ray-address",
            help="Address of Ray cluster for seamless distributed execution.")
        args = parser.parse_args()
        if args.ray_address:
            ray.init(address=args.ray_address)
        else:
            ray.init(num_cpus=2 if args.smoke_test else None)
    
        # for early stopping
        sched = AsyncHyperBandScheduler()
    
        analysis = tune.run(
            train_mnist,
            metric="mean_accuracy",
            mode="max",
            name="exp",
            scheduler=sched,
            stop={
                "mean_accuracy": 0.98,
                "training_iteration": 5 if args.smoke_test else 100
            },
            resources_per_trial={
                "cpu": 2,
                "gpu": int(args.cuda)  # set this for GPUs
            },
            num_samples=1 if args.smoke_test else 50,
            config={
                "lr": tune.loguniform(1e-4, 1e-2),
                "momentum": tune.uniform(0.1, 0.9),
            })
    
        print("Best config is:", analysis.best_config)
    

    ray tune 会根据机器上的资源和设定的每个trial所需资源来运行多个trial(每一组参数跑一个trial 一个尝试).

    不同的参数变量产生方式

    • tune.grid_search([0.1, 0.2, 0.3])
    • tune.sample_from(lambda spec: np.random.uniform(100)) 自定义lambda方法
    • tune.loguniform(1e-4, 1e-2)
    • tune.uniform(0.1, 0.9)

    不同的搜索算法

    支持不同的搜索算法:

    • Grid Search (暴力解法,穷举参数)
    • Random Search (在不确定超参分布时,采用随机搜索https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf)
    • BayesOpt (通过观测一些参数得到的结果,参数与观测值的后验分布,根据已知点的结果预测下一个可能获得最小点的参数,再跑这个参数结果,更新参数分布)
    • HyperOpt (使用LGB做预测器)
    • SigOpt
    • Nevergrad
    • Scikit-Optimize
    • Ax
    • BOHB
    from hyperopt import hp
    from ray.tune.suggest.hyperopt import HyperOptSearch
    
    space = {
        "lr": hp.loguniform("lr", 1e-10, 0.1),
        "momentum": hp.uniform("momentum", 0.1, 0.9),
    }
    
    hyperopt_search = HyperOptSearch(
        space, max_concurrent=2, reward_attr="mean_accuracy")
    
    analysis = tune.run(train_mnist, num_samples=10, search_alg=hyperopt_search)
    

    分布式训练

    类似工具

    微软NNI(Neural Network Intelligence)

    • hyper-parameter tuning and neural architecture search
    • find good models, which includes good neural architecture, good hyper-parameters, good model compression approach

    ray tune:

    • hyper-parameter tuning and reinforcement learning algorithm
    • distributed framework (Tune uses a master-worker architecture to centralise decision-making and communicates)

    HyperOpt
    Hyperband

    还有其他的一些工具 Google Vizier、 Amazon Sagemaker、 facebook Hiplot 参考:
    https://analyticsindiamag.com/top-hyperparameter-optimisation-tools-neural-networks/
    https://zhuanlan.zhihu.com/p/56730229

    工具模型参数搜索模型框架搜索并行支持支持各种深度学习框架强化学习
    NNI支持支持支持支持
    Google Vizier支持支持支持支持
    ray tune支持不支持支持支持支持
    Hyteropt支持不支持支持支持

    NNI/Google Vizier 偏向神经网络参数和模型结构的自动化搜索。Ray.tune 支持强化学习。Hyperopt偏重超参数搜索。

    机器学习数据挖掘方面有一些支持特征搜索筛选的工具:Auto ML、auto_sklean、 Feature Tool

    展开全文
  • 以Titanic乘客生存预测任务为例,进一步熟悉Ray Tune调参工具。 titanic数据集的目标是根据乘客信息预测他们在Titanic号撞击冰山沉没后能否生存。 本示例的基础代码参考了下面两篇文章: 1-1,结构化数据建模...
  • Ray Tune相关API介绍

    千次阅读 2018-07-23 22:18:04
    ray.tune.register_trainable(name, trainable) 参数: name (str) - 注册的方法或函数名。 trainable (obj) - 函数或tune.Trainable类。函数必须采用(config, status_reporter)作为参数,并且在注册的过程...
  • (使用Keras和Ray Tune的U-Net / Mask R-CNN工作流程进行实例细分) 肺部实例分割工作流程使用通过模型从图像中预测肺罩。 运行工作流程 使用命令git clone cd进入lung-instance-segmentation-workflow目录 [可选...
  • PyTorch Hyperlight并不是一个分支,因为它不修改(也没有这样的计划)任何PyTorch-Lightning或Ray Tune代码,并且建立在上述框架的基础上。 例子 PyTorch Hyperlight关键原则 无需重新设计轮子即可使用已经...
  • https://docs.ray.io/en/master/tune/api_docs/schedulers.html 在调优过程中,一些超参数优化算法被称为“scheduling algorithms”,这些算法可以提前终止坏的尝试(trial),可以暂停尝试,复制尝试,调整尝试的参数...
  • ) Ray Tune is a Python library that accelerates hyperparameter tuning by allowing you to leverage cutting edge optimization algorithms at scale. It is built on Ray designed to remove the friction from...
  • 参考了PyTorch官方文档和Ray Tune官方文档 1、HYPERPARAMETER TUNING WITH RAY TUNE 2、How to use Tune with PyTorch 以PyTorch中的CIFAR 10图片分类为例,示范如何将Ray Tune融入PyTorch模型训练过程中。 其中...
  • Ray Tune 模块 Tune Tune是一个超参数整定模块,他以’trials’来构建起每一次尝试。为’trials’利用Scheduler作为调度器。可以使用包括PBT,AsyncHyperBand在内的多种超参数整定方法。 如何使用? 根据上述所述,...
  • Ray.tune官方文档

    千次阅读 2020-03-11 23:47:20
    Ray.tune官方文档 调整超参数通常是机器学习工作流程中最昂贵的部分。 Tune专为解决此问题而设计,展示了针对此痛点的有效且可扩展的解决方案。 请注意,此示例取决于Tensorflow 2.0。 Code: ...
  • Ray_Tune-源码

    2021-03-14 13:42:04
    雷·图恩
  • Ray.tune官方文档 调整超参数通常是机器学习工作流程中最昂贵的部分。 Tune专为解决此问题而设计,展示了针对此痛点的有效且可扩展的解决方案。 请注意,此示例取决于Tensorflow 2.0。 Code: ray/python/ray/tune ...
  • ray.tune文档总结

    2021-09-01 14:48:12
    ray.tune文档总结 tune.run config指定超参数的搜索方法 ConcurrencyLimiter搜索算法 scheduler试验调度程序 分析 资源(并行、GPU、分布式) 原文档请看这里 ...执行超参数调整、用于管理实验,例如 日志检查、提前...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 875
精华内容 350
关键字:

raytune