精华内容
下载资源
问答
  • Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, ...
  • B) -> (N, Hi*Wi*A, B) x.view(x.shape[0], -1, self.anchor_generator.box_dim, x.shape[-2], x.shape[-1]) .permute(0, 3, 4, 1, 2) .flatten(1, -2) .float() # ensure fp32 for decoding precision for x in ...

    整体来说,Backbone、RPN和Fast RCNN是三个相对独立的模块。Backbone对每张图片产生5 level的特征,并送入RPN。

    RPN对送入的特征,首先经过3x3卷积,随后用sibling 1x1卷积产生分类和bbox信息,分类是指该anchor是否包含Object,bbox信息为四维,包括(dx, dy, dw, dh)。初始anchor加上偏移量后用于判断正负或忽略样本,并确定归属的gt instance。然后从中采样256个anchor(正负样本各一半)用于计算损失。最后,通过cls_score排序和NMS筛选出1000个样本,送入Fast RCNN。

    Fast RCNN对送入样本重新确定正负样本,并确定归属的gt instance。然后从中采样512个proposals(正负样本比为1:3),送入RoIAlign,根据proposal的w,h确定在哪一层(注意这里不包含P5采样得到的P6层),用对应层的比例缩放proposal,切出ROI。其中包括设置采样点(论文中采样点设4最好,设1效果差不多),双线性插值(根据落在的坐标方格进行插值,双线性就是线性插值两次,第一次先对x坐标插值,第二次对y坐标插值),maxpooling。

    用ROI计算classificaiton和bbox regression:和gt计算 softmax cross entropy loss和smooth L1 loss。

    需要注意的是,只有在测试时,会在Fast RCNN中使用NMS的选取最后的结果。


    如果不选择用cfg初始化模型,则Mask RCNN的初始化代码如下:可以简单的分为三个部分:(1) backbone = FPN(), (2) proposal_generator = RPN(); (3) roi_heads=StandardROIHeads()。

    model = GeneralizedRCNN(
        backbone=FPN(
            ResNet(
                BasicStem(3, 64, norm="FrozenBN"),
                ResNet.make_default_stages(50, stride_in_1x1=True, norm="FrozenBN"),
                out_features=["res2", "res3", "res4", "res5"],
            ).freeze(2),
            ["res2", "res3", "res4", "res5"],
            256,
            top_block=LastLevelMaxPool(),
        ),
        proposal_generator=RPN(
            in_features=["p2", "p3", "p4", "p5", "p6"],
            head=StandardRPNHead(in_channels=256, num_anchors=3),
            anchor_generator=DefaultAnchorGenerator(
                sizes=[[32], [64], [128], [256], [512]],
                aspect_ratios=[0.5, 1.0, 2.0],
                strides=[4, 8, 16, 32, 64],
                offset=0.0,
            ),
            anchor_matcher=Matcher([0.3, 0.7], [0, -1, 1], allow_low_quality_matches=True),
            box2box_transform=Box2BoxTransform([1.0, 1.0, 1.0, 1.0]),
            batch_size_per_image=256,
            positive_fraction=0.5,
            pre_nms_topk=(2000, 1000),
            post_nms_topk=(1000, 1000),
            nms_thresh=0.7,
        ),
        roi_heads=StandardROIHeads(
            num_classes=80,
            batch_size_per_image=512,
            positive_fraction=0.25,
            proposal_matcher=Matcher([0.5], [0, 1], allow_low_quality_matches=False),
            box_in_features=["p2", "p3", "p4", "p5"],
            box_pooler=ROIPooler(7, (1.0 / 4, 1.0 / 8, 1.0 / 16, 1.0 / 32), 0, "ROIAlignV2"),
            box_head=FastRCNNConvFCHead(
                ShapeSpec(channels=256, height=7, width=7), conv_dims=[], fc_dims=[1024, 1024]
            ),
            box_predictor=FastRCNNOutputLayers(
                ShapeSpec(channels=1024),
                test_score_thresh=0.05,
                box2box_transform=Box2BoxTransform((10, 10, 5, 5)),
                num_classes=80,
            ),
            mask_in_features=["p2", "p3", "p4", "p5"],
            mask_pooler=ROIPooler(14, (1.0 / 4, 1.0 / 8, 1.0 / 16, 1.0 / 32), 0, "ROIAlignV2"),
            mask_head=MaskRCNNConvUpsampleHead(
                ShapeSpec(channels=256, width=14, height=14),
                num_classes=80,
                conv_dims=[256, 256, 256, 256, 256],
            ),
        ),
        pixel_mean=[103.530, 116.280, 123.675],
        pixel_std=[1.0, 1.0, 1.0],
        input_format="BGR",
    )

    以下涉及的符号表示:

    • N: number of images in the minibatch
    • L: number of feature maps per image on which RPN is run
    • A: number of cell anchors (must be the same for all feature maps),指feature map上的每个点产生3种aspect_ratios的anchors
    • Hi, Wi: height and width of the i-th feature map
    • B: size of the box parameterization,指bbox的参数量4

    1. Backbone

    backbone使用ResNet-FPN,产生["p2", "p3", "p4", "p5", "p6"],共5层特征,每层特征的维度为:[N, C, Hi, Wi]。输出为一个dict,dict的keys是in_features的五个元素。

     

    2. RPN

    用于初筛anchors,并产生Fast RCNN中使用的proposals。

     proposal_generator=RPN(
            in_features=["p2", "p3", "p4", "p5", "p6"],
            head=StandardRPNHead(in_channels=256, num_anchors=3),
            anchor_generator=DefaultAnchorGenerator(
                sizes=[[32], [64], [128], [256], [512]],
                aspect_ratios=[0.5, 1.0, 2.0],
                strides=[4, 8, 16, 32, 64],
                offset=0.0,
            ),
            anchor_matcher=Matcher([0.3, 0.7], [0, -1, 1], allow_low_quality_matches=True),
            box2box_transform=Box2BoxTransform([1.0, 1.0, 1.0, 1.0]),
            batch_size_per_image=256,
            positive_fraction=0.5,
            pre_nms_topk=(2000, 1000),
            post_nms_topk=(1000, 1000),
            nms_thresh=0.7,

    对应RPN forward()的代码如下:

            features = [features[f] for f in self.in_features]
            anchors = self.anchor_generator(features)
    
            pred_objectness_logits, pred_anchor_deltas = self.rpn_head(features)
            # Transpose the Hi*Wi*A dimension to the middle:
            pred_objectness_logits = [
                # (N, A, Hi, Wi) -> (N, Hi, Wi, A) -> (N, Hi*Wi*A)
                score.permute(0, 2, 3, 1).flatten(1)
                for score in pred_objectness_logits
            ]
            pred_anchor_deltas = [
                # (N, A*B, Hi, Wi) -> (N, A, B, Hi, Wi) -> (N, Hi, Wi, A, B) -> (N, Hi*Wi*A, B)
                x.view(x.shape[0], -1, self.anchor_generator.box_dim, x.shape[-2], x.shape[-1])
                .permute(0, 3, 4, 1, 2)
                .flatten(1, -2)
                .float()  # ensure fp32 for decoding precision
                for x in pred_anchor_deltas
            ]
    
            if self.training:
                assert gt_instances is not None, "RPN requires gt_instances in training!"
                gt_labels, gt_boxes = self.label_and_sample_anchors(anchors, gt_instances)
                losses = self.losses(
                    anchors, pred_objectness_logits, gt_labels, pred_anchor_deltas, gt_boxes
                )
            else:
                losses = {}
            proposals = self.predict_proposals(
                anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes
            )

    可以提炼为以下几个部分:

    1. features = [features[f] for f in self.in_features]: 从backbone取得的图片特征,共有5 level,每个level的特征为[batch_size, channel, w, h]
    2. anchors = self.anchor_generator(features) -> DefaultAnchorGenerator():每个level的feature map对应一种anchor sizes,feature map上的每个点对应三种aspect_ratios。该函数旨在生成所有的anchors。
    3. anchors pred_objectness_logits, pred_anchor_deltas = self.rpn_head(features) -> StandardRPNHead():送入特征,算出是否是目标和(dx, dy, dw, dh)。pred_objectness_logits[in_features] = [N, A, Hi, Wi],pred_objectness_logits[in_features] = [N, A*B, Hi, Wi]
    4. reshape
    5. gt_labels, gt_boxes = self.label_and_sample_anchors(anchors, gt_instances):其中包括self.anchor_matcher(),通过预设的ROI_THRESHOLDS=[0.3, 0.7],会将每个anchors分为正负样本,并返回每个anchors对应的gt_instances。
    6. losses = self.losses() : 随机选取batch_size_per_image=256个anchors,其中positive_fraction=0.5为正样本,0.5为负样本,用作训练
    7. proposals = predict_proposals(anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes)->find_top_rpn_proposals(),该函数用于产生proposals,执行流程包括,根据cls_score选取pre_nms_topk个anchors,根据rpn_bbpx_pred对anchors的location进行微调,获得调整后的bbox,随后根据nms_thresh=0.7执行NMS,然后去除超过边界和过小的anchors,最后根据cls_score选取post_nms_topk个样本
    8. return proposals, losses

     

    3. roi_heads = StandardROIHeads()。

    roi_heads=StandardROIHeads(
            num_classes=80,
            batch_size_per_image=512,
            positive_fraction=0.25,
            proposal_matcher=Matcher([0.5], [0, 1], allow_low_quality_matches=False),
            box_in_features=["p2", "p3", "p4", "p5"],
            box_pooler=ROIPooler(7, (1.0 / 4, 1.0 / 8, 1.0 / 16, 1.0 / 32), 0, "ROIAlignV2"),
            box_head=FastRCNNConvFCHead(
                ShapeSpec(channels=256, height=7, width=7), conv_dims=[], fc_dims=[1024, 1024]
            ),
            box_predictor=FastRCNNOutputLayers(
                ShapeSpec(channels=1024),
                test_score_thresh=0.05,
                box2box_transform=Box2BoxTransform((10, 10, 5, 5)),
                num_classes=80,
            ),
            mask_in_features=["p2", "p3", "p4", "p5"],
            mask_pooler=ROIPooler(14, (1.0 / 4, 1.0 / 8, 1.0 / 16, 1.0 / 32), 0, "ROIAlignV2"),
            mask_head=MaskRCNNConvUpsampleHead(
                ShapeSpec(channels=256, width=14, height=14),
                num_classes=80,
                conv_dims=[256, 256, 256, 256, 256],
            ),
        ),

    1. proposal_matcher = Matcher(),会根据ROI_HEADS.IOU_THRESHOLDS来筛选正负样本,并重新指定GT bbox,Fast RCNN从每张图片种选择512个proposals用于训练,其中0.25是正样本,其余为负样本。

    2. 在训练时,Fast RCNN不会使用NMS,只有在测试时,会先通过score筛选掉部分结果后,再用NMS输出结果。

     

     

    展开全文
  • 目标检测 | Detectron2 中Boxes 详解

    千次阅读 2020-03-10 22:41:24
    Boxes 是Detectron2 中用来处理检测框的基础类,功能强大, 包括clip、scale、cat 等等操作。 代码解析 初始化类,可以看出成员变量tensor是一个二维tensor,保存N个框(x0, y0, x1, y1), 第二维度为4, 可以存放...

    简介

    • Boxes 是Detectron2 中用来处理检测框的基础类,功能强大,几乎包含boxe所有处理, 包括clip、scale、cat 、erea、nonempty等等操作。

    代码解析

    • 初始化类,可以看出成员变量tensor是一个二维tensor,保存N个框(x0, y0, x1, y1), 第二维度为4, 可以存放anchor或groun truth.整个Boxes都是围绕这个tensor进行一系列操作
    class Boxes:
        """
        This structure stores a list of boxes as a Nx4 torch.Tensor.
        It supports some common methods about boxes
        (`area`, `clip`, `nonempty`, etc),
        and also behaves like a Tensor
        (support indexing, `to(device)`, `.device`, and iteration over all boxes)
    
        Attributes:
            tensor (torch.Tensor): float matrix of Nx4.
        """
    
        BoxSizeType = Union[List[int], Tuple[int, int]]
    
        def __init__(self, tensor: torch.Tensor):
            """
            Args:
                tensor (Tensor[float]): a Nx4 matrix.  Each row is (x1, y1, x2, y2).
            """
            device = tensor.device if isinstance(tensor, torch.Tensor) else torch.device("cpu")
            tensor = torch.as_tensor(tensor, dtype=torch.float32, device=device)
            if tensor.numel() == 0:
                tensor = torch.zeros(0, 4, dtype=torch.float32, device=device)
            assert tensor.dim() == 2 and tensor.size(-1) == 4, tensor.size()
    
            self.tensor = tensor
    
    • clone , 复制Boxes(新的内存空间)
      def clone(self) -> "Boxes":
            """
            Clone the Boxes.
    
            Returns:
                Boxes
            """
            return Boxes(self.tensor.clone())
    
    • to,指定tensor的device(“cuda” or “cpu”)
     def to(self, device: str) -> "Boxes":
            return Boxes(self.tensor.to(device))
    
    • area, 计算框面积
        def area(self) -> torch.Tensor:
            """
            Computes the area of all the boxes.
    
            Returns:
                torch.Tensor: a vector with areas of each box.
            """
            box = self.tensor
            area = (box[:, 2] - box[:, 0]) * (box[:, 3] - box[:, 1])
            return area
    
    • clip, 将框坐标clip到图片区域
     def clip(self, box_size: BoxSizeType) -> None:
            """
            Clip (in place) the boxes by limiting x coordinates to the range [0, width]
            and y coordinates to the range [0, height].
    
            Args:
                box_size (height, width): The clipping box's size.
            """
            assert torch.isfinite(self.tensor).all(), "Box tensor contains infinite or NaN!"
            h, w = box_size
            self.tensor[:, 0].clamp_(min=0, max=w)
            self.tensor[:, 1].clamp_(min=0, max=h)
            self.tensor[:, 2].clamp_(min=0, max=w)
            self.tensor[:, 3].clamp_(min=0, max=h)
    
    • nonempt, 清除小于指定大小的框
        def nonempty(self, threshold: int = 0) -> torch.Tensor:
            """
            Find boxes that are non-empty.
            A box is considered empty, if either of its side is no larger than threshold.
    
            Returns:
                Tensor:
                    a binary vector which represents whether each box is empty
                    (False) or non-empty (True).
            """
            box = self.tensor
            widths = box[:, 2] - box[:, 0]
            heights = box[:, 3] - box[:, 1]
            keep = (widths > threshold) & (heights > threshold)
            return keep
    
    • __get_item__\_\_get\_item\_\_,索引获取框。
      """
            Returns:
                Boxes: Create a new :class:`Boxes` by indexing.
    
            The following usage are allowed:
    
            1. `new_boxes = boxes[3]`: return a `Boxes` which contains only one box.
            2. `new_boxes = boxes[2:10]`: return a slice of boxes.
            3. `new_boxes = boxes[vector]`, where vector is a torch.BoolTensor
               with `length = len(boxes)`. Nonzero elements in the vector will be selected.
    
            Note that the returned Boxes might share storage with this Boxes,
            subject to Pytorch's indexing semantics.
            """
            if isinstance(item, int):
                return Boxes(self.tensor[item].view(1, -1))
            b = self.tensor[item]
            assert b.dim() == 2, "Indexing on Boxes with {} failed to return a matrix!".format(item)
            return Boxes(b)
    
    • 框的个数
     def __len__(self) -> int:
            return self.tensor.shape[0]
    
    • inside_box,主要用于faster rcnn计算positive和negative时,删除一些越界的框。
      def inside_box(self, box_size: BoxSizeType, boundary_threshold: int = 0) -> torch.Tensor:
            """
            Args:
                box_size (height, width): Size of the reference box.
                boundary_threshold (int): Boxes that extend beyond the reference box
                    boundary by more than boundary_threshold are considered "outside".
    
            Returns:
                a binary vector, indicating whether each box is inside the reference box.
            """
            height, width = box_size
            inds_inside = (
                (self.tensor[..., 0] >= -boundary_threshold)
                & (self.tensor[..., 1] >= -boundary_threshold)
                & (self.tensor[..., 2] < width + boundary_threshold)
                & (self.tensor[..., 3] < height + boundary_threshold)
            )
            return inds_inside
    
    • get_centers,计算框中心
    
        def get_centers(self) -> torch.Tensor:
            """
            Returns:
                The box centers in a Nx2 array of (x, y).
            """
            return (self.tensor[:, :2] + self.tensor[:, 2:]) / 2
    
    • scale, 归一化坐标
        def scale(self, scale_x: float, scale_y: float) -> None:
            """
            Scale the box with horizontal and vertical scaling factors
            """
            self.tensor[:, 0::2] *= scale_x
            self.tensor[:, 1::2] *= scale_y
    
    • cat, 将多个Boxes 拼接成一个
    @staticmethod
        def cat(boxes_list: List["Boxes"]) -> "Boxes":
            """
            Concatenates a list of Boxes into a single Boxes
    
            Arguments:
                boxes_list (list[Boxes])
    
            Returns:
                Boxes: the concatenated Boxes
            """
            assert isinstance(boxes_list, (list, tuple))
            assert len(boxes_list) > 0
            assert all(isinstance(box, Boxes) for box in boxes_list)
    
            cat_boxes = type(boxes_list[0])(cat([b.tensor for b in boxes_list], dim=0))
            return cat_boxes
    
    • device, tensor 对应device属性
        @property
        def device(self) -> torch.device:
            return self.tensor.device
    
    • __iter__\_\_iter\_\_迭代器
        def __iter__(self) -> Iterator[torch.Tensor]:
            """
            Yield a box as a Tensor of shape (4,) at a time.
            """
            yield from self.tensor
    
    展开全文
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...

    以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的鼓励。\color{blue}{文末附带}\color{blue}{公众号 -}\color{blue}{ 海量资源}。

    detectron2(目标检测框架)无死角玩转-00:目录

    前言

    通过前面的博客,已经知道detectron2的整体架构,源码我们再回溯到detectron2/engine/train_loop.py,可以看到TrainerBase,我们来看看他存在那些子孙如下:

    # 老祖宗 detectron2/engine/train_loop.py
    class TrainerBase: 
    
    #第一代子孙 detectron2/engine/train_loop.py
    class SimpleTrainer(TrainerBase): 
    
    # 第二代子孙 detectron2/engine/defaults.py
    class DefaultTrainer(SimpleTrainer): 
    
    # 第三代子孙 tools/train_my.py-本人参考源码实现
    class Trainer(DefaultTrainer):
    

    我们先从老祖宗看起:

    class TrainerBase:
    
        def __init__(self):
            self._hooks = []
    
        def register_hooks(self, hooks):
            hooks = [h for h in hooks if h is not None]
            for h in hooks:
                assert isinstance(h, HookBase)
                h.trainer = weakref.proxy(self)
            self._hooks.extend(hooks)
    
        def train(self, start_iter: int, max_iter: int):
    
            self.iter = self.start_iter = start_iter
            self.max_iter = max_iter
    
            with EventStorage(start_iter) as self.storage:
                try:
                    self.before_train()
                    for self.iter in range(start_iter, max_iter):
                        self.before_step()
                        self.run_step()
                        self.after_step()
                finally:
                    self.after_train()
    
        def before_train(self):
            for h in self._hooks:
                h.before_train()
    
        def after_train(self):
            for h in self._hooks:
                h.after_train()
    
        def before_step(self):
            for h in self._hooks:
                h.before_step()
    
        def after_step(self):
            for h in self._hooks:
                h.after_step()
            # this guarantees, that in each hook's after_step, storage.iter == trainer.iter
            self.storage.step()
    
        def run_step(self):
            raise NotImplementedError
    

    我这里删减了很多注释,大家可以阅读一下源码的英文注释。总的来说,还是很简单的,首先需要实现如下函数:

    # 已经实现
    def after_train(self):  def after_train(self): def before_step(self):
    
    # 已定义,待子类实现
    def run_step(self):
        raise NotImplementedError 
    

    通过源码为我们可以知道,after_train,after_train,before_step他们的实现过程真的很简单,就是循环调用 self._hooks中对应的函数,那么self._hooks是什么东西呢?翻译过来为钩子!不急我们先放一放,其中的实现的

    def register_hooks(self, hooks)
    

    也放在后面一起讲解,我们先来看看他的第一代子孙class SimpleTrainer(TrainerBase):,其重写了def run_step(self),实现了

    	# 检测异常
        def _detect_anomaly(self, losses, loss_dict):
        
        # 简单的看作日志记录即可
        def _write_metrics(self, metrics_dict: dict):
    

    很明显,核心部分为def run_step(self),重写如下:

        def run_step(self):
            """
            Implement the standard training logic described above.
            """
            
            # 确定为训练模式
            assert self.model.training, "[SimpleTrainer] model was changed to eval mode!"
            start = time.perf_counter()
            
            """
            # 获取一个batch_size的数据,如果有必要,是可以对dataloader进行装饰的
            If your want to do something with the data, you can wrap the dataloader.
            """
            data = next(self._data_loader_iter)
            data_time = time.perf_counter() - start
    
            """
            # 如果有必要,可以重写loss的计算过程
            If your want to do something with the losses, you can wrap the model.
            """
            loss_dict = self.model(data)
            losses = sum(loss for loss in loss_dict.values())
            
            # 检测loss计算是否异常
            self._detect_anomaly(losses, loss_dict)
    
            # 写入log日志
            metrics_dict = loss_dict
            metrics_dict["data_time"] = data_time
            self._write_metrics(metrics_dict)
    
            """
            # 进行反向传播
            If you need accumulate gradients or something similar, you can
            wrap the optimizer with your custom `zero_grad()` method.
            """
            self.optimizer.zero_grad()
            losses.backward()
    
            """
            # 一次迭代完成
            If you need gradient clipping/scaling or other processing, you can
            wrap the optimizer with your custom `step()` method.
            """
            self.optimizer.step()
    

    其实还是很好理解的,一路分析到这里,已经完成了反向传播。我们继续分析,看看其第二代子孙class DefaultTrainer(SimpleTrainer),路径为detectron2/engine/defaults.py,是有点复杂吧,不过关系不大,我们慢慢分析就好,再其初始化函数中我们又看到了

            model = self.build_model(cfg) # 构建模型
            optimizer = self.build_optimizer(cfg, model) # 构建优化方式
            data_loader = self.build_train_loader(cfg) # 构建训练数据迭代器
    

    很熟悉的,DefaultTrainer主要实现可如下函数:

        # 继续训练,或者重新加载模型
        def resume_or_load(self, resume=True):
    
        # 构建和训练相关的hooks
        def build_hooks(self):
    	
    	# 主要调用了父类的train
        def train(self):
        
    	# 根据cfg构建网络模型
        def build_model(cls, cfg):
    
    	# 构建SGD优化器
        def build_optimizer(cls, cfg, model):
        
    	# 定义学习率衰减方式 
        def build_lr_scheduler(cls, cfg, optimizer):
    
    	# 构建训练数据迭代器
    	def build_train_loader(cls, cfg):
    
    	# 构建测试数据迭代器
        def build_test_loader(cls, cfg, dataset_name):
    
    	# 用于训练过程中,进行验证,主意,这里为空,并没有实现
        def build_evaluator(cls, cfg, dataset_name):
    
    	# 对数据进行测试
    	def test(cls, cfg, model, evaluators=None):
    

    可以看到,第三代子孙的功能基本以及很完善了,也就是剩下

    def build_evaluator(cls, cfg, dataset_name):
    

    需要之类重写,除此之外,还有一个重点,那当然就是:

        # 构建和训练相关的hooks
        def build_hooks(self):
    

    我们暂且先放一下,来看看第四代子孙,也就是本人仿写tools/train_my.py中的class Trainer(DefaultTrainer),实现了:

    # 根据cfg配置,构建评估器
    def build_evaluator(cls, cfg, dataset_name, output_folder=None):
    
    # 这个我就是抄过来的,暂时不知道给来做什么的
    def test_with_TTA(cls, cfg, model):
    

    到这里,我们把祖宗到第三代都稍微过了一一遍,现在,还有一个重点,那就是Hook了。

    Hook

    首先,我们第一次提到Hook,是在祖宗TrainerBase的初始化函数之中:

    class TrainerBase:
        def __init__(self):
            self._hooks = []
    
        def register_hooks(self, hooks):
            hooks = [h for h in hooks if h is not None]
            for h in hooks:
                assert isinstance(h, HookBase)
                h.trainer = weakref.proxy(self)
            self._hooks.extend(hooks)
        def before_train(self):
            for h in self._hooks:
                h.before_train()
    
        def after_train(self):
            for h in self._hooks:
                h.after_train()
    
        def before_step(self):
            for h in self._hooks:
                h.before_step()
    
        def after_step(self):
            for h in self._hooks:
                h.after_step()
            # this guarantees, that in each hook's after_step, storage.iter == trainer.iter
            self.storage.step()
    

    从这里可以很明确的看到,self._hooks列表中,存在着很多hook,当调用before_train,after_train,before_step其会循环调用self._hooks列表中所有hook对应的函数,def register_hooks(self, hooks),就是把hook注册到self._hooks列表中,我们先来看:

    class HookBase:
        def before_train(self):
            """
            Called before the first iteration.
            """
            pass
    
        def after_train(self):
            """
            Called after the last iteration.
            """
            pass
    
        def before_step(self):
            """
            Called before each iteration.
            """
            pass
    
        def after_step(self):
            """
            Called after each iteration.
            """
            pass
    

    似乎没有什么好看的,定义了几个函数,但是都没有实现实际上的东西,那么我们在源码中查看一下,其在那些地方被调用了:
    在这里插入图片描述
    可以看到,在源码中,HookBase的子类还是非常多的,其都是在detectron2/engine/hooks.py中实现:

    # 可以自定义回调函数
    class CallbackHook(HookBase):
    
    # 对训练过程中的时间进行记录,追踪
    class IterationTimer(HookBase):
    
    # 迭代之前和迭代之后周期性写入
    class PeriodicWriter(HookBase):
    
    # 周期性的进行检查
    class PeriodicCheckpointer(_PeriodicCheckpointer, HookBase):
    
    # 学习率调整策略,每次迭完之后,都进行,判断是否达到学习率改变条件
    class LRScheduler(HookBase):
    
    # 迭代到指定次数,则进行评估
    class EvalHook(HookBase):
    
    # 可以简单理解为BN的升级版本
    class PreciseBN(HookBase):
    

    这里,为大家做一个简单的介绍,如果后续使用到这些hook再做详细的介绍。其实这些hook是很有的一个点子,大家在做消融实验的时候可以使用到。

    总的来说,我们可以创建各种各样的hook,只要该hook继承于HookBase,就能通过TrainerBase.register_hooks进行注册,每个hook可以实现一下几个函数:

    def before_train(self):
    def after_train(self):
    def before_step(self):
    def after_step(self):
    

    在这里,我们拿class EvalHook(HookBase)来举一个例子,该类实现是为了对训练中的模型进行测试,一般来说,测试都在迭代一定次数之后,再进行验证,所以其重写了函数:

        def after_step(self):
    

    迭代达到指定次数后就会进行测试,其初始化函数如下:

        def __init__(self, eval_period, eval_function):
            self._period = eval_period
            self._func = eval_function
    

    其传入了两个参数,一个是验证周期,一个是验证(测试)函数。

    结语

    到这里,对于整体的把控,又更近一步了,下小结我们就来看看数据的预处理过程,也就是训练数据的迭代器。

    在这里插入图片描述

    展开全文
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...

    以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的鼓励。\color{blue}{文末附带}\color{blue}{公众号 -}\color{blue}{ 海量资源}。

    detectron2(目标检测框架)无死角玩转-00:目录

    前言

    根据前面的博客,已经知道如何去训练自己的数据,本人是在之前的编写的程序上进行分析,也就是使用configs/My/retinanet_R_50_FPN_3x.yaml配置文件。通过tools/train_my.py的代码,可以在main函数看到如下:

        trainer = Trainer(cfg)
        trainer.resume_or_load(resume=args.resume)
        return trainer.train()
    

    说到底,其核心在于Trainer(cfg),既然如此,我们就来对其解剖一下。

    Registry

    进入Trainer,可以看到,重写了两个函数分别为:

    def build_evaluator(cls, cfg, dataset_name, output_folder=None):
    def test_with_TTA(cls, cfg, model):
    

    暂时我们不去理会,到底是什么玩意,先看看Trainer继承的父类DefaultTrainer,在初始化函数可以看到如下:

    def __init__(self, cfg):
            model = self.build_model(cfg) # 构建模型
            optimizer = self.build_optimizer(cfg, model) # 构建优化方式
            data_loader = self.build_train_loader(cfg) # 构建训练数据迭代器    
    

    是的,就是这么简单,整体框架就是这样的,倒是其内部的实现是很复杂的,比如我们进入self.build_model中的build_model(cfg),追踪到detectron2\modeling\meta_arch\build.py文件,真的很简单,就下面几句代码:

    from detectron2.utils.registry import Registry
    
    META_ARCH_REGISTRY = Registry("META_ARCH")  # noqa F401 isort:skip
    META_ARCH_REGISTRY.__doc__ = """
    
    def build_model(cfg):
    
        meta_arch = cfg.MODEL.META_ARCHITECTURE
        return META_ARCH_REGISTRY.get(meta_arch)(cfg)
    

    相信大家看了之后也明白,核心要点就是META_ARCH_REGISTRY = Registry(“META_ARCH”),那么这到底是个什么东西呢?其实简单的说,把他当作一个字典就可以了,如上面的META_ARCH_REGISTRY.get(meta_arch)(cfg),就是获得字典中键 “META_ARCH” 对应的值去构建的模型。这样说起来,大家可以可能有点迷糊,前面提到,我们讲的这一系列不可现在是围绕,目标检测retina网络来讲解的,那我们我们先来这个,我相信大家就不会迷糊了,detectron2/modeling/meta_arch/retinanet.py中的clas RetinaNet(nn.Module):

    @META_ARCH_REGISTRY.register()
    class RetinaNet(nn.Module):
        """
        Implement RetinaNet (https://arxiv.org/abs/1708.02002).
        """
    
        def __init__(self, cfg):
            super().__init__()
    
            self.device = torch.device(cfg.MODEL.DEVICE)
    

    其核心的重点,是前面的@META_ARCH_REGISTRY.register(),这里和其名字一样,是一个注册的操作,总的来说就是把RetinaNet(nn.Module)这个模型注册到 META_ARCH_REGISTRY 中去, 换而言之 META_ARCH_REGISTRY 保存的是模型架构,如Retina, Rcnn。在detectron2\modeling\meta_arch\rcnn.py文件中我们就能看到Rcnn的注册。如果我们想创建新的网络架构,我们也要去注册一个,这个后续的章节我带大家走一遍,学习如何去添加一个新得网络构架。
    其实把META_ARCH_REGISTRY理解为容器也可以,我们把模型都放入到这个容器之中,等到想要的时候,就可以把他取出来。除了用来装载网络模型的容器,也还有很多其他的容器如下:
    在这里插入图片描述
    有装载ROI_xx的容易,或者装载SEM_SEG_xxx的容器,虽然现在我不是很明白他们的作用,但是并不会干扰我对程序的分析。从目前的分析来看,就是有很多容器,每个容器之中,都装载着不同配置的模型,或者ROI_DEAD等等。那我我们怎么才能从这么多容器中,拿到我们想要的东西呢?我们回到detectron2/modeling/meta_arch/build.py,其中的:

    META_ARCH_REGISTRY = Registry("META_ARCH")  # noqa F401 isort:skip
    

    表示加载了META_ARCH这个容器,然后我们在根据 cfg.MODEL.META_ARCHITECTURE = 'RetinaNet’获得这个容器中我们想要的’RetinaNet模型。至于如何构建的,我们后续的章节机进行讲解。下面我们来看看:

    optimizer = self.build_optimizer(cfg, model)
    

    optimizer

    追踪上面的函数到detectron2/solver/build.py,可以看到如下:

    def build_optimizer(cfg: CfgNode, model: torch.nn.Module) -> torch.optim.Optimizer:
        """
        Build an optimizer from config.
        """
        params: List[Dict[str, Any]] = []
        for key, value in model.named_parameters():
            if not value.requires_grad:
                continue
            lr = cfg.SOLVER.BASE_LR
            weight_decay = cfg.SOLVER.WEIGHT_DECAY
            if key.endswith("norm.weight") or key.endswith("norm.bias"):
                weight_decay = cfg.SOLVER.WEIGHT_DECAY_NORM
            elif key.endswith(".bias"):
                # NOTE: unlike Detectron v1, we now default BIAS_LR_FACTOR to 1.0
                # and WEIGHT_DECAY_BIAS to WEIGHT_DECAY so that bias optimizer
                # hyperparameters are by default exactly the same as for regular
                # weights.
                lr = cfg.SOLVER.BASE_LR * cfg.SOLVER.BIAS_LR_FACTOR
                weight_decay = cfg.SOLVER.WEIGHT_DECAY_BIAS
            params += [{"params": [value], "lr": lr, "weight_decay": weight_decay}]
    
        optimizer = torch.optim.SGD(params, lr, momentum=cfg.SOLVER.MOMENTUM)
        return optimizer
    

    其实实现的过程十分的简单,就是把模型中对应的参数加载到SGD优化器之中,然后设定好学习率,权重衰减等等。

    build_train_loader

    最后就只剩下

    data_loader = self.build_train_loader(cfg) 
    

    其实现于detectron2/data/build.py

    def build_detection_train_loader(cfg, mapper=None):
        """
        A data loader is created by the following steps:
    
        1. Use the dataset names in config to query :class:`DatasetCatalog`, and obtain a list of dicts.
        2. Start workers to work on the dicts. Each worker will:
          * Map each metadata dict into another format to be consumed by the model.
          * Batch them by simply putting dicts into a list.
        The batched ``list[mapped_dict]`` is what this dataloader will return.
    
        Args:
            cfg (CfgNode): the config
            mapper (callable): a callable which takes a sample (dict) from dataset and
                returns the format to be consumed by the model.
                By default it will be `DatasetMapper(cfg, True)`.
    
        Returns:
            an infinite iterator of training data
        """
        # 迭代数据的线程数目
        num_workers = get_world_size()
        images_per_batch = cfg.SOLVER.IMS_PER_BATCH
        assert (
            images_per_batch % num_workers == 0
        ), "SOLVER.IMS_PER_BATCH ({}) must be divisible by the number of workers ({}).".format(
            images_per_batch, num_workers
        )
        assert (
            images_per_batch >= num_workers
        ), "SOLVER.IMS_PER_BATCH ({}) must be larger than the number of workers ({}).".format(
            images_per_batch, num_workers
        )
        
        # 每个线程获取的图像数目
        images_per_worker = images_per_batch // num_workers
    
        # 根据配置创建数据迭代器
        dataset_dicts = get_detection_dataset_dicts(
            cfg.DATASETS.TRAIN, # 指定为训练模式
            filter_empty=cfg.DATALOADER.FILTER_EMPTY_ANNOTATIONS, # 是否过滤掉注释为空的图像
            min_keypoints=cfg.MODEL.ROI_KEYPOINT_HEAD.MIN_KEYPOINTS_PER_IMAGE
            if cfg.MODEL.KEYPOINT_ON # 是否开启关键点
            else 0,
            proposal_files=cfg.DATASETS.PROPOSAL_FILES_TRAIN if cfg.MODEL.LOAD_PROPOSALS else None, # 是否预定义了训练文件
        )
        
        # 把数据按照一定格式输出
        dataset = DatasetFromList(dataset_dicts, copy=False)
        if mapper is None:
            mapper = DatasetMapper(cfg, True)
        dataset = MapDataset(dataset, mapper)
    
        # 根据重新开始训练还是继续加载训练,设定对应的参数
        sampler_name = cfg.DATALOADER.SAMPLER_TRAIN
        logger = logging.getLogger(__name__)
        logger.info("Using training sampler {}".format(sampler_name))
        if sampler_name == "TrainingSampler":
            sampler = samplers.TrainingSampler(len(dataset))
        elif sampler_name == "RepeatFactorTrainingSampler":
            sampler = samplers.RepeatFactorTrainingSampler(
                dataset_dicts, cfg.DATALOADER.REPEAT_THRESHOLD
            )
        else:
            raise ValueError("Unknown training sampler: {}".format(sampler_name))
    
    
        # 这路存在的意思,本人认为是,训练的时候如果图片大小不统一,则进行的是分组训练
        # 否则就按照batch_size进行训练
        if cfg.DATALOADER.ASPECT_RATIO_GROUPING:
            data_loader = torch.utils.data.DataLoader(
                dataset,
                sampler=sampler,
                num_workers=cfg.DATALOADER.NUM_WORKERS,
                batch_sampler=None,
                collate_fn=operator.itemgetter(0),  # don't batch, but yield individual elements
                worker_init_fn=worker_init_reset_seed,
            )  # yield individual mapped dict
            data_loader = AspectRatioGroupedDataset(data_loader, images_per_worker)
        else:
            batch_sampler = torch.utils.data.sampler.BatchSampler(
                sampler, images_per_worker, drop_last=True
            )
            # drop_last so the batch always have the same size
            data_loader = torch.utils.data.DataLoader(
                dataset,
                num_workers=cfg.DATALOADER.NUM_WORKERS,
                batch_sampler=batch_sampler,
                collate_fn=trivial_batch_collator,
                worker_init_fn=worker_init_reset_seed,
            )
    
        return data_loader
    

    其实简单的说,就是数据迭代器的创建,其中做了一些数据预处理,后续我们再慢慢的分析细节

    train

    通过前面的讲解,已经知道了如何构建网络模型,优化器,以及数据迭代器,剩下的就是去训练模型了,其代码的实现于detectron2/engine/train_loop.py:

        def train(self, start_iter: int, max_iter: int):
            """
            Args:
                start_iter, max_iter (int): See docs above
            """
            logger = logging.getLogger(__name__)
            logger.info("Starting training from iteration {}".format(start_iter))
    
            self.iter = self.start_iter = start_iter
            self.max_iter = max_iter
    
            with EventStorage(start_iter) as self.storage:
                try:
                    self.before_train()
                    for self.iter in range(start_iter, max_iter):
                        self.before_step()
                        self.run_step()
                        self.after_step()
                finally:
                    self.after_train()
    

    里面的细节,本人暂时也没有查看,不过没有关系,很明显其中最核心在于:

                        self.before_step()
                        self.run_step()
                        self.after_step()
    

    大致分成了三个循序,迭代前,迭代进行,迭代后。

    结语

    接下来,我们的任务就是去分析每一个细节了。后面的博客见,记得点赞哈!

    在这里插入图片描述

    展开全文
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...
  • 以下链接是个人关于detectron2(目标检测框架),所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的...

空空如也

空空如也

1 2 3 4 5
收藏数 94
精华内容 37
关键字:

detectron2详解