首页 - 技术栈

51网站怎么打不开了聚享游网站如何做推广

作者: 五速梦信息网
时间: 2026年05月16日 14:55

当前位置：首页 > news >正文

51网站怎么打不开了,聚享游网站如何做推广,深圳手机网站制作公司,邯郸购物网站建设锚框学习视频#xff1a;锚框【动手学深度学习v2】官方笔记#xff1a;锚框 1.锚框目标检测算法通常会在输入图像中采样大量的区域#xff0c;然后判断这些区域中是否包含我们感兴趣的目标#xff0c;并调整区域边界从而更准确地预测目标的真实边界框#xff08;gro…锚框学习视频锚框【动手学深度学习v2】官方笔记锚框 1.锚框目标检测算法通常会在输入图像中采样大量的区域然后判断这些区域中是否包含我们感兴趣的目标并调整区域边界从而更准确地预测目标的真实边界框ground-truth bounding box。不同的模型使用的区域采样方法可能不同。这里我们介绍其中的一种方法以每个像素为中心生成多个缩放比和宽高比aspect ratio不同的边界框。这些边界框被称为锚框anchor box 首先让我们修改输出精度以获得更简洁的输出 %matplotlib inline import torch from d2l import torch as d2ltorch.set_printoptions(2) # 精简输出精度1.1生成多个锚框 #save def multibox_prior(data, sizes, ratios):生成以每个像素为中心具有不同形状的锚框in_height, in_width data.shape[-2:]device, num_sizes, num_ratios data.device, len(sizes), len(ratios)boxes_per_pixel (num_sizes num_ratios - 1)size_tensor torch.tensor(sizes, devicedevice)ratio_tensor torch.tensor(ratios, devicedevice)# 为了将锚点移动到像素的中心需要设置偏移量。# 因为一个像素的高为1且宽为1我们选择偏移我们的中心0.5offset_h, offset_w 0.5, 0.5steps_h 1.0 / in_height # 在y轴上缩放步长steps_w 1.0 / in_width # 在x轴上缩放步长# 生成锚框的所有中心点center_h (torch.arange(in_height, devicedevice) offset_h) * steps_hcenter_w (torch.arange(in_width, devicedevice) offset_w) * steps_wshift_y, shift_x torch.meshgrid(center_h, center_w, indexingij)shift_y, shift_x shift_y.reshape(-1), shift_x.reshape(-1)# 生成“boxes_per_pixel”个高和宽# 之后用于创建锚框的四角坐标(xmin,xmax,ymin,ymax)w torch.cat((size_tensor * torch.sqrt(ratio_tensor[0]),sizes[0] * torch.sqrt(ratio_tensor[1:])))* in_height / in_width # 处理矩形输入h torch.cat((size_tensor / torch.sqrt(ratio_tensor[0]),sizes[0] / torch.sqrt(ratio_tensor[1:])))# 除以2来获得半高和半宽anchor_manipulations torch.stack((-w, -h, w, h)).T.repeat(in_height * in_width, 1) / 2# 每个中心点都将有“boxes_per_pixel”个锚框# 所以生成含所有锚框中心的网格重复了“boxes_per_pixel”次out_grid torch.stack([shift_x, shift_y, shift_x, shift_y],dim1).repeat_interleave(boxes_per_pixel, dim0)output out_grid anchor_manipulationsreturn output.unsqueeze(0)可以看到返回的锚框变量Y的形状是批量大小锚框的数量4。 img d2l.plt.imread(../img/catdog.jpg) h, w img.shape[:2]print(h, w) X torch.rand(size(1, 3, h, w)) Y multibox_prior(X, sizes[0.75, 0.5, 0.25], ratios[1, 2, 0.5]) Y.shape 561 728torch.Size([1, 2042040, 4])将锚框变量Y的形状更改为(图像高度,图像宽度,以同一像素为中心的锚框的数量,4)后我们可以获得以指定像素的位置为中心的所有锚框。在接下来的内容中我们访问以250,250为中心的第一个锚框。它有四个元素锚框左上角的(x,y)轴坐标和右下角的(x,y)轴坐标。输出中两个轴的坐标各分别除以了图像的宽度和高度。 boxes Y.reshape(h, w, 5, 4) boxes[250, 250, 0, :] tensor([0.06, 0.07, 0.63, 0.82])为了显示以图像中以某个像素为中心的所有锚框定义下面的show_bboxes函数来在图像上绘制多个边界框。 #save def show_bboxes(axes, bboxes, labelsNone, colorsNone):显示所有边界框def _make_list(obj, default_valuesNone):if obj is None:obj default_valueselif not isinstance(obj, (list, tuple)):obj [obj]return objlabels _make_list(labels)colors _make_list(colors, [b, g, r, m, c])for i, bbox in enumerate(bboxes):color colors[i % len(colors)]rect d2l.bbox_to_rect(bbox.detach().numpy(), color)axes.add_patch(rect)if labels and len(labels) i:text_color k if color w else waxes.text(rect.xy[0], rect.xy[1], labels[i],vacenter, hacenter, fontsize9, colortext_color,bboxdict(facecolorcolor, lw0))正如从上面代码中所看到的变量boxes中x轴和y轴的坐标值已分别除以图像的宽度和高度。绘制锚框时我们需要恢复它们原始的坐标值。因此在下面定义了变量bbox_scale。现在可以绘制出图像中所有以(250,250)为中心的锚框了。如下所示缩放比为0.75且宽高比为1的蓝色锚框很好地围绕着图像中的狗。 d2l.set_figsize() bbox_scale torch.tensor((w, h, w, h)) fig d2l.plt.imshow(img) show_bboxes(fig.axes, boxes[250, 250, :, :] * bbox_scale,[s0.75, r1, s0.5, r1, s0.25, r1, s0.75, r2,s0.75, r0.5])2.交并比接下来部分将使用交并比来衡量锚框和真实边界框之间、以及不同锚框之间的相似度。给定两个锚框或边界框的列表以下box_iou函数将在这两个列表中计算它们成对的交并比。 #save def box_iou(boxes1, boxes2):计算两个锚框或边界框列表中成对的交并比box_area lambda boxes: ((boxes[:, 2] - boxes[:, 0]) *(boxes[:, 3] - boxes[:, 1]))# boxes1,boxes2,areas1,areas2的形状:# boxes1(boxes1的数量,4),# boxes2(boxes2的数量,4),# areas1(boxes1的数量,),# areas2(boxes2的数量,)areas1 box_area(boxes1)areas2 box_area(boxes2)# inter_upperlefts,inter_lowerrights,inters的形状:# (boxes1的数量,boxes2的数量,2)inter_upperlefts torch.max(boxes1[:, None, :2], boxes2[:, :2])inter_lowerrights torch.min(boxes1[:, None, 2:], boxes2[:, 2:])inters (inter_lowerrights - inter_upperlefts).clamp(min0)# inter_areasandunion_areas的形状:(boxes1的数量,boxes2的数量)inter_areas inters[:, :, 0] * inters[:, :, 1]union_areas areas1[:, None] areas2 - inter_areasreturn inter_areas / union_areas3.在训练数据中标注锚框在训练集中我们将每个锚框视为一个训练样本。为了训练目标检测模型我们需要每个锚框的类别class和偏移量offset标签其中前者是与锚框相关的对象的类别后者是真实边界框相对于锚框的偏移量。在预测时我们为每个图像生成多个锚框预测所有锚框的类别和偏移量根据预测的偏移量调整它们的位置以获得预测的边界框最后只输出符合特定条件的预测边界框。目标检测训练集带有真实边界框的位置及其包围物体类别的标签。要标记任何生成的锚框我们可以参考分配到的最接近此锚框的真实边界框的位置和类别标签 3.1 将真实边界框分配给锚框具体算法为 #save def assign_anchor_to_bbox(ground_truth, anchors, device, iou_threshold0.5):将最接近的真实边界框分配给锚框num_anchors, num_gt_boxes anchors.shape[0], ground_truth.shape[0]# 位于第i行和第j列的元素x_ij是锚框i和真实边界框j的IoUjaccard box_iou(anchors, ground_truth)# 对于每个锚框分配的真实边界框的张量anchors_bbox_map torch.full((num_anchors,), -1, dtypetorch.long,devicedevice)# 根据阈值决定是否分配真实边界框max_ious, indices torch.max(jaccard, dim1)anc_i torch.nonzero(max_ious iou_threshold).reshape(-1)box_j indices[max_ious iou_threshold]anchors_bbox_map[anc_i] box_jcol_discard torch.full((num_anchors,), -1)row_discard torch.full((num_gt_boxes,), -1)for _ in range(num_gt_boxes):max_idx torch.argmax(jaccard)box_idx (max_idx % num_gt_boxes).long()anc_idx (max_idx / num_gt_boxes).long()anchors_bbox_map[anc_idx] box_idxjaccard[:, box_idx] col_discardjaccard[anc_idx, :] row_discardreturn anchors_bbox_map3.2标记类别和偏移量 #save def offset_boxes(anchors, assigned_bb, eps1e-6):对锚框偏移量的转换c_anc d2l.box_corner_to_center(anchors)c_assigned_bb d2l.box_corner_to_center(assigned_bb)offset_xy 10 * (c_assigned_bb[:, :2] - c_anc[:, :2]) / c_anc[:, 2:]offset_wh 5 * torch.log(eps c_assigned_bb[:, 2:] / c_anc[:, 2:])offset torch.cat([offset_xy, offset_wh], axis1)return offset#save def multibox_target(anchors, labels):使用真实边界框标记锚框batch_size, anchors labels.shape[0], anchors.squeeze(0)batch_offset, batch_mask, batch_class_labels [], [], []device, num_anchors anchors.device, anchors.shape[0]for i in range(batch_size):label labels[i, :, :]anchors_bbox_map assign_anchor_to_bbox(label[:, 1:], anchors, device)bbox_mask ((anchors_bbox_map 0).float().unsqueeze(-1)).repeat(1, 4)# 将类标签和分配的边界框坐标初始化为零class_labels torch.zeros(num_anchors, dtypetorch.long,devicedevice)assigned_bb torch.zeros((num_anchors, 4), dtypetorch.float32,devicedevice)# 使用真实边界框来标记锚框的类别。# 如果一个锚框没有被分配标记其为背景值为零indices_true torch.nonzero(anchors_bbox_map 0)bb_idx anchors_bbox_map[indices_true]class_labels[indices_true] label[bb_idx, 0].long() 1assigned_bb[indices_true] label[bb_idx, 1:]# 偏移量转换offset offset_boxes(anchors, assigned_bb) * bbox_maskbatch_offset.append(offset.reshape(-1))batch_mask.append(bbox_mask.reshape(-1))batch_class_labels.append(class_labels)bbox_offset torch.stack(batch_offset)bbox_mask torch.stack(batch_mask)class_labels torch.stack(batch_class_labels)return (bbox_offset, bbox_mask, class_labels)3.3举例下面通过一个具体的例子来说明锚框标签。我们已经为加载图像中的狗和猫定义了真实边界框其中第一个元素是类别0代表狗1代表猫其余四个元素是左上角和右下角的(x,y)轴坐标范围介于0和1之间。我们还构建了五个锚框用左上角和右下角的坐标进行标记A0…A4索引从0开始然后我们在图像中绘制这些真实边界框和锚框 ground_truth torch.tensor([[0, 0.1, 0.08, 0.52, 0.92],[1, 0.55, 0.2, 0.9, 0.88]]) anchors torch.tensor([[0, 0.1, 0.2, 0.3], [0.15, 0.2, 0.4, 0.4],[0.63, 0.05, 0.88, 0.98], [0.66, 0.45, 0.8, 0.8],[0.57, 0.3, 0.92, 0.9]])fig d2l.plt.imshow(img) show_bboxes(fig.axes, ground_truth[:, 1:] * bbox_scale, [dog, cat], k) show_bboxes(fig.axes, anchors * bbox_scale, [0, 1, 2, 3, 4]);使用上面定义的multibox_target函数我们可以根据狗和猫的真实边界框标注这些锚框的分类和偏移量。在这个例子中背景、狗和猫的类索引分别为0、1和2。下面我们为锚框和真实边界框样本添加一个维度 labels multibox_target(anchors.unsqueeze(dim0),ground_truth.unsqueeze(dim0))返回的结果中有三个元素都是张量格式。第三个元素包含标记的输入锚框的类别。 labels[2] tensor([[0, 1, 2, 0, 2]])返回的第二个元素是掩码mask变量形状为批量大小锚框数的四倍。掩码变量中的元素与每个锚框的4个偏移量一一对应。由于我们不关心对背景的检测负类的偏移量不应影响目标函数。通过元素乘法掩码变量中的零将在计算目标函数之前过滤掉负类偏移量。 labels[1] tensor([[0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 1., 1.,1., 1.]])返回的第一个元素包含了为每个锚框标记的四个偏移值。请注意负类锚框的偏移量被标记为零 labels[0]tensor([[-0.00e00, -0.00e00, -0.00e00, -0.00e00, 1.40e00, 1.00e01,2.59e00, 7.18e00, -1.20e00, 2.69e-01, 1.68e00, -1.57e00,-0.00e00, -0.00e00, -0.00e00, -0.00e00, -5.71e-01, -1.00e00,4.17e-06, 6.26e-01]])4.使用非极大值抑制预测边界框在预测时我们先为图像生成多个锚框再为这些锚框一一预测类别和偏移量。一个预测好的边界框则根据其中某个带有预测偏移量的锚框而生成。下面我们实现了offset_inverse函数该函数将锚框和偏移量预测作为输入并应用逆偏移变换来返回预测的边界框坐标。 #save def offset_inverse(anchors, offset_preds):根据带有预测偏移量的锚框来预测边界框anc d2l.box_corner_to_center(anchors)pred_bbox_xy (offset_preds[:, :2] * anc[:, 2:] / 10) anc[:, :2]pred_bbox_wh torch.exp(offset_preds[:, 2:] / 5) * anc[:, 2:]pred_bbox torch.cat((pred_bbox_xy, pred_bbox_wh), axis1)predicted_bbox d2l.box_center_to_corner(pred_bbox)return predicted_bbox以下nms函数按降序对置信度进行排序并返回其索引。 #save def nms(boxes, scores, iou_threshold):对预测边界框的置信度进行排序B torch.argsort(scores, dim-1, descendingTrue)keep [] # 保留预测边界框的指标while B.numel() 0:i B[0]keep.append(i)if B.numel() 1: breakiou box_iou(boxes[i, :].reshape(-1, 4),boxes[B[1:], :].reshape(-1, 4)).reshape(-1)inds torch.nonzero(iou iou_threshold).reshape(-1)B B[inds 1]return torch.tensor(keep, deviceboxes.device)我们定义以下multibox_detection函数来将非极大值抑制应用于预测边界框。这里的实现有点复杂请不要担心。我们将在实现之后马上用一个具体的例子来展示它是如何工作的。 #save def multibox_detection(cls_probs, offset_preds, anchors, nms_threshold0.5,pos_threshold0.009999999):使用非极大值抑制来预测边界框device, batch_size cls_probs.device, cls_probs.shape[0]anchors anchors.squeeze(0)num_classes, num_anchors cls_probs.shape[1], cls_probs.shape[2]out []for i in range(batch_size):cls_prob, offset_pred cls_probs[i], offset_preds[i].reshape(-1, 4)conf, class_id torch.max(cls_prob[1:], 0)predicted_bb offset_inverse(anchors, offset_pred)keep nms(predicted_bb, conf, nms_threshold)# 找到所有的non_keep索引并将类设置为背景all_idx torch.arange(num_anchors, dtypetorch.long, devicedevice)combined torch.cat((keep, all_idx))uniques, counts combined.unique(return_countsTrue)non_keep uniques[counts 1]all_id_sorted torch.cat((keep, non_keep))class_id[non_keep] -1class_id class_id[all_id_sorted]conf, predicted_bb conf[all_id_sorted], predicted_bb[all_id_sorted]# pos_threshold是一个用于非背景预测的阈值below_min_idx (conf pos_threshold)class_id[below_min_idx] -1conf[below_min_idx] 1 - conf[below_min_idx]pred_info torch.cat((class_id.unsqueeze(1),conf.unsqueeze(1),predicted_bb), dim1)out.append(pred_info)return torch.stack(out)现在让我们将上述算法应用到一个带有四个锚框的具体示例中。为简单起见我们假设预测的偏移量都是零这意味着预测的边界框即是锚框。对于背景、狗和猫其中的每个类我们还定义了它的预测概率。 anchors torch.tensor([[0.1, 0.08, 0.52, 0.92], [0.08, 0.2, 0.56, 0.95],[0.15, 0.3, 0.62, 0.91], [0.55, 0.2, 0.9, 0.88]]) offset_preds torch.tensor([0] * anchors.numel()) cls_probs torch.tensor([[0] * 4, # 背景的预测概率[0.9, 0.8, 0.7, 0.1], # 狗的预测概率[0.1, 0.2, 0.3, 0.9]]) # 猫的预测概率我们可以在图像上绘制这些预测边界框和置信度。 fig d2l.plt.imshow(img) show_bboxes(fig.axes, anchors * bbox_scale,[dog0.9, dog0.8, dog0.7, cat0.9])现在我们可以调用multibox_detection函数来执行非极大值抑制其中阈值设置为0.5。请注意我们在示例的张量输入中添加了维度。我们可以看到返回结果的形状是批量大小锚框的数量6。最内层维度中的六个元素提供了同一预测边界框的输出信息。第一个元素是预测的类索引从0开始0代表狗1代表猫值-1表示背景或在非极大值抑制中被移除了。第二个元素是预测的边界框的置信度。其余四个元素分别是预测边界框左上角和右下角的(x,y)轴坐标范围介于0和1之间 output multibox_detection(cls_probs.unsqueeze(dim0),offset_preds.unsqueeze(dim0),anchors.unsqueeze(dim0),nms_threshold0.5) output tensor([[[ 0.00, 0.90, 0.10, 0.08, 0.52, 0.92],[ 1.00, 0.90, 0.55, 0.20, 0.90, 0.88],[-1.00, 0.80, 0.08, 0.20, 0.56, 0.95],[-1.00, 0.70, 0.15, 0.30, 0.62, 0.91]]])删除-1类别背景的预测边界框后我们可以输出由非极大值抑制保存的最终预测边界框。 fig d2l.plt.imshow(img) for i in output[0].detach().numpy():if i[0] -1:continuelabel (dog, cat)[int(i[0])] str(i[1])show_bboxes(fig.axes, [torch.tensor(i[2:]) * bbox_scale], label)实践中在执行非极大值抑制前我们甚至可以将置信度较低的预测边界框移除从而减少此算法中的计算量。我们也可以对非极大值抑制的输出结果进行后处理。例如只保留置信度更高的结果作为最终输出。总结一类目标检测算法基于锚框来预测首先生成大量锚框并赋予标号每个锚框作为一个样本进行训练在预测时使用NMS来去掉冗余的预测