首页 - 技术栈

江苏省建设厅工会网站wordpress 图片剪裁

作者: 五速梦信息网
时间: 2026年03月21日 10:39

当前位置：首页 > news >正文

江苏省建设厅工会网站,wordpress 图片剪裁,开源平台,鹤岗网站seo欢迎关注我的CSDN#xff1a;https://spike.blog.csdn.net/ 本文地址#xff1a;https://spike.blog.csdn.net/article/details/143388189 免责声明#xff1a;本文来源于个人知识与公开资料#xff0c;仅用于学术交流#xff0c;欢迎讨论#xff0c;不支持转载。 Ground… 欢迎关注我的CSDNhttps://spike.blog.csdn.net/ 本文地址https://spike.blog.csdn.net/article/details/143388189 免责声明本文来源于个人知识与公开资料仅用于学术交流欢迎讨论不支持转载。 Grounded SAM2 集成多个先进模型的视觉 AI 框架融合 GroundingDINO、Florence-2 和 SAM2 等模型实现开放域目标检测、分割和跟踪等多项视觉任务的突破性进展通过自然语言描述来定位图像中的目标生成精细的目标分割掩码在视频序列中持续跟踪目标保持 ID 的一致性。 Paper: Grounded SAM: Assembling Open-World Models for Diverse Visual TasksSAM 版本由 1.0 升级至 2.0

环境配置 GitHub: Grounded-SAM-2 git clone https://github.com/IDEA-Research/Grounded-SAM-2 cd Grounded-SAM-2准备 SAM 2.1 模型格式是 pt 的GroundingDINO 模型格式是 pth 的即 wget https://huggingface.co/facebook/sam2.1-hiera-large/resolve/main/sam2.1_hiera_large.pt?downloadtrue -O sam2.1_hiera_large.pt wget https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swint_ogc.pth最新模型位置 cd checkpoints ln -s [your path]/llm/workspace_comfyui/ComfyUI/models/sam2/sam2_hiera_large.pt sam2_hiera_large.ptcd gdino_checkpoints ln -s [your path]/llm/workspace_comfyui/ComfyUI/models/grounding-dino/groundingdino_swinb_cogcoor.pth groundingdino_swinb_cogcoor.pth ln -s [your path]/llm/workspace_comfyui/ComfyUI/models/grounding-dino/groundingdino_swint_ogc.pth groundingdino_swint_ogc.pth激活环境 conda activate sam2测试 PyTorch import torch print(torch.version) # 2.5.0cu124 print(torch.cuda.is_available()) # True exit() echo $CUDA_HOME安装 Grounding DINO pip install –no-build-isolation -e grounding_dino pip show groundingdino安装 SAM2 pip install –no-build-isolation -e . pip install –no-build-isolation -e .[notebooks] # 适配 Jupyter pip show SAM-2配置参数视觉分割开源算法 SAM2(Segment Anything 2) 配置与推理依赖文件 cd grounding_dino/ pip install -r requirements.txt –verbose2. 测试图像测试脚本grounded_sam2_local_demo.py 导入相关的依赖包 import os import cv2 import json import torch import numpy as np import supervision as sv import pycocotools.mask as mask_util from pathlib import Path from torchvision.ops import box_convert from sam2.build_sam import build_sam2 from sam2.sam2_image_predictor import SAM2ImagePredictor from grounding_dino.groundingdino.util.inference import load_model, load_image, predictfrom PIL import Image import matplotlib.pyplot as plt配置数据以及依赖环境其中包括输入文本提示例如袜子(socks) 和吉他(guitar)输入图像SAM2 模型 v2.1 版本以及配置GroundingDINO (DETR with Improved deNoising anchOr boxes, 改进的去噪锚框的DETR) 模型以及配置Box 阈值、文本阈值输出文件夹与Json 即 TEXT_PROMPT socks. guitar. #IMG_PATH notebooks/images/truck.jpg IMG_PATH [your path]/llm/vision_test_data/image2.pngimage Image.open(IMG_PATH) plt.figure(figsize(9, 6)) plt.title(fannotated_frame) plt.imshow(image)SAM2_CHECKPOINT ./checkpoints/sam2.1_hiera_large.pt SAM2_MODEL_CONFIG configs/sam2.1/sam2.1_hiera_l.yaml GROUNDING_DINO_CONFIG grounding_dino/groundingdino/config/GroundingDINO_SwinT_OGC.py GROUNDING_DINO_CHECKPOINT gdino_checkpoints/groundingdino_swint_ogc.pth BOX_THRESHOLD 0.35 TEXT_THRESHOLD 0.25 DEVICE cuda if torch.cuda.is_available() else cpu OUTPUT_DIR Path(outputs/grounded_sam2_local_demo) DUMP_JSON_RESULTS True# create output directory OUTPUT_DIR.mkdir(parentsTrue, exist_okTrue)加载 SAM2 模型获得 sam2_predictor即

build SAM2 image predictor

sam2_checkpoint SAM2_CHECKPOINT model_cfg SAM2_MODEL_CONFIG sam2_model build_sam2(model_cfg, sam2_checkpoint, deviceDEVICE) sam2_predictor SAM2ImagePredictor(sam2_model)加载 GroundingDINO 模型获得 grounding_model即

build grounding dino model

grounding_model load_model(model_config_pathGROUNDING_DINO_CONFIG, model_checkpoint_pathGROUNDING_DINO_CHECKPOINT,deviceDEVICE )SAM2 加载图像数据即 text TEXT_PROMPT img_path IMG_PATH# image(原图), image_transformed(正则化图像) image_source, image load_image(img_path) sam2_predictor.set_image(image_source)GroudingDINO 预测 Bounding Box输入模型、图像、文本、Box和Text阈值即 load_image() 和 predict() 都来自于 GroundingDINO数据和模型匹配。 boxes, confidences, labels predict(modelgrounding_model,imageimage,captiontext,box_thresholdBOX_THRESHOLD,text_thresholdTEXT_THRESHOLD, )适配不同 Box 的格式 h, w, _ image_source.shape boxes boxes * torch.Tensor([w, h, w, h]) input_boxes box_convert(boxesboxes, in_fmtcxcywh, out_fmtxyxy).numpy()SAM2 依赖的 PyTorch 配置

FIXME: figure how does this influence the G-DINO model

torch.autocast(device_typecuda, dtypetorch.bfloat16).enter()if torch.cuda.get_device_properties(0).major 8:# turn on tfloat32 for Ampere GPUs (https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices)torch.backends.cuda.matmul.allow_tf32 Truetorch.backends.cudnn.allow_tf32 TrueSAM2 预测图像 masks, scores, logits sam2_predictor.predict(point_coordsNone,point_labelsNone,boxinput_boxes,multimask_outputFalse, )后处理预测结果 Post-process the output of the model to get the masks, scores, and logits for visualization# convert the shape to (n, H, W) if masks.ndim 4:masks masks.squeeze(1)confidences confidences.numpy().tolist() class_names labelsclass_ids np.array(list(range(len(class_names))))labels [f{class_name} {confidence:.2f}for class_name, confidencein zip(class_names, confidences) ]输出结果可视化 Visualize image with supervision useful APIimg cv2.imread(img_path) detections sv.Detections(xyxyinput_boxes, # (n, 4)maskmasks.astype(bool), # (n, h, w)class_idclass_ids )box_annotator sv.BoxAnnotator() annotated_frame box_annotator.annotate(sceneimg.copy(), detectionsdetections)label_annotator sv.LabelAnnotator() annotated_frame label_annotator.annotate(sceneannotated_frame, detectionsdetections, labelslabels) cv2.imwrite(os.path.join(OUTPUT_DIR, groundingdino_annotated_image.jpg), annotated_frame) plt.figure(figsize(9, 6)) plt.title(fannotated_frame) plt.imshow(annotated_frame[:,:,::-1])mask_annotator sv.MaskAnnotator() annotated_frame mask_annotator.annotate(sceneannotated_frame, detectionsdetections) cv2.imwrite(os.path.join(OUTPUT_DIR, grounded_sam2_annotated_image_with_mask.jpg), annotated_frame) plt.figure(figsize(9, 6)) plt.title(fannotated_frame) plt.imshow(annotated_frame[:,:,::-1])GroundingDINO 的 Box 效果准确检测出袜子和吉他两类实体 SAM2 的分割效果如下
转换成 COCO 数据格式 def single_mask_to_rle(mask):rle mask_util.encode(np.array(mask[:, :, None], orderF, dtypeuint8))[0]rle[counts] rle[counts].decode(utf-8)return rleif DUMP_JSON_RESULTS:# convert mask into rle formatmask_rles [single_mask_to_rle(mask) for mask in masks]input_boxes input_boxes.tolist()scores scores.tolist()# save the results in standard formatresults {image_path: img_path,annotations : [{class_name: class_name,bbox: box,segmentation: mask_rle,score: score,}for class_name, box, mask_rle, score in zip(class_names, input_boxes, mask_rles, scores)],box_format: xyxy,img_width: w,img_height: h,}with open(os.path.join(OUTPUT_DIR, grounded_sam2_local_image_demo_results.json), w) as f:json.dump(results, f, indent4)

上一篇：江苏省建设厅副厅长网站长沙计算机培训机构排名前十
下一篇：江苏省建设厅网站首页做数学题赚钱的网站

江苏省建设厅工会网站wordpress 图片剪裁

build SAM2 image predictor

build grounding dino model

FIXME: figure how does this influence the G-DINO model

相关文章

江苏省建设厅副厅长网站长沙计算机培训机构排名前十

江苏省建设考试网站松原网站制作

江苏省建设局网站证件查询wordpress 流量监控

江苏省建设厅网站首页做数学题赚钱的网站

江苏省建设银行网站深圳做响应式网站公司

江苏省建设招标网站首页晋江网站建设费用

vps配置iis网站网站管理系统排名

vps能同时做网站同时做其它事吗牡丹江有做网站的人吗

VM2008 做网站如何建立公司网站推广

vi设计网站有哪些秦皇岛找一家能建网站的公司

visual2008做网站自己做资金盘网站

visual studio网站开发教程郑州广告牌制作市场