CV 任务

分类 [1] #

image-level #

  • image recognition

  • (Retrieval)image-text retrieval

  • Caption(image captioning)

  • VQA(visual question answering)

region-level #

  • Object Detection object detection

    • DETR -> DINO -> Grounding DINO
  • dense caption

  • phrase grounding

pixel-level #

  • Segmentation
    • generic segmetation
    • referring segmetation

其他 #

  • 对比

    • [CNN 更深的网络]
    • [transformer 没有局限]
  • CV任务

    • 分类(Classification)
    • 检测(Detection)
    • 分割(Segmentation)
    • 跟踪(Tracking)
    • 行为识别(Action Recognition)

参考 #

  1. [CVPR Tutorial Talk] Towards General Vision Understanding Interface pdf