ICCV2021 Oral-新任务!新数据集!康奈尔大学提出了类似VG但又不是VG的PVG任务
论文链接:https://arxiv.org/abs/2108.07253
项目链接:https://whoswaldo.github.io/(尚未开源)
01
02
03
Data Collection
Detecting People in Images and Captions
Estimating Ground Truth Links
Dataset Size and Splits
Validating Test Images with AMT
04
4.1. Model
4.2. Learning
Box–Name Matching Losses
Unlinked Box Classification Loss
05
5.1. Comparison to Prior Work
5.2. Ablation study
5.3. Analysis of results
06
作者介绍
研究领域:FightingCV公众号运营者,研究方向为多模态内容理解,专注于解决视觉模态和语言模态相结合的任务,促进Vision-Language模型的实地应用。
知乎/公众号:FightingCV
END
赞 (0)