王文彬(Kenneth Wong)目前是华为诺亚方舟实验室的研究员。他在中国科学院计算技术研究所智能信息处理重点实验室视觉研究组( VIPL )获得博士学位,指导老师为 陈熙霖 研究员和 王瑞平 研究员. 他的研究方向包括但不限于2D/3D场景理解、物体检测、场景图生成以及图像标题生成。在此之前(2013-2017),他在 南开大学计算机与控制工程学院 获得工程学士学位(计算机科学与技术专业)。
下载我的 简历.
“Those times when you get up early and you work hard; those times when you stay up late and you work hard; those times when don’t feel like working — you’re too tired, you don’t want to push yourself — but you do it anyway. That is actually the dream. That’s the dream.”
博士研究生,计算机视觉方向, 2017 - 2022
中国,北京市,中国科学院计算技术研究所
本科生,工学学士学位(计算机科学与技术), 2013 - 2017
中国,天津市,南开大学
[May. 9, 2023]: One paper was accepted to International Journal of Computer Vision (IJCV).
[Sept. 26, 2022]: I joined Huawei Noah’s Ark Lab.
[Sept. 18, 2022]: I obtained my Ph.D. degree from ICT, CAS.
[Jul. 19, 2022]: One paper was accepted to SCIENTIA SINICA Informationis.
[Jul. 23, 2021]: One paper was accepted to ICCV 2021!
[Aug. 10, 2020]: The code of our HetH (ECCV 2020) was released.
[Jul. 3, 2020]: One paper was accepted to ECCV 2020!
[Jun. 15 ~ 19, 2019]: I attended the CVPR 2019 held in Long Beach, CA, U.S.
[Mar. 1, 2019]: One paper was accepted to CVPR 2019!
[Sept. 1, 2017]: I joined the Key Lab. of IIP, ICT, CAS as a Ph.D. student. Go Bears!
Scene graph aims to faithfully reveal humans’ perception of image content. When humans look at a scene, they usually focus on their interested parts in a special priority. This innate habit indicates a hierarchical preference about human perception. Therefore, we argue to generate the Scene Graph of Interest which should be hierarchically constructed, so that the important primary content is firstly presented while the secondary one is presented on demand. To achieve this goal, we propose the Tree–Guided Importance Ranking (TGIR) model. We represent the scene with a hierarchical structure by firstly detecting objects in the scene and organizing them into a Hierarchical Entity Tree (HET) according to their spatial scale, considering that larger objects are more likely to be noticed instantly.
近年来, 场景图自动生成逐渐受到关注, 但生成结果中 对于关系的描述受到长尾分布带来的偏见的影响, 偏向于样本量较大的头部关系. 然而头部关系往往过于空泛, 描述不够准确, 容易造成误解. 由于这种关系价值不高, 生成的场景图近似于退化为场景中物体信息的堆叠, 不利于其他应用在图结构上进行结构化推理. 为了使场景图生成器在这种不均衡的数据条件下, 能够更均衡地学习, 给出更加多样化的特别是尾部的更准确的关系, 本文提出一种 附加偏见预测器(Additional Biased Predictor, ABP)辅助的均衡化学习方法.
If an image tells a story, the scene graph and image caption are the most popular narrators. Generally, a scene graph prefers to be an omniscient generalist, while the image caption is more willing to be a specialist, which outlines the gist. Lots of previous studies have found that as a generalist, a scene graph is not enough to serve for downstream advanced intelligent tasks unless it can reduce the trivial contents and noises. In this respect, the image caption is a good teacher. To this end, we let the scene graph borrow the ability from the image caption.
Scene graph aims to faithfully reveal humans’ perception of image content. When humans analyze a scene, they usually prefer to describe image gist first, namely major objects and key relations in a scene graph, which contains essential image content. This humans’ inherent perceptive habit implies that there exists a hierarchical structure about humans’ preference during the scene parsing procedure. Therefore, we argue that a desirable scene graph should be also hierarchically constructed, and introduce a new scheme for modeling scene graph.
Relationship is the core of scene graph, but its prediction is far from satisfying because of its complex visual diversity. To alleviate this problem, we treat relationship as an abstract object, exploring not only significative visual pattern but contextual information for it, which are two key aspects when considering object recognition. Our observation on current datasets reveals that there exists intimate association among relationships. Therefore, inspired by the successful application of context to object-oriented tasks, we especially construct context for relationships where all of them are gathered so that the recognition could benefit from their association.
Python, C/C++
Scikit-Learn, Pandas, NumPy, Matplotlib
Common Machine Learning Models
Pytorch, TensorFlow, Git, VS Code, Jupyter