你能區分影像社群網站的留言是來自機器還是真實使用者嗎?
從對話機器人、或是我們之前參與的小冰寫詩,我們知道影像與自動文字描述的重要性。超越平鋪直述的影像文字敘述,過去兩年我們研究「鄉民式」的影像自動留言系統,針對流行服飾的社群網站。主要研究如何產生更生動、更多樣性的影像對話內容,結合影像以及文字的深度學習技術。我們發現如何量測「多樣性」、「新鮮感」都是個未探討的議題。
同樣地,我們樂於將先導性研究所採集的大型資料與全球的研究社群分享,也是目前唯一相關研究的資料集。所採集目前含有超過35萬張照片、一萬一千位使用者、五百萬留言的「NetiLook」資料集目前公開下載。
相關技術也將發表在頂尖會議 WWW 2018 (Cognitive Computing Track). 我們並發現,一般使用者很難區分機器產生或是使用者的留言。
Recently, image captioning has appeared promising, which is expected to widely apply in chatbot area. Yet, “vanilla” sentences, only describing shallow appearances (e.g., types, colors), generated by current works are not satisfied netizen style resulting in lacking engagement with users. Hence, we propose Netizen Style Commenting (NSC), to generate characteristic comments to a user-contributed fashion photo. We are devoted to modulating the comments in a vivid netizen style which reflects the culture in a designated social community and hopes to facilitate more engagement with users. In this work, we design a novel framework that consists of three major components: (1) We construct a large-scale clothing dataset named NetiLook to discover netizen-style comments. (2) We propose three unique measures to estimate the diversity of comments. (3) We bring diversity by marrying topic models with neural networks to make up the insufficiency of conventional image captioning works. The work is also accepted for WWW 2018 (Cognitive Computing Track), also available in arXiv.
To the best of our knowledge, this is the first and the largest netizen-style commenting dataset, NetiLook. It contains 355,205 images from 11,034 users and 5 million associated comments collected from Lookbook, a fashion social media. Most of the images are fashion photos in various angles of views, distinct filters and different styles of collage. Each image is paired with (diverse) user comments, and the average number of comments is 14 per image in our dataset. Besides, each post has a title named by an author, a publishing date and the number of hearts given by other users. Moreover, some users add names, brands, pantone of the clothes, and stores where they bought the clothes. Furthermore, we collect the authors’ public information. Some of them contain age, gender, country and the number of fans. In this paper, we only use the comments and the photos from our dataset. Other attributes can be used to refine the system in future work.
We need to thank the continuing supports from Microsoft Research Asia, the MOST AI Initiatives Projects (科技部), and the fruitful discussions with Dr. Ruihua Song. We also benefit from the grants from NVIDIA and the NVIDIA DGX-1 AI Supercomputer.
- Wen Hua Lin, Kuan-Ting Chen, Hung Yueh Chiang, Winston H. Hsu. Netizen-Style Commenting on Fashion Photos – Dataset and Diversity Measures. WWW 2018, Cognitive Computing Track. (arXiv available).
- NetiLook Dataset: https://mashyu.github.io/NSC/