List

你能區分影像社群網站的留言是來自機器還是真實使用者嗎?

從對話機器人、或是我們之前參與的小冰寫詩,我們知道影像與自動文字描述的重要性。超越平鋪直述的影像文字敘述,過去兩年我們研究「鄉民式」的影像自動留言系統,針對流行服飾的社群網站。主要研究如何產生更生動、更多樣性的影像對話內容,結合影像以及文字的深度學習技術。我們發現如何量測「多樣性」、「新鮮感」都是個未探討的議題。

同樣地,我們樂於將先導性研究所採集的大型資料與全球的研究社群分享,也是目前唯一相關研究的資料集。所採集目前含有超過35萬張照片、一萬一千位使用者、五百萬留言的「NetiLook」資料集目前公開下載。

相關技術也將發表在頂尖會議 WWW 2018 (Cognitive Computing Track). 我們並發現,一般使用者很難區分機器產生或是使用者的留言。

Recently, image captioning has appeared promising, which is expected to widely apply in chatbot area. Yet, “vanilla” sentences, only describing shallow appearances (e.g., types, colors), generated by current works are not satisfied netizen style resulting in lacking engagement with users. Hence, we propose Netizen Style Commenting (NSC), to generate characteristic comments to a user-contributed fashion photo. We are devoted to modulating the comments in a vivid netizen style which reflects the culture in a designated social community and hopes to facilitate more engagement with users. In this work, we design a novel framework that consists of three major components: (1) We construct a large-scale clothing dataset named NetiLook to discover netizen-style comments. (2) We propose three unique measures to estimate the diversity of comments. (3) We bring diversity by marrying topic models with neural networks to make up the insufficiency of conventional image captioning works. The work is also accepted for WWW 2018 (Cognitive Computing Track), also available in arXiv.

To the best of our knowledge, this is the first and the largest netizen-style commenting dataset, NetiLook. It contains 355,205 images from 11,034 users and 5 million associated comments collected from Lookbook, a fashion social media. Most of the images are fashion photos in various angles of views, distinct filters and different styles of collage. Each image is paired with (diverse) user comments, and the average number of comments is 14 per image in our dataset. Besides, each post has a title named by an author, a publishing date and the number of hearts given by other users. Moreover, some users add names, brands, pantone of the clothes, and stores where they bought the clothes. Furthermore, we collect the authors’ public information. Some of them contain age, gender, country and the number of fans. In this paper, we only use the comments and the photos from our dataset. Other attributes can be used to refine the system in future work.

We need to thank the continuing supports from Microsoft Research Asia, the MOST AI Initiatives Projects (科技部), and the fruitful discussions with Dr. Ruihua Song. We also benefit from the grants from NVIDIA and the NVIDIA DGX-1 AI Supercomputer.

  • Wen Hua Lin, Kuan-Ting Chen, Hung Yueh Chiang, Winston H. Hsu. Netizen-Style Commenting on Fashion Photos – Dataset and Diversity Measures. WWW 2018, Cognitive Computing Track. (arXiv available).
  • NetiLook Dataset: https://mashyu.github.io/NSC/

  Posts

1 2 3 6
June 7th, 2018

First Place (#1) in Disguised Face Recognition in CVPR 2018

January 6th, 2019

意想不到的科技部「AI投資潛力獎第一名」

December 16th, 2018

Keep Recruiting for Machine Learning Research Partners for Numerous Visual Sensors

December 16th, 2018

FutureTech Demo and Breakthrough Award (未來科技突破獎)

October 12th, 2018

結合虛與實的試鞋生成網路 (Virtual Try-On Shoe with Generative Neural Networks)

我們都有這樣的困擾,在電子購物的時候,看到一雙好看的鞋子,想買。但是卻又拿不定主意自己穿起來好看嗎?或是搭配某件褲子適合嗎?怎麼讓網路虛擬商城的鞋子,可以有效試在自己的腳上呢? 這個工作的挑戰在於如何使用單張鞋子商品的照片,很自然的合成在使用者的腳上,而且腳可能會有各種姿勢、角度。如何客服這個問題? 很高興大學部專題生(EE) 周晁德 完成了這個 PIVTONS 的虛擬鞋子試穿生成網路,試著解決這個困擾大家很久的問題。 這個有趣的工作也將於十二月初,在澳洲珀斯舉辦 Asian Conference on Computer Vision (ACCV) 2018 以大會演說 (Oral) 的方式跟大家分享這個工作。接下來全新的測試資料集將會公開讓大家使用,如果可以的話,我們也將試試看將整個試穿生成系統上線,讓大家體驗虛擬試鞋的樂趣 — 可以多試穿,多省錢。 我們鼓勵high-risk的研究工作。令人慶幸的是,這工作的發想、資料收集都是專題生獨立完成。當然在過程當中遇到很多GAN生成的問題,網路設計、訓練的問題,幾乎放棄了,還好團隊成員一起想辦法解決,關關難過,關關過(甚至免費擔任model),讓這個兼具技術深度以及商業價值的系統,可以順利完成。 我們也一直努力,讓智能生成(或是辨識)系統,賦予更有意義的應用 […]

September 13th, 2018

Finalist (Top 3) in 2018 IEEE Signal Processing Society Video and Image Processing (VIP) Cup

July 29th, 2018

信手拈來的3D模型搜尋 (Cross-View and Cross-Domain 3D Model Search)

July 27th, 2018

低解析人臉辨識跟解析度放大 (Very Low-Resolution Face Hallucination and Recognition)

June 18th, 2018

Winning Third Place in CVPR 2018 Video Recognition Challenge — Moments in Time

June 13th, 2018

[Video Report] National Investment for the GPU Supercomputer?

June 9th, 2018

Amazing Crowd Size and Positive Feedbacks in the Deep Learning Lecture for GTC 2018 Taipei