Me-Link: Link Me To The Media – Fusing Audio And Visual Cues for Robust and Efficient Mobile Media Interaction – Winston HSU (徐宏民) | Scientist enthusiastic for machine intelligence and deep learning over large-scale image/video streams

External Link

Top Conference paper

Chun-Yen Yeh, Yu-Ming Hsu, Hsinfu Huang, Hong-Wun Jheng, Yu-Chuan Su, Tzu-Hsuan Chiu, Winston H. Hsu

WWW, Pages 147-150, April 2014

Publication year: 2014

In this demo, we present a scalable mobile video recognition system, named “Me-link,” based on progressive fusion of light-weight audio visual features. With our system, users only have to point the mobile camera to the video they are interested in. The system will capture the frames and sounds, then retrieve relevant information immediately. As the users hold the mobile longer, the system progressively aggregates the cues temporally and then returns more accurate results. We also consider the real world noisy environment, where users may not get clear visual or audio signals. In the aggregation step of audio and visual cues, our system automatically detects the available channel for the final rank. On the server side, users can upload the videos with information via website. Besides, we also link the streaming signals so that users can get the real time broadcasting with “Me-link”.

Winston H. Hsu

National Taiwan University

Me-Link: Link Me To The Media – Fusing Audio And Visual Cues for Robust and Efficient Mobile Media Interaction

Related