The prevalence of capture devices (e.g., mobile phones, egocentric devices, drones, vehicles, etc.) and the advent of media-sharing services have drastically increased the sheer amount of image and video collections. Here arise the strong needs for effective recognition, mining, forecasting, search, etc. We have been devoted to machine intelligence over large-scale multimodal data streams (e.g., images, videos, text, etc.).
Recently, we focus on numerous Deep Convolutional Neural Network methods for large-scale image/video recognition, retrieval, and mining projects, sponsored by leading industry partners (Microsoft Research, IBM Watson, NVIDIA, MediaTek, HTC, etc.). Especially, we aim for effective CNN methods for image/video applications such as image search, video event detection, face recognition, facial/clothing attributes, super-resolution, etc.
Though having observed very exciting applications requiring intensive researches in machine intelligence, we further identify certain core challenges and respond to them respectively:
- Semantic gap – bridging the low-level visual features to satisfy semantic needs by proposing semantic ontology and learning semantic representations in an automatic manner;
- User gap – helping users issue proper queries to satisfy their intentions in different application scenarios and mobile devices (e.g., by sketch, attribute, snapshot, speech, and touch-based);
- Volume gap – learning ultra-large-scale photos and videos by distributed learning and efficient high-dimensional indexing (e.g., hash-based methods) for real-time query response over big photo/video data; balancing the technical strengths between the mobiles and cloud servers.
- Privacy – conducting privacy-preserving mining for large-scale photos and videos and addressing the privacy concerns as sharing sensitive photos and videos (e.g., family albums) in the public clouds.
- Industry needs – besides thorough algorithms for academic researches, we also investigate practical methods for meeting possible needs in industrial developments (e.g., technology transfer).