We present a scalable mobile video Question-Answering system with locally aggregated descriptors and random projection using user-generated videos all around the world for ACM Multimedia 2011 Technicolor challenge: “precise event recognition and description from video excerpts.” Our proposed system takes a video excerpt as a query and explores its canonical semantics of that. We collect a public events video dataset containing 7 topics with 1963 YouTube videos to evaluate our proposed system. The experiment results show that our proposed video feature representation outperforms a state-of-the-art near-duplicate retrieval based on color histogram. The signature generated by random projection not only ensures real-time efficiency but also achieves a competitive MAP performance with original feature.