“What is this” and “where am I” are two common questions that arise when people travel abroad. Recently, landmark image identification has shown great promise for the addressed problems, where most previous approaches are either visual-based or location-based. However, regarding city-view image location identification, there could be a number of buildings in a close proximity. Moreover, it is common that photos were taken indoors. The conditions may degrade the performance of previous approaches. To remedy the deficiencies, this paper unifies visual features, geo-tags, and check-in data, based on cross-domain social media, for city-view image location identification. Besides, this paper shows an effective and memory-efficient implementation based on sparse coding, where a new dictionary selection approach is presented. Further, this paper proposes a location-aware graph-based regrouping approach, leveraging spanning graph construction, on clusters of photos to refine clustering results. Experimental results show the improvement over the baselines (location-based, visual-based, etc.).