Existing counting methods often adopt regression-based approaches and thus cannot precisely localize the target objects, which hinders the further applications and analysis (e.g., high-level understanding and fine-grained classification). In addition, most of prior work mainly focus on counting objects in the static environments with fixed cameras. Motivated by the advent of unmanned flying vehicles (i.e., drone), we are interested in detecting and counting objects in such dynamic environments. We propose a Layout Proposal Networks (LPNs) and spatial kernels to simultaneously count and localize target objects (e.g., cars) in the Drone videos. Different from the conventional region proposal methods, we leverage the spatial layout information (e.g., cars often park regularly) and introduce the spatially regularized constraints into our network to improve the localization accuracy. To evaluate our counting method, we present a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots. To the best of our knowledge, it is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.