In the era of deep learning, training data play an essential role. Yet, their generation is expensive concerning both cost and time. Besides this, it often suffers from (1) insufficient data, (2) fixed definition of categories possibly leading to data imbalance, and (3) difficult quality control of manual annotations. In this paper, we propose an approach for automatic generation of urban building data for detection and classification from remote sensing imagery which attempts to deal with these issues. Datasets in popular format, which can be directly used for training, are created in a fully automatic pipeline from the raw data. Furthermore, the generation process can be customized to adapt the datasets to specific tasks as well as optimized according to the data characteristics of the source. We show that an automatically generated dataset can reach a comparable level to the current open datasets concerning both the quantity of instances as well as the diversity of categories. All this is achieved in a few hours without manual intervention or costs for annotation. Experiments with multiple datasets including the comparison with manually annotated data demonstrate the potential of the proposed approach.
«
In the era of deep learning, training data play an essential role. Yet, their generation is expensive concerning both cost and time. Besides this, it often suffers from (1) insufficient data, (2) fixed definition of categories possibly leading to data imbalance, and (3) difficult quality control of manual annotations. In this paper, we propose an approach for automatic generation of urban building data for detection and classification from remote sensing imagery which attempts to deal with these...
»