Effectiveness of Semi-Supervised Learning and Multi-Source Data in Detailed Urban Landuse Mapping with a Few Labeled Samples

Bo Sun, Yang Zhang, Qiming Zhou*, Xinchang Zhang

*Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    11 Citations (Scopus)

    Abstract

    Detailed urban landuse information plays a fundamental role in smart city management. A sufficient sample size has been identified as a very crucial pre-request in machine learning algorithms for urban landuse classification. However, it is often difficult to recognize and label landuse categories from remote sensing images alone. Alternatively, field investigation is time-consuming with a high demand in human resources and monetary cost. Therefore, previous studies on urban landuse classification have often relied on a small size of labeled samples with very uneven spatial distribution. This study aims to explore the effectiveness of a semi-supervised classification framework with multi-source data for detailed urban landuse classification with a few labeled samples. A disagreement-based semi-supervised learning approach, the co-forest, was employed and compared with traditional supervised methods (e.g., random forest and XGBoost). Multi-source geospatial data were utilized including optical and nighttime light remote sensing and geospatial big data, which present the physical and socio-economic features of landuse categories. Taking urban landuse classification in Shenzhen City as a case, results show that the classification accuracy of the semi-supervised method are generally on par with that of traditional supervised methods, and less labeled samples are needed to achieve a comparable result under different training set ratios. Given a small sample size, the accuracy tends to be stable with training samples no less than 5% in total. Our results also indicate that the classification accuracy by using multi-source data is significantly higher than that with any single data source being applied. Among these data, map POI and high-resolution optical remote sensing data make larger contributions on the classification, followed by mobile data and nighttime light remote sensing data.

    Original languageEnglish
    Article number648
    JournalRemote Sensing
    Volume14
    Issue number3
    DOIs
    Publication statusPublished - Feb 2022

    Scopus Subject Areas

    • Earth and Planetary Sciences(all)

    User-Defined Keywords

    • Multi-source geospatial data
    • Sampling strategy
    • Semi-supervised classification
    • Small sample learning
    • Urban landuse

    Fingerprint

    Dive into the research topics of 'Effectiveness of Semi-Supervised Learning and Multi-Source Data in Detailed Urban Landuse Mapping with a Few Labeled Samples'. Together they form a unique fingerprint.

    Cite this