Comparison of Machine Learning Techniques in Inferring Phytoplankton Size Classes

Shuibo Hu, Huizeng Liu*, Wenjing Zhao, Tiezhu Shi, Zhongwen Hu, Qingquan Li, Guofeng Wu

*Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    54 Citations (Scopus)

    Abstract

    The size of phytoplankton not only influences its physiology, metabolic rates and marine food web, but also serves as an indicator of phytoplankton functional roles in ecological and biogeochemical processes. Therefore, some algorithms have been developed to infer the synoptic distribution of phytoplankton cell size, denoted as phytoplankton size classes (PSCs), in surface ocean waters, by the means of remotely sensed variables. This study, using the NASA bio-Optical Marine Algorithm Data set (NOMAD) high performance liquid chromatography (HPLC) database, and satellite match-ups, aimed to compare the effectiveness of modeling techniques, including partial least square (PLS), artificial neural networks (ANN), support vector machine (SVM) and random forests (RF), and feature selection techniques, including genetic algorithm (GA), successive projection algorithm (SPA) and recursive feature elimination based on support vector machine (SVM-RFE), for inferring PSCs from remote sensing data. Results showed that: (1) SVM-RFE worked better in selecting sensitive features; (2) RF performed better than PLS, ANN and SVM in calibrating PSCs retrieval models; (3) machine learning techniques produced better performance than the chlorophyll-a based three-component method; (4) sea surface temperature, wind stress, and spectral curvature derived from the remote sensing reflectance at 490, 510, and 555 nm were among the most sensitive features to PSCs; and (5) the combination of SVM-RFE feature selection techniques and random forests regression was recommended for inferring PSCs. This study demonstrated the effectiveness of machine learning techniques in selecting sensitive features and calibrating models for PSCs estimations with remote sensing.

    Original languageEnglish
    Article number191
    Number of pages18
    JournalRemote Sensing
    Volume10
    Issue number3
    DOIs
    Publication statusPublished - Mar 2018

    User-Defined Keywords

    • Feature selection
    • Machine learning
    • Phytoplankton size classes
    • Random forest
    • Remote sensing

    Fingerprint

    Dive into the research topics of 'Comparison of Machine Learning Techniques in Inferring Phytoplankton Size Classes'. Together they form a unique fingerprint.

    Cite this