Unsupervised learning on scientific ocean drilling datasets from the South China Sea

Kevin C. Tse*, Hon Chim CHIU, Man Yin Tsang, Yiliang Li, Edmund Y. Lam

*Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    6 Citations (Scopus)

    Abstract

    Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.

    Original languageEnglish
    Pages (from-to)180-190
    Number of pages11
    JournalFrontiers of Earth Science
    Volume13
    Issue number1
    DOIs
    Publication statusPublished - 1 Mar 2019

    Scopus Subject Areas

    • Earth and Planetary Sciences(all)

    User-Defined Keywords

    • clustering
    • IODP
    • machine learning
    • ODP
    • unsupervised learning

    Fingerprint

    Dive into the research topics of 'Unsupervised learning on scientific ocean drilling datasets from the South China Sea'. Together they form a unique fingerprint.

    Cite this