Environmental awareness in machines: a case study of automated debris removal using Generative Artificial Intelligence and Vision Language Models

Jolly P.C. Chan*, Heiton M.H. Ho, T. K. Wong, Lawrence Y.L. Ho, Jackie Cheung, Samson Tai

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Water channels play a crucial role in stormwater management, but the build-up of debris in their grilles can lead to flooding, endangering humans and animals, properties, and critical infrastructure nearby. While automated mechanical grab systems are necessary for efficient debris removal, their deployment in outdoor environments has been non-existent due to safety concerns. Here we report the successful use of Generative Artificial Intelligence (GenAI) and a Vision Language Model (VLM) to endow an automated mechanical grab with “awareness”, which allows it to differentiate between non-living and living objects, deciding whether to initiate or abort grabbing actions. The existing approaches such as YOLOv7 only achieve a sensitivity of 86.94% (95% CI: 83.44% to 89.93%) in detecting humans and specified animals. They systematically miss crouching workers and animals facing away from the cameras. Grounding DINO (VLM) can achieve a sensitivity of 100% (95% CI: 99.17% to 100.00%) and a specificity of 85.37% (95% CI: 77.86% to 91.09%). Together with BLIP-2 (GenAI), it acquires “awareness”, allowing it to detect animals beyond those specified. This opens up possibilities for the application of GenAI/VLM in automation sectors where human-machine mingling occurs, such as manufacturing, logistics, and construction. This innovation can potentially improve the safety and efficiency in these domains.

Original languageEnglish
Article number2024005
Number of pages11
JournalHKIE Transactions Hong Kong Institution of Engineers
Volume31
Issue number4
DOIs
Publication statusPublished - 10 Dec 2024

Scopus Subject Areas

  • General Engineering

User-Defined Keywords

  • Artificial intelligence
  • computer vision
  • debris clearance
  • flood management
  • machine learning model
  • object detection

Fingerprint

Dive into the research topics of 'Environmental awareness in machines: a case study of automated debris removal using Generative Artificial Intelligence and Vision Language Models'. Together they form a unique fingerprint.

Cite this