Towards Lightweight Learning Framework for Simultaneous Multiple Visual Recognition in Dynamic Environments

Project: Research project

Project Details


Background and Motivation: Deep neural networks have demonstrated superior performance in various visual recognition tasks, such as image classification, object detection, and semantic segmentation. Previously, networks for visual recognition were primarily designed to solve single-task problems. However, in practical scenarios, many real-world applications often involve multiple visual recognition tasks simultaneously. For instance, in autonomous driving systems, a vehicle may need to detect objects such as cars and pedestrians, measure distances, identify lane lines, and recognize traffic signs. This has inspired the development of multi-task learning (MTL), which involves jointly training a single network on multiple tasks and sharing some layers across these tasks, as depicted in Figure 1. This allows the network to learn to extract shared features that are useful for all tasks while also learning task-specific features. However, the application of MTL networks in real-life scenarios is still limited by significant challenges. First, in applications involving a set of practical tasks, there remains the issue of grouping and jointly training the tasks to enhance overall performance, as certain tasks may conflict with each other and are not suitable for simultaneous learning. Furthermore, tasks in dynamic environments may change over time, making it essential to employ a dynamic grouping technique capable of analyzing and automatically grouping tasks. Second, the size of commonly used MTL networks is typically enormous, with a huge number of parameters, and their training is time-consuming. This limitation restricts the practical applicability of the networks, especially on devices with limited computational resources. The final challenge is that real-world environments are often dynamic, leading to changes in the distribution of input data over time. To adapt to new data, the network must be able to quickly update its parameters to maintain performance. In summary, achieving a lightweight MTL learning framework in a dynamic environment is a desirable but challenging task that, to the best of our knowledge, has yet to be thoroughly explored in the literature.

Problem Definition and Challenges: This project aims at proposing a lightweight learning framework for simultaneously handling multiple visual recognition tasks in a dynamic environment. Accordingly, the following four key questions will be addressed. 1) How can we dynamically group different tasks, which may be similar, independent, or even conflicting, for effective training to enhance overall performance? 2) How can we design and construct lightweight networks with fewer parameters for each task group? 3) How can we improve computational efficiency when training multiple tasks simultaneously? 4) How can the network be updated quickly to mitigate performance degradation when new input data are produced in a data streaming environment?

Novelty of Project: The challenges of developing lightweight learning framework stated above have yet to be addressed in the literature. Consequently, this project will focus on the following four key issues: 1) developing a dynamic task grouping technique that can adapt to changing tasks; 2) constructing lightweight networks for each task group with task-specific modules, to reduce memory and computational costs; 3) designing a computationally efficient training approach to simultaneously learning multiple tasks through objective reduction; and 4) proposing a localization detection mechanism that can pinpoint specific parameters for quick updates in a dynamic environment.

Long-Term Impact: Developing lightweight learning framework for multiple visual recognition tasks in a dynamic environment is a crucial area of research with numerous potential applications, including autonomous driving, the Internet of Things, and mobile computing. The proposed project will provide a theoretical analysis of the splitting and assigning of multiple tasks for training, and develop corresponding lightweight networks and efficient training and updating algorithms. These efforts will enable the creation of more effective and efficient learning systems for multiple visual recognition tasks. The outcomes of this project will have a significant impact on the real-world applications of MTL, benefiting both academia and industry.
StatusNot started
Effective start/end date1/01/2531/12/27


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.