Regularized neural networks for semantic image segmentation

  • Fan Jia

Student thesis: Doctoral Thesis


Image processing consists of a series of tasks which widely appear in many areas. It can be used for processing photos taken by people's cameras, astronomy radio, radar imaging, medical devices and tomography. Among these tasks, image segmentation is a fundamental task in a series of applications. Image segmentation is so important that it attracts hundreds of thousands of researchers from lots of fields all over the world. Given an image, the goal of image segmentation is to classify all pixels into several classes. Given an image defined over a domain, the segmentation task is to divide the domain into several different sub-domains such that pixels in each sub-domain share some common information. Variational methods showcase their performance in all kinds of image processing problems, such as image denoising, image debluring, image segmentation and so on. They can preserve structures of images well. In recent decades, it is more and more popular to reformulate an image processing problem into an energy minimization problem. The problem is then minimized by some optimization based methods. Meanwhile, convolutional neural networks (CNNs) gain outstanding achievements in a wide range of fields such as image processing, nature language processing and video recognition. CNNs are data-driven techniques which often need large datasets for training comparing to other methods like variational based methods. When handling image processing tasks with large scale datasets, CNNs are the first selections due to their superior performances. However, the class of each pixel is predicted independently in semantic segmentation tasks which are dense classification problems. Spatial regularity of the segmented objects is still a problem for these methods. Especially when given few training data, CNNs could not perform well in the details. Isolated and scattered small regions often appear in all kinds of CNN segmentation results. In this thesis, we successfully add spatial regularization to the segmented objects. In our methods, spatial regularization such as total variation (TV) can be easily integrated into CNNs and they produce smooth edges and eliminates isolated points. Spatial dependency is a very important prior for many image segmentation tasks. Generally, convolutional operations are building blocks that process one local neighborhood at a time, which means CNNs usually don't explicitly make use of the spatial prior on image segmentation tasks. Empirical evaluations of the regularized neural networks on a series of image segmentation datasets show its good performance and ability in improving the performance of many image segmentation CNNs. We also design a recurrent structure which is composed of multiple TV blocks. By applying this structure to a popular segmentation CNN, the segmentation results are further improved. This is an end-to-end framework to regularize the segmentation results. The proposed framework could give smooth edges and eliminate isolated points. Comparing to other post-processing methods, our method needs little extra computation thus is effective and efficient. Since long range dependency is also very important for semantic segmentation, we further present non-local regularized softmax activation function for semantic image segmentation tasks. We introduce graph operators into CNNs by integrating nonlocal total variation regularizer into softmax activation function. We find the non-local regularized softmax activation function by the primal-dual hybrid gradient method. Experiments show that non-local regularized softmax activation function can bring regularization effect and preserve object details at the same time

Date of Award10 Sept 2020
Original languageEnglish
SupervisorXue-Cheng TAI (Supervisor)

User-Defined Keywords

  • Image segmentation
  • Image processing
  • Convolutions (Mathematics)

Cite this