In some practical scenarios, such as video surveillance and personal identification, we often have to address the recognition problem of occluded faces, where content replacement by serious occlusion with non-face objects always produces partial appearance and ambiguous representation. Under the circumstances, the performance of face recognition algorithms will often deteriorate to a certain degree. In this paper, we therefore address this problem by removing occlusions on face images and present a new two-stage Facial Structure Guided Generative Adversarial Network (FSG-GAN). In Stage I of the FSG-GAN, the variational auto-encoder is used to predict the facial structure. In Stage II, the predicted facial structure and the occluded image are concatenated and fed into a generative adversarial network (GAN) based model to synthesize the de-occlusion face image. In this way, the facial structure knowledge can be transferred to the synthesis network. Especially, in order to enable the occluded face image to be perceived well, the generator in the GAN based synthesis network utilizes the hybrid dilated convolution modules to extend the receptive field. Furthermore, aiming at further eliminating the appearance ambiguity as well as unnatural texture, a multi-receptive fields discriminator is proposed to utilize the features from different levels. Experiments on the benchmark datasets show the efficacy of the proposed FSG-GAN.