This paper addresses nonlinear feature extraction and Small Sample Size (S3) problems in face recognition. In sample feature space, the distribution of face images is nonlinear because of complex variations in pose, illumination and face expression. The performance of classical linear method, such as Fisher discriminant analysis (FDA), will degrade. To overcome pose and illumination problems, Shannon wavelet kernel is constructed and utilized for non-linear feature extraction. Based on a modified Fisher criterion, simultaneous diagonalization technique is exploited to deal with S3 problem, which often occurs in FDA based methods. Shannon wavelet kernel based subspace Fisher discriminant (SWK-SFD) method is then developed in this paper. The proposed approach not only overcomes some drawbacks of existing FDA based algorithms, but also has good computational complexity. Two databases, namely FERET and CMU PIE face databases, are selected for evaluation. Comparing with the existing FDA-based methods, the proposed method gives superior results.