Recently, various graph neural network based approaches have been proposed to learn graph feature representations. However, there exists a long-term outstanding issue, i.e., over-smoothing problem. That is, when convoluting deeper neighboring nodes, the feature difference gradually vanishes. To address this issue, this paper proposes a novel determinantal point process based sampling strategy, called Graph-DPP, to sample diverse neighboring nodes for convolution. The target of diversified sampling is to maximize the relevance between sampled nodes and target node and simultaneously minimize the similarity between any two sampled nodes. To this end, we first adapt the Hawkes process to calculate feature similarity between any two neighboring nodes. Then, their structural similarity value is calculated. Both feature and structural similarity are used to generate the positive semi-definite similarity matrix for the later sampling. To the best of our knowledge, this is among the first attempts to integrate determinantal point process technique with graph neural network model. To evaluate the model performance, the proposed Graph-DPP strategy is respectively combined with GCN, GAT and GraphSAGE, and then are performed on three datasets. Experimental results show that the proposed Graph-DPP sampling strategy could achieve the state-of-the-art model performance.