Deep learning has been successfully used in heterogeneous network embedding. Although it shows excellent performance on preserving the structure and semantic characteristics of network while a large scale of training data is provided, it is still challenging to model complex structured representations that effectively perform on diverse network tasks. In this work, a new heterogeneous network embedding learning method is presented based on cloud model and variational inference, called HNECV. The model uses meta-path random walks to obtain structural information of original network which can capture abundant semantics of networks from different views. In addition, a novel framework is put forward to build an excellent embedding. We employ the forward cloud transformation algorithm to improve the sampling method of the variational autoencoder in its hidden space, and then a self-supervised learning module is constructed to guide the cluster of node vectors in the hidden space of variational autoencoder. Experimental results indicate that the proposed model can achieve better performance than those of state-of-the-art algorithms. Furthermore, HNECV shows better robustness and steadiness on different network tasks when different ratio of edges are disconnected at training.