In multimedia analysis, one objective of unsupervised visual domain adaptation is to train a classifier that works well on a target domain given labeled source samples and unlabeled target samples. Feature alignment of two domains is the key issue which should be addressed to achieve this objective. Inspired by the recent study of Generative Adversarial Networks (GAN) in domain adaptation, this paper proposes a new model based on Generative Adversarial Network, named Hierarchical Adversarial Deep Network (HADN), which jointly optimizes the feature-level and pixel-level adversarial adaptation within a hierarchical network structure. Specifically, the hierarchical network structure ensures that the knowledge from pixel-level adversarial adaptation can be back propagated to facilitate the feature-level adaptation, which achieves a better feature alignment under the constraint of pixel-level adversarial adaptation. Extensive experiments on various visual recognition tasks show that the proposed method performs favorably against or better than competitive state-of-the-art methods.