As social media grown to become an integral part of many people's daily life, brands are quick to launch targeted social media marketing campaign to acquire new potential customers online. To facilitate the potential customer discovery process, a costly and labor intensive manual selection process is done to build a brand portfolio consisting of multimedia data relevant to the brand. To automate this process in a cost-effective way, in this paper, we propose a novel Multi-Modal Distance Metric Learning (M2DML) method, which learns a data-dependent similarity metric from multi-modal media data, aiming at assisting the brands to retrieve appropriate media data from social networks for potential customer discovery. To comprehensively model the supervised information of multi-modal data, M2DML aims to learn both the intra-modality and inter-modality distance metrics simultaneously. To further explore the unsupervised information of the dataset, M2DML aims to preserve the manifold structure of the multi-modal data. The proposed method is then formulated as a standard eigen-decomposition problem and the closed form solution is efficiently computed. Experiments on a standard multi-modal media dataset and a self-collected dataset validate the effectiveness of the proposed method.