We propose a novel visual tracking algorithm based on the representations from adiscriminatively trained Convolutional Neural Network (CNN). Our algorithmpretrains a CNN using a large set of videos with tracking ground truths toobtain a generic target representation. Our network is composed of sharedlayers and multiple branches of domain-specific layers, where domainscorrespond to individual training sequences and each branch is responsible forbinary classification to identify target in each domain. We train each domainin the network iteratively to obtain generic target representations in theshared layers. When tracking a target in a new sequence, we construct a newnetwork by combining the shared layers in the pretrained CNN with a new binaryclassification layer, which is updated online. Online tracking is performed byevaluating the candidate windows randomly sampled around the previous target state.The proposed algorithm illustrates outstanding performance in existing trackingbenchmarks。