DANCE: Learning A Domain Adaptive Framework for Deep Hashing

This paper studies unsupervised domain adaptive hashing, which aims to transfer a hashing model from a label-rich source domain to a label-scarce target domain. Current state-of-the-art approaches generally resolve the problem by integrating pseudo-labeling and domain adaptation techniques into deep hashing paradigms. Nevertheless, they usually suffer from serious class imbalance in pseudo-labels and suboptimal domain alignment caused by the neglection of the intrinsic structures of two domains. To address this issue, we propose a novel method named unbiaseD duAl hashiNg Contrastive lEarning (DANCE) for domain adaptive image retrieval. The core of our DANCE is to perform contrastive learning on hash codes from both instance level and prototype level. To begin, DANCE utilizes label information to guide instance-level hashing contrastive learning in the source domain. To generate unbiased and reliable pseudo-labels for semantic learning in the target domain, we uniformly select samples around each label embedding in the Hamming space. A momentum-update scheme is also utilized to smooth the optimization process. Additionally, we measure the semantic prototype representations in both source and target domains and incorporate them into a domain-aware prototype-level contrastive learning paradigm, which enhances domain alignment in the Hamming space while maximizing the model capacity. Experimental results on a number of well-known domain adaptive retrieval benchmarks validate the effectiveness of our proposed DANCE compared to a variety of competing baselines in different settings.


INTRODUCTION
Efcient approximate nearest neighbor (ANN) search techniques have highly demanding in real-world web search [56,66,72].Owing to its great computational and memory costs, hashing-based ANN search has been more and more popular among various search engines [33,49,50].Hashing seeks to maintain the semantic similarity of high-dimensional data points while translating them to a collection of compact binary codes.Generally speaking, there are two common categories of hashing techniques, namely, supervised methods [6,8,26,40,45,62,65,68] and unsupervised methods [9,28,29,35,42,57,63,70].Due to the integration of supervised information, supervised hashing often demonstrates a higher retrieval accuracy.Thanks to the developments of deep neural networks with efective representation learning, deep hashing methods combine deep feature extractors and binary code learning into a unifed deep neural network to create superior hash codes, delivering state-of-the-art performance in image retrieval.
Despite their promising performance, the bulk of these methods requires that the distributions of the query data and the training data are almost identical.In reality, however, query samples could have diferent distributions from the training samples, which these methods often struggle to manage.For example, people could take photographs using their cameras in the real-world scene, and then search for items associated with them on a massive online searching engine [18,41,60].These photographs may have a signifcant domain diference from the images on the platform.Towards this end, we analyze the problem of unsupervised domain adaptive hashing, which aims to optimize a hashing framework using both labeled source examples and unlabeled target examples.The model is applicable for both single-domain retrieval and cross-domain retrieval.The former supposes that both the queries and database originate from the target domain [67], whereas the latter supposes that the queries and database originate from the target domain and the source domain, respectively.
However, developing an efective domain adaptive hashing framework is non-trivial due to two major obstacles.Firstly, we should overcome the label scarcity in the target domain for adequate semantic learning of hash codes.Secondly, we need to generate domaininvariant binary codes, thus benefting efective cross-domain retrieval.Recent attempts of unsupervised domain adaptive hashing methods typically incorporate supervised and unsupervised hashing techniques into domain adaptive frameworks [17,44,60,61,64,67].To overcome the frst obstacle, existing solutions usually frst add an additional classifcation head, and then set a threshold for confdence scores to generate pseudo-labels of target data, and then use them to guide the optimization of hash codes in the target domain [16,60].However, the pseudo-labeling procedure could be biased towards easy semantic information, degrading the optimization of the high-quality hash code generation.As with the second obstacle, adversarial learning is widely used to generate domain-invariant features [31,51].We contend that they neglect the intrinsic semantic structures of two domains, hence failing to capture semantics during domain adaption.Moreover, serious domain disparity could be exacerbated in the hash code space, resulting in inefective image retrieval.
To fully above two obstacles, we propose a novel method called unbiaseD duAl hashiNg Contrastive lEarning (DANCE) for domain adaptive image retrieval, which performs hashing contrastive learning from both instance level and prototype level with unbiased and reliable pseudo-labels.In virtue of label information, we warm up the hashing network using instance-level hashing contrastive learning in the source domain.To generate unbiased and reliable pseudo-labels for efective target semantic learning, we adopt a teacher-student framework with balanced pseudo-labeling.On the one hand, in the momentum-updated teacher network, we set classspecifc thresholds to guarantee balanced pseudo-labels inferred from the similarities between hash codes and label embeddings.On the other hand, we utilize these pseudo-labels to guide the optimization of the student network.In addition, we utilize the binary codes in a mini-batch to generate diferent views of the source and target semantic prototypes.Then a domain-aware prototype-level contrastive learning paradigm is developed to enhance domain alignment with the consideration of intrinsic semantic structures.These dual views of hashing contrastive learning are complementary since better semantic learning by instance-level contrastive learning can help obtain prototype representations, whereas, in turn, efective domain alignment by prototype-level contrastive learning can also aid in pseudo-labeling on the target data.In this way, domain-invariant and discriminative binary codes can be produced in the Hamming space for efective domain adaptive retrieval.Extensive experiments on a variety of well-known datasets demonstrate signifcant and consistent improvements over a range of competing baselines.We also plot the precision-recall curves and the Top-N precision curves of our proposed method to demonstrate our advantages in a qualitative manner.To summarize, this work makes the following main contributions: • We develop a novel unsupervised domain adaptive hashing method DANCE, which adopts a simple yet efective pseudo-labeling mechanism that ensures class balance and reliability for efective semantic learning on target data.• Our DANCE performs hashing contrastive learning from both instance level and prototype level, which facilities sufcient semantic learning and domain alignment with full consideration of intrinsic semantic information.
• Extensive experiments are conducted on a variety of domain adaptive retrieval datasets to evaluate our DANCE and experimental results validate our efcacy.

RELATED WORK 2.1 Learning to Hash
Owing to its enormous potential to drastically reduce storage and computing load, hash code is commonly utilized in the multimedia search community, i.e., web search [2,20,69] and cross-modal retrieval [5,52,56,59].Recently, with the development of deep learning, learning to hash has lately attracted growing interest for efcient multimedia search [14,54].Nowadays, numerous deep supervised hashing methods have achieved superior retrieval performance by means of the representation capability of deep neural networks [40,45,62,65,68].A typical portion of these methods usually build a similarity structure from semantic labels to direct hash code learning using a pairwise loss or ranking-based loss [1,25,40,62].Another line of learning to hash makes an efort to supervise the optimization of the hashing network in a point-wise manner.In early eforts, a classifcation layer is generally used to convert the outputs of the hashing network into assignment distributions, followed by a classifcation loss for discriminative hash codes.Recently, researchers usually generate anchors for diferent classes to directly guide semantics learning in the Hamming space [10,65].Despite their enormous success, these methods sufer from poor performance when it comes to signifcant domain shifts between training data and query data in the real world.To tackle this issue, we study the emerging problem of unsupervised domain adaptive hashing, which attempts to eliminate the domain discrepancy between the label-rich source domain and the unlabeled target domain and a novel method DANCE is proposed in this paper.

Unsupervised Domain Adaptation
Unsupervised domain adaptation (UDA) seeks to transfer a model from a label-rich source domain to a label-scarce target domain [36,47].Early on, researchers often calculate several distribution metrics across the source and target domains (e.g., maximum mean discrepancy [39] and Wasserstein distance [24]) for domain discrepancy removal.Unfortunately, the poor performance of these approaches when dealing with complicated multi-modal distributions has led to a recent decline in their popularity.Another area of this study along this line is focused on adversarial learning [11,31].These methods attempt to produce domain-invariant representations using a domain discriminator and a gradient reversal layer.Furthermore, a few researchers have extended this problem for efective domain adaptive retrieval.Early domain adaptive hashing methods usually solve single-domain retrieval, which employs labeled source data to enhance the performance on the target domain [16,31,51].
Typically, these methods usually leverage distribution metrics or adversarial learning to produce domain-invariant features, facilitating downstream binary outputs.Recent eforts also take cross-domain retrieval into account [17,44,53,60].For example, DHLing [60] makes an efort to create a domain-invariant memory bank, which not only benefts semantic learning with pseudo-labeling but also helps achieve efcacious cross-domain alignment.In comparison to current solutions, our DANCE not only introduces an unbiased pseudo-labeling mechanism to generate reliable pseudo-labels for target semantic learning, but also incorporates semantic information into domain-aware prototype-level contrastive learning to efectively minimize domain discrepancy.

PRELIMINARIES 3.1 Problem Formulation
Our learning algorithm has access to a labeled source domain D = {( , )} = 1 containing samples and a unlabeled target domain The purpose is to learn a hash function H : → ∈ {−1, 1} , in which denotes the input sample and denotes a compact -bit binary code with semantic information preserved.We evaluate our model in two kinds of retrieval systems, i.e., singledomain retrieval and cross-domain retrieval.The former assumes that the queries and databases are both from the target domain D while the latter aims to search examples from D with similar semantic information to the queries from D .

Hashing Contrastive Learning
We briefy show the typical framework of hashing contrastive learning, which is widely used in unsupervised hashing [35,37].In particular, these methods encourage each sample to have similar hash codes under diferent transformations and diferent hash codes from other samples.Given a batch of images { } =1 , 2 augmented views can be generated, i.e., { , ★ (2) represent the cosine similarity between two binary codes and the hashing contrastive learning can be written as: = ≠ 1 or 2) and serves as a temperature parameter.

METHODOLOGY
This paper proposes a novel unsupervised domain adaptive hashing method named DANCE.As shown in Figure 1, the hashing network (•) is modifed by a convolutional neural network backbone (e.g., VGG-F [46]) where the last layer is replaced by a fully-connected layer containing units to provide binary codes, i.e., = ( ()).The two challenging of unsupervised domain adaptive hashing is target semantic learning under label scarcity and serious domain discrepancy.We start by warming up the hashing network using instance-level hashing contrastive learning in the source domain.Then, we adopt a class-specifc pseudo-labeling strategy incorporated inside a teacher-student framework for reliable semantic learning on the target domain.Furthermore, we develop a domainaware prototype-level contrastive learning paradigm to generate domain-invariant and discriminative hash codes.

Instance-level Hashing Contrastive
Learning for Source Semantic Learning We frst warm up the hashing network by learning semantics in source data.To utilize label knowledge, we incorporate supervised contrastive learning into hash code learning.} In particular, we sample a mini-batch of source images { =1 , each of which produces two augmented views via random pertur-} 2 bation.We re-annotate these 2 samples as { ˜ =1 and generate where ˜ ★ ˜ is the cosine similarity of ˜ between ˜ and serves as a temperature parameter set to 0.5 as in [4].

Unbiased Pseudo-labeling for Target Semantic Learning
Due to potential domain shifts, we need to learn semantics from target data.Existing methods usually add an extra classifer and set a threshold for the confdence scores to generate pseudo-labels [16,60].They will be used to guide the optimization of the hashing network.However, this strategy could run into two troubles.On the one hand, as shown in Figure 2, pseudo-labels could be biased towards easy classes (e.g., keyboards and laptops in Ofce-31 [38]) and this class imbalance could prevent the network from learning sufcient semantics in those hard classes.On the other hand, neural network classifers are prone to produce overconfdent pseudolabels due to the class competition nature of normalization [3,22].
To tackle this issue, we adopt a teacher-student architecture that can ease potential overftting empirically [71].In the teacher network, we set class-specifc thresholds to generate reliable and unbiased pseudo-labels of target data, which will guide the semantic structure learning in the student network.
In detail, the teacher network is warmed up using source semantic learning while the student network with the same architecture is randomly initialized.To begin, for each categorization, we utilize the associated source hash codes from the teacher network to infer its centroid in the Hamming space.Formally, the semantic centroid of the -th categorization is measured as below:  Then each weakly-augmented 1 target sample , is fed into the teacher network to produce a binary code , .The similarity distribution of , can be derived as: ′ 1 Weak augmentation refers to standard fip-and-shift augmentation strategies [48].
where is the label embedding for the -th class.Instead of setting a threshold for the whole target domain, we determine class-specifc thresholds to ensure uniform selection for diferent semantics.The indexes of confdent samples can be formulated as below: where is the threshold for the -th class.Then we get the hard pseudo-labels for samples in S using ˆ = =1 ( []).Assuming that • samples are adopted for target semantic learning, is set to promise that • / samples satisfy [] > .After obtaining the pseudo-labels of target data, we minimize the distance between target instances with the same pseudo-labels compared with other instances following the paradigm of instancelevel contrastive learning in the source domain.Similarly, we generate two views of each target example, resulting in a mini-batch of Given ′ ( ) = ′ /{ }, the instance-level hashing contrastive learning on target data is written as follows: ( )

Prototype-level Hashing Contrastive Learning for Domain Alignment
To achieve domain alignment, previous methods usually utilize adversarial learning to produce domain-invariant deep features, followed by a binarization layer.However, this strategy could run into two thorny troubles.First, since semantic information is not taken into account, it is challenging to train a domain discriminator for complicated multi-modal distributions [31].Second, the domain discrepancy could be amplifed in the Hamming space, hindering efective cross-domain retrieval.In this part, we turn to prototypical learning to tackle these issues.Generally, we utilize hash codes along with their labels (pseudo-labels) to measure the hashing prototypes in both source and target domains.Moreover, we utilize instance-level augmentation and mini-batch operation to provide challenging views of prototypes.Finally, we propose a novel domain-aware prototype-level contrastive learning objective, which can produce domain-invariant hash codes while maximizing the model capacity for efcient cross-domain image retrieval.
In particular, we frst generate source prototypes using semantic labels.For each class, the source prototype is defned as the average of hash codes with the same semantics.Following the paradigm of contrastive learning, we aim to provide challenging views of prototypes [4,15,43].First of all, we involve instance-level augmentation in calculating hashing prototypes before feeding samples into the hashing network.Moreover, we involve the random subsets of the whole datasets in calculation since this strategy empirically does not change the semantics.In practice, mini-batch optimization naturally samples subsets to generate views for prototypes.As a consequence, given a batch B , an augmented view of the -th source prototype can be written as below: where comes from augmented view of with the student network.Similarly, recall that [] implies the probability that belongs to the -th class.As a consequence, given a batch B , a view of the -th target prototype can be calculated through the pseudo-labels.In formulation, we have: where all the hash codes of target samples are averaged with diferent weights from similarity distributions.Note that these prototype representations could not be binary, but the hash codes with the same semantics should be still close to their corresponding prototypes in terms of the cosine distance.Then, we introduce our domain-aware prototype-level contrastive learning objective.Its principle is to bring together diferent views of the same prototype compared with the views of the other prototypes.To reduce the domain discrepancy, we frst compare representations across the source and target domains for the -th prototype: where is a temperature parameter set to 0.5 as in [4,34].Construct the subset S with pseudo-labels using Eq.5; 4: Sample and from D and D respectively to obtain a mini-batch; 6: Generate two views for each source prototype using Eq.7; 7: Generate two views for each target prototype using Eq.8; 8: Calculate L using Eq.13; 9: Update and using Eq.14; 10: end for 11: until convergence In addition, to fully increase the discriminability of hashing prototypes in both source and target domains, we propose to compare prototypes separately in both domains.In detail, we generate augmented samples under diferent operators for each target prototype denoted as ¯ and then compare two views of source prototypes in the source domain.In formulation, the source contrastive learning objective is written as: where denotes the label embedding for the -th class.Similarly, given another view for each target prototype denoted as ¯ and the target contrastive learning objective is formulated as follows: In a nutshell, the whole domain-aware prototype-level contrastive learning framework contains three parts as follows: Our prototype-level contrastive learning contributes to hash code learning from the following aspects.First, it compares the hashing prototype across two domains, which overcomes the domain disparity with the consideration of semantics.Second, since the denominator encourages enlarging the distance between hash prototypes of diferent semantics, it favors the uniform distribution of prototypes in the Hamming space from [55], maximizing the model capacity of each bit.Third, since contrastive learning can produce preeminent representations for downstream tasks such as clustering [27], our method can help provide discriminative hash code for efective image retrieval.Finally, our DANCE is optimized by jointly instance-level and domain-aware prototype-level contrastive learning.The overall loss objective is formulated as follows: where is a trade-of parameter.However, (•) is not diferentiable at the zero point and its derivation is zero for any other point, hindering efcient optimization using the back-propagation algorithm.To tackle this issue, we adopt ℎ(•) to approximate (•), producing the approximate hash code using = ℎ( (•)) instead of = ( (•)) during training.Our student network (•) is updated using the standard mini-batch stochastic gradient descent (SGD) algorithm while the momentum update is adopted in teacher network (•).In formulation, where represents the learning rate and represents a momentum coefcient set to 0.99 as in [15].During optimization, for every iterations, we update the subset S containing reliable pseudo-labels.The whole algorithm is summarized in Algorithm 1.

EXPERIMENTS 5.1 Experimental Settings
Datasets.We evaluate our proposed DANCE on three public datasets with their details as below.Ofce-Home [51] collects images from four diferent domains.Following recent works [60], two distinct domains are chosen for the source domain and the target domain respectively, generating six transferable retrieval scenes.Ofce-31 [38] contains over 4 thousand images covering 31 categories.Randomly choosing two domains as the source and target can result in six diferent transferable image retrieval tasks.Digits datasets are also investigated for domain adaptive retrieval, i.e., MNIST [23] and USPS [19], containing handwritten digits from 0-9, respectively.In this case, we select one domain as the source and another as the target, generating two transferable image retrieval tasks.Baselines.Our DANCE is compared with a series of state-of-theart approaches, including six unsupervised hashing approaches (i.e., SH [58], ITQ [13], DSH [30], LSH [12], SGH [21], GraphBit [57]) and six transfer hashing approaches (i.e., ITQ+ [73], LapITQ+ [73], GTH-g [67], DAPH [17], PWCF [18], and DHLing [60]).Due to a lack of privileged knowledge, partial results of ITQ+ and LapITQ+ are not available according to their original paper.Metrics.The accuracy of our methods is evaluated using three common criteria: mean average precision (MAP), precision-recall curve, and Top-N accuracy curve.In terms of retrieval accuracy, a higher MAP score indicates better retrieval performance.Precisionrecall curves are excellent indicators of comprehensive performance, since they reveal precision across a wide range of recall levels.Last but not least, the Top-N precision curve illustrates the precision over a range of retrieved instances.Implementation Details.All of the experiments are implemented by a single NVIDIA A100 GPU under PyTorch environment.We optimize the proposed DANCE by mini-batch SGD with momentum.The sample size within a batch is 36, and the learning rate is set to 0.001.Two hyperparameters, i.e., and are set to 0.5 and 0.2 according to sensitivity analysis, respectively.Each dataset consists of 10% of the target samples that are randomly selected for testing, and the rest is used for training following [17].It should be noted that the test set cannot be used for training.Data from the source domain is used as the database for cross-domain retrieval, while data from the target domain is used for single-domain retrieval.During warming up, we adopt the random perturbations including random cropping, random color distortions, and random Gaussian blur for efcient contrastive learning.

Experimental Results
To quantitatively show the performance of our proposed DANCE, the cross-domain MAPs of diferent methods on datasets Ofce-Home, Ofce-31, and Digits with hash code lengths increasing from 16 to 128 are recorded in Table 1 and Table 2.The following observations can be obtained: 1) Learning-based domain adaptive hashing methods (i.e., DAPH, PWCF and DHLings) show better performance than traditional ones, which demonstrates that tricks in the domain adaptation community make a diference to both crossdomain and single-domain retrieval.2) A signifcant enhancement is obtained by our proposed DANCE over the baselines including the recent state-of-the-art methods on cross-domain retrieval tasks.Specifcally speaking, our DANCE averagely goes beyond DHLing about 14.93%, 7.99%, and 6.34% for 64-bit MAP performances in cross-domain tasks on Ofce-Home, Ofce-31, and Digits, respectively, which shows the superiority of ours on transferable domain learning.3) Improvement on cross-domain retrieval tasks can be found over the previous state-of-the-art DHLing in all settings by a large margin.For a single domain, we measure DANCE on Ofce31 in Table 5 and more results can be found in Appendix.We can observe that DANCE improves MAP scores by a large margin on average.
To visually demonstrate the advantage of our proposed DANCE, we also plot the precision-recall curves of DAPH, PWCF, DHLing and DANCE on three cross-domain tasks in the left column of Figure 3.It can be observed that the precision of DANCE is always above the other models' precision value with a given recall value, which indicates that the hash codes learned by DANCE can perform better with hash table lookup retrieval scheme.The right column of Figure 3 presents the Top-N precision curves of these compared models in the same settings.Obviously, our proposed DANCE obtains higher precision with the same number of returned samples compared with the baselines.Our DANCE signifcantly goes beyond the comparison methods by large margins.

Ablation Study
In this section, we investigate several important modules of our proposed DANCE to evaluate their efectiveness in our framework.The performance of these variants is recorded in Table 4. removes the comparison between two views of each target prototype in the target domain, i.e., L ↔ .( 6) V6 adopts weak augmentations in both networks, and V7 adopts strong augmentations in both networks.(7) V8 is an extension of our DANCE, which adopts spectral clustering on deep features and retains the pseudo-labels with consistent results as in [34].
From the quantitative results, we can draw the conclusions as follows: V1 implies that the fxed threshold is not adaptive to the pseudo-labeling to ensure the bias removal and thus obtains the lower performance.V2 can be attributed that smoothing parameter updating can relieve the hash code learning of biased pseudo labels and help to optimize the learning process, which enhances the search performances.V3, V4 and V5 investigate the efects of prototypical contrastive learning variants.The results have proven that the efect of intra-domain alignment is better than inter-domain alignment.Besides, each component boosts the alignment of two domains, which implies that domain alignment plays an important role in domain adaptive hash learning.From the performance of V6, V7, we can see that weak augmentation can provide accurate pseudo-labels with less variance, which is suitable for the teacher network while strong augmentation adds more variance to training samples, relieving the overftting of the student network.The comparison of V8 and our DANCE shows that our results can be improved by more accurate pseudo-labels.

Parameter Sensitivity
We learn about the infuence of and in Figure 4, where is the trade-of coefcient between prototype-level and instance-level contrastive learning and is the threshold for target sample selection.We utilize three cross-domain tasks to evaluate the parameter sensitivity comprehensively.Firstly, we vary from 0.1 to 0.9 with other parameters fxed.It can be seen that the performance frst increases when rises, and then it is going to be stable and fnally decreases a little.Obviously, our performance is not sensitive to in the range of [0.5, 0.7], and we can set it to any value in that range as the trade-of.It can be explained by the numerical imbalance in contrastive loss between two diferent levels.Then we change in the same set while fxing other parameters.represents the ratio of uniformly selected clean samples.Obviously, our method can perform well with ranging from 0.1 to 0.3.As we can see, the performances decrease with more target sample selection, which is related to the semantic complexity in the target domain.Finally, and are set to 0.5 and 0.2 as default respectively.The top-k returned images from our model are shown in Figure 5.In comparison with DHLing for cross-domain retrieval, our DANCE retrieves much more related and user-desired images.For example, our DANCE hits more relevant targets than the baseline with the query 'bottle'.The returned instances are all relevant to the input from diferent domains, while the baseline can only achieve 0.8 precision, which further validates its superiority.Besides, with the increase of retrieved images, the gap between our proposed DANCE and the baseline is getting larger.

CONCLUSION
We investigate the practical problem of unsupervised domain adaptive hashing and an efective method named DANCE is proposed.DANCE incorporates unbiased pseudo-labeling into a teacher-student architecture, which generates unbiased and reliable pseudo-labels for target semantic learning.Moreover, DANCE develops a domainaware prototypical contrastive learning framework to generate domain-invariant and discriminative hash codes.Extensive experiments on a variety of benchmark datasets demonstrate the effectiveness of our DANCE.In the future works, we will try more pseudo-labeling and in-batch hard-negative techniques to improve our model.

Figure 1 :
Figure 1: Overview of our proposed DANCE.DANCE adopts a teacher-student framework where label information is used to guide instance-level contrastive learning on source data.Moreover, class-specifc thresholds are adapted to generate unbiased pseudo-labels for instance-level semantic learning in the target domain.Moreover, domain-aware prototype-level contrastive learning is utilized for domain-invariant and discriminative hash codes.

Figure 5 :
Figure 5: Examples of the top-returned images and Precision@ on cross-domain scenes in Ofce31 dataset.

Figure 6 :
Figure 6: The t-SNE visualizations of 128-bit hash codes on Digit.

Figure 6
Figure 6 represents the t-SNE analysis [1] on Digit dataset, where the legend shows the class name on the right.It is obvious that hash codes generated by our DANCE show more discriminative structures than the reproducible baselines DAPH and DHLing.The results show that the proposed DANCE can generate more discriminative binary codes, making image retrieval more successful.The top-k returned images from our model are shown in Figure5.In comparison with DHLing for cross-domain retrieval, our DANCE retrieves much more related and user-desired images.For example, our DANCE hits more relevant targets than the baseline with the query 'bottle'.The returned instances are all relevant to the input from diferent domains, while the baseline can only achieve 0.8 precision, which further validates its superiority.Besides, with the increase of retrieved images, the gap between our proposed DANCE and the baseline is getting larger.

Table 1 :
Cross-domain retrieval performances comparison with the baselines on Ofce-Home and Ofce31.

Table 2 :
Cross-domain retrieval performances comparison with the baselines on MNIST and USPS.

Table 3 :
Partial Performances of single-domain retrieval on AMAZON→DSLR with varying bits.

Table 4 :
Ablation experiments on important inner modules of our method with 64 bits hash codes.

Table 5 :
Performances of single domain retrieval on AMAZON→DSLR, PRODUCT→REAL, and MNIST→USPS with varying code lengths from 16 to 128.