Data-Dependent Hashing Based on p-Stable Distribution
The p-stable distribution is traditionally used for data-independent hashing. In this paper, we describe how to perform data-dependent hashing based on p-stable distribution. We commence by formulating the Euclidean distance preserving property in terms of variance estimation. Based on this property...
Saved in:
| Published in | IEEE transactions on image processing Vol. 23; no. 12; pp. 5033 - 5046 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
IEEE
01.12.2014
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1057-7149 1941-0042 1941-0042 |
| DOI | 10.1109/TIP.2014.2352458 |
Cover
| Summary: | The p-stable distribution is traditionally used for data-independent hashing. In this paper, we describe how to perform data-dependent hashing based on p-stable distribution. We commence by formulating the Euclidean distance preserving property in terms of variance estimation. Based on this property, we develop a projection method, which maps the original data to arbitrary dimensional vectors. Each projection vector is a linear combination of multiple random vectors subject to p-stable distribution, in which the weights for the linear combination are learned based on the training data. An orthogonal matrix is then learned data-dependently for minimizing the thresholding error in quantization. Combining the projection method and orthogonal matrix, we develop an unsupervised hashing scheme, which preserves the Euclidean distance. Compared with data-independent hashing methods, our method takes the data distribution into consideration and gives more accurate hashing results with compact hash codes. Different from many data-dependent hashing methods, our method accommodates multiple hash tables and is not restricted by the number of hash functions. To extend our method to a supervised scenario, we incorporate a supervised label propagation scheme into the proposed projection method. This results in a supervised hashing scheme, which preserves semantic similarity of data. Experimental results show that our methods have outperformed several state-of-the-art hashing approaches in both effectiveness and efficiency. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1057-7149 1941-0042 1941-0042 |
| DOI: | 10.1109/TIP.2014.2352458 |