Multiview Alignment Hashing for Efficient Image Search

Abstract—Hashing is a popular and efficient method for
nearest neighbor search in large-scale data spaces, by embedding
high-dimensional feature descriptors into a similarity-preserving
Hamming space with a low dimension. For most hashing methods,
the performance of retrieval heavily depends on the choice of
the high-dimensional feature descriptor. Furthermore, a single
type of feature cannot be descriptive enough for different images
when it is used for hashing. Thus, how to combine multiple
representations for learning effective hashing functions is an
imminent task. In this paper, we present a novel unsupervised
Multiview Alignment Hashing (MAH) approach based on Regularized
Kernel Nonnegative Matrix Factorization (RKNMF),
which can find a compact representation uncovering the hidden
semantics and simultaneously respecting the joint probability
distribution of data. Specifically, we aim to seek a matrix
factorization to effectively fuse the multiple information sources
meanwhile discarding the feature redundancy. Since the raised
problem is regarded as nonconvex and discrete, our objective
function is then optimized via an alternate way with relaxation
and converges to a locally optimal solution. After finding the
low-dimensional representation, the hashing functions are finally
obtained through multivariable logistic regression. The proposed
method is systematically evaluated on three datasets: Caltech-
256, CIFAR-10 and CIFAR-20, and the results show that our
method significantly outperforms the state-of-the-art multiview
hashing techniques.

INTRODUCTION
LEARNING discriminative embedding has been a critical
problem in many fields of information processing and
analysis, such as object recognition [1], [2], image/video retrieval
[3], [4] and visual detection [5]. Among them, scalable
retrieval of similar visual information is attractive, since with
the advances of computer technologies and the development of
the World Wide Web, a huge amount of digital data has been
generated and applied. The most basic but essential scheme for
similarity search is the nearest neighbor (NN) search: given a
query image, to find an image that is most similar to it within
a large database and assign the same label of the nearest
neighbor to this query image. NN search is regarded as a
linear search scheme (O(N)), which is not scalable due to the
large sample size in datasets of practical applications. Later,
to overcome this kind of computational complexity problem,

some tree-based search schemes are proposed to partition the
data space via various tree structures. Among them, KD-tree
and R-tree [6] are successfully applied to index the data for fast
query responses. However, these methods cannot operate with
high-dimensional data and do not guarantee faster search compared
to the linear scan. In fact, most of the vision-based tasks
suffer from the curse of dimensionality problems1, because
visual descriptors usually have hundreds or even thousands
of dimensions. Thus, some hashing schemes are proposed to
effectively embed data from a high-dimensional feature space
into a similarity-preserving low-dimensional Hamming space
where an approximate nearest neighbor of a given query can
be found with sub-linear time complexity.
One of the most well-known hashing techniques that preserve
similarity information is Locality-Sensitive Hashing
(LSH) [7]. LSH simply employs random linear projections
(followed by random thresholding) to map data points close
in a Euclidean space to similar codes. Spectral Hashing
(SpH) [8] is a representative unsupervised hashing method, in
which the Laplace-Beltrami eigenfunctions of manifolds are
used to determine binary codes. Moreover, principled linear
projections like PCA Hashing (PCAH) [9] has been suggested
for better quantization rather than random projection hashing.
Besides, another popular hashing approach, Anchor Graphs
Hashing (AGH) [10], is proposed to learn compact binary
codes via tractable low-rank adjacency matrices. AGH allows
constant time hashing of a new data point by extrapolating
graph Laplacian eigenvectors to eigenfunctions. More relevant
hashing methods can be seen in [11], [12], [13].
However, single-view hashing is the main topic on which
the previous exploration of hashing methods focuses. In their
architectures, only one type of feature descriptor is used for
learning hashing functions. In practice, to make a more comprehensive
description, objects/images are always represented
via several different kinds of features and each of them has
its own characteristics. Thus, it is desirable to incorporate
these heterogenous feature descriptors into learning hashing
functions, leading to multi-view hashing approaches. Multiview
learning techniques [14], [15], [16], [17] have been well
explored in the past few years and widely applied to visual
information fusion [18], [19], [20]. Recently, a number of
multiview hashing methods have been proposed for efficient
similarity search, such as Multi-View Anchor Graph Hashing
(MVAGH) [21], Sequential Update for Multi-View Spectral
Hashing (SU-MVSH) [22], Multi-View Hashing (MVH-CS)

[23], Composite Hashing with Multiple Information Sources
(CHMIS) [24] and Deep Multi-view Hashing (DMVH) [25].
These methods mainly depend on spectral, graph or deep
learning techniques to achieve data structure preserving encoding.
Nevertheless, the hashing purely with the above schemes
are usually sensitive to data noise and suffering from the high
computational complexity.
The above drawbacks of prior work motivate us to propose a
novel unsupervised mulitiview hashing approach, termed Multiview
Alignment Hashing (MAH), which can effectively fuse
multiple information sources and exploit the discriminative
low-dimensional embedding via Nonnegative Matrix Factorization
(NMF) [26]. NMF is a popular method in data mining
tasks including clustering, collaborative filtering, outlier detection,
etc. Unlike other embedding methods with positive
and negative values, NMF seeks to learn a nonnegative partsbased
representation that gives better visual interpretation
of factoring matrices for high-dimensional data. Therefore,
in many cases, NMF may be more suitable for subspace
learning tasks, because it provides a non-global basis set
which intuitively contains the localized parts of objects [26].
In addition, since the flexibility of matrix factorization can
handle widely varying data distributions, NMF enables more
robust subspace learning. More importantly, NMF decomposes
an original matrix into a part-based representation that gives
better interpretation of factoring matrices for non-negative
data. When applying NMF to multiview fusion tasks, a partbased
representation can reduce the corruption between any
two views and gain more discriminative codes.
To the best of our knowledge, this is the first work using
NMF to combine multiple views for image hashing. It is
worthwhile to highlight several contributions of the proposed
method:
MAH can find a compact representation uncovering the
hidden semantics from different view aspects and simultaneously
respecting the joint probability distribution of
data.
To solve our nonconvex objective function, a new alternate
optimization has been proposed to get the final
solution.
We utilize multivariable logistic regression to generate the
hashing function and achieve the out-of-sample extension.
The rest of this paper is organized as follows. In Section
2, we give a brief review of NMF. The details of our method
are described in Section 3. Section 4 reports the experimental
results. Finally, we conclude this paper in Section 5.

CLOUD WORKFLOW SCHEDULING WITH DEADLINE AND TIME SLOT ALGORITHM

CLOUD WORKFLOW SCHEDULING WITH DEADLINE AND TIME SLOT ALGORITHM Abstract Allocating service capacities in cloud computing is based on the assumption that they are unlimited and can be used at any time. However, available service capacities change with workload and cannot satisfy users’ requests at any time from the cloud provider’s perspective because cloud services can be shared by multiple tasks. Cloud service providers provide available time slots for new user’s requests based on available capacities. In this paper, we consider workflow scheduling with deadline and time slot availability in cloud computing. An iterated heuristic framework is presented for the problem under study which mainly consists of initial solution construction, improvement, and perturbation. Three initial solution construction strategies, two greedy- and fair-based improvement strategies and a perturbation strategy are proposed. Different strategies in the three phases result in several heuristics. ...

SPRING SOURCE TECHNOLOGIES

Search This Blog

Multiview Alignment Hashing for Efficient Image Search

Comments

Post a Comment

Popular posts from this blog

Android Tutorial

CLOUD WORKFLOW SCHEDULING WITH DEADLINE AND TIME SLOT ALGORITHM

MobiContext: A Context-aware Cloud-Based Venue Recommendation Framework