Options
Sample-based Creation of Peer Summaries for Efficient Similarity Search in Scalable Peer-to-Peer Networks
Blank, Daniel; El Allali, Soufyane; Müller, Wolfgang; u. a. (2007): Sample-based Creation of Peer Summaries for Efficient Similarity Search in Scalable Peer-to-Peer Networks, in: James Z. Wang, Nozha Boujemaa, Alberto Del Bimbo, u. a. (Hrsg.), MIR ’07 : Proceedings of the international workshop on Workshop on multimedia information retrieval, New York: ACM, S. 143–151, doi: 10.1145/1290082.1290104.
Faculty/Chair:
Author:
Title of the compilation:
MIR '07 : Proceedings of the international workshop on Workshop on multimedia information retrieval
Editors:
Wang, James Z.
Boujemaa, Nozha
Bimbo, Alberto Del
Li, Jia
Conference:
MM07: The 15th ACM International Conference on Multimedia 2007, September 24 - 29, 2007 ; Augsburg, Bavaria, Germany
Publisher Information:
Year of publication:
2007
Pages:
ISBN:
978-1-59593-778-0
Language:
English
Abstract:
In this paper we introduce a simple yet experimentally convincing approach in the research field of source selection for content-based similarity search in P2P networks or, more concretely, in summary-based P2P systems. In these systems, summaries are used for data source selection when performing k-NN queries on distributed collections of documents represented by feature vectors.
We introduce a new type of cluster-based summaries for source selection that can efficiently and cheaply be calculated and distributed in P2P networks. For the summaries generation, a very large number of sample points is used. Each peer in the network assigns its indexing data to their corresponding closest sample points and publishes its constructed summary. We evaluate the quality of these summaries when changing the number of sample points used in experiments on real-world image feature data obtained from a large crawl of the flickr web photo community and show that for higher numbers of sample points we achieve a better retrieval performance. Our experiments show that the proposed summaries yield four times better performance with respect to previous methods. Intuitively, there are some disadvantages to this approach due to the large size of the generated summaries. We show experimentally, that these disadvantages can easily be overcome due to the sparse nature of the generated summaries by simple compression techniques.
We introduce a new type of cluster-based summaries for source selection that can efficiently and cheaply be calculated and distributed in P2P networks. For the summaries generation, a very large number of sample points is used. Each peer in the network assigns its indexing data to their corresponding closest sample points and publishes its constructed summary. We evaluate the quality of these summaries when changing the number of sample points used in experiments on real-world image feature data obtained from a large crawl of the flickr web photo community and show that for higher numbers of sample points we achieve a better retrieval performance. Our experiments show that the proposed summaries yield four times better performance with respect to previous methods. Intuitively, there are some disadvantages to this approach due to the large size of the generated summaries. We show experimentally, that these disadvantages can easily be overcome due to the sparse nature of the generated summaries by simple compression techniques.
Keywords: ; ;
P2P
CBIR
Source selection
Type:
Conferenceobject
Activation date:
September 24, 2014
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/16361