Interactive Content Based Image Retrieval using Multiuser Feedback

Retrieving images from large databases becomes a difficult task. Content based image retrieval (CBIR) deals with retrieval of images based on their similarities in content (features) between the query image and the target image. But the similarities do not vary equally in all directions of feature space. Further the CBIR efforts have relatively ignored the two distinct characteristics of the CBIR systems: 1) The gap between high level concepts and low level features; 2) Subjectivity of human perception of visual content. Hence an interactive technique called the relevance feedback technique was used. These techniques used user’s feedback about the retrieved images to reformulate the query which retrieves more relevant images during next iterations. But those relevance feedback techniques are called hard relevance feedback techniques as they use only two level user annotation. It was very difficult for the user to give feedback for the retrieved images whether they are relevant to the query image or not. To better capture user’s intention soft relevance feedback technique is proposed. This technique uses multilevel user annotation. But it makes use of only single user feedback. Hence Soft association rule mining technique is also proposed to infer image relevance from the collective feedback. Feedbacks from multiple users are used to retrieve more relevant images improving the performance of the system. Here soft relevance feedback and association rule mining techniques are combined. During first iteration prior association rules about the given query image are retrieved to find out the relevant images and during next iteration the feedbacks are inserted into the database and relevance feedback techniques are activated to retrieve more relevant images. The number of association rules is kept minimum based on redundancy detection. Keywords— Association rules, Content-based image retrieval, Relevance Inference, Soft relevance feedback.


I. INTRODUCTION
With the rapid development of internet technology, the transmission and access of image items have become easier and the volume of image repository is exploding.Image retrieval has been a very active research area since the 1970s, with the thrust from two major research communities, database management and computer vision.These two research communities study image retrieval from different angles, one being text-based and the other visual-based.
A very popular framework of image retrieval then was to first annotate the images by text and then use text-based database management systems (DBMS) to perform image retrieval.However, there exist two major difficulties, especially when the size of image collections is large (tens or hundreds of thousands).One is the vast amount of labor required in manual image annotation.Rich content in the images and the subjectivity of human perception leads to the other difficulty.To overcome these difficulties, content-based image retrieval was proposed.That is, instead of being manually annotated by text-based key words, images would be indexed by their own visual content, such as color and texture.These systems provide various means for the users to describe their queries, such as a SQL-like query language, sample query images, just to name a few.The system responds to the query by returning a set of database images that are 'similar in content' to the query.This paper focuses on the discussion of CBIR systems with query by example interface, both the target search and category search will be addressed.
There are some fundamental problems associated with simple content based image retrieval scheme: First, features are unequal in their differential relevance for computing similarities between images [2].Second, the user understands more about the query, whereas the database systems can only "guess" what the user is looking for during the retrieval process.Finally, different similarity measures capture different aspects of perceptual similarity between images [3].

INTERNATIONAL JOURNAL ON INFORMATICS VISUALIZATION VOL 1 (2017) NO 4 e-ISSN : 2549-9904 ISSN : 2549-9610
Hence the interactive technique called the relevance feedback technique is used.Relevance feedback (RF) is an interactive process which can fulfill the requirements of query reformulation and it proceeds as follows.The user initializes a query session by submitting a sample image as the query.The system then compares the query image to each image in the database and returns t images in one display that are the nearest neighbors to the query.If the user is not satisfied with the retrieval result, he/she can activate an RF process by identifying which retrieved images are relevant and which are nonrelevant.The system then updates the relevance information, such as the reformulated query vector, feature weights, and prior probabilities of relevance, to include as many user-desired images as possible in the next retrieval result.The process is repeated until the user is satisfied or the results cannot be further improved.
Most of the existing RF approaches deal with hard feedback and focus on only individual experience.In this paper we propose soft relevance feedback to capture user's intention more effectively by providing more choices.On the other hand, the meta-knowledge exploited from multiple users' interactions with the system across different query sessions can improve the performance of future retrieval results.The hard relevance technique is modified slightly to have a soft relevance feedback technique.With the collective feedback, association rule mining can find the most relevant images with the highest confidence.In this paper, we present an image relevance association rule mining (IRARM) model with soft relevance feedback.The system uses the a priori association rules for image relevance inference and returns the most relevant images to the user.If the user is not satisfied with the current retrieved images, he/she can identify the relevance level of each retrieved image through our soft feedback interface and activate the embedded soft RF technique to improve the retrieval results.
The remainder of this paper is organized as follows.Section 2 describes the related works on relevance feedback.Section 3 presents the proposed IRARM model.Section 4 gives the experimental results.Finally, Section 5 presents the conclusions of the paper.

II. RELATED WORKS A. Relevance Feedback Techniques
The submitted query image and a database image be represented by feature vectors X = (x1, x2 . . .xd) and Y = (y1,y2, . ..,yd), respectively, where d is the number of selected features and xi and yi are the values of the ith feature.The similarity between X and Y is derived using the normalized Euclidean metric shown in eqution (1), The top t database images that are the nearest neighbors of the query are then returned to the user.If the user is not satisfied with the retrieval result, he/she can activate an iterative RF process until satisfied.In the following subsections, the main existing RF techniques are presented 1).Query vector modification: The query vector modification (QVM) approach [4][5] iteratively reformulates the query vector based on user's feedback in order to move the query toward a topological region of more relevant images.Let the i th database image be the query and j be the RF iteration number, and let Xi (j) denote the current query formulation.Also let the set of relevant images identified at the jth iteration be R, and the set of identified non relevant images be N.For the (j + 1) th RF iteration, the method reformulates the query vector and shown in equation ( 2), where, Yk, are images that belong to region R or N, and a, b, and c are the parameters controlling the relative weight of each component.

2). Feature Relevance Estimation:
The relevance of the feature is evaluated by counting how many of the newly retrieved t images are identified as relevant.That is, the relevance weight wi of feature i is proportional to |Ri|, where |Ri| denotes the number of relevant retrieved images obtained using feature i alone.The larger the relevance weight, the better the retrieval ability of the tested feature.Finally, equation (3) represents the feature relevance which is used as a weight in the dissimilarity metric, viz., 3).Bayesian inference: The Bayesian inference (BI) approaches use a Bayesian framework to estimate the a posteriori probability that a database image is relevant to the query given the prior feedback [6][7].

III. THE PROPOSED APPROACH
A model named image relevance association rule mining (IRARM) is shown in Fig. 1.When a user starts a new query session, the a priori relevance association rules about this query are first retrieved.Then the model performs retrieval based on image relevance inference.If the user is not satisfied, he/she can identify the relevance level of each retrieved image through our soft annotation interface.The user's relevance feedback is processed in two aspects.
(1) The adopted soft RF technique uses this feedback for query reformulation to improve the next retrieval results of the same query session.(2) The user's feedback is inserted into the set of the relevance itemsets for association rule mining.The derived association rules from many users' experiences can improve the retrieved images of future sessions.

A. RF techniques with Soft Feedback
In order to better capture the users' perception, in our system, we provide the user with four levels of relevance, namely ''highly relevant'', ''good'', ''don't care'', and ''bad''.To cope with soft feedback, each relevance level is assigned a soft value.In particular, assign each retrieved image I an image relevance weight r1, and let r1 = 0:2 if image I is highly relevant 0:1 if image I is good 0 if the user does not care 0:1 if image I is bad Thus, the image relevance weight represents the significance degree with which a given image is relevant to the query image.

1). Soft QVM:
The user identifies relevance degree of each retrieved image using our four-level annotation interface.Let R be the set of highly relevant and good retrieved images and N be the set of bad retrieved images.The soft QVM which is the modified form of QVM derives the reformulated query vector as shown in equation ( 4),  (4) i.e., each relevant (involving highly relevant and good retrieved images) and nonrelevant (bad retrieved images) image is discounted by the corresponding image relevance weight.The retrieved images with ''don't care'' annotation are not involved in the reformulation because its image relevance weight is 0.

2). Soft FRE:
With the user's feedback the soft FRE examines the retrieval ability of each feature and the new t closest images to the query are retrieved.The relevance weight of feature i is evaluated by   = max {∑   /, 0} ∈ where Ωi is the set of retrieved images using feature i alone.Hence, the feature is more relevant if many of the retrieved images in Ωi have been annotated as highly relevant or good by the user.Finally, the new t retrieved images after this RF iteration are determined by equation ( 5) which is called dissimilarity metric.
3).Soft BI: Let the a priori probabilities of P(Y|R) and P(Y|N) be estimated using the observed samples in R and N, respectively, identified at the current RF iteration.If we assume the relevant images form a Gaussian density, then (|) ≡ (  ,   ) (6) The most relevant images are determined using the Bayesian classifier.

B. Image Relevance Association Rule Mining (IRARM)
In this paper, we presents a systematic framework which employs association rule mining technique for finding strong association rules among historic relevance information from users' feedback.To achieve this, many theoretic and practical issues should be concerned.(1) The extension of binary vectors to fuzzy vectors that accommodate soft relevance feedback should be devised.(2) To reduce the responding time, the size of the rule set should be reduced while still retaining high retrieval precision.(3) Various scenarios of association rules with respect to a particular query exist, the image relevance inference that considers the rule confidence and scenario is needed.

1). Soft Apriori algorithm:
Association rule mining techniques have been extensively used for finding associations or relationships between different items from large amounts of transactional data.The traditional Apriori algorithm uses two steps join and prune step.The support count is the frequent itemset count.For a CBIR problem with relevance feedback, the retrieved images to a particular query at one feedback iteration can be treated as a transactional record.Table 1 shows an example of the transaction database where three transaction records have been stored.

ID
List of retrievals and image relevance weights T1 I1 (0.2), I2 (0.1) , I3 (0.2),I4(0.2) T2 I1 (0.2), I2 (0.1) T3 I3 (0.2), I4 (0.1), I5 (0.2), I6 (0.1) The Apriori algorithm copes only with hard transactional data, i.e., the support count of an itemset is increased by one if an additional occurrence of this itemset is observed.Our system interacts with the users through soft feedback, a new scheme referred to as soft Apriori algorithm for counting the number of fuzzy occurrences of itemsets is thus proposed.Using the above example as shown in Table .1,we first find the set of 1-itemsets which is denoted L1 (Table.2a).The support count of each 1-itemset is the sum of image relevance weight given in the corresponding transaction record since the image relevance weight indicates the relevance degree of the image occurring in the record.For instance, the support count of itemset {I1} is computed by sup({I1}) = 0.2 + 0.2 = 0.4.
where A is called the rule antecedent and B is called the rule consequent.Here, restrict the association rules to those where A is a 1-itemset.

2). Rule set reduction:
The two types of rule set reduction techniques based on confidence quantization and redundancy detection are described as follows.
Type-1 rule set reduction.
Here, merge all the rules that have the same antecedent and whose confidence values fall in the same interval.Formally, assume there are k rules with the same antecedent A and similar confidence values, enumerated as A => Bi, i = 1, 2, 3. . .k.Then, merge these rules into one as A => Z, Type-2 rule set reduction.
Another useful reduction technique is to detect redundancy in the rules.Let two rules have the same antecedent A and be enumerated as A => B and A => C. Rule redundancy exists if the two rules satisfy at least one of the following conditions.
• If с  and Conf(A => B) < Conf(A=> C), then rule A=> B is redundant and can be removed from the rule set.• If  =  ∩  and Conf(A=> B) < Conf(A=> C), then rule A=> B can be shorten as A=>B\D where B\D denotes the difference set between B and D.
3).Relevance inference: With the prior relevance knowledge represented as association rules, the potentially relevant images to a particular query can be found using relevance inference.Given a query image q, we consider the following relevance inference rules. •

IV. COMPARATIVE PERFORMANCES
The Soft Query Vector Modification (SQVM) technique performs better than the Query Vector Modification (QVM).The performance of Image Relevance Association Rule Mining is estimated by the measure of retrieval score values (RS).The retrieval score is given by the sum of the relevance weights of all images retrieved during a particular iteration.The performance the (ESQVM) Euclidean soft query vector modification technique and the IRARM soft query vector modification (ASQVM) technique is shown in the fig.2. The ASQVM performs better.In ASQVM the retrieval score values increases gradually as the number of iterations increases whereas in ESQVM the RS values increases for half of iterations and then slows down for rest of the iterations.Image relevance association rule mining (IRARM) model with soft relevance feedback from multiple users is proposed here.Most of the existing relevance feedback (RF) approaches deal with hard feedback and focus on individual experience only.In this paper, we present an.To add this feature, all of the traditional RF techniques should be modified accordingly.Further, we propose a soft association rule mining algorithm to generate image relevance association rules from the collective soft feedback.The number of association rules is kept minimum based on confidence quantization and redundancy detection.The proposed model provides a better performance.The reinforcement learning approach can be used to combine various RF techniques.The other versions of Apriori algorithm like AprioriTID and Hybrid Apriori algorithms can be applied for the proposed model to increase the performance and reduce the cost.
The first-order inference rule seeks for all the rules A=> B with A = {q}.It infers the relevance of image b, b Є B, with respect to query q as Фq (b) = Conf(A => B). • The second-order inference rule seeks for the rules A => B and C=>D with A = {q} and Cс B. It infers the relevance of image d, d Є D, with respect to query q as Фq(d) = Conf(A=>B) х Conf(C=>D).

Fig. 2 .
Fig. 2. The average RS values obtained using ASQVM and EQVM methods at different number of relevance feedback iterations