The importance of evaluation in information retrieval
Finding and evaluating information are two of
the main steps in doing research. Of course, finding the information is one
thing, but how can you determine if it is good information? Everyone will have
a different opinion on what is considered good information. The best way to
evaluate information is to make a list of questions and check each source
against them. In this paper the author is going to analyse the importance of
evaluation in information retrieval.
DEFINITION OF TERMS
According to
San Diago State University website “evaluating information encourages you to think critically about the
reliability, validity, accuracy, authority, timeliness, point of view or bias
of information sources.” A web definition of information retrieval is the tracing and recovery of specific
information from stored data. The Wikipedia defines information retrieval as “ the activity of obtaining information resources relevant to an information need from
a collection of information resources. Searches can be based on metadata or on full-text (or other content-based) indexing.” The Cambridge
University defines information retrieval as “finding material (usually
documents) of an unstructured nature (usually text) that satisfies an
information need from within large collections (usually stored on computers).”
The Stanford University website says evaluation in
information retrieval is done to find the relevance of the search results “The standard approach to
information retrieval system evaluation revolves around the notion of relevant and nonrelevant documents. With respect to a user
information need, a document in the test collection is given a binary
classification as either relevant or nonrelevant. This decision is referred to
as the gold standardor ground truth judgment
of relevance.” The website goes further to say
document is relevant if it addresses the stated information need, not
because it just happens to contain all the words in the query. This distinction
is often misunderstood in practice, because the information need is not overt.
But, nevertheless, an information need is present. If a user types python into
a web search engine, they might be wanting to know where they can purchase a
pet python. Or they might be wanting information on the programming language
Python. From a one word query, it is very difficult for a system to know what
the information need is. But, nevertheless, the user has one, and can judge the
returned results on the basis of their relevance to it. To evaluate a system,
we require an overt expression of an information need, which can be used for
judging returned documents as relevant or nonrelevant. At this point, we make a
simplification: relevance can reasonably be thought of as a scale, with some
documents highly relevant and others marginally so. But for the moment, we will
use just a binary decision of relevance.
Evaluation
of information is done to check its bias or objectivity of information.A
document is relevant if it addresses the stated information need, not because
it just happens to contain all the words in the query. This distinction is
often misunderstood in practice, because the information need is not overt.
But, nevertheless, an information need is present. If a user types python into
a web search engine, they might be wanting to know where they can purchase a
pet python. Or they might be wanting information on the programming language
Python. From a one word query, it is very difficult for a system to know what
the information need is. But, nevertheless, the user has one, and can judge the
returned results on the basis of their relevance to it.
Information is evaluated to screen propaganda from
real truthful information. Propaganda,
as defined in Webster's II Dictionary, refers to the systematic widespread
promotion of a particular idea; that is, material which is published with the
purpose of winning people over to a particular idea. Propaganda techniques are
used in print material, as well as mass media. The techniques have the same
effect, but may be even more persuasive on the Net because of the potential for
interaction with more media. Some of the most common propaganda techniques are
bandwagon, testimonial, transfer, repetition, and emotional words. The
bandwagon technique persuades someone to do something because many other people
are doing it. The testimonial method involves the use of testimonials, or
quotes from authorities or famous people to persuade the public. The transfer
approach uses a situation that is appealing, or features appealing or famous
people, in hopes that the audience will identify with the people or situation
and transfer their feelings to the product. The repetition strategy involves
repeating the same words or phrases for emphasis. Finally, another persuasive
technique is using emotional words to make the audience feel strongly about
something or someone. Many times these techniques are used in advertising, but
we still face propaganda in the content of the sites we visit, as well.
Evaluation
in information retrieval is used to determine how well does the system work.
This can be investigated at several levels: firstly, the processing speed of
the system, that is, time and speed. Secondly, the effectiveness of the results
and whether the system satisfies the user.
On the
effectiveness of the retrieval system we find that the search system in
response to a query, an IR system searches its document collection and returns
an ordered list of responses called the retrieved set or ranked list. The
system employs a search strategy or algorithm. The system then measures the
quality of a ranked list. A better search strategy yields a better ranked list
and a better ranked list help the user fill their information need.This is why
evaluation is important in information retrieval to compare systems which
produces better ranked lists.
Evaluation
also helps in finding an IR system with a better precision and recall.Precision
is the fraction of retrieved of retrieved documents that are relevant while
recall is the fraction of relevant document retrieved. Precision and recall are
measures of sets. In a ranked list, wecan measure the precision at each recall
point, recall increases when a relevant document is retrieved, compute
precision at each relevant retrieved document, over that fraction of the
retrieved set. There is a trade-off between recall and precision. If the recall
becomes more there is also a relative decline in precision.So evaluation is
there to help us see a system with a balance recall and precision.
According
to Nayak P. and Raghavan P.(2003 :1) information evaluation is important in
determining if our found information is good and if it satisfies the user. How
do we know if our results are any good. There is need to evaluate the search
engine benchmarks, precision and recall and results summaries. The search
engine need to be evaluated on how fast it index number of documents per hour,
average document size, how fast it searches, its expressiveness of query
language, uncluttered UI and whether it is free. We also measure the user
happiness which is contributed by speed of response. Blindly fast, useless
answers won’t make a user happy.
The system
need to be evaluated to determine if the IR system satisfies the system user. Some
of the factors influencing user satisfaction are as follows: system effectiveness,
user effectiveness, user effort, and user characteristics. System effectiveness
measures how well a given IR system achieves its objective. Traditionally,
system retrieval effectiveness is measured in terms of precision (the fraction
of retrieved documents retrieved by the IR system that are also relevant to the
query) and recall (the fraction of the relevant documents present in the
database that are retrieved by the IR system). These two parameters characterize
the ability of the system to retrieve relevant documents and avoid irrelevant
ones (Van Rijsbergen, 1979: p.114).User effectiveness is defined as the
accuracy and completeness with which users achieve certain goals. User
effectiveness can be measured by the following criteria: the number of tasks successfully completed,
number of relevant documents obtained, and the time taken by users to complete
set tasks Hersh, et al., (2000) Indicators of effectiveness also include
quality of solution and error rates.
User
effectiveness is different from system effectiveness, for example system
effectiveness is measured objectively by the number of relevant documents
retrieved by the IR system (e.g. TREC relevance assessments) whereas user
effectiveness is measured by the number of relevant documents saved by the
users from the number of relevant documents retrieved by the IR system (e.g.
the number of relevant documents identified by the users and at the same time
match with TREC relevance assessments). User effort can be defined in a similar
way to the definition of „information searching behaviour‟
(Wilson, 2000); information searching behaviour is the user search behaviour
when interacting with an IR system to search for information. So evaluation
helps us find a system with the above qualities which can help satisfy the
user.
Conclusion
The author had pointed out that evaluating information retrieval
helps in establishing the relevance of search results, determine the bias or
objectivity of information, screen propaganda and to determine how well does
the system work. Evaluation helps determine the effectiveness of the retrieval
system which includes the precision and recall. Evaluation is important also
because it helps find if the search results satisfies the user.
References
Nayak P,2003, introduction to information retrieval and web
search,Toronto,Pacific Press
En.wikipedia.org/wiki/precision_and_recall accessed 26/03/15
www.lib.vt.edu/instruct/evaluate/evaluating_internet_information accessed 26/03/15
Al-Maskari and Sanderson M,_A Review of Factors Influencing User
Satisfaction in Information –Found online on
citeseerx.lst.psu.edu/viewdoc/download?
Accessed on 26/03/15
http://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-evaluation-1.html accessed 26/03/15
Etiwel Mutero holds a Bachelor of Science Honours Degree in Records and Archives Management through the Zimbabwe Open University and a National Certificate in Records and Archives Management from Kwekwe Polytechnic.You can contact him on 00264817871070 or etiwelm02@gmail.com
Comments
Post a Comment