The importance of evaluation in information retrieval
Finding and evaluating information are two of the main steps in doing research. Of course, finding the information is one thing, but how can you determine if it is good information? Everyone will have a different opinion on what is considered good information. The best way to evaluate information is to make a list of questions and check each source against them. In this paper the author is going to analyse the importance of evaluation in information retrieval.
DEFINITION OF TERMS
According to San Diago State University website “evaluating information encourages you to think critically about the reliability, validity, accuracy, authority, timeliness, point of view or bias of information sources.” A web definition of information retrieval is the tracing and recovery of specific information from stored data. The Wikipedia defines information retrieval as “ the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text (or other content-based) indexing.” The Cambridge University defines information retrieval as “finding material (usually documents) of an unstructured nature (usually text) that satisﬁes an information need from within large collections (usually stored on computers).”
The Stanford University website says evaluation in information retrieval is done to find the relevance of the search results “The standard approach to information retrieval system evaluation revolves around the notion of relevant and nonrelevant documents. With respect to a user information need, a document in the test collection is given a binary classification as either relevant or nonrelevant. This decision is referred to as the gold standardor ground truth judgment of relevance.” The website goes further to say document is relevant if it addresses the stated information need, not because it just happens to contain all the words in the query. This distinction is often misunderstood in practice, because the information need is not overt. But, nevertheless, an information need is present. If a user types python into a web search engine, they might be wanting to know where they can purchase a pet python. Or they might be wanting information on the programming language Python. From a one word query, it is very difficult for a system to know what the information need is. But, nevertheless, the user has one, and can judge the returned results on the basis of their relevance to it. To evaluate a system, we require an overt expression of an information need, which can be used for judging returned documents as relevant or nonrelevant. At this point, we make a simplification: relevance can reasonably be thought of as a scale, with some documents highly relevant and others marginally so. But for the moment, we will use just a binary decision of relevance.
Evaluation of information is done to check its bias or objectivity of information.A document is relevant if it addresses the stated information need, not because it just happens to contain all the words in the query. This distinction is often misunderstood in practice, because the information need is not overt. But, nevertheless, an information need is present. If a user types python into a web search engine, they might be wanting to know where they can purchase a pet python. Or they might be wanting information on the programming language Python. From a one word query, it is very difficult for a system to know what the information need is. But, nevertheless, the user has one, and can judge the returned results on the basis of their relevance to it.
Information is evaluated to screen propaganda from real truthful information. Propaganda, as defined in Webster's II Dictionary, refers to the systematic widespread promotion of a particular idea; that is, material which is published with the purpose of winning people over to a particular idea. Propaganda techniques are used in print material, as well as mass media. The techniques have the same effect, but may be even more persuasive on the Net because of the potential for interaction with more media. Some of the most common propaganda techniques are bandwagon, testimonial, transfer, repetition, and emotional words. The bandwagon technique persuades someone to do something because many other people are doing it. The testimonial method involves the use of testimonials, or quotes from authorities or famous people to persuade the public. The transfer approach uses a situation that is appealing, or features appealing or famous people, in hopes that the audience will identify with the people or situation and transfer their feelings to the product. The repetition strategy involves repeating the same words or phrases for emphasis. Finally, another persuasive technique is using emotional words to make the audience feel strongly about something or someone. Many times these techniques are used in advertising, but we still face propaganda in the content of the sites we visit, as well.
Evaluation in information retrieval is used to determine how well does the system work. This can be investigated at several levels: firstly, the processing speed of the system, that is, time and speed. Secondly, the effectiveness of the results and whether the system satisfies the user.
On the effectiveness of the retrieval system we find that the search system in response to a query, an IR system searches its document collection and returns an ordered list of responses called the retrieved set or ranked list. The system employs a search strategy or algorithm. The system then measures the quality of a ranked list. A better search strategy yields a better ranked list and a better ranked list help the user fill their information need.This is why evaluation is important in information retrieval to compare systems which produces better ranked lists.
Evaluation also helps in finding an IR system with a better precision and recall.Precision is the fraction of retrieved of retrieved documents that are relevant while recall is the fraction of relevant document retrieved. Precision and recall are measures of sets. In a ranked list, wecan measure the precision at each recall point, recall increases when a relevant document is retrieved, compute precision at each relevant retrieved document, over that fraction of the retrieved set. There is a trade-off between recall and precision. If the recall becomes more there is also a relative decline in precision.So evaluation is there to help us see a system with a balance recall and precision.
According to Nayak P. and Raghavan P.(2003 :1) information evaluation is important in determining if our found information is good and if it satisfies the user. How do we know if our results are any good. There is need to evaluate the search engine benchmarks, precision and recall and results summaries. The search engine need to be evaluated on how fast it index number of documents per hour, average document size, how fast it searches, its expressiveness of query language, uncluttered UI and whether it is free. We also measure the user happiness which is contributed by speed of response. Blindly fast, useless answers won’t make a user happy.
The system need to be evaluated to determine if the IR system satisfies the system user. Some of the factors influencing user satisfaction are as follows: system effectiveness, user effectiveness, user effort, and user characteristics. System effectiveness measures how well a given IR system achieves its objective. Traditionally, system retrieval effectiveness is measured in terms of precision (the fraction of retrieved documents retrieved by the IR system that are also relevant to the query) and recall (the fraction of the relevant documents present in the database that are retrieved by the IR system). These two parameters characterize the ability of the system to retrieve relevant documents and avoid irrelevant ones (Van Rijsbergen, 1979: p.114).User effectiveness is defined as the accuracy and completeness with which users achieve certain goals. User effectiveness can be measured by the following criteria: the number of tasks successfully completed, number of relevant documents obtained, and the time taken by users to complete set tasks Hersh, et al., (2000) Indicators of effectiveness also include quality of solution and error rates.
User effectiveness is different from system effectiveness, for example system effectiveness is measured objectively by the number of relevant documents retrieved by the IR system (e.g. TREC relevance assessments) whereas user effectiveness is measured by the number of relevant documents saved by the users from the number of relevant documents retrieved by the IR system (e.g. the number of relevant documents identified by the users and at the same time match with TREC relevance assessments). User effort can be defined in a similar way to the definition of „information searching behaviour‟ (Wilson, 2000); information searching behaviour is the user search behaviour when interacting with an IR system to search for information. So evaluation helps us find a system with the above qualities which can help satisfy the user.
The author had pointed out that evaluating information retrieval helps in establishing the relevance of search results, determine the bias or objectivity of information, screen propaganda and to determine how well does the system work. Evaluation helps determine the effectiveness of the retrieval system which includes the precision and recall. Evaluation is important also because it helps find if the search results satisfies the user.
Nayak P,2003, introduction to information retrieval and web search,Toronto,Pacific Press
En.wikipedia.org/wiki/precision_and_recall accessed 26/03/15
www.lib.vt.edu/instruct/evaluate/evaluating_internet_information accessed 26/03/15
Al-Maskari and Sanderson M,_A Review of Factors Influencing User Satisfaction in Information –Found online on citeseerx.lst.psu.edu/viewdoc/download? Accessed on 26/03/15
San Diago State University website https://www.sdsu.edu/
http://nlp.stanford.edu/IR-book/html/htmledition/information-retrieval-system-evaluation-1.html accessed 26/03/15
Etiwel Mutero holds a Bachelor of Science Honours Degree in Records and Archives Management through the Zimbabwe Open University and a National Certificate in Records and Archives Management from Kwekwe Polytechnic.You can contact him on 00264817871070 or firstname.lastname@example.org