The (7) seven stages of the information retrieval.

Lesk (1996) is famed for writing the seven stages of information retrieval.  Lask is account seemed to be a criticism of Bush’s predictions on information retrieval systems since he referred a number of times to Bush’s writings and predictions.  Shakespear had described the seven ages of man (Shakespear 1599) starting from infancy and leading to senility.  The history of information retrieval parallels such a life, assert Lesk (1996:1).  Lesk says the popularization of the idea of information retrieval started in 1945, with Vannevar Bush’s article.  In this paper the author is going to discuss the (7) seven stages of the information retrieval.

Definition of Terms
The author will define what information retrieval is and who Michael Lesk was.  According to the Wikipedia, Information retrieval (IR) is the activity of obtaining information need from a collection of information resources.  Searches can be based on metadata or on full-text (or other context-based) indexing.  According to Cambridge Up (2009:1) the meaning of the term information retrieval can be very broad.  Just getting a credit card out of your wallet so that you can type in the card number is a form of information retrieval.  However, as an academic field of study, information retrieval might be defined thus:
Information retrieval(IR) if finding material (usually document) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
Now turning to who Michael Lesk was, according to the Wikipedia, he worked for the SMART Information Retrieval System project, wrote much of its retrieval code and many of the retrieval experiments, as well as obtaining a PhD from Havard University in Chemical Physics.  He worked for Bell labs and in 1984 started working for Bellmore, Cornell, Online computer Library Center, currently, he is on the faculty of the Library and Information Science Department, school of Communication & Information, Rutgers University.

According to Lesk (1996:2) the seven ages of information retrieval starts with a stage he named the Childhood ranging from the year 1945 to 1955.  Lesk says during this period digital recording, photographic inventions though the fall for short from Bush’s expectations.  There was information explosion after the World War II and high possibility of information processing machine.  Information storage predictions by Bush were not met during this period.

Bush had predicted the design of the user interface during this period but Lesk says when scientists or engineers want information, they look first in the closest sources.  They consult personal notes, ask nearest colleagues, ask their supervisors or other local authorities, on only much less frequently do they look into journals.  Research into information retrieval began with the Soviet Union launching its artificial Earth satellite
The invention of KWIC indexes, concordances as used in information retrieval, by researchers such as H.P Luhn was done during this period.  Concordances were still used where a small text is to be printed for any kind of large collection.  There were also innovative attempts at alternate kinds of machinery such as the use of overlapping codes on edge-notched cards by Calvin Mooers.  The most famous piece of equipment of this period wat the WRU Searching Selector built by Allen Kent at Western Reserve University.  All of this special purpose technology was swept away by digital systems of course.  He could no more find edge-notched cords today than he (Lesk) could find IBM punched cards.

The second stage of information retrieval is the schoolboy which is the period of the 1950s – 1960s.  this was a time of great experimentation in information retrieval systems.  Lesk (1986) says this period also saw the first large scale information systems built.  Commercial library systems such as Dialog and BRS can be traced back to experiments done at this time.  The period saw the definition of ‘recall’ and ‘precision’ and the development of the technology for evaluating retrieval systems.  It saw the separation of the field from the main stream of computer science.  During thei period many people attended (IR) conferences.

During this period the idea of free search arose.  There could be complete retrieval of any document using a particular world.  Cyril also developed the mathematics of ‘recall’ (fraction of relevant documents retrieval) and (precision) fraction of retrieval documents that are relevant as measures of IR systems.  Throughout this period of the 1960s, there was relatively little actual computerized retrieval going.

The third stage of information retrieval is called the adulthood, a period of the 1970s.  There was the development of the computer typesetting, word processing and line-sharing systems.  Commercial system such as Dialog, orbit and BRS were developed.  Computerization made is faster and easier to combine first the weekly indexes into yearly collective indexes.  The tape to do this could then be provided to services such as Dialog for access by professional librarians.

Lesk (19960 also say another early system was OCLC, the Online Computer Library Center founded by Fred Kilgour, it used the output of the Library of Congress MARC program for machine readable cataloguing.  The decade saw the start of full-text retrieval systems started.  The 1970s and 1980s were a period when both databases and office automation research boomed.  Keik Van Rijsbergen introduced the probabilistic information retrieval.  This involved, Lesk wrote, measuring the frequency of words in relevant documents, and using term frequency measures to adjust the weight given to different words.

The fourth stage of information retrieval is called the maturity stage, the period of the 1980s.  sugimoto and McCain (2010:2) says Lesk called the 1980s as ‘a decade in which online information became common’.  Markey also noted the transition which occurred in the 1980s, stating:
“Prior to 1980 the few available online information retrieval (IR) systems were expensive and complicated to search, so end users either delegated their searches to trained intermediaries or searched card catalogs and print indexes on their own.  In the mid 1980s, online IR systems suppliers introduced simple fronted interface to IR systems and marked search services to end users but never used these services because of their high cost’
This sentiment of moving form intermediaries to novices was also described by Savage-Knepshield and Belkin who noted a change in design to facilitate these users.

With respect to other developments, Lesk describes the rise and fall of artificial intelligence within this decade; the enthusiasm for expert system and knowledge representation languages in the early 1980s, the exponential increase in the number of databases available online; progress in OPAC development and widespread use of the CD-ROM, especially by the end of the decade

The fifth level of information retrieval according to Lesk (1986) is called the midlife crisis, which is the period of the 1990s.  he notes the ‘technological revolutions’ that is the internet and its likely impact on information retrieval, predicting that the internet would become ubiquitous within the next decade (2000-2009). Lesk describes development of scanning, online publishing, computer networking and the rise of digital libraries in this decade.  Internet put information retrieval to the test.  He concludes by noting the information retrieval evaluation, crediting TREC with the creation of “realistic collections for evaluation”, in comparison with the small test collections of the 1980s.  other authors have noted that TREC;
“transformed both the state of the art of information retrieval in general of that of IR experimentation in particular for a history treck”
Markey (2007;1071) says internet on IR was reinforced in many historical writings, he says “end user searching truly came into its own with the development of web search engines in the mid 1990s”.

The sixth stage of information retrieval is called the fulfillment, a period of the 2000s.  Lesk writing speculatively about the future, called the 2000s a time of ‘fulfillment’ for information retrieval prophesising that 2000-2009 would be a time when most ordinary question would be answered by reference to online materials rather than paper materials, when books were routinely offered online, and that libraries would actively work toward a retrospective conversion of their collections.  He predicted problems to do with work toward better retrieval technologies for images, sounds and video, better procedures for handling online publishing; and a need for new technique in cooperative browsing.

The seventh stage of information is called the retirement, which are the years starting in 2010.  The basic job of this period is the conversion to machine readable form of all the remaining manuscripts left.  Lesk says another serious problem is that of the copyright law.  There will be a great problem in dealing with copyright issues when it comes to digitization of paper records.

The author have discussed the development or history of information retrieval as authored by Michael Lesk starting with the childhood (1945-1955) followed by the schoolboy (1960s), followed by the adulthood (1970s), then comes maturity 1980, followed by the midlife crisis 1990s, and finally the fulfillment (2000) and retirement 2010.
1.     K. Markey, twenty-five years of end-user searching, part 1:  research findings, journal of the American Society for information science technology, 58 (8) (2007) 1071-1081
2.     M Lesk, the seven ages of information retrieval, UDT Occassional paper #5 IFLA (1995) Available at accessed 26/02/15
3.     I Ruthven, Ineractive Information Retrieval.  In B. Croni (ed), Annual Review of Information Sciences and Technology, 42 (1) 2008 43-91
4.     Sugimoto C.R, McCain K.W, Visualising changes overtime:  A history of information retrieval through the lens of descriptor tri-occurrence mapping-journal of information scienceXX(X) 2010 pp1-15.  Available on

Etiwel Mutero  holds a Bachelor of Science Honours Degree in Records and Archives Management through the Zimbabwe Open University and a National Certificate in Records and Archives Management from Kwekwe Polytechnic.You can contact him on 00264817871070or

Popular posts from this blog

Compare and contrast Maslow’s Hierarchy of needs with Herzberg’s Two Factor Theory.

Causes of Archives and Records Deterioration and Their Methods of Control