The (7) seven stages of the information retrieval.
(1996) is famed for writing the seven stages of information retrieval. Lask is account seemed to be a criticism of
Bush’s predictions on information retrieval systems since he referred a number
of times to Bush’s writings and predictions.
Shakespear had described the seven ages of man (Shakespear 1599)
starting from infancy and leading to senility.
The history of information retrieval parallels such a life, assert Lesk
(1996:1). Lesk says the popularization
of the idea of information retrieval started in 1945, with Vannevar Bush’s
article. In this paper the author is
going to discuss the (7) seven stages of the information retrieval.
of Terms
author will define what information retrieval is and who Michael Lesk was. According to the Wikipedia,
Information retrieval (IR) is the activity of obtaining information need from a
collection of information resources.
Searches can be based on metadata or on full-text (or other
context-based) indexing. According to
Cambridge Up (2009:1) the meaning of the term information retrieval can be very
broad. Just getting a credit card out of
your wallet so that you can type in the card number is a form of information
retrieval. However, as an academic field
of study, information retrieval might be defined thus:
retrieval(IR) if finding material (usually document) of an unstructured nature
(usually text) that satisfies an information need from within large collections
(usually stored on computers).
turning to who Michael Lesk was, according to the Wikipedia, he worked for the SMART Information Retrieval
System project, wrote much of its retrieval code and many of the retrieval
experiments, as well as obtaining a PhD from Havard University in Chemical
Physics. He worked for Bell labs and in 1984
started working for Bellmore, Cornell, Online computer Library Center,
currently, he is on the faculty of the Library and Information Science
Department, school of Communication & Information, Rutgers University.
to Lesk (1996:2) the seven ages of information retrieval starts with a stage he
named the Childhood ranging from the year 1945 to 1955. Lesk says during this period digital
recording, photographic inventions though the fall for short from Bush’s
expectations. There was information
explosion after the World War II and high possibility of information processing
machine. Information storage predictions
by Bush were not met during this period.
had predicted the design of the user interface during this period but Lesk says
when scientists or engineers want information, they look first in the closest
sources. They consult personal notes,
ask nearest colleagues, ask their supervisors or other local authorities, on
only much less frequently do they look into journals. Research into information retrieval began
with the Soviet Union launching its artificial Earth satellite
invention of KWIC indexes, concordances as used in information retrieval, by
researchers such as H.P Luhn was done during this period. Concordances were still used where a small
text is to be printed for any kind of large collection. There were also innovative attempts at
alternate kinds of machinery such as the use of overlapping codes on
edge-notched cards by Calvin Mooers. The
most famous piece of equipment of this period wat the WRU Searching Selector
built by Allen Kent at Western Reserve University. All of this special purpose technology was
swept away by digital systems of course.
He could no more find edge-notched cords today than he (Lesk) could find
IBM punched cards.
second stage of information retrieval is the schoolboy which is the period of
the 1950s – 1960s. this was a time of
great experimentation in information retrieval systems. Lesk (1986) says this period also saw the
first large scale information systems built.
Commercial library systems such as Dialog and BRS can be traced back to
experiments done at this time. The
period saw the definition of ‘recall’ and ‘precision’ and the development of
the technology for evaluating retrieval systems. It saw the separation of the field from the
main stream of computer science. During
thei period many people attended (IR) conferences.
this period the idea of free search arose.
There could be complete retrieval of any document using a particular
world. Cyril also developed the
mathematics of ‘recall’ (fraction of relevant documents retrieval) and
(precision) fraction of retrieval documents that are relevant as measures of IR
systems. Throughout this period of the
1960s, there was relatively little actual computerized retrieval going.
third stage of information retrieval is called the adulthood, a period of the
1970s. There was the development of the
computer typesetting, word processing and line-sharing systems. Commercial system such as Dialog, orbit and
BRS were developed. Computerization made
is faster and easier to combine first the weekly indexes into yearly collective
indexes. The tape to do this could then
be provided to services such as Dialog for access by professional librarians.
(19960 also say another early system was OCLC, the Online Computer Library
Center founded by Fred Kilgour, it used the output of the Library of Congress
MARC program for machine readable cataloguing.
The decade saw the start of full-text retrieval systems started. The 1970s and 1980s were a period when both
databases and office automation research boomed. Keik Van Rijsbergen introduced the
probabilistic information retrieval.
This involved, Lesk wrote, measuring the frequency of words in relevant
documents, and using term frequency measures to adjust the weight given to
different words.
fourth stage of information retrieval is called the maturity stage, the period
of the 1980s. sugimoto and McCain
(2010:2) says Lesk called the 1980s as ‘a decade in which online information
became common’. Markey also noted the
transition which occurred in the 1980s, stating:
to 1980 the few available online information retrieval (IR) systems were
expensive and complicated to search, so end users either delegated their
searches to trained intermediaries or searched card catalogs and print indexes
on their own. In the mid 1980s, online
IR systems suppliers introduced simple fronted interface to IR systems and marked
search services to end users but never used these services because of their
high cost’
sentiment of moving form intermediaries to novices was also described by
Savage-Knepshield and Belkin who noted a change in design to facilitate these
respect to other developments, Lesk describes the rise and fall of artificial
intelligence within this decade; the enthusiasm for expert system and knowledge
representation languages in the early 1980s, the exponential increase in the
number of databases available online; progress in OPAC development and
widespread use of the CD-ROM, especially by the end of the decade
fifth level of information retrieval according to Lesk (1986) is called the
midlife crisis, which is the period of the 1990s. he notes the ‘technological revolutions’ that
is the internet and its likely impact on information retrieval, predicting that
the internet would become ubiquitous within the next decade (2000-2009). Lesk
describes development of scanning, online publishing, computer networking and
the rise of digital libraries in this decade.
Internet put information retrieval to the test. He concludes by noting the information retrieval
evaluation, crediting TREC with the creation of “realistic collections for
evaluation”, in comparison with the small test collections of the 1980s. other authors have noted that TREC;
both the state of the art of information retrieval in general of that of IR
experimentation in particular for a history treck”
(2007;1071) says internet on IR was reinforced in many historical writings, he
says “end user searching truly came into its own with the development of web
search engines in the mid 1990s”.
sixth stage of information retrieval is called the fulfillment, a period of the
2000s. Lesk writing speculatively about
the future, called the 2000s a time of ‘fulfillment’ for information retrieval
prophesising that 2000-2009 would be a time when most ordinary question would
be answered by reference to online materials rather than paper materials, when
books were routinely offered online, and that libraries would actively work
toward a retrospective conversion of their collections. He predicted problems to do with work toward
better retrieval technologies for images, sounds and video, better procedures
for handling online publishing; and a need for new technique in cooperative
seventh stage of information is called the retirement, which are the years
starting in 2010. The basic job of this
period is the conversion to machine readable form of all the remaining
manuscripts left. Lesk says another
serious problem is that of the copyright law.
There will be a great problem in dealing with copyright issues when it
comes to digitization of paper records.
author have discussed the development or history of information retrieval as
authored by Michael Lesk starting with the childhood (1945-1955) followed by
the schoolboy (1960s), followed by the adulthood (1970s), then comes maturity
1980, followed by the midlife crisis 1990s, and finally the fulfillment (2000)
and retirement 2010.
1. K.
Markey, twenty-five years of end-user searching, part 1: research findings, journal of the American
Society for information science technology, 58 (8) (2007) 1071-1081
2. M
Lesk, the seven ages of information retrieval, UDT Occassional paper #5 IFLA
(1995) Available at
accessed 26/02/15
3. I
Ruthven, Ineractive Information Retrieval.
In B. Croni (ed), Annual Review of Information Sciences and Technology,
42 (1) 2008 43-91
4. Sugimoto
C.R, McCain K.W, Visualising changes overtime:
A history of information retrieval through the lens of descriptor
tri-occurrence mapping-journal of information scienceXX(X) 2010 pp1-15. Available on
Etiwel Mutero is an archivist by profession .If you need assistance on your assignment or you want a workshop on records and archives management contact him on +263773614293 or
Post a Comment