3: Major Issues

[next chapter] [previous chapter] [contents]



User Needs

Until the middle of the 20th century the information requirements of professionals, non professionals, and the general public were largely unknown. Librarians assumed (without reliable evidence) that the information needs of most of their patrons, particularly scholars and academics, were met. Until the development of the user studies field in the late 1940’s, building large library collections was thought to be the ultimate goal.

The wealth of knowledge gained from user studies over the past 40 years relates, in the main, to information requirements for hard copy. The explosive growth in electronic information systems during the past few years has taken place without any concomitant systematic feedback and evaluation. Technological solutions to the electronic communication of information are now running ahead of what is known about information requirements.

Many services have been developed by enthusiastic small groups of interested parties, who may know little about the development of bibliographical control, and the evaluation of services. Poor quality information, junk, and irrelevant information has been more difficult to identify and control in electronic form than in traditional hard copy. Developers and providers of electronic information services are concerned, but have done little about problems of information overload, quality control, image, coverage, and many other aspects that were of concern to the producers of hard copy indexing and abstracting services.

User studies have quantified the information requirements of one of more of the following seven categories of users: information specialists (including librarians); academics; business and industrial; government departments; public sector (e.g. health, transport, utilities); general public and international organisations.

The methods used most frequently to assess information requirements are questionnaires, interviews (structured and unstructured), check-lists, recording and analysis of queries to information services. Occasionally bibliometric analyses have been used to record and analyse citation patterns.

These methods are relevant and appropriate to the investigation of information requirements of users of electronic services. In addition, the usage of electronic services can relatively easily be logged, and the results analysed. A more detailed study can be carried out of users of electronic services than users of traditional card catalogues, bibliographical tools, and hard copy literature. For example, an

analysis of the borrowing patterns of users from libraries provides no insight into the detailed use (or sometimes non use) of items borrowed. In the electronic media, it is relatively easy to assess the usage of individual sections of documents although the cooperation of users is still required for detailed and specific feedback about usage patterns.

There are considerable differences between the information requirements and usage patterns of the seven categories of users enumerated above. A generalised picture of user groups is based upon the thousands of user studies carried out over the past 40 years, dealing mainly with the use of, and need for, traditional hard copy primary, secondary and tertiary literatures.

Information Specialists

Information specialists (including librarians) in theory need access to the world’s corpus of published literature. Bibliographic references are an obvious requirement, but in order to satisfy client requirements, easy access is required to abstracts and often full text. In the age of hard copy only the ideal access has never been achieved; that is, access to the bibliographical records of the entire stock of the world’s literature, and access to hard copy full text. There is no information system in the world that covers all published material, from all countries, in all languages, from all times. However, many services provide coverage of the world’s literature that is more than adequate. For example, the Library of Congress, and the British Library Lending Division (formerly the UK National Lending Library). In the electronic era, access to electronic bibliographical records, abstracts, and full text is relatively easy (in principle) if the material already exists in electronic form.

However, it is unlikely that all the past literature (even in science and technology) will ever be available electronically. It would not be cost effective, considering that requirements for materials older than about 20 years in most areas of science and technology are minimal. There are one or two exceptions (e.g. oceanography, botany, zoology) where scientists do make fairly extensive use of literature from 50 or 100 years ago. However, when more details are known about the information requirements of specific groups, a more directed program of digitalisation of material can be pursued.

Information specialists also have a need for access to unpublished materials (often referred to as the grey literature). Electronic information services are facing just as many problems in accessing unpublished literature as traditional information specialists services. Indeed, one of the great unresolved problems of electronic communication is the control, categorisation, indexing and retrieval of unpublished literature. Rather than attempt to provide unlimited access to every type of material (which is likely anyway to be impossible in the immediate future) assessments of the information requirements for unpublished material should be carried out in all fields, and across all categories of users.

Academics

The information requirements of academic users differ considerably from one discipline to another. Scientists and technologists typically require up-to-date information and make little use of information that is more than 10 years old. For example, in the physical sciences the so-called half life of the value of literature may be only four or five years. However, there are some exceptions even in science and technology, noted above. The overwhelming finding from user studies is that science and technology will continue to progress, for the most part, without access to information more than (say) 10 years old.

Assuming that most new information and publications in science and technology are now in digital form, it may only be a few years before the vast majority of scientists and technologists can make do with electronic communications only; albeit there will always be a need for them to print hard copy. Many information users in science and technology, as well as in applied areas and medicine, require exhaustive information retrieval services. For example, review articles have always been (and perhaps even more so now than ever) an extremely important aspect in the accumulation of knowledge.

In summary, scientists and technologists require speedy access, mainly to current and recently published material, often with exhaustive coverage of specialised and narrow topics. In addition, as the use of ‘invisible colleges’ and networks of communications has demonstrated, access to unpublished writings and data is also regarded as important.

The use of electronic communications provides ample opportunities for scientists and technologists to interact with each other and to overcome some of the problems of distance and transport of hard copy material, and attendance at conferences and meetings that limited networking and informal communications in the past.

Social science academic users have quite different information requirements. For example, some of the most frequently cited social scientists published 100 or more years ago. Social scientists are much more likely to access the literature of their fields at random and less systematically than scientists and technologists. They do require reasonably good access to past literature, but coverage does not need to be so exhaustive as in the case in science.

From about 1910 onwards nearly all disciplines and subjects in science and technology, the social sciences, and the arts and humanities developed indexing and abstracting services (and other secondary services), followed a decade later by tertiary services (i.e. guides to secondary services). In the electronic era, providers of information services have been quick to capitalise upon the new possibilities for secondary and tertiary services, although they are rarely referred to by these names. Fore example, the so-called ‘gateway information services’ have been developed for many disciplines and specialist groups. The gestation period (and,

indeed, at the moment, the cost) of electronic information services for specialist groups is small. However, electronic specialised services have not been subjected to evaluation, as has been the case for the past 40 years with hard copy secondary and tertiary services.

Business and Industry

The information needs of users in business and industry are quite different, although no less easy to satisfy than are users in the academic community. The vast majority of users in business and industry require (and prefer, but do not always obtain) information, data and knowledge rather than bibliographical references, abstracts of full texts of documents. They often require almost immediate access. Business and industry has not been particularly well served with access to hard copy full text, although in theory access has been possible. For example, a business or an industry that may be adjacent to a large public or university library often has not developed the means of access to local information services and material, either directly through an agreement between the business and the information supplier, or indirectly through inter-library loan services. In the future, electronic access will provide, at least in theory, equal access for both business and industry on the one hand, and academia on the other.

Government

The information requirements of users in government departments are often similar to the information requirements of academics, although generally more specialised. Government departments have often developed their own library collections, as well as specialised information services.

Public Sector

The information requirements of users in the public sector are diverse and in some ways similar to the information requirements of users in business and industry. Public sector employees in health, the utilities and transport have specific information requirements. Often they have not been well provided for within their own organisations. For example, although many large hospitals maintain libraries, the vast majority of healthcare professionals do not have the time, motivation or indeed skills to access the information services that already exist. The use of electronic services, together with point of use education and instruction, is likely to improve the information seeking and retrieval skills of users in the public sector. However, since their requirements can be very specific, the evaluation of existing (and future) information services is important, to ensure that the services continue to be developed in the right direction and fine tuned.

General Public

The information requirements of the general public are diverse and often unpredictable. Users of public libraries may demonstrate any of the characteristics of academic users, or users in business or industry or government departments. The majority of the general public does not use public libraries, although they still have information needs, or at least problems and enquiries that require access to information services. Many of the so-called information needs of the general public are not seen as information needs by the general public. The general public requires a seamless service, although the means to produce the information, advice, and advocacy they require may be complex.

Information services for the general public have proliferated during the last 10 years and this trend is seen particularly in areas of consumerism, travel, and the provision of healthcare. Information services at this level often leave a great deal to be desired, and the need to continue to evaluate the services is of paramount importance. There are still many unsolved problems, and the electronic era is likely to generate more.

There is a need for information specialists (including national providers of information services) to have access to the world’s published literature. It is unreasonable to expect national information providers to be able to access, in any degree of completeness, the world’s unpublished literature. For the foreseeable future, any national information service must provide for access to both electronic and traditional hard copy material, because of the impossibility and impracticality of converting the heritage of hard copy literature into electronic form.

Neither is it reasonable for a single organisation at a national level to provide complete access even to a single nation’s resources. The thrust of contemporary thinking is to provide an infrastructure and a methodology and encourage content holders to contribute. This distributed model naturally arises by combining the strengths of library and Internet information management strategies.

Governments in North America and Europe, in particular, have funded experimental and prototype information services, at all levels. Most of this work is of recent origin, and firm conclusions and evaluated results and best practices are not yet available. However, it is imperative for smaller countries, without similar investments in experimental and prototype electronic information services, to monitor closely developments elsewhere in the world, and perhaps to telescope some of the rather lengthy experimental programs. Not all experimental services have been (or indeed will be) evaluated. However, smaller countries can make a contribution to the overall world development of new electronic information services by carrying out evaluations (with their own populations and information requirements in mind) of electronic information services developed elsewhere, which (mainly for reasons of cost) cannot be duplicated (at least in their entirety) in smaller countries.

The electronic era has spawned an explosive growth in the development of specialised information services and products for every conceivable group and type of user. However, most of these new developments have not been evaluated, or indeed costed. It would be premature to alight upon any one or relatively small group of new electronic information services and to champion them, at the expense of other services, which in the longer term may turn out to be more effective.

Electronic Primary Communications

Primary communications include journal articles, monographs, letters and correspondence to editors. Primary communications can be in electronic form or traditional hard copy, and can exist in both forms. Gateway services may or may not refer directly to primary sources. For example, if they do refer to primary sources, a gateway service will include details about monographs, research reports, and articles within journals.

Primary communications which have no concomitant hard copy equivalent, usually have the following characteristics:

Primary communications in electronic form only will continue to increase, no doubt at the expense of traditional hard copy primary communications. Since gateway services are likely to increasingly include a greater percentage of electronic primary communications, the characteristics of these communications are of interest and concern to researchers and practitioners, in all fields and disciplines. The increasing emphasis upon electronic primary communications has implications for the development of knowledge; the implications differ from one discipline to another.

A critical feature of scientific knowledge is the peer review process, which ensures, by and large, that topics and problems in each major scientific discipline gradually develop a high degree of consensus, and form a body of knowledge which subsequent generations can build upon. The peer review system ensures that fraudulent findings are not reported (or if they are, are soon noticed in the scientific community) and that poor methods and unverified theories and practices are identified and eventually discarded from the body of knowledge.

Electronic communications that have not undergone the peer review process, or traditional publication processes, can of course be subject to severe critical comment by an electronic network of scientific peers. However, because the

permanency of electronic networks (and the critical comments they contain) may not endure, and accumulate, there is always the threat of a return to 17th and 18th century scientific practices (e.g. charlatans, alchemy, phrenology, etc).

The emphasis upon electronic primary communications has different implications for the development of knowledge in science, the social sciences, the arts and humanities, and other disciplines. The implications are not explored in any detail here, but could be the subject of the evaluation of a number of prototype electronic information networks, in which effectiveness and quality are compared across, for example, science, technology, social science, arts and humanities.

Studies of the information-seeking behaviour of social science researchers and practitioners, and the ways in which information and data accumulate (and in many cases do not accumulate), demonstrate that information (and by implication information networks and services) operate in very different ways; one discipline to another, and often one sub-discipline to another. Designers and operators of traditional bibliographical services (including librarians) and electronic services usually ignore these differences between disciplines. Gateway services can relatively easily capitalise upon the very different information requirements of sub-disciplines even within a major field. In the past, for example, it would have been very difficult for the major psychology secondary service (i.e. Psychological Abstracts) to provide custom made services for (say) 100 or more different sub-groups and specialties within psychology. However, electronic services are able to do this, very quickly, and with relatively little expense, compared with traditional hard copy services.

The proliferation of services can be welcomed, albeit with caution. Many new possibilities are now open to design and operate information services more clearly geared to user requirements; but at the same time many opportunities are now open for inadequate and poorly controlled services, as well as easy entry of information that is fraudulent, inaccurate, poorly researched or misleading.

Cataloguing, Metadata and Indexing

The effectiveness of information retrieval systems is often a direct reflection of the effectiveness of cataloguing and indexing information. Solving the resource location problem for integrated media in a library context is in essence the same as solving it for the Internet. Internet resource location projects (often run in conjunction with libraries) are aware of this and are building it into their solutions. An agreed essential step in the location problem is better cataloguing techniques using metadata, but it is clear that this in itself is only a very short-term solution. Current cataloguing projects are working within metadata standards, which either replace or work with these robots with descriptions of indices, including pointers to other similar indices.

The OCLC Intercat project attempts to manually catalogue electronic resources by including MARC field 856 to contain the resource URL. Given the rapid rate at which material is being added to the web, it is already impossible for human cataloguers to keep up. It will only ever be practical for a small core of very important resources to be catalogued in this way. Existing projects such as InterCat and the WWW Experimental catalogue (North Carolina State University) containing hypertext linked URLs provide very small example databases which appear to allow limited search capabilities. The UK Catriona project assumes a distributed model for such cataloguing projects, unlike Intercat which uses a centralised model.

In qualitative terms, the net effect of these projects is to increase the indexing power by an order of magnitude, in process achieving a neater and more manageable set of indices. But there is continued exponential growth of the Web (both users and providers) meaning that the pressure on indexing search tools will become greater at an increasing rate. Library resources are also increasing. Should we anticipate a move in a few years’ time to meta-metadata, and then meta-meta-metadata after that? The information retrieval community is acutely conscious of this prospect and the unmanageable complexity it implies. This is why the future of searching mechanisms is extensible cataloguing, describing information in a way that is sufficiently abstract that it will accommodate future technologies such as adaptive databases and distributed intelligence.

Development of automated tools and PURLS (Persistent Resource Locators) may make these cataloguing type projects somewhat more viable in the future by ensuring that electronic location information is stable. Current projects make no attempt to integrate print and electronic sources although obviously having consistent and compatible catalogue records available would be a first step towards building an integrated catalogue.

An alternative approach to the issue is the development of the Dublin Core data element set for describing network-accessible ‘document like objects’ based on metadata. The intention of the Dublin Core is to determine standard descriptive data which can be included at the time of document creation and automatically harvested to form searchable databases. Under the Dublin Core metadata information can be presented according to a range of classification systems, which in itself is identified. If work on the Dublin Core proceeds and full support is gathered it creates good potential for the design of tools for the subject cross searching of multiple classification systems.

In the short term initiatives such as the Dublin Core are also likely to act as a quality mechanism, as only the producers of the better quality information are likely to be capable of applying the system. GILS, the Government Information Locator Service of the United States, provides another standard for metadata, upon which the Global Information Locator Service is based, and the Text Encoding Initiative provides another standard. The existence of two systems with the same acronym, GILS, is

confusing. To avoid confusion in this report, the Government Information Locator Service is always referred to in full, with the GILS acronym being used for the Global Information Locator. Owen & Wiercx suggest that it will take years for standards on metadata to be resolved (1996, p. 53).

A massive stumbling block to effective retrieval across sources is subject control. The use of very different thesauri and natural language make for many retrieval difficulties, which projects such as the Unified Medical Language System are attempting to counter through the development of a meta-thesaurus translating tool. This is a priority for Australian attention which should be carefully planned to complement rather than duplicate international efforts. There is abundant scope for projects emphasising materials in line with the development of the proposed international standards.

Issues of Quality

In the rush to provide access to electronic resources, the issue of quality is often overlooked but it is an issue of vital importance in the integration of print and electronic resources. The major stumbling block in the development of systems to integrate the retrieval of print and electronic resources is the issue of quality. Closely linked to the issue of quality is the issue of quantity. Vast amounts of networked information resources are essentially irrelevant for serious research purposes, comprised of vanity publishing and infotainment type resources. This vast amount of extraneous information serves to both obscure quality resources and, due to the sheer volume of material, complicate the creation of integrated systems by ensuring that any proposed systems must include some sort of filtering mechanism.

Such has been the Australian experience with the project to trial the Harvest indexing system which intends to create an index of Australian government information. Dack (1996) reports that the listing is still incomplete as insufficient disk space has been available to complete the project. Although it could be assumed that in the area of government information the majority of material is likely to be quality, the automated indexing process gathers all and any information, regardless of the minor or peripheral nature of the information and such mechanisms contribute to information overload. The vast indexes created using automated processes suffer from high recall that creates an unacceptable level of user frustration and information overload.

In terms of electronic information, specifically web-based information services, it is vital that services provide the most direct route to resources as possible, avoiding the creation of web pages that are based on no original content or the provision of circular links that lead nowhere. Information literacy issues must also be addressed in quality systems. Systems are needed which can signpost to users the processes that are underway with an option for users to opt for ignorance.

The criteria used to evaluate traditional hard copy services include:

Tillman (1996) suggests that in evaluating electronic resources the following should be clearly identifiable: organisation responsible, who hosts the information and why, the authority of the moderators, completeness of the archive, who and how selection of resources is determined and when pages were last updated.

Electronic services often maintain that they provide a quality service, or access to quality information. However, there are few examples where services define quality, or cite the results of studies of quality. Informal discussion with a number of gateway services highlights the problem.

Many services include or omit material according to their own preconceived ideas about usefulness, relevance, or ‘quality’. Gateway operators have little idea about the nature and extent of the users of their services, and receive no systematic user feedback about relevance, timeliness, coverage, usefulness, and so forth.