4: Analysis of Projects

[next chapter] [previous chapter] [contents]



In considering the integration of print and electronic information it is useful to identify the broad categories of contemporary information sources: library information, commercial services (electronic and CD-ROM databases) and Internet/networked sources. Under these categories it must be recognised that many existing commercial services are developing services on the Internet.

Integration of Print and Electronic Information

Very few resource discovery projects exist which effectively span the electronic and print information worlds. Lund Electronic Library, World1, and e-MATH are some of the few available examples. Attempts to develop integrated systems for print and electronic information take a number of different forms and approaches.

Electronic/Digital/Virtual Libraries

One option for solving resource discovery problems is the creation of electronic/digital/virtual libraries. All three terms refer to library systems delivered electronically. In practice there are minimal differences between the terms and the use of any particular term is a matter of individual/organisational preference. Electronic libraries as a term appears to be more favoured within Europe and such projects are based around an existing library service. Digital libraries are most prevalent in the United States, integrating a number of resources around a core of digitised information. These projects tend to be large scale, well funded, holistic type activities. Virtual libraries usually refer to systems which are primarily comprised of a number of web-based links, organised around no specific physical library collection.

The NSA/ARPA/NASA digital libraries project involves a number of projects which are looking at digitised collections of different information media; for example, the Alexandria project focuses on spatially indexed materials such as maps, while the Informedia project is establishing a video digital library. These large scale digital library projects produce a range of smaller outcomes in the form of new retrieval and indexing tools. Digitisation has a role to play in the preservation of print heritage; however, no similar digital library projects have been identified in Australia, probably primarily due to the prohibitive costs involved.

Digitisation projects will have limited impact on the integration of print and electronic information due to the difficulties and costs of conversion and storage of digitised information. For example, Lim (1996, p. 36) cites Crawford and Gorman (1995, p. 92) who suggest that the space required to store the compressed files of a complete run of PC Magazine is 3 terrabytes, which is apparently more than the total storage space of the Online Computer Library Center (OCLC).

Multimedia information, specifically sound, video and graphics files, all require vast amounts of storage space in a digital system. Associated problems are issues of scanning standards and reader software which are continuously evolving, making particular digitised formats unreadable. Digital translations of print materials suffer also from the loss of the contextual aspects of the physical item (Nurnberg 1995).

The longest established and most comprehensive electronic library is that at Lund University. The Lund University Electronic Library links a traditional library catalogue with electronic databases and Internet resources. Lund links four servers: Gopher, WWW, WAIS and ftp, although the Gopher service is currently being converted to a web-based server. A variety of search methods are provided and the resource discovery capabilities of the system are immense, including WAIS searching of multiple databases via a web interface. Continual research is an integral feature of the Lund system and it is linked to a number of related projects including DESIRE, and Development of Information Services for the Academic and Research Community in Europe.

No formal evaluations of the Lund electronic library have been identified; however, it is relatively simple to identify the strengths and weaknesses of the system. The great strength of the Lund Electronic Library is the drawing together of the three major information sources: library, commercial and networked information resources via a single interface. This ensures that researchers can clearly identify the range of available information, if not have the capability to search this information in an integrated fashion. In addition, the Lund electronic library was identified during the course of the IDA project as providing the most comprehensive and informative web pages on ‘browsing and searching Internet resources’.

The major weakness of the Lund Electronic Library is that it makes no provision for integrated searching to be performed. Any individual search needs to be carried out, at minimum, once for each information source and no mechanism is available which allows a search strategy to be saved and re-executed across the different information sources.

The largest and most established Australian electronic library is ELISA, the Electronic Library and Information Service at ANU. ELISA provides links to ANU library services, including an electronic reserve, OPAC and document delivery options of online interlibrary loan requests and online document supply

information. Registered users of the ANU library service can also access databases through ELISA. Under the options ELISA offers is a gateway to Internet services, including electronic texts and journals, Internet indexes, OPACs and links to all ANU servers.

Another useful Australian model is the Queensland Department of Education Virtual Library, established to provide services to geographically isolated department staff. This library provides a useful model, combining electronic resources with a solid customer service ethos. According to its creators it is an example ‘... that people with a clear idea of what they want to achieve can overcome budgetary and technical constraints and find a way’ (http://cooroomba.client.uq.edu.au/VLSERVER.HTML).

The Australian working group, Electronic Libraries Forum, is attempting to monitor developments in standards for linking library systems and, accordingly, to provide advice to the Australian library community on inter-workings of library systems. However, this group meets infrequently and has no practical project on which to base its work. Its brief appears somewhat narrow, confined to a discussion and advisory role, and lacking the capacity to provide the practical assistance in transforming library systems that is urgently required for academic libraries.

Virtual libraries are very similar projects to subject indexes and subject gateways. The unfunded World Wide Web Virtual Library (WWWVL) is an extremely useful resource, which provides people with mediated subject links to electronic resources. Recently funding has been secured by Charles Sturt University, under the Improved Information Infrastructure Network Information Support funds, to continue its work on the education virtual library section of the WWWVL.

The strengths of virtual libraries are threefold. Firstly, they are perfect for linking distributed resources; secondly they encourage the creation of indexes and directories of electronic resources; and thirdly, given that there is no need for a physical library building or collection as such, provide a useful cost effective approach. It must be noted here, however, that in general the increase in electronic sources and services does not necessarily result in cost reductions in comparison with equivalent print services. Electronic services require a heavy investment in user services, such as training and staff mediated searches (Lim 1996, p. 35). This need for increased staff-provided services was a significant finding of the ANU review of the university library (ANU 1996).

Obviously such projects in Australia are, on the whole, at an early stage and there is a clear need for a better coordinated approach. The majority of work that is being carried out is performed by individual institutions as merely a part of their other functions. Progress in this area would be greatly enhanced by the creation of a specific dedicated project with adequate staff funding.

To be successful such libraries must be based around a solid core of original content. Tertiary institution libraries are well placed to index, organise and make accessible the significant information output of academia and should be in the forefront of such endeavours, leading by example. As appropriate, many libraries may seize the role of electronic publisher for their institutions.

Gateway Services

An information gateway brings together links to useful and relevant sources in order to aid navigation and to make some sense of the apparent chaos of large networks covering many disciplines. Gateway services provide academic researchers and practitioners with easy subject based access to networked resources worldwide (e.g. discussion lists, data archives, journal titles, journal articles, monographs, databases, grant awards and resulting publications, software programs, and book reviews).

An example of an information gateway, supported by government and national initiatives, is the Social Science Information Gateway (SOSIG), supported by the United Kingdom Government initiatives: Electronic Libraries Program (eLib) and the Economic and Social Research Council (ESRC). There are many other gateway services in Canada, the United States, the United Kingdom, and other European countries. Gateway services provide access to resources under main subject headings (e.g. SOSIG includes headings from anthropology to statistics), as well as many other services of general interest to social scientists.

Although subject gateway services tend to be focussed in Europe, there are many other gateway services in Canada, and the United States. United States approaches have developed from the perspective of professional organisations and resulted in gateway services such as e-MATH and the American Chemical Society. It is these types of gateway services that make significant attempts to integrate print and electronic sources.

The e-MATH web pages coordinated by the American Mathematical Society (AMS) have been developed through an interdisciplinary approach, involving mathematicians, publishers, librarians and computer specialists (Tillman 1996). The e-MATH provides access to mathematics on the web, AMS publications, science policy (USA), education, professional information and services, meetings and conferences and the home page of the AMS secretariat.

The major strengths of e-MATH are firstly the comprehensive maths coverage of the server, the original content carried on the server, the offering of a number of access versions to cater for differing client software and the obvious thought that has gone into the integration of print information. Included in the service is the full-text of 50 years of print publications. However, similar to Lund electronic library, e-MATH has not been able to overcome the search obstacle. All identified information sources must be individually interrogated, whether print or electronic, with the end-user having to manually integrate the search results.

There are a number of problems relating to the new gateway services that need urgent attention. Problems include:

Other problems relate to the quality of indexing, searching mechanisms and navigational tools. Electronic gateway services are only as good, in the last resort, as the index terms supplied to communications by authors, or added (at a later stage) by the operators of gateway services.

An evaluation of gateway services is likely to demonstrate that many of the present claims about quality are unfounded. Indeed, many of the electronic gateway services may fall far short of existing traditional (and originally hard copy) information services, although for a number of reasons users may at the present time prefer (and no doubt continue to prefer) access to information electronically, rather than through traditional hard copy.

Many gateway services purport to offer quality services. For example, SOSIG maintain that ‘SOSIG project staff are committed to improving the accessibility of relevant, quality information and encouraging new networked information providers as well as providing training and training materials specifically tailored for subject specialists in both paper and on-line forms’. The question of quality and quality control in information services and provision needs to be viewed in the context of quality issues in traditional hard copy bibliographical and other information services, as nearly all major indexing and abstracting services evolved in a former age when only hard copy was available.

The DESIRE project highlights the need for the development of more information gateways and includes intentions of creating an European gateway, whether it be in the form of a subject gateway, subject guide or virtual library. All are very useful test-bed projects as all involve work which specifically must be performed independently for each country. That is, Australian materials must, at some inevitable point, be organised for easy access regardless of all else. The addition of keyword searching capabilities to subject arranged services illustrates the strength of the subject-gateway approach.

Few comprehensive Australian subject gateway/virtual library services exist. These services are urgently required to bring together academic electronic resources. The few existing Australian subject based services display potential, including the Aussie Index subject listings of government and education

information and the Australian Internet Directory from Wide West Media, which includes subject arrangements under the following categories: government, computers, entertainment, arts, education, science, sport and technology. Sofcoms Australian Internet Directories offers a useful subject based index.

There is now an opportunity to evaluate gateway services, and influence their development. Gateway services can be designed, and redesigned, operated and made available at a fraction of the cost and time it takes to modify and change traditional hard copy information services.

There is therefore considerable motivation in evaluating the embryonic gateway services, with a view to making them particularly relevant to areas that are judged to be high priority for Australian research, particularly in science, medicine and technology.

The Web and Z39.50

The most important thing about the World Wide Web is its unique and continuing success. Any work in global information organisation and retrieval must be done in the context of what is now an ubiquitous standard. This presents problems in providing search facilities across all media, encompassing traditional information provision as well as vast and disparate Internet resources. Every substantial existing online information system is designed around a session oriented architecture, which is fundamentally different to the stateless nature of the web. Each paradigm has particular strengths, and modern search tools must accommodate both.

There are many metadata-related efforts underway within the information retrieval industry, and all of them are committed to either a Z39.50 architecture or at least Z39.50 interoperability. However, the nature of the two protocols means that they have clear areas of specialisation. Z39.50 is well suited to representing a set of information systems of arbitrary nature and number to client software of arbitrary type in a completely consistent way. The web is designed around hypertext links. and web does not support the notion of data structured in a standard form.

According to Favaro and Hammer [http://130.228.5.168/webz.html] the web is an ideal vehicle for organisations that are ‘vertically integrated’; that is, organisations which are owners of content that they can present to the user in a structure of their own choosing. That is why many media and entertainment companies are showing a great interest in the web today. But when users must actively search the web for information across organisations, they encounter a sea of largely unstructured data.

From the users’ perspective, the differences between the two protocols are not great. Z39.50 servers can appear as web documents, and web servers can be addressed via Z39.50. Each can search the other’s information space. From an architectural viewpoint, however, the differences are crucial. Australia has the opportunity to participate with the leaders in global resource location efforts, laying a foundation that new retrieval technologies can be built upon.

The web is simple, stateless, cheap to implement and meant for providing graphical representation of data that has a structure known to the server implementing the protocol. Its great strength is its hyperlink navigation abilities for shifting focus from one resource to another, with no more overhead than to move from one point to another within a single resource, a capability related to its stateless nature. It is extensible only by major standard revisions conducted by the international W3 organisation, and extensions under current consideration involve adding some inherent state facilities.

Z39.50 is complex, has state, is more difficult to implement than web protocols. The standard describes an abstract information system with a large set of facilities for searching and manipulating data of an arbitrary nature. Behind this standard system can be a database of any kind, a set of pointers to other databases, or even no database at all. It has a mechanism allowing extensions by means of its application profile facility. Four application profiles already exist, including WAIS. It is this extensibility that has led to Z39.50 being selected as the starting point for the G-7 group’s national level metadata projects, and which makes Z39.50 very much more than just a vehicle for standardising interfaces to library catalogues and other databases.

The web is stateless. This means that every transaction is treated individually, unrelated to any previous or subsequent requests. Every single new page access from a web browser might as well be from a different user as far as the protocol is concerned. This has advantages when dealing with hundreds of thousands or millions of individual short-duration requests for information per day, because the scaling is cheaper and more robust. The Z39.50 protocol is session-oriented, that is clients (called ‘Origins’) and servers (called ‘Targets’) have a conversation that covers the whole time that the Origin is interested in the Target. This means that Targets must keep state information about what has happened in the conversation so far. It is more efficient since it saves resending a lot of information, allows for predicting what will come next but it makes for a more complex and less scaleable server. There is no concept of a user interface within Z39.50, whereas a web transaction is largely composed of graphical descriptive elements.

There are enough architectural similarities to make interoperability a fairly simple operation technically. Both protocols are based on the client-server model of computing, and neither mandate that a human must be at the client end of the transaction (although the web is much less general in this respect). In both, a server may act as a client to gather results and return them to the original client transparently (again this was not a fundamental design parameter of the web). The web is invariably layered on Internet protocols, and Z39.50 usually is.

In general, Australian academic libraries are currently represented on the Internet through library home pages which provide some information about library services and links to Internet resources. Australian university library web sites which should play a crucial role in directing information seeking behaviour are comparatively new, underdeveloped and primitive. Little work has been done on developing their own subject guides (annotated or otherwise) to useful Internet resources from an Australian perspective. Such projects are usually conducted by subject/liaison librarians of individual institutions, such as at the University of Adelaide and University of South Australia libraries. The latest round of funding under the National Priority (Reserve) Funds for Improved Information Infrastructure included funding for an Australian Information server in History and the Humanities, to be established by the University of Melbourne and James Cook University.

For most libraries, library catalogues are still only available via telnet, meaning users must use the search conventions and language of the local catalogue. Currently few Australian libraries provide a Z39.50 interface to their library catalogues; Monash University and the LISWA are among the few examples. Z39.50 systems have the following advantages: they allow disparate information systems to appear the same and to be searched in the same way by the end user; they can be used to locate other Z39.50 servers as they provide a session based interaction. Potentially they can be used with re-executed search statements (Hammer & Favaro 1996).

The immediate future is likely to see a great upsurge in the adoption of Z39.50 servers owing to the greater search efficiencies offered (Weibel 1995, p. 636; Hammer & Favaro 1996).

Australian research groups already working on Z39.50 and web integration include:

Interfaces

It is necessary that the integration capabilities of interfaces are also considered; that is, the capability to integrate with the major delivery mechanisms of the three types of information: library, commercial databases and Internet resources. This is an area in which work is making steady progress. The Electronic Reference Library (ERL) software from Silver Platter has the capability to perform basic keyword, author or title searches across multiple ERL databases. Included in the ERL software is the capability to automatically read the catalogue journal holdings list directly into database records. A newly released Z39.50 client from the OVID company allows the online integration of library holdings with commercial database records.

Although there is plenty of scope for further work on interfaces, it appears that much work is being carried out by commercial interests. With commercial services such as online and CD-ROM databases, there is a tendency now for these services to be linked to library OPACs. Licensing requirements means such services are available only to members of the host institution. Such services are not linked to either local catalogue records or full text electronic sources.

In Australia, World1 has a focus on interfaces, providing a PC based GUI interface which allows sophisticated searching, and a more simplistic web interface. World1 will carry a range of databases allowing the retrieval of print and electronic sources. It will host the Australian Bibliographic Network (ABN), the cooperative cataloguing database and other indexing/abstracting databases such as Ozline and Medline. Gateways will be provided to as yet unidentified overseas databases. World1 will also include Internet connections and links, in addition to maintaining a collection of Australian electronic publications. However, once again, searching World1 will involve the need to re-execute a single search statement across the range of information types.

Alternative Approaches

Union catalogues such as OhioLINK are a useful approach to information retrieval that either creates a central, extremely large database or links a number of related databases, such as academic library OPACs. OhioLINK provides an interesting example, with the great strength of the system being the document delivery function it performs. Registered users of member institutions are able to request items held by other member institutions or copies of journal articles and, through the courier system, receive in stock items within 24 hours.

Although OhioLINK cannot be said to bring together print and electronic media in a genuinely integrated way, it does, however, make very useful steps towards integration by linking the options of the library catalogue, online services and Internet access through one single telnet interface.

Through OhioLINK, members of the member institutions can search either the single union catalogue of 42 library OPACs or select any single library catalogue to search. Cross searching of any other number of catalogues does not appear to be supported. The OhioLINK interface also provides access to a selection of Internet resources.

If in the future the serial holdings of member institutions can be automatically fed into the relevant online database records, as well as providing dynamic links from database records to full text electronic information, OhioLINK will be offering a system that is well on the way to providing a best practice example of the level of integration that is almost possible with existing tools.

Although systems like OhioLINK perform very well, they have little applicability to the Australian research environment, as the Australian Bibliographic Network (ABN) which will be delivered to a much broader audience of end users through World1, performs a very similar purpose containing catalogue and holdings records of the majority of Australian libraries. On a regional basis similar cooperative and reciprocal systems also already exist, such as New South Wales UNILINC.

GILS—Global Information Locator Service

The Global Information Locator has a simple goal: to make it easier for people to find information. Information of all kinds, in all media, in all languages, for all time. Many pieces of the infrastructure for doing this are already in place. GILS is a way of describing information resources in a universal manner so that these existing pieces can work together while allowing for the radically different information society of the future.

The idea initially stemmed from a series of projects initiated by the G7 group of industrialised nations through the Global Information Society Pilot Projects. There are 11 of these themed projects. Theme six is Environment and Natural Resource Management, the only one coordinated by a United States team. The primary goal of the ENRM pilot project is to seek consensus on a Global Information Locator, using the United States Government Information Locator Service as a model. Progress has been very good, and it is expected that there will be formal endorsement in May for using Z39.50 as the base standard of the Global Information Locator.

The Global Information Locator initiative focuses on sharing locator information across organisations. Whether in libraries or shopping centres, the goal is to make it easier to find information. It is currently believed that the WWW consortium are shortly to announce moves towards a common GILS.

Locators—Linking Print and Electronic Resources

A locator is a kind of information resource. There are usually four categories of things it can contain:

These can be plain language references and instructions for humans, designed for completely automated reference, or anything in between. The Global Information Locator is not just a locator to networked information, it is a locator for all kinds of information on any media. It can also be expressed in all kinds of media.

The TV Guide is a locator for television programs. An atlas is a locator for places. The White Pages is a locator for people who have telephones. The Yellow Pages is a locator for businesses. Shops publish a locator of the products they sell and shoppers use this catalogue from home. Those that do not publish a catalogue have a locator for internal use—their product inventory. Libraries are the most common places to find a large range of locators, but many other organisations also keep a range of locators.

The important point is that locators are to help people find information through whatever means are appropriate for their particular community. Besides print media of books, postal mail and magazines, common electronic media for publishing locator information includes bulletin board systems, the World Wide Web, CD-ROM or X.500. Print media includes books, periodicals, or newsletters. A locator can even be expressed personally by providing a telephone service (such as the Australian white pages directories service) or an information desk in a shopping centre.

Network mechanisms are needed to decentralise the information locators, but it must be possible to access locator information without being on a network.

Avoiding Vertical Integration

The Global Information Locator allows content owners to avoid vertical integration with distributors, whether by commercial design (such as competing encyclopaedias) or by default due to the nature of the medium (for example, the lack of standard interfaces has led to substantial vertical integration on the World Wide Web).

When vertically integrated, information is likely to be less widely accessible and it is harder for locators to reference other media. Why doesn’t your TV listing include your local theatre and high school sports events? Why don’t newspapers also provide references to related materials in library catalogues? Why doesn’t each element of a WWW virtual library list all its holdings around the world? In each case it is difficult to achieve because of vertical integration.

The Global Information Locator allows content owners to avoid vertical integration with distributors. It places no barriers between media and no constraints on how information is organised. A very natural arrangement is a web-like mesh, rather than the rigid tree arrangement seen in most information repositories (particularly the World Wide Web). With GILS, all kinds of information sources link to each other in many ways, all using a standard description method.

Sometimes it does make sense to arrange information in a tree-like structure. The Global Information Locator supports that model too. For example, Canada and the United States have an organisational directory for Federal information sources. These tree structures coexist with other independent structures, perhaps organised by topic or by region, or in a way which makes more sense to children. This gives maximum flexibility.

Data is any collection of recording such as numbers, characters and images. This can then be put into a context to become information. A collection of information yields patterns and over time becomes what we call knowledge. Information about information is thus also knowledge. It is knowledge that GILS is primarily concerned with locating. GILS can present the resources of the world in a consistent, known way, accessible by any current or future knowledge-seeking mechanism. What is implemented now as a simple search across information locators may, in the future, become an autonomous agent building up its discovery tools as it goes, but the underlying structure will be the same. In many fields of endeavour management practices (whether involving computers or not) can enhance or completely replace humans in the acquisition of data, processing information and distilling knowledge. Progressing from this to wisdom, however, is a step which is completely human.

Navigation

A Global Information Locator works with information ranging from home to office, community group to corporation to libraries, museums, and archives. Soon it will be possible for it to include all communications media—even television and telephone. Following the example of libraries, architects of the several pieces of the global information society are appreciating the need for common search mechanisms across this vast range of information sources. This is particularly true for the Internet and World Wide Web development communities, where the rate of growth is most evident. Whilst unstructured Web navigation is easy, directed navigation is normally possible only within narrowly defined subsets of information resources.

A Global Information Locator allows anyone to gather and organise information in new ways. As new paradigms for organising are developed they can be used for accessing existing information without any changes required. This is exactly the opposite of what has happened in libraries over the years, firstly with a succession of mutually incompatible manual systems, then with electronic systems that did not work with the manual systems and even successive electronic systems that would not search across the previous system.

Navigation across these resources must be transparent, and can only be so if a common descriptive mechanism is employed. GILS is the only system proposed today that can embrace all major markup standards for all major media.

Java

The Java language has become one of the strongest growth areas in application development in a very short time. Key features of Java that make it an attractive language for software developers include:

Within the first year this already includes every major platform, and the trend continues. Java is the first language with universal potential. Although it was not arranged deliberately, these features also happen to make Java ideal for applications to run over the Internet within the context of the World Wide Web. Routinely downloading and running program code from Web servers would normally be very risky because of the potential for accidental and deliberate damage to the local computer. However, Java’s security features make it possible to isolate everything downloaded while still providing useful functionality. One of the main reasons for the success of the Web has been that it works on nearly every kind of networked computer, and this portability is matched by Java which already extends to nearly every platform that has a graphical Web browser. The simplicity of Java lends itself to rapid development of small, self-contained applets—a model which matches the rate of change of requirements on the Internet.

Australia is well served by companies and individuals specialising in Java. Local projects range from strategic compiler technology to short-term Internet utilities. Australia is also well represented at international Java forums. Given that Java is not suited to massive projects involving hundreds of people simultaneously, local expertise is able to match that of any other region in its Java development capacity.

Java and Z39.50

Java and Z39.50 have state but the web does not. Java is the universal application language for web browsers, and is now commonly used to add facilities that require state to web sites. Being an extensible, object oriented language with access to network and graphical primitives, it is possible to use it to implement a portable GUI interface to arbitrary network protocols. Work along these lines is already underway in other areas, including mail clients and strong encryption at the TCP/IP socket level. Z39.50 is a relatively complex bytestream protocol but the principles remain the same. The very nature of Java makes any such project easier; it represents the latest in the evolution of computer programming languages. It is clear then, that it is quite possible to develop a Z39.50 origin in Java. The size of the task is comparable to that of implementing the client side of any other complex protocol that uses TCP/IP as a network transport.

Local examples of such implementations include generic support for PC networking and a compressed, encrypted file system, both of which involve very complex client-server bytestream protocols. This shows that the scale of this kind of work is within the capabilities of a small and focussed team. In the case of Z39.50 much of the development work has already been done in that there is working code written in languages closely related to Java. This should make any implementation straightforward, and discussions with Australian companies specialising in Java confirm this; estimates range from one to two person-months of effort required to produce a professional working prototype.

This would include full documentation and a detailed implementation plan for developing the final product. The features of a Java implementation of a Z39.50 query engine would be as rich as that of any other and, in addition to the standard features, Java has unique capabilities that a Z39.50 client would use. These include native support for Internet protocols (including http, ftp, nntp and mime), dynamic extensibility (the Z39.50 client can download new classes automatically, adding to its functionality) and network cooperation (Java has built-in support for concurrency). The Mike Loukides article ‘Java: Not Just Another Pretty Face’ at http://www.ora.com/www/info/java/louk.html puts these strategic and long-term benefits into perspective.

Search Tools

Detailed evaluation of traditional Internet search tools such as Archie and Veronica can be found in Foster (1994), which is useful if somewhat outdated. A briefer, up to date overview is available in the Owen & Wiercx (1996) report on Networked Library Services. This report focuses on the three major categories of more contemporary Internet finding aids: robot types, subject catalogues and indexes to electronic publications. Much of the following analysis applies equally to all categories.

Robot type search tools crawl the Internet collecting various data about information which is then used to build massive indexes. Dozens of such search tools exist for the location of Internet electronic resources and a significant amount of work has been performed to evaluate the effectiveness of individual search tools.

Current concerns with robot-based search tools focus on their ability to keep up with the massive growth of the Internet; increasingly search tools are requesting submission of data to counter the delays in automated collection that rapid growth in web sites is causing. The effectiveness of integrated search tools is hampered by the diversity of subject/indexing terms and methods used in different databases.

New search tools appear rapidly, for example, Alta Vista which searches across web and news resources was merely one new search tool released during the IDA project. It is encouraging that during the brief period of the IDA project there has been the release of new Australian search tools. Unfortunately as yet none offer an effective web page search service, either lacking the facilities for an effective depth of indexing such as the CSU index of WWW servers or including very basic search facilities which permit only very high recall searching such as the Web Wombat.

The latest funding round of Network Information Support funds under the Improved Information Infrastructure program included funding for several Australian search tools and indexes. This includes the project based around Harvest, being conducted by ADFA and ANU, which aims to create an index of Australian government WWW sites (Dack 1996).

Subject catalogues perform very similar functions to subject gateways and virtual libraries, in providing browsable, subject arranged links to Internet resources. Despite debate over the methods, subject catalogues/gateways tend to be an effective mechanism for filtering and presenting quality information. The development of Australian subject catalogues is in its infancy, being carried out at an institutional level via the work of subject librarians. Although this approach is valuable in providing much locally relevant knowledge, the task of compiling subject catalogues for Australian use would greatly benefit from coordination to avoid duplication. Australian developments in this area are more fully discussed earlier.

Overall specific indexes to electronic publications are few, and only one Australian example has been identified, the Australian Electronic Journals list maintained by the National Library of Australia. This is a likely reflection of the experimental stage of electronic publishing.

Document Delivery

Many retrieval systems now perform a document delivery function through the ability to deliver full text. It is becoming clear that all electronic systems are moving towards having some capability in terms of electronic document delivery, whether it be an ordering process, a delivery of hard copy process, or transmission to the desktop with the ability to print to a local printer. When formulating a test-bed study, it is therefore important to acknowledge and include some element of document delivery.

A project that integrates document delivery into an existing system is World1, which will integrate Uncover with ABN and the Ozline databases. It intends to enable easy access to worldwide information, from the users’ desktop. The possibility of incorporating existing document delivery projects into this system, such as ARIEL (USA initiative), a document transmission system and the Australian JEDDS (Joint Electronic Document Delivery Software Project) initiative to integrate MIME capabilities with ARIEL are being explored by the National Library of Australia. REDD (Regional Electronic Document Delivery) can be integrated into web-based OPACs now allowing end users to directly submit document delivery requests.

Document delivery must be integrated into the search mechanism, in which search results for print information include a facility for an online ordering service. The importance of document delivery to those requiring information makes it a vital component of any Australian test-bed study. However, given the large range of options being pursued in Australia regarding document delivery, it should only be a component of the test-bed study as opposed to a specific, independent project.

Possible Solutions and Architecture Issues

In simple terms there are two primary possible solutions to the integration of print and electronic information resources. Either electronic records can be carried in library OPACs or library OPAC records can be run into the web robots/indexes. Owen and Wiercx (1996, p. 9) in their recently released report Knowledge Models for Networked Library Services alternatively suggest that there are two ways for library information to be linked into Internet information; either libraries provide access to networked information resources, adding value to these resources or libraries provide global access to their resources, including OPAC records and electronic documents of their host institution.

In general terms solutions to the problem of the integration of print and electronic sources are heading in the following directions. The development of sophisticated interfaces with a rich variety of user customisation options, for example, facilities to enable users to select say six to eight specific databases to cross search. These databases/information sources may be library catalogues, indexes or abstracts, or Internet servers. Customisation options must include options to select certain types of resources, for example OPACs or web pages, and to search only specific geographic domains.

Due to the prohibitive costs of very large storage systems required to host the massive databases which would result from a truly integrated system, even if confined only to material which meets strict quality guidelines, the only practical solution is the creation of distributed systems. These would be linked by the above mentioned sophisticated interfaces.

Search results would be presented in an integrated manner with record results clearly indicating their information type and availability. Other search result options would include library catalogue records indicating the closest (to the user) available copy; commercial service records (i.e. CD-ROM or online databases including holdings information or electronic resource location). In such a system hypertext links would be available where possible. Search capabilities for a distributed system would also have to include some sort of tool for selection of relevant information sources such as the dial-index function of Dialog.

Owen & Wiercx (1996) have focused detailed attention on models that provide a development path towards networked library services. They propose three different models. Firstly, the networked library model which utilises all current network technologies and resources to create the level of integrated system currently possible (pp. 82–99). This system comprises internal and external electronic resources, and is based around the provision of storage facilities, an integrated resource discovery system and a support system. Secondly, Owen & Wiercx (1996, pp. 100–105) suggest the cooperative network model which focuses on a networked, distributed system. Finally, they suggest the knowledge environment model (1996, pp. 106–110), aimed at public library-type services.