[next chapter] [previous chapter] [contents]
The IDA project has been a challenging research project. The technology under investigation is rapidly changing and it was at times difficult to keep pace. As the following section illustrates, the research methodology was strongly based around the new electronic information technologies.
In the initial stages of the IDA project, the research team prepared a discussion paper on the parameters and scope of the IDA project. Guidelines were developed to assist in the more intensive searching for tools and resources and the development of guidelines for critical analysis. The following tasks were also undertaken:
The results of these activities were varied but the information and discussion indicated that there is no single clear answer, and that explorations of integrated document access must focus on manageability, comprehension and interpretation by users. In many ways, the development of programs and projects designed to aid information retrieval in a climate of information overload is very much in its infancy. Closely related issues, such as access to quality information and electronic publishing, help to provide partial solutions and therefore need to be considered. Specifically, it was clear that the projects recommendations and findings must be of benefit, relevant and easily transferable to the wider community.
During consultation with the IDA Advisory Committee, it was identified that there is limited consensus on definitions and terminology. Although well directed research usually commences from a position of agreed definitions, in the case of this project it is the understanding of the problem that is most important.
Traditionally, academics and researchers conducted a literature search through primary and secondary sources at their university and other specialist research libraries. This traditional searching included the use of commercial electronic sourcesCD-ROM and online databaseswhich most traditionally (and still via dial up access) require a trained intermediary to perform the search.
The growth of the Internet and the vast amount of useful research and teaching information carried on the net have created new dilemmas for searchers. Internet users have access to a huge new range of information sources as well as end user access to previously restricted online databases. However, finding information on the Internet can be a game of finding a needle in the haystack as existing search tools are inadequate and unable to keep up with the addition of information. Large quantities of information is, for academic and other purposes, completely extraneous and distracting.
In addition, countries outside of the United States can essentially be considered information poor and unable to locate the full text of much cited English language information located through Internet sources. Preliminary investigations by the project team have revealed that traditional definitions of the terminology related to this project are evolving and at the same time new jargon is developing.
Projects were identified through a literature search, Internet searching and recommendations of the project team and advisory group. The documentation received from Cliff Law (IDA Advisory Committee) provided useful coverage of the activities of major overseas national bodies. The majority of the relevant sites were added to the web site in January 1996. Electronic discussion lists were an ongoing and important source of information regarding current projects.
There are a number of diverse and creative projects in operation examining the issues of resource discovery and access and many organisations and individuals claim that their project provides the answer. In order to determine meaningful evaluative criteria, it was useful to categorise the projects according to type.
Unfortunately no existing typology or classification for this could be located, so the project team has developed its own, determined according to the approach taken to the problem.
These categories are not exclusive and the final divisions are somewhat arbitrary. Categories have been modified and adjusted as appropriate. Many of the projects examined in the IDA project are appropriate to be placed under two categories and it appears that, generally, these projects take one of the following forms:
Interfaces
Projects designed to simplify searching through the provision of an easy to use search interface such as BabyOIL which provides simple searching by providing links to recommended search tools.
Navigation Aids
Similar to interfaces, these projects provide navigational assistance, such as the search strategy recommendations of Virtual Pathfinder which are based around the traditional notion of a pathfinder incorporating library resources with pointers to Internet resources.
Integrated Databases
Projects designed to create either massive databases or design integrated searching of a number of linked databases such as OhioLink.
Cataloguing
Projects for which the primary emphasis is on cataloguing/classifying resources for improved retrieval, includes database projects that serve as a cataloguing tool such as World1/NDIS.
Electronic/Digital/Virtual Libraries
Projects that provide libraries delivered via computers, which may or may not be linked to physical library collection. Virtual libraries tend to be comprised of web based links while digital libraries are more likely to integrate a range of digitised resources.
Gateway Services
These projects provide links across protocols; for example, SOSIG (Social Sciences Information Gateway) which provides telnet and gopher links to social science resources.
Electronic Publishing Projects
Projects that provide access to full text articles that may or may not be linked to abstracts/indexes (e.g. OCLC Electronic Journals online).
Search Tools and Indexes
The rapid expansion of the Internet has seen the rapid and continual growth in finding tools and indexes. Dozens of such search aids exist, primarily in two forms, either machine based spider/crawler/gatherer type tools or people-mediated subject guide/directory type tools. Machine-based search engines can be layered over single servers, multiple servers or the entire WWW. Many of the latter can also be identified as virtual libraries or subject gateways.
For the IDA project, search tools were selected for their usefulness in yielding research and teaching information. This was determined by firstly considering the type of information that the tools can reveal and secondly browsing the search engine links provided by academic institutions. Accordingly, tools were selected which lead to documents, journal articles, news and web pages. Search tools which are designed to locate people or software and guides to Internet use, although useful for academic purposes, have been excluded.
Major categories and definitions identified in the preparation stage of the IDA project were used in the design of a flat-file database to access and search information and assist in the analysis of document delivery systems. This database enabled searches of specific projects to assist in grouping, priority and relevance. The project evaluation form which formed the basis of the database assisted in the development of a glossary of definitions to assist in cross checking and validation by the research team. The following are examples of the criteria used for the preliminary analysis:
Appendix 3 provides the database structure and a listing of entries. The project team has undertaken to maintain the IDA project web site for a period of 12 months and the evaluation database will be linked into the web site.
Web Site
The web site for the IDA project (http://www.ida.unisa.edu.au) was developed in December 1995 (Refer to Appendix 1: Overview of the IDA Web Site). In the initial stages of the project, it consisted of a list of relevant links identified through the tertiary sector networks, knowledge within the project team, the CAUL site and some preliminary searching with various search engines. In January 1996, descriptions and evaluations of various projects were gradually added while feedback was gathered from the wider academic community.
The web site was advertised through the following electronic lists: webcat-L, doclibs, intuser (University of South Australia), and syslibs. It was registered with a number of indexes and subject guides including: Excite, Yahoo, Web Crawler, Lycos, Infoseek, Open Text, Harvest and Galaxy.
During January 1996, the research team made a concentrated effort to advertise the server more widely through overseas/international electronic discussion lists. For example, D-Lib magazine was extremely interested in the project, providing a link to the IDA web site in the November issue and inviting a brief descriptive article on the project for the January issue. The Australian Library and Information Associations publication InCite, included a brief description of the project in February 1996 in the Internet Pages.
Using the publicly accessible World Wide Web site, interested parties were invited to participate in the debate and provide feedback about the research teams descriptions and evaluations of projects. A request was sent to a contact person for relevant projects to obtain views and information on the following:
Links with the IDA project were also requested from several resources which are aimed at a specific sector. For example, the EdNA Directory of Educational Resources (http://www.edna.edu.au) has pages which are dynamically generated
at the time of access from information stored in an Oracle database. The system is able to be customised by the user by choosing from a set of themes (icons and backgrounds), default search screens, level of help and entry into the browsing tree.
Other feedback through the IDA site related to key questions associated with analysis of projects. For example, identifying those search tools which provide indexes to print or non traditional Internet sources and Internet material combined. Other responses suggested that the contents of catalogues or bibliographical databases could be placed into the harvest system. One response (T. Barry 7/2/96) also provided important guidance and suggestions about components of delivery and the recent development of the Z39.50 protocol.
The project has attracted solid interest from the library and information community. An examination of the server logs and subscriptions to the IDA electronic list clearly show a bias towards Australian interest, with it appearing that the majority of major Australian tertiary institutions are represented. A small selection of individuals from the United States, United Kingdom, Canada and Sweden have also joined the IDA electronic list.
The work of FIGIT Electronic Libraries Program (eLib), sponsored by the United Kingdom Government, is quite well known in Australia. Following publication of the Follett Report (1993), the United Kingdom Government decided to support a wide range of research and development projects relating to electronic communications, for a limited period of time. Concomitantly, in the United Kingdom as in many other countries, the private sector (particularly the publishing industry) is pursuing its own course, albeit sometimes piggy backing on public sector initiatives.
Some of the projects in the public sector are beginning to develop commercial plans in addition to their Government-financed research and development. In the electronic era, as demonstrated by the series of FIGIT projects, researchers and practitioners from nearly every field of scholarship and practice are involved.
On the back of other work, Professor Michael Brittain (IDA research team) was able to visit the following FIGIT sites in early December 1995:
SOSIG (The Social Science Information Gateway), which forms part of the Program Networked Information Support for Research in the Social Sciences, based at the University of Bristol;
an Open Journal Framework: Integrating Electronic Journals with Networked Information Resources, part of the eLib program, based at the Department of Electronics and Computer Science, Southampton University;
InfoBike part of BIDS (Bath Information and Data Services), based at the University of Bath;
UKOLN (UK Office for Library and Information Networking), based in part at University of Bath;
ADAM (Art, Design, Architecture, Media Information Gateway), based at Surrey Institute of Art and Design; and
EEVL (Edinburgh Engineering Virtual Library), based at Heriot Watt University, being a project to build a gateway for the higher education and research community to facilitate access to high quality information resources in Engineering.
The most valuable outcomes of the visit included:
Given the diverse nature of the Internet, all members of the research team were responsible for performing some degree of searching using the preferred techniques of each individual. Comprehensive and coordinated Internet searching was conducted through the following methods:
In addition to this coordinated searching, the following resources were reviewed:
In addition to the Internet searching, more traditional searching was also performed including relevant Australian bibliographic databases. Highly specific searching was also carried out using Dialog, Uncover, Current Contents, Information Science Abstracts, LISA and ERIC. Other than the classic index, the ARL Directory of E-Journals, Newsletters and Academic lists which includes Kovacs Directory of Scholarly electronic conferences, few other specific search tools for electronic publications were identified.
The main information sources used to identify relevant materials in the initial CD-ROM searching in October were Information Science Abstracts and Austrom (Issue 3), which incorporated APAIS, AEI and ALISA. Online services used were Uncover, Current Contents, the Library of Congress and The National Library of Australia catalogues. All searching was limited to between and including the years 19901995 in order to examine recent literature.
Separate searches were carried out covering document delivery, retrieval (search tools, indexes, finding aids) and integration of electronic and print projects. Searches revealed a body of both peripheral and specific literature in the form of journal articles, with much of the literature located on the web or Internet accessible via telnet or gopher.
Follow up searching was conducted in February using DIALOG and Ozline. DIALOG was limited to the last two years and focussed on high precision searching. Ozline was limited to the last nine months; it was hoped it may provide some overlap to pick up sources with a long time delay in indexing. Computing type databases were purposely ignored given that solutions from this area are already quite dominant; an effort was made to focus on broader approaches to the issue as found in library science and information retrieval, and education/research sources.
Membership to relevant electronic mailing lists acted as a valuable means of locating relevant literature. Sources were gained through the use of the Dig-Libs, Govpubs, webcat-L, Syslibs and Doclibs electronic mailing lists through the duration of the project. References to literature throughout the project were also forwarded to the project team from the IDA advisory team. Electronic publications were found through D-Lib Magazine which has provided a valuable source of literature throughout the project.
There is extensive literature available which debates the many options and issues of resource discoverythe majority of it located on the web. This literature takes the form of hard copy and electronic journal articles, research results, discussion papers and conference papers. Given the increase in the interaction of various industries due to the growth of the Internet and resulting networks, various industries have been active in generating literature relevant to the issues, including the library and information, social science, multimedia and education industries.
Globally, seminal work is underway that must be monitored; for example, the development of the CNI (Coalition for Networked Information) White Paper on Networked Information Discovery and Retrieval. The purpose of the white paper is to provide a detailed examination of relevant information retrieval issues. The project is being conducted over a lengthy time span, necessary in this area to gain a full appreciation and coverage of the issues.
The existence of broad-based, large scale projects which play a role in information dissemination reflects the importance of integrated document access issues for the international academic community. In the United Kingdom, the Electronic Libraries (eLib) program is successfully coordinating relevant research for academic libraries.
An equivalent project in the United States is the NSA/ARPA/NASA digital library initiatives, whereby a small number of long term, expensive digital library projects have been funded. Informally, through subsidiary projects such as DLib Magazine, the project as a whole performs a monitoring role for a wider range of resource discovery/information retrieval projects. The existence of these projects indicates that modern information retrieval issues are complex enough to warrant ongoing monitoring and a coordinated national approach.
In Australia there is no equivalent project based organisation playing a formal role in providing ongoing examination of resource discovery trends. Informally the National Library and the Australian National University play leading roles in collecting and disseminating information. However, this role appears essentially voluntary and scattered across internal departments.