The “repository ecology” approach to describing cross-search aggregation service management

My colleague Malcolm and I have had our position paper for the ECDL 2007 workshop “Towards an European repository ecology” accepted, so I’ll be giving a presentation with the above title in Budapest on Sept 21st. We will use the ecology metaphor to describe and explore issues raised through a project that developed a pilot service providing resource discovery across a series of repositories of interest to the engineering learning and teaching communities. We’ll also describe the ecological habitat within which the pilot service that PerX created sat and sketch the ecological niche, that is the role of the service and its interactions with other entities. In doing so we hope to show that, while a technical architecture is at the heart of this description, the ecology approach highlights crucial interactions that are out of the scope of a technical architecture.

Here’s the text of that position paper. For more information and other formats see the ICBL web page for this work). Any comments, especially those received before I write the paper for Budapest, are welcome.

We propose to use the ecology metaphor to describe and explore some issues raised through the PerX project . This project developed a pilot service providing resource discovery across a series of repositories of interest to the engineering learning and research communities. The fundamental use case behind PerX is that an Engineer who requires information should be able to perform a search across a selected range of data providers and the results should direct him or her to a relevant information resource. The distributed architecture involved in building such a service can easily be described using the JISC Information Environment Architecture : the PerX service is an aggregator in the fusion layer, it has a user interface provided in the presentation layer, and cross-searches information about resources held by several data providers in the provision layer. In the Information Environment Architecture view of PerX, the nature of the content provider services and how to search them is known because of data provided by a service registry, which is part of the shared infrastructure. This is shown schematically in the central part of figure 1.
The actual experience of setting up and maintaining links between PerX and data providers has been described (here, and here), and was critically dependent on many more resources and factors than are shown in the Architectural view. Typically addition of a new data provider involved the PerX service manager gathering information from the wider community of information specialists, from the data provider’s website and crucially from a data provider service manager: someone with the authority and/or expertise to commit to providing a data feed under suitable conditions.
We believe that the interactions involved in setting up and maintaining a cross search aggregation service may be modelled as a “habitat” in the repository ecology, and have attempted to sketch some of these interactions in figure 1. We make the following observations relating to this approach:

  • the architectural view is at the centre of our ecological view. The ecological view supplements the architecture: it highlights crucial interactions that are out of scope for a technical architecture, and has the potential to show where the architecture fails to support the information flow required by a service.
  • It would be of great benefit to obtain more detail on the precise nature of the information obtained by the PerX service manager from each of the available sources, and whether necessary interactions are in place for this information to be supplied via the Service Registry in the shared infrastructure.
  • Organizations acting as data providers have an internal structure that may be modelled as a community. Within this there will be factors that may hinder the interactions required for the PerX habitat to function smoothly. These may include competition (a biotic factor) from other services that offer resource discovery, for example other equivalents to PerX (engagement in other habitats) or a policy of preferring use of their own resource discovery tools over facilitating indirect resource discovery.
  • “Abiotic” factors such as policy and the consequent flow of funding may affect the whole habitat by, for example, not providing enough resource to nurture the “soft” elements such as a functioning information community. The ecological view has value in surfacing these elements of the ecology, helping to explain their value, and thus providing a means of securing support for them.
  • Service managers may be a keystone species. There are certainly cases where no service manager emerged from the mists of the data provider’s community, and achieving a cross-search in these cases was problematic.

We hope to be able to expand on these and other observations during the workshop. If the ecology approach is to meet its potential as a communication tool, then it is important that we all speak the same language when employ it: we hope that attendance at this workshop will be important in establishing this language.

Figure 1.
ecologyinteraction.pngEntities and interactions in the PerX cross-search habitat. An engineer uses the PerX user interface to perform a cross search of selected data providers in order to find information. The schematics in the centre of the diagram show the distributed architecture of data provision, fusion (the aggregator), and presentation the user interface), with a shared service registry which (in theory) can be used obtain information about the available data providers and how to interface with them. Around this we show the other elements at play in the habitat which were required in order set up the cross-search: a community supporting Information Professionals that provided information about which data providers were relevant and a service manager at each data-provider who provided information specific to that service.