packaging University museum collections information as part of open metadata provision
The project has provided invaluable insight into open data and resource disocvery issues for University Museums and the use of collection descriptions to aid this; resulting in some important improvements to aggregation services and the development of new interfaces to present and promote these exceptional collections to higher education and other audiences. A summary follows:
The project has produced:
Opportunities and Possibilities
Two immediate opportunities spring to mind as the project ends:
Firstly, it would seem apt for a next stage to be a programme of content development to support University Museums, (not already in Culture Grid), to be aggregated and thus make use of the fully functioning prototype search for UK University Museums, (developed to meet the needs of users in this domain). This would provide a major resource for research, learning, teaching and collections management and an excellent promotional and advocacy tool for University Museums.
Secondly, the close association with the Open Book project has highlighted the potential benefits for associating research output data with collection descriptions, and further developments in this area would be welcome.
User Interface (UI) developments have focused on the UK University Museums search, with the aim of trying to incorporate as many of the feature requests as possible that were raised during user consultations.
This has meant that the three user interfaces the project aimed to produce will be finalised after the official project end. Developers Gooii expand on some of the developments so far for the UK University Museum search and new Culture Grid search:
UK University Museums search
The major UI development has gone into the concept of tab filtering of search results for Institutions, Collections and Items. Initially we tried to combine tabs with breadcrumbs to clearly identify the user’s location within the hierarchy.
When implemented however this proved unwieldy as the amount of data overloaded the tabs, and objects were in multiple collections, so the breadcrumb was too long and was confusing. However the tab concept did seem to work as a useful filtering option. So we simplified the tabs:
Adding just a count, so that the user can clearly see how many of each record type have been returned. We then developed a sync controller so that the tabs would be kept in sync with the search term.
Users can now find institutions and collections easily, and then choose whether to view a whole collection or only objects within a collection that match their search term.
The new GRID CLDs are now shown when the collection is selected.
Subject browse structure using JACS
Sign in and User Collections
Users can sign in via Google (OpenID) and then create and publish their own collections on the site. The Google OpenID login was well documented and was quite easy to implement. We used an OpenID consumer only wrapper class (LiteOpenID) which simplified things.
Once signed in users can edit their details including username which affects the public URL of their collections. They can also choose whether to make their collections public or private.
Collecting objects is done from the search results. Each item record has a “Add to your collection” button which allows users to add the record to an existing or new collection (via a modal dialogue).
User collections can then be reached via a public URL with the structure:/user/username/usercollectionname
Both username and user collection name can be defined by the user.
Features still remaining to be implemented:
Culture Grid search upgrade
The CG search UI was modernised, URL patterns made human readable, load speeds reduced and item pages given their own static URLs.
Importantly the collections in the advanced search were made hierarchical for easier searching.
Features still remaining to be implemented:
Some of the overall lessons learnt include the value of stakeholder / user consultation and the need to not underestimate the resources required for activities such as application-profile development, data enhancement and interface refinement to meet the detailed user needs gathered during consultations.
At a more detailed level, lessons learnt include keeping initial interface design-briefs focused on outcomes and brand values, rather than technical solutions; and adopting an informal approach to user consultation. The latter helped conversations flow and ensured sufficient engagement at focus groups: leading to many valuable insights and recommendations coming forward from the community.
Further lessons relating to technical development are available at: https://contextualwrappers2.wordpress.com/2012/08/09/technical-challenges/
Benefits of our approach and how others could follow us
The project team has particularly benefited from new insights into the broader use of collection records across University Museums and the applicability of the ‘wrappers’ approach in practice at both the institution and aggregator level. The former being an outcome of the wide consultation the project took with the sector (https://contextualwrappers2.wordpress.com/category/consultation/) and the latter resulting from the related Open Book project. (http://jiscopenbook.wordpress.com/ )
These insights have been fed into technical developments and aggregation workflows to improve integration and dissemination of collections information through the Culture Grid (http://www.culturegrid.org.uk) and other interfaces. This gives future benefits not only to The Fitzwilliam and Culture Grid, but the wider University Museums sector, which can make use of specific search interfaces and an aggregation infrastructure more attuned to the needs of University Museums.
Our advice to other projects looking to use collection descriptions for contextualised resource discovery through aggregation services would be to use existing infrastructure at the aggregation-level, such as the project now provides; focus resources on the development of local systems and policies to fit a broader ‘open resource discovery’ environment; and prioritise those local developments to meet the particular resource discovery needs of their own communities by continually engaging with their collection users.
In line with the Discovery Open Metadata principles, (http://discovery.ac.uk/businesscase/principles/ ) the project takes the approach of advocating for open licensing of metadata and also applying a layered approach wherein appropriate licenses can be applied to different types and formats or instances of metadata.
Within this approach it is also important that rights statements about the ‘resource’ being described are recorded within the metadata, so for item records that would be rights information about the item being described and for collection records rights information about a collection as a whole.
Additionally, the project’s development of records about collections within the Culture Grid (http://www.culturegrid.org.uk/ ) meant that a concise summary of the overall legal status of a collection within the Culture Grid could be made. This further supports appropriate use of resources and metadata by a variety of end users as the full range of rights conditions within an aggregated collection can be easily made transparent.
One of the advantages and main reasons for the project taking this approach is that it is a good fit within a service model that includes different aggregation and discovery points, e.g. where Culture Grid is a metadata provider for Europeana, the European cultural portal (http://europeana.eu/ ). So the rights management process becomes: an institution assigns rights and licence statements about its items and their representations in its item metadata; then makes a more limited format of item-metadata available to Culture Grid for resource discovery under an open licence, such as CCO or Open Government Licence; and Culture Grid in turn reformats that metadata into Europeana’s resource discovery format under the terms of the Europeana Data Exchange Agreement and Usage Guidelines. In this way users at each ‘discovery point’ are made aware of both the legal status of the items and representations of them and the conditions under which sets of metadata may be used.
To support and develop this approach the project gained insight from the related Open Book project (http://jiscopenbook.wordpress.com/2012/05/08/disambiguating-metadata/ ) and included rights management issues within its consultations (https://contextualwrappers2.wordpress.com/category/consultation/). The latter revealed University Museum practitioners to be very positive about the open licensing of resource discovery metadata, once it was made clear that open licenses were only applied to a reduced format of ‘resource discovery’ metadata and that restricted rights remained for use of items, such as contemporary works of art and their representations, with those rights being recorded in the metadata for users to be aware of.
A future development in this area would be for UMG and UMIS to become signatories of the Open Metadata Principles.
This post from the project’s technical partners, describes some of the technical challenges faced within Culture Grid (CG) and the new search interfaces developed for the Contextual Wrappers 2 project (CW2).
This project follows on from the Contextual Wrappers project (CW) that was previously funded by JISC. The CW project tackled several problems surrounding representation of Collection Level Descriptions (CLDs) and the linking of these CLDs to relevant item records. However, it also led to several new issues being highlighted. CW2 aimed to tackle these issues and develop the ideas that were previously investigated within both CG and CW.
In situations such as CG where records are combined from over 100 different museums, libraries and archives it is useful to provide an overall collection hierarchy which allows different sets of records from different museums to be grouped together or separated out as appropriate. This allows the aggregating platform (CG) to apply extra management information to the individual collections that would not normally be made available publically. This information can include whether to expose the data through different API endpoints, etc.
CG has traditionally allowed for very simple ‘grid’ collection descriptions for these groupings along with the relevant management data. It then allowed for the uploading of highly detailed CLDs based on RSLP metadata elements for individual museum’s collections. CW2 takes the lessons learnt from CW and previous projects and apply them to improving the management of collections within CG. This in turn improves the user experience allowing them to navigate between associated collections within the user interface, helping them to find records more easily.
Another issue that CW2 aims to tackle is caused by the increasing use of application profiles as implemented by the CW project. The CW work allows different records within CG to have different application profiles applied to them. This means that records will have data specified related to varying metadata fields. The existing CG interfaces simply output all fields for which there was data to the user. While this deals with the fact that different records use different fields it doesn’t lead to the best user experience as fields aren’t necessarily shown in useful orders, or with appropriate grouping, etc. CW2 aims to solve this problem by ensuring that the application profiles that are used within the backend CG code are exposed to the frontend interface systems allowing the information to be displayed in a more consistent manner.
In retrospect there’s nothing hard about these developments. However, as always when working with a system that needs to deal with almost 3 million records without breaking existing functionality, the devil is in the details! Much of the functionality that has been added for CW2 could easily be implemented into a new aggregation system which doesn’t already contain that much data as the data can then be made to fit the system on import. This isn’t possible with a system such as CG which has had to develop organically throughout its lifetime.
The frontend interfaces for CG are completely separate from the backend processing components. While this has advantages in terms of allowing backend components to be upgraded without adversely affecting the frontend and vice versa it means that application programming interfaces (APIs) must be developed to share information between the components. For CW2 it has been necessary to develop mechanisms to share the application profile specification between the backend and frontend. The standard mechanism for sharing application profiles is to use an XML Schema. While this is perfectly sensible for data validation and associated purposes it is too cumbersome for use when creating user interfaces from the structure. Therefore a custom JSON structure had to be developed which allowed the frontend interfaces to structure data as specified by the profiles.
CG also holds information about many of the cultural institutions within the UK since it preserves an institution dataset inherited from a prior project. CG has always modelled the link between institutions and the CLDs describing the collections held by these institutions. However, this information was not exposed in enough detail in order to be used by the user to navigate between different collections of records. CW2 has improved this by firstly extending the application profile implementation to include the description of institution records and then by ensuring that enough information is provided to allow navigation between different collections held by the institution.
The initial data model for institution records within CG was based around the data structures that were inherited. This didn’t include links between institutions. This lack of linking can be problematic in the case of institutions such as the University of Cambridge and the Fitzwilliam Museum (FM) which are linked but distinct institutions in real life. Collections held by FM need to somehow be associated with both FM and potentially University of Cambridge. A similar situation exists with records from services like VADS and COPAC in CG, where there is a need to associate records with both the service and the institution(s) their data represents. Therefore CW2 has extended the CG institution data model to provide links between ‘institutions’ allowing the user to perform this sort of navigation between records.
CG holds both metadata representations of collections (CLDs) and of the items themselves within the collections. This means that it is possible for CG to perform some automatic analysis of the item metadata and to save this information as part of the CLD itself. This in turn allows the user to navigate collections based on information stored in the items within the collection which the original producer of the CLD may not have included.
CG performs this function by periodically searching for important information within the items stored in a collection and then caching that information within the collection itself in a way that it can be presented to the end user. In theory this information could also be used as a controlled vocabulary of sorts to help the user when creating new records within that collection within CG but this functionality is not within the scope of the CW2 project.
Positive lessons from CW2 include that it is possible to link together management information and CLDs in order to provide a more user friendly search and navigation experience. Neither the CLD specification, nor the specification of the management information within CG needed to be adapted for use in this way; it was simply a case of linking the two together and then making the data available as appropriate. The main search APIs provided by CG didn’t need any expanding to meet the needs of the CW2 project although additional API calls were required to expose application profile information, etc.
The main negative lesson learnt is that once again we underestimated the amount of time required to implement the desired changes to a system as large and complex as CG. The CG platform has evolved incrementally throughout its existence. This means that there are often complex interactions between different parts of the platform which makes changing any one small part of the system more complex than would be the case with a system developed from scratch to fulfil the same requirements. It also means that further incremental changes often interact and conflict with these previous interactions in unforeseen ways, leading to increased development and testing times compared with forecasts. Happily, though, we’ve managed to get through the pain and implement something that we hope will prove useful for all CG users.