We have just purchased Calm for our Cartoon centre project and are going to evaluate the XML Wrapper (XW) for use with our Cartoon archive website, with a view to purchasing it in the near future if it can meet our needs. As we understand it, the XW operates in a similar way as DServe to search and retrieve records.
We are concerned that we are planning a structure that will make these calls very onerous, requiring multiple calls for each search. We are wondering if anyone has run into a similar situation and can give us advice about structuring our data to reduce the number of calls that will be necessary for each search.
Background
Web searching for cartoons will be concerned with the item level and related metadata but we want to simplify the results that are returned.
If more than one format of a given “cartoon” matches the search criteria (ie more than one linked item level related record), we only want to return one of the items in the search results which will represent the “cartoon”. Clicking on the “cartoon” will then allow the user to drill down to see all related items (i.e. all formats of the cartoon we hold).
At the moment, in our non-Calm database, metadata related to multiple items is held in one record with links to images of the various cartoon formats. Basically, the equivalent of our “cartoon concept” record. There is no structure to the current catalogue (i.e. most collections are composed of single items and there is no need for a hierarchy).
However, since the Giles material has more than just cartoons, it requires a hierarchy to show the structure of the collection, each format of the cartoon will be in its relevant series and will need to have its own metadata. Thus, we need some way of grouping or associating versions of a “cartoon” in different series and even different collections. The usual method of doing this, just linking related records, is not sufficient because we need to distinguish linking of “cartoon versions” from other standard linkages (e.g. a drawing linked to correspondence mentioning it).
Proposed solution
We are considering using Calm’s “analytical record” to represent this “cartoon concept” record, though this is a slightly unusual use of this type of record. By linking the analytical record to each of the cartoon formats, we are able to “name” the relationship and thus distinguish it from other linkages. We envision these “cartoon concept” records having only minimal metadata, as the more complete metadata is being held by the child records. While we would be happy to have the item metadata duplicated in the analytical record, we do not believe this is a sustainable model since data changes would need to be manually updated in more than one place.
The problem with our proposed solution is that the XML Wrapper (and DServe) would usually retrieve all the item-level records, since they are being matched by the search. However, if more than one item linked to the same “cartoon concept” is returned, we only want to show one.
The only way we see to achieve this is to make a large number of web service calls, as demonstrated in this scenario:
If a search result returns an item record, we will have to iterate over the list of ‘related records’ to work out which are relevant (analytical records) at the level we need, then back down from the ‘parent’ to find out which is the most relevant ‘child’ record. For a search that returns a large number of these we could end up having to do this repeatedly for each result returned by the search – potentially a huge number of XW calls. This is complicated by there being no simple method to return just a single record. This could be made easier by having some direction in the related record – so ‘isParentOf’ or ‘isChildOf’ but maintaining these manually isn’t going to be very practical. For searches that return a large number of records, we are worried that these multiple calls will make the system very slow and onerous to use.We are looking for ways of creating this structure so that one item representing the “cartoon concept” record could be returned more easily by the search. We have not yet started to enter data, so we can be flexible in creating our structure and wonder if there is a better or more efficient way of doing so.
Any help with any questions, responses or information on this would be greatly appreciated. Please email bca-digitisation@kent.ac.uk .
