Linking research & learning technologies through standards

Link Affiliates Blog

Comparison, People Australia and Register My Data encoding of parties

with one comment

We have already presented the People Australia and the Register My Data initiatives, and their different approaches to encoding information about parties and their identity. We elsewhere walk through a comparison of their schemata, which consists of a walkthrough the schemata, and a discussion of points of disparity. We first compare People Australia with ISO 2146 proper, before comparing ISO 2146 with RIF-CS.

Our comparison is motivated by the fact that ANDS will be using People Australia as a primary resource for researcher identity. The comparison is specific to the process of importing People Australia metadata into the format required for Register My Data.

Research Data Australia and People Australia have been set up to solve different problems, so we expect to find mismatches between the two—not only in what data they capture, but in how that data is viewed. In particular, People Australia is set up as an authoritative reference on identities: it uses a rigorous data standard, is intended to be machine-readable, and sets a premium on deduplication. Register My Data is driven by research data discovery by humans, and the identities of researchers are coded only as a means to that end: researcher identity is not critical to disambiguate, since the context of the data discovery itself provides the disambiguation. Research Data Australia is a data aggregator, and does not have autonomous authority; its RIF-CS schema needs to serve as an intermediary between disparate sources of information, rather than aspiring to detail.

That means that information lost moving from People Australia to Register My Data, while importing identifies, need not be a problem. Data on identity that cannot be expressed under Research Data Australia will likely not be relevant to Register My Data’s purposes anyway. We still thought it useful to confirm this against the current RIF-CS schema, which is undergoing review, and to contribute our comparison to the review process. RIF-CS will be expanded in the next few months, particularly as it will have added functionality of exporting citations to citation software.

Given all that, we note the following in our comparison between the current versions of each:

  • ISO 2146 support for metadata disambiguating parties is poorer than in EAC. This is reflected in RIF-CS, but is consistent with the different approach on identity in Register My Data.

  • Consistent use of identifiers for parties throughout allows users to do discovery by author; but exposing identifiers to users raises user expectation that the identifiers are deduplicated.
  • The ANDS use of ISO 2146 could usefully include the EAC attributes of level of detail and authority file (source). It may also be useful to code country separately for international collaborations.
  • RIF-CS does not support the ISO 2146 entity Events, so any professional history of the party (e.g. their professional affiliation) can only be encoded through Activity. At issue is whether a party’s professional appointments should be encoded as attributes of that party, or as separate entities which are independently discoverable. For past appointments in particular, detailed information may be hard to come by.
  • There is no provision for record history in RIF-CS, outside a simple indication of provenance. Again, authority data should properly be sought at the source repository; but ANDS will assert its own authority if it embarks on quality control of metadata. If it does, that authority should be discoverable as record metadata.
  • RIF-CS several constrains where Date Range can be used: parties, activities, and relationships cannot have a lifespan coded against them. This restricts the potential for disambiguation or context.
Advertisement

Written by Nick Nicholas

December 10, 2009 at 5:04 pm

One Response

Subscribe to comments with RSS.

  1. [...] communities, and we have already posted on it at some length, including its use as a basis for the Australian National Data Service’s RIF-CS schema, and the IMS LODE registry model. (The latter post includes a UML diagram of the ISO 2146 classes [...]


Leave a Reply

Fill in your details below or click an icon to log in:

Gravatar
WordPress.com Logo

You are commenting using your WordPress.com account. Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.