Linking research & learning technologies through standards

Link Affiliates Blog

Archive for the ‘Collection management and delivery’ Category

ISO 2146 released

leave a comment »

Last month, ISO released the long-awaited third edition of the ISO 2146 standard for Registry services for libraries and related organisations. ISO 2146 is a standard of great interest to repository communities, and we have already posted on it at some length, including its use as a basis for the Australian National Data Service’s RIF-CS schema, and the IMS LODE registry model. (The latter post includes a UML diagram of the ISO 2146 classes as of its 2008 draft.) Because of this interest, it is worth describing the standard further.
Read the rest of this entry »

Written by Nick Nicholas

May 4, 2010 at 9:41 pm

ADL Registries and Repositories Summit: report

with one comment

The U.S. Advanced Distributed Learning Initiative (ADL) recently convened a Learning Content Registries and Repositories summit (#ADLRR2010) in Alexandria, Va., which Link Affiliates attended. (We have already posted here our position paper for the meeting.)

ADL have been pioneers in developing and disseminating e-learning content; the ADL-Registry and its underlying model CORDRA have been highly influential since their inception in 2003. However the way information is disseminated and consumed online has changed greatly in the six years since, and the expectations of users have changed along with them. The summit was convened to ask:

  • What has happened in the last 6+ years?
  • What are the current business drivers and requirements?
  • What is the state of practice in registries and repositories for learning content?
  • What are the outstanding business and policy issues?
  • What are the outstanding technical issues?
  • What should we (the broader learning, educational, training, repositories and registries communities) be doing?

The summit was arranged as a sequence of panels, with audience questions. The panels reflected perspectives from US Government agencies, repository initiatives, technical interoperability, Web 2.0 and Semantic Web, and content vendors. The summit also included two breakout sessions, on what the current status and problems are in the learning repository space, and on what future priorities for development should be.

I’ve taken blow by blow notes of the workshop at the Interoppo Research blog; ADL has also provided links to other blog posts and tweets discussing the summit, as well as position papers requested for the summit. The summit ended with a polyphony of opinions on what to do next. Looking back, however, there are some clear realisations running through the summit; these have been picked up by Dan Rehak and Damon Regan in their summaries (Rehak: PPT, Regan: PDF), and are consistent with the findings of the subsequent CETISROW event (see Phil Barker’s summary).

This is my own skewed summary of what the summit found:

  • We don’t need more standards.
  • We do need a lot to seek out much more feedback from our users: what problems are we trying to solve?
  • The users don’t come to us, they go to Google (Facebook, Twitter, Flickr).
  • We won’t beat Google (Facebook, Twitter, Flickr) at their own game, and should not try to.
    • They build on Open Web content, we should provide Open Web content.
    • They harness content through Open Web standards (as does the Semantic Web): we should expose content through Open Web standards.
    • They set user expectations on discovery; we should break those expectations only if what we do is visibly better.
  • We have unique value as repositories, as authoritative & targeted providers of content. We should promote this—via Open Web channels.
  • We have defined contexts for interacting with content, and means of gathering user contextual data. That contributes to our unique value: better targeted search, or content push anticipating search.
  • Get metadata from wherever you can (automated, user-provided): users already deal with bad metadata every day, and bad metadata is still better than no metadata.
  • Repository federations are growing, but depend on harmonisation and registry metadata (and still coexist with Google).

The following is a more detailed summary.
Read the rest of this entry »

Comparison, People Australia and Register My Data encoding of parties

with one comment

We have already presented the People Australia and the Register My Data initiatives, and their different approaches to encoding information about parties and their identity. We elsewhere walk through a comparison of their schemata, which consists of a walkthrough the schemata, and a discussion of points of disparity. We first compare People Australia with ISO 2146 proper, before comparing ISO 2146 with RIF-CS.

Our comparison is motivated by the fact that ANDS will be using People Australia as a primary resource for researcher identity. The comparison is specific to the process of importing People Australia metadata into the format required for Register My Data.
Read the rest of this entry »

Written by Nick Nicholas

December 10, 2009 at 5:04 pm

People Australia and Register My Data encoding of parties

leave a comment »

We have seen in a previous post that different representations of identity are possible, because there are different business motivations for knowing a party’s identity. Depending on the use we put the identity to, different kinds of detail need to be gathered about a party.

There are two major initiatives for identifying parties being considered at the moment in Australian e-research. Register My Data aims to improve the discovery of research data through the Australian Research Data Commons, and People Australia aims to improve the discovery of resources by and about people and organisations generally. The initiatives do not address exactly the same business concerns, so the metadata they gather are different.
Read the rest of this entry »

Written by Nick Nicholas

December 4, 2009 at 11:13 am

IMS Global Meeting: Learner Information Services

with one comment

The IMS Global quarterly meeting for late 2009 was hosted by Oracle at their Redwood City campus in California. During the meeting, Oracle and their partners gave a nice demonstration of systems integration using the emerging Learning Information Services specification.

About the LIS specification

The IMS Learning Information Services (LIS) specification supports

sharing of learner and course information between Student Information Systems and Learning Environments

It supersedes the previous IMS specification in this space (IMS Enterprise) that specified data formats for exchanging learning information between systems. LIS takes things a step further: as well as specifying data formats, it defines services for exchanging and synchronising student and course information between systems. This represents a new direction for IMS specifications: a shift toward a service oriented approach (soa) rather than a data oriented approach to system integration.

The LIS specification is large. It defines hundreds of operations in six services for managing updates to data about people, groups, memberships, courses, outcomes. It also has a bulk data exchange service that supports bulk provisioning of information between systems. Most of the services are defined using an IMS profile of the WS-I suite of specifications (WSDL, SOAP). There are also an LDAP binding for some of the services, and talk of REST-ful bindings in future versions.

An implementation of the specification is not required to support each and every service. Neither is an implementation required to support each and every operation. Rather, it is expected that communities will define profiles of the specification and implement those.

The demonstration

The demonstration itself involved an implementation of a higher education profile of the LIS specification. In the demonstration, Oracle used its Campus Solutions to manage information about students, course offerings, classes, grades etc in a mythical college. The product was essentially used as “single source of truth” for student and course information. Read the rest of this entry »

Modelling identity for different purposes

leave a comment »

Registries of data—whether in research, learning, government, or other domains, and whether repositories, data warehouses, Learning Management Systems, or libraries—typically contain metadata not just on the content itself, but on who the data came from. The people responsible for the data are of interest to the people consuming the data; so registries need to record information about them as well. The primary kind of people (or groups of people) that are of interest are the authors of the data—or, where that concept is not as applicable, the contributors or compilers of the data. (Because institutions and organisations can also claim authorship, we prefer to refer to parties rather than people, following the ISO 2146 information model for registries.) But many parties can be responsible for data ending up in a registry, in the form it does; a registry can track a range of parties involved with data, in a range of roles: publisher, editor, validator, annotator, designer.

Because it is important to record information about parties, lots of registries record that information, in lots of ways. And to lots varying extents of detail. That means that there are a variety of information models at play for parties in registries. That doesn’t mean that all information models are rigorous and well thought out. Whacking in just the login name of an uploader, as YouTube does, is itself an information model for a party involved with the content—even if the amount of thought that went into it was not overwhelming.

But that does not mean YouTube’s information model is wrong. How much information you capture on parties for a registry depends on what use that information will be put to in the registry. The information model for parties is driven by the business requirements of the registry.

That of course is no great surprise, and working out what information is required is not particularly onerous: people may not put a lot of thought into it when they put registries together, but often enough they don’t need to. Still, especially if you are shopping for standards on representing parties, it is worth spending a couple of minutes working out what you need—and as importantly, what you don’t need.
Read the rest of this entry »

Written by Nick Nicholas

November 16, 2009 at 10:05 am

Approaches to fluid identity: Identifier Assertion Hubs

with 2 comments

We have posted about the fluidity of researcher identity, and approaches to identity which acknowledge that fluidity—the NicNames project’s in particular. That post discussed the profusion of identities authors now have online, and presumed that those identities need to be deduplicated, and gathered together so that all the author’s work can be aligned to the one identity—even if we do not presume a notion of primary identity.

But the researcher does not always want their disparate identities tethered together. The pseudonym has long been a literary convention, dissected by literary historians (and authority files). Now it is a mainstay of the blogosphere, where a far amount of scholarly writing takes place; and people are well-attuned to the distinction between pseudonymous and anonymous writing. Internet sleuthing can work out the connections between online identities, just as literary scholars have been doing. That doesn’t mean the authors appreciate if you do. There may be an objective reality about an author’s identity, beyond the fluid consensus of authorities. But fluidity may suit the author just fine, because authors want control over their own identity.

We have mentioned NicNames as an approach to dealing with multiple author identities. The other initiative to mention is an outcome of the UKOLN/DRIVER workshop on international repository infrastructure, held in March. One of the infrastructure tasks the workshop faced was how to establish interoperability between repository identifiers internationally, whether they be identifiers for repository objects, or for authors. At a basic level, repository identifiers from the various available schemes—URL, Handle, PURL, XRI—are already interoperable, since all of them are usable under HTTP. But interoperability is a real problem when it comes to what representations the identifiers resolve to, or how to get a service to operate on identifiers from a huge number of different schemes.

Outside their associated services, though, identifiers are just names associated with things, and the workshop came up with a simple solution to identifier interoperability—which ANDS will take the lead in implementing, as presented at the OAI6 workshop in June. The solution is, have authorities assert that two identifiers are pointing to the same thing. This will allow you to translate queries involving one to queries involving the other, without having to build an extra service layer on top of the existing identifier services.

For author identifiers in particular, the identifiers will be the different tokens associated with researchers by sundry identifier authorities—Elsevier and Thomson, national libraries, grants agencies, institutions. And the authorities asserting equivalence between the identifiers will be national hubs (the UN doesn’t yet have the requisite infrastructure). The assertions themselves can be simple RDF statements of equivalence: katherine.mansfield@hogwarths.edu.au owl:sameAs kbeauchamp@unseen.ac.uk .

So the existing identifiers for authors are left alone, there is no unrealistic proposal to substitute them all with a Single Author Identifier. A layer is imposed over these identifiers, to deduplicate them. And that layer is decentralised, to the national level; because that is rather more feasible than a global solution.

A crucial insight is, these national hubs are still accountable to the researchers, unlike the authority file approach. And they will allow researchers to dissociate online identities, if that’s what they want. So if Kath Mansfield does not want the publications of Kate Jackson associated with her, she can get her national hub to assert instead katherine.mansfield@hogwarths.edu.au owl:differentFrom kbeauchamp@unseen.ac.uk. She can do that if the internet sleuthing associating the two identities is wrong. She can also do it, if it turns out to be right: the researcher is still empowered to control the representation of their own online identity.

To some extent. The national hubs are authorities, in the plural, and there may be another national hub insisting they are the same person after all. And that brings us back to consensus-driven wikiality, as we alluded to in the preceding post. There are authorities to assert two identities are the same, and those authorities are necessary to the scholarly process. But the identities of authors are subject to review and revision—just like the research they publish.

Even if *they* know who they are.

Written by Nick Nicholas

October 23, 2009 at 7:12 am

Fluid identity in repositories

with one comment

The business of a library is to establish authoritative identities for the works they make available. That is why libraries put together authority files, as unambiguous names for authors: those are the names books are indexed under, and searched under in library catalogues. There are several advantages of having an unambiguous identity for an author are obvious. A researcher who wants credit for their work—or the department whose funding depends on it—doesn’t want credit to go to another researcher with the same name. Anyone collecting royalties on their published work will want their identity to be unambiguous as well—though not all fields of research make it as worthwhile to chase after residuals.

Library users also appreciate disambiguation: if I am looking for works by or about the contemporary German novelist Richard Wagner (1952- ), I’d like to avoid the deluge of works by or about the slightly more famous German composer Richard Wagner (1813-1883). And a library catalogue is being helpful when it includes the dates of birth to differentiate between the two Richard Wagners—just as Wikipedia is, when it refers to Richard_Wagner_(novelist).

Making those kinds of distinctions depends on having good enough metadata on the authors. If you’ve publishing a dead-tree book in the past few decades, your national library has been in cahoots with your publisher to make sure they have that metadata. *I* don’t remember giving the Library of Congress my year of birth, but it avoids a car dealer in Florida getting credit for any books I’ve written. (See Libraries Australia.)
Read the rest of this entry »

Written by Nick Nicholas

October 21, 2009 at 6:54 am

IMS LODE: Discovery through Collection Descriptions

with 2 comments

We have already discussed our development activities around the IMS LODE activity for discovery of learning objects. However, what we have described so far presupposes that learning object descriptions are already available to a user, because the user can access those descriptions in their local repository, or through a repository federation they have access to.

But there will not in the foreseeable future be a Super-Federation of all education repositories in the world, nor indeed does there need to be. Rather than unleashing users on all e-learning repositories in the world, it makes more sense for users to discover learning object collections that they don’t already have access to—but which are of direct interest to them. So users should be able to target their searches for content to the collections which will pay off, instead of doing an inefficient, iterative blanket search across Everything.
Read the rest of this entry »

Building e-Humanities infrastructure

leave a comment »

Reflections on e-Humanities workshop, Melbourne e-Research Scholarship Centre, 2009-08-12

Building generic ICT infrastructure to support humanities research seems to be a difficult task. The standard approach is to

  1. collect a bunch of usage stories from different communities
  2. infer common business processes based on those stories
  3. build infrastructure that supports those business processes

The theory is that a community would then take the generic infrastructure and customise it to meet their particular needs. The problem is that there is something about the humanities that makes generic business processes hard to find.

We’ve blogged previously about the Project Bamboo approach to finding generic e-Humanities business processes. Project Bamboo certainly had difficulty converting its scholarly narratives into common recipes. Maybe there aren’t any processes common to the different strands of humanities research? Unlikely. Rather, the fierce independence of humanities researchers makes it difficult to infer commonalities. Suggesting to a humanities researcher that she might have a research process in common with her peers carries with it an inference that her research is not unique. Even uttering the phrase “business process”  can put humanities researchers offside (some of them conflate business and commerce).

In this context, there was a little nervousness leading up to the Interconnections and Services in the eHumanities: Reflecting on Current Initiatives workshop hosted by the University of Melbourne eScholarship Research Centre on 12 August.
Read the rest of this entry »

Follow

Get every new post delivered to your Inbox.