Linking research & learning technologies through standards

Link Affiliates Blog

Archive for the ‘Resource discovery and access’ Category

The Visible Archive

leave a comment »

The e-Research Australasia conference, which recently concluded in Sydney, had data visualisation as one of its major focuses. George Djorgovski’s plenary talk [abstract, presentation] on virtualisation in science posed the urgency for visualisation in stark terms: as the models that science is called on to build get more and more complex, not only will most data being gathered (in the “data avalanche”) never be seen by human eye; even the models needed to make sense of the data are starting to surpass human understanding, and can only be turned over to machines to deal with.

Someone will greet the notion that we will turn over our comprehension of the world to machines as a challenge. Your Humble Correspondent is more inclined to think of SkyNet

The highlight of the conference for me, at least, was Mitchell Whitelaw’s Exploring Archival Collections with Interactive Visualisation, presenting two prototype visualisations of the contents of the National Archives of Australia. The first visualisation gives an overview of all 57,000 series in the National Archives, according to their sizes (in number of items and shelf-space), and starting date. That visualisation allows critical dates to emerge from the data—the disproportionate importance of 1901, 1914 and 1939 in Australian archive-gathering, for example; and the visualisation also highlights relations between different series graphically. But although getting 57,000 complex data points into a single 2-D diagram has its appeal, there were no real surprises there.

The second visualisation, I found much more interesting. It uses tag clouds for the titles of the 65,000 records contained in a single archive series. Tag clouds have justly been called the mullets of Web 2.0 (as far back as 2005—strange to think there was a Web 2.0 as long ago as 2005!) But Whitelaw’s use of tag clouds, helped along with plenty of Java, are the most intelligent use of tag clouds I’ve seen in a while.

One very handy piece of interactivity is that you can select to ignore particular tags which are crowding out their peers in the cloud; if in a particular archive half the titles contain “Naturalisation” or “Citizenship”, then in most contexts those words will have no more interest for you as a researcher than instances of “the” or “and”: they become stop-words in the context of that archive. Choosing to eliminate those recurring words reveal the real diversity of topics in the archive, startlingly. The effect is like putting on glasses: fuzzy tags on the periphery of the tag cloud, blotted out by one or two stop words, suddenly come into focus.

But the critical distinction in what Whitelaw does is that it can explore collocations of words in the titles. Clicking a tag draws lines to all the other tags it coocurs with in the same title—the more frequently, the thicker the line. So you can straightforwardly get a sense of what contexts a particular word comes up in—with an accompanying bar chart giving the chronological distribution of those contexts. As Whitelaw shows in his example, clicking on Darwin in his example archive draws prominent lines to “1937″ and “cyclone”—which burbles up out of the data the fact of the 1937 Darwin Cyclone [PDF]. The visualisation allows the user to drill down to digitisations of the individual archive records that the tag cloud collocations expose. (The metadata to the 1937 cyclone archives are online.)

Which all means that intelligent navigation of tags and tag collocations can expose stories directly in the documents they are drawn from, without any prepping or mediation. All done with a highly engaging interface.

Whitelaw has blogged his visualisation work at The Visible Archive, which includes downloadable Java for both visualisations (with canned data). The Darwin 1937 Cyclone is not the only fact that emerges out of the tag clouds, and we encourage you to go exploring yourselves.

Written by Nick Nicholas

November 23, 2009 at 10:10 am

Sharing learning resources in the VET sector: The LORN way

leave a comment »

The Learning Object Repository Network (LORN)  has been a long time in gestation, sensibly so. The Australian Flexible Learning Framework spawned LORN using a measured and standards based approach.

In 2003, the Framework began developing structures and standards for managing access to quality electronic learning resources across Australia’s VET sector. In 2004 the Framework established LORN to facilitate exchange of learning objects between states and territories, based on a model of trust, cooperation and interoperability. LORN currently enables the sharing and sale of learning resources that support flexible delivery across the VET sector.” (from LORN website)

 Much of what LORN has developed has leveraged another VET infrastructure service, AEShareNet , not least the standards based licensing approach.

So what is LORN?

The Learning Object Repository Network (LORN) is an easy to use portal that allows teachers and trainers to access quality resources for the VET sector.

LORN consists of:

  • repository owner organisations that hold learning resources they are willing to share across the VET sector, and
  • consumer access providers (CAPs) that use the LORN search to display results within their organisation’s website.

So basically anyone can access and download learning objects; but in order to “advertise” that you have objects available to share, you must conform to some standards and specification both technical and non technical.

Repository owners who participate in LORN have agreed to the following principles:

  • Commitment to working with other members—in a spirit of cooperation—to advance the interests of the whole sector especially in relation to gaining efficiencies from sharing teaching and learning resources.
  • Commitment to exposing a reasonable amount of content so that using OAI harvesting in the federation of repositories is a rewarding experience for the consumer.
  • Agreement to adhere to a minimum set of business and technical specifications.
  • Agreement to licence learning objects to users to be reusable within the terms of the associated digital rights. Learning objects in the repositories should correspond with the AEShareNet‑U (unrestricted), AEShareNet-S (share and return), AEShareNet –P (Preserve Content) and AEShareNet‑FfE (Free for Education purposes) licences.

Technical specifications include:

  • Maintaining a repository of learning objects relevant to the VET sector
  • Providing a harvest file that includes descriptions of all learning objects and other resources using Vetadata (agreed VET specific metadata)
  • Using the AEShareNet instant licences (FfE, U, S & P) and the immediate C licences
  • Providing a pricing file in the approved format as required for the purpose of transacting immediate C licences.

I have been a member of various LORN references groups since its inception in 2003 as part of different roles and contracts I have held. It has been fascinating to be involved in its slow (sometimes frustratingly slow) and steady progress.  

So what are the upsides and where are the issues and where is it going?

The approach is basically driven by a bottom up agreement to cooperate and share. There is a small amount of national infrastructure funding that has enabled the development so far. But really the commitment put in by the repository owners has been the key to its growth and sustainability. And it is amazing that there has been so much agreement albeit hard won. The end result is a whole heap of learning objects accessible by teachers and trainers across the VET sector, which might otherwise have remained hidden within one institution or one jurisdiction. At the same time there has been a strenuous effort to keep it simple for the teachers or learning object seekers. The repositories do the hard yards behind the scenes to keep it simple and consistent for those looking to access learning resources. 

But with any such service there are issues.

One fundamental challenge has been the need to allow repositories to charge for learning objects. Basically only a relatively small number of objects would be released across the VET sector if only “free to access and use” resources were allowed. Models do not really exist across the VET sector for freely sharing resources across public and private training organisations, especially when there is both stiff competition between training providers, and fully commercial exploitation of resources in terms of both course provision and publishing. So a simple thing like charging for a resource sets up a huge challenge for LORN, in terms of providing simple and immediate access via micro payments.

Other emerging issues include:

  • The desire from repository owners to make non-downloadable learning resources accessible via LORN
  • The need to develop a sustainable business model in terms of who pays for the ongoing maintenance and further development of the LORN infrastructure (at present it is project funded)
  • The need to provide access to a larger variety of repositories including commercial publishers
  • The need to cater for (smaller) repository owners who might struggle to meet the technical specifications entry requirements

So does the LORN model have any relevance to other sectors? 

Well first of all in order to develop and deliver the service, LORN has had to tackle key challenges that any resource sharing approach would need to tackle, including:

  • Agreed metadata standards
  • Agreed and consistent licensing
  • Agreed federated harvesting/search protocols
  • Persistent identifiers for materials
  • Authentication for users
  • Also it provides a model for collaborative governance, especially across the public VET jurisdictions.

In designing a pay-per-access option LORN has provided a methodology for ensuring that learning resources can be “shared”, albeit with money changing hands sometime. This sharing can occur across public and private and between private institutions. Mind you this is not non-contestable. There is a school of thought that says the teachers accessing learning resources should not be faced with barriers of “pay before access” . This should be sorted at the macro rather than the micro level. In other words, jurisdiction or institutions or repository publishers provide access to any individual teacher based on a bulk arrangement, either pre or post facto for particular institutions or jurisdictions. (A simple Trust Federation may help in this regard.)

For 2010 LORN has a few key  tasks to drive things forward including: finalising the implementation of persistent identifiers, moving towards a smoother authentication approach, and incorporating non-downloadable learning resources into the network.

At the same time AEEYSOC (Australian Education, Early Childhood Development and Youth Affairs Senior Officials Committee) is apparently grappling  with the importance of a national eLearning architecture plan for digital resource discovery, development, storage and sharing  in the school sector. LORN might just have paved the way for such an approach with its hard won successes over the last six years. If nothing else it demonstrates that sharing learning resources was not meant to be easy.

 

Written by uldm

November 19, 2009 at 8:32 pm

Fluid identity in repositories

with one comment

The business of a library is to establish authoritative identities for the works they make available. That is why libraries put together authority files, as unambiguous names for authors: those are the names books are indexed under, and searched under in library catalogues. There are several advantages of having an unambiguous identity for an author are obvious. A researcher who wants credit for their work—or the department whose funding depends on it—doesn’t want credit to go to another researcher with the same name. Anyone collecting royalties on their published work will want their identity to be unambiguous as well—though not all fields of research make it as worthwhile to chase after residuals.

Library users also appreciate disambiguation: if I am looking for works by or about the contemporary German novelist Richard Wagner (1952- ), I’d like to avoid the deluge of works by or about the slightly more famous German composer Richard Wagner (1813-1883). And a library catalogue is being helpful when it includes the dates of birth to differentiate between the two Richard Wagners—just as Wikipedia is, when it refers to Richard_Wagner_(novelist).

Making those kinds of distinctions depends on having good enough metadata on the authors. If you’ve publishing a dead-tree book in the past few decades, your national library has been in cahoots with your publisher to make sure they have that metadata. *I* don’t remember giving the Library of Congress my year of birth, but it avoids a car dealer in Florida getting credit for any books I’ve written. (See Libraries Australia.)
Read the rest of this entry »

Written by Nick Nicholas

October 21, 2009 at 6:54 am

IMS LODE: Discovery through Collection Descriptions

with 2 comments

We have already discussed our development activities around the IMS LODE activity for discovery of learning objects. However, what we have described so far presupposes that learning object descriptions are already available to a user, because the user can access those descriptions in their local repository, or through a repository federation they have access to.

But there will not in the foreseeable future be a Super-Federation of all education repositories in the world, nor indeed does there need to be. Rather than unleashing users on all e-learning repositories in the world, it makes more sense for users to discover learning object collections that they don’t already have access to—but which are of direct interest to them. So users should be able to target their searches for content to the collections which will pay off, instead of doing an inefficient, iterative blanket search across Everything.
Read the rest of this entry »

IMS LODE: Exchanging Objects

leave a comment »

Over the last few months, the Australian Digital Futures Institute has been working with Link Affiliates to test the specifications coming out of the IMS Learning Object Discovery and Exchange activity. We have already posted about our testing work; now that our work is wrapping up, this is a summary of what we have done. This post goes into the work done on discovery of individual learning objects.

DEEWR has funded Link Affiliates to participate in the IMS activity on behalf of the Australian schools sector, with the aim to facilitate discovery and retrieval of learning content from repositories, by profiling standards for searching and harvesting learning content, and learning content repositories. That leads to better use and reuse of available resources in the domain, and is one of the areas prioritised by the Digital Education Revolution. Our main partners in the activity have been European Schoolnet, which is pursuing large-scale exchange of objects between repositories through the ASPECT project (see more details), and TÉLUQ, the distance education arm of the Université de Québec à Montréal.

The issues IMS LODE is seeking to address involve both search queries and search results.
Read the rest of this entry »

Written by Nick Nicholas

September 14, 2009 at 12:37 pm

IMS Global Meeting: Curriculum Standards

leave a comment »

We have already mentioned that the recent quarterly IMS meeting concentrated on developments in Common Cartridge, and how Common Cartridge is being aligned with other initiatives underway in IMS. One of those initiatives is Learning Tools Interoperability (LTI), and was the subject of a developer workshop there.

The other major initiative involving Common Cartridge are Curriculum Standards, which are being added to Common Cartridge as metadata. We have also discussed here the importance of machine-readable curricula, and how they can be exploited as metadata for learning objects—to enable more focused discovery of learning objects, and better alignment of resources to a school’s curriculum. Including Curriculum Standards in Common Cartridge addresses these concerns expressly.
Read the rest of this entry »

Building e-Humanities infrastructure

leave a comment »

Reflections on e-Humanities workshop, Melbourne e-Research Scholarship Centre, 2009-08-12

Building generic ICT infrastructure to support humanities research seems to be a difficult task. The standard approach is to

  1. collect a bunch of usage stories from different communities
  2. infer common business processes based on those stories
  3. build infrastructure that supports those business processes

The theory is that a community would then take the generic infrastructure and customise it to meet their particular needs. The problem is that there is something about the humanities that makes generic business processes hard to find.

We’ve blogged previously about the Project Bamboo approach to finding generic e-Humanities business processes. Project Bamboo certainly had difficulty converting its scholarly narratives into common recipes. Maybe there aren’t any processes common to the different strands of humanities research? Unlikely. Rather, the fierce independence of humanities researchers makes it difficult to infer commonalities. Suggesting to a humanities researcher that she might have a research process in common with her peers carries with it an inference that her research is not unique. Even uttering the phrase “business process”  can put humanities researchers offside (some of them conflate business and commerce).

In this context, there was a little nervousness leading up to the Interconnections and Services in the eHumanities: Reflecting on Current Initiatives workshop hosted by the University of Melbourne eScholarship Research Centre on 12 August.
Read the rest of this entry »

Project Bamboo

leave a comment »

Project Bamboo is an Andrew W Mellon Foundation-sponsored project that aims to dramatically improve the way digital technologies are used in humanities research, with a particular focus on shared services infrastructure. The main participants are humanities departments and libraries in major US universities such as Chicago and Berkley, but overseas universities including Cambridge, Oxford, ANU and the University of Melbourne are represented.

Founded in March 2008, Project Bamboo has run five workshops to turn input from the e-Scholarship community into a proposal which it will submit to the Andrew W Mellon Foundation at the end of 2009. The proposal will describe a 7-10 year process, but will focus heavily on implementation in years 1 and 2.

As the project has developed, its thinking has evolved. The project began in more optimistic financial conditions, and implicitly supported a very wide agenda to be realised over ten years. This includes shared services, an extensive, ongoing business analysis model (scholarly narratives, recipes, activities in theme groups, and a marketplace for goods, services for labour (Bamboo Exchange). The project argues that with a solid service based infrastructure supporting reusable applications and tools across different institutions, the cost and effort of using technology in humanities research will be reduced, with many new benefits. With the current global financial situation, the project’s immediate scope has become focused on two parts:

  • The Bamboo Services Platform is a cloud-based environment which will host shared services useful to researchers in e-Humanities. They will include existing services and applications re-engineered for the new platform, as well as novel services created to fill identified niches.
  • The Bamboo Commons is a broad discovery mechanism that allows Bamboo participants to find Bamboo services, tools, business analysis – and each other.

Link Affiliates has submitted two recipes and is using the e-Framework to model solutions to the problems they pose. The e-Framework with its principled binding of services analysis to business requirements is well positioned to offer a structured approach to the problem of interoperability of services, tools, content and business processes within the digital humanities sector.

National Curriculum, machine-readable

leave a comment »

Establishing a National Curriculum in Australia has proven to be an elusive goal for the past forty years, for a variety of reasons. The Federal Government has decided to go ahead with establishing a National Curriculum, and has established the National Curriculum Board, to make it a reality [EDIT: now under the Australian Curriculum, Assessment and Reporting Authority (ACARA)]. The Board’s work is well underway: the Shaping Papers and Framing Papers for the curriculum, in the priority areas of English, maths, the sciences and history, have already been subject to stakeholder feedback and revised. The Board is now embarking on the work of scoping, sequencing, and filling in the curricula proper, and aims to publish the curricula after national consultation, in June through September 2010.

Although there has not been a national curriculum in Australia to date, there has already been agreement among the States as to the broad goals of national education, as affirmed by MCEETYA; these include the Melbourne Declaration on Educational Goals for Young Australians in December 2008, and the 1999 Adelaide Declaration on National Goals for Schooling in the Twenty-First Century before it. The National Assessment Program, benchmarking student performance in Literacy and Numeracy, has become nationwide in 2008, but national benchmarking of State tests has already been in place for a decade before that. Independently, MCEETYA has adopted nationwide Statements of Learning for English, Maths, Science, Civics, and ICT since 2003, and these have served to bring the State curricula into at least some alignment. So there are already foundations in place for the National Curriculum Board to build on.

Curriculum objectives and outcomes are stated in prose paragraphs. The entire K-10 or K-12 curriculum can end up being quite a weighty book, given the detail to which a student’s thirteen years of education need to be laid out, and even the broad Foundation Statements . By way of comparison, take the examples of mathematics:

Even the broad, Foundation Statement summaries of desired outcomes are groups of multiple paragraphs per level. Compare:

But as teaching is transformed in the Information Age, the paragraphs need to be leveraged to maximise their use, beyond the realm of PDFs. In particular, teachers will not want to be guided by the curriculum just to plan what they will teach this semester. They will also want help in working out what resources they can use, to teach those curriculum objectives. With the quantity of resources available to them, and increasing all the time, teachers do not expect to have to sift through the descriptions of the resources one by one, to work out which fits the curriculum objectives best. They quite reasonably expect someone to have done so for them already, wherever the resources were registered and their metadata crafted. Given a curriculum objective, they should be able to search for resources which match it.

The expectation that curriculum objectives should drive resource discovery holds, whether there is a national curriculum, or lots of different state curricula. A single set of curriculum objectives makes things much easier for realising such discovery, because resources need to be searchable against only a single set of objectives, rather than eight. But the most critical requirement for resource discovery is not to have a single set of curriculum objectives per country: there is no national curriculum on the horizon for the States, to take the most obvious example. The critical requirement is to have the curriculum objectives be machine readable. Once the paragraphs of learning objectives can be distilled into assertions to tag content with, content can be discovered through those tags.

If those assertions are machine readable, then the capabilities of the Semantic Web and Natural Language Processing can be brought to bear, to automate the tagging of content as much as possible. Making the tagging efficient is in itself a powerful argument for a consistent national curriculum, since it needs only deal with a single authoritative set of assertions about what students should learn. The efficiencies to be gained are obvious for the publishers and registry providers, who make the content available, as well as the teachers and administrators looking for the content.

Machine readable curriculum objectives have been a long running concern of the e-learning community. The disparity between State curricula, and the need to make content searchable against each state’s objectives, led to the development in 2003 of a Curriculum Organiser component, incorporated in the Basic E-Learning Tool Set (BELTS) tool by The Le@rning Federation. The challenge has been particularly acute in the US, as seen, and has led to initiatives like the Gateway to 21st Century Skills and the Achievement Standards Network. The ASN contains, as a resource, all 51 US curricula and several national curricula in machine-readable form. Based on that resource, members of the Gateway network can build tools to better navigate the assertions in resource discovery and learning pathways, as well as curricula. This leads to the paragraphs becoming active and flexible building blocks: they can be sequenced differently, broken down differently, and matched across streams to the same content, leading to more creative and responsive learning.

As the National Curriculum sets about its work of creating a new set of assertions, there is an opportunity to make these assertions dynamic, integrating them into the ICT-driven workflows that enable 21st century learning. This is especially timely with a national curriculum, which can drive the efficiencies required in those ICT-driven workflows most effectively.

Written by Nick Nicholas

July 20, 2009 at 6:39 pm

ANDS Persistent Identifier Service

leave a comment »

Hyperlinks break, and there has long been a realisation that there is information online whose hyperlinks should not break. Repositories have been set up to ensure the ongoing availability of online information; but like any online data source, repositories too change servers and structures and platforms, and their hyperlinks too break. This is a problem that affects e-research as much as it does e-libraries and e-learning. With the increasing move to publish and cite research data online, the issue is becoming even more keenly felt.

The repository community has come to accept that persistent identifiers help deal with this issue; they do this through some mechanism of redirection to the current network location of a resource. In itself, this does not solve the issue: it’s no good having a persistent identifier redirect to the current network location, if the redirection is not updated, or the identifier server is down, or the data is tampered with. But removing the dependency on current location does at least allow procedures to be put in place, which can prevent foreseeable disruptions to the persistent access to a resource.

The PILIN project was tasked with exploring the policy and technological issues behind persisting identifiers; as a result, it produced quite a bit of text, and some code. Institutions that already host persistent identifiers can use these outputs to firm up their infrastructure. But PILIN was not an operational project, and it could not build a sustainable, backbone identifier infrastructure. That is the job for a national service supporting access to data: individuals and institutions should be able to rely on such a service to keep their identifiers around in the long run, even if they cannot host the identifiers themselves.

The Australian National Data Service fits that description, and has acknowledged from the beginning that providing persistent identifiers are a core part of its business. It has already piloted a persistent identifier service based on the Handle System, and is going live with it. The service is intended for B2B use through XML over HTTP, rather than manually administering each identifier: this encourages users to automate solutions to maintaining identifiers, which is sounder practice for ensuring that identifiers really are kept up to date.

The Link Affiliates team has also been writing training materials on persistent identifiers, building on the PILIN project work, and concentrating in particular on the range of policies needed to ensure persistence. Although ANDS is providing the infrastructure for hosting the identifiers, much of the policy implementation still has to happen on the side of the researchers and data managers, who have requested the identifiers and are responsible for keeping them up to date.


A brief note, by the by, that I have posted elsewhere on the UKOLN International Repository Workshop, and its work on persistent identifiers. The workshop came up with a model for identifier interoperability, as discoverable assertions of equivalence or difference of different identifier names. This is humble and unexciting enough to be feasible (particularly for author identity, a fraught issue which has already engaged much discussion).

Written by Nick Nicholas

June 22, 2009 at 10:45 am

Follow

Get every new post delivered to your Inbox.