Linking research & learning technologies through standards

Link Affiliates Blog

Archive for November 2009

The Visible Archive

leave a comment »

The e-Research Australasia conference, which recently concluded in Sydney, had data visualisation as one of its major focuses. George Djorgovski’s plenary talk [abstract, presentation] on virtualisation in science posed the urgency for visualisation in stark terms: as the models that science is called on to build get more and more complex, not only will most data being gathered (in the “data avalanche”) never be seen by human eye; even the models needed to make sense of the data are starting to surpass human understanding, and can only be turned over to machines to deal with.

Someone will greet the notion that we will turn over our comprehension of the world to machines as a challenge. Your Humble Correspondent is more inclined to think of SkyNet

The highlight of the conference for me, at least, was Mitchell Whitelaw’s Exploring Archival Collections with Interactive Visualisation, presenting two prototype visualisations of the contents of the National Archives of Australia. The first visualisation gives an overview of all 57,000 series in the National Archives, according to their sizes (in number of items and shelf-space), and starting date. That visualisation allows critical dates to emerge from the data—the disproportionate importance of 1901, 1914 and 1939 in Australian archive-gathering, for example; and the visualisation also highlights relations between different series graphically. But although getting 57,000 complex data points into a single 2-D diagram has its appeal, there were no real surprises there.

The second visualisation, I found much more interesting. It uses tag clouds for the titles of the 65,000 records contained in a single archive series. Tag clouds have justly been called the mullets of Web 2.0 (as far back as 2005—strange to think there was a Web 2.0 as long ago as 2005!) But Whitelaw’s use of tag clouds, helped along with plenty of Java, are the most intelligent use of tag clouds I’ve seen in a while.

One very handy piece of interactivity is that you can select to ignore particular tags which are crowding out their peers in the cloud; if in a particular archive half the titles contain “Naturalisation” or “Citizenship”, then in most contexts those words will have no more interest for you as a researcher than instances of “the” or “and”: they become stop-words in the context of that archive. Choosing to eliminate those recurring words reveal the real diversity of topics in the archive, startlingly. The effect is like putting on glasses: fuzzy tags on the periphery of the tag cloud, blotted out by one or two stop words, suddenly come into focus.

But the critical distinction in what Whitelaw does is that it can explore collocations of words in the titles. Clicking a tag draws lines to all the other tags it coocurs with in the same title—the more frequently, the thicker the line. So you can straightforwardly get a sense of what contexts a particular word comes up in—with an accompanying bar chart giving the chronological distribution of those contexts. As Whitelaw shows in his example, clicking on Darwin in his example archive draws prominent lines to “1937″ and “cyclone”—which burbles up out of the data the fact of the 1937 Darwin Cyclone [PDF]. The visualisation allows the user to drill down to digitisations of the individual archive records that the tag cloud collocations expose. (The metadata to the 1937 cyclone archives are online.)

Which all means that intelligent navigation of tags and tag collocations can expose stories directly in the documents they are drawn from, without any prepping or mediation. All done with a highly engaging interface.

Whitelaw has blogged his visualisation work at The Visible Archive, which includes downloadable Java for both visualisations (with canned data). The Darwin 1937 Cyclone is not the only fact that emerges out of the tag clouds, and we encourage you to go exploring yourselves.

Written by Nick Nicholas

November 23, 2009 at 10:10 am

Sharing learning resources in the VET sector: The LORN way

leave a comment »

The Learning Object Repository Network (LORN)  has been a long time in gestation, sensibly so. The Australian Flexible Learning Framework spawned LORN using a measured and standards based approach.

In 2003, the Framework began developing structures and standards for managing access to quality electronic learning resources across Australia’s VET sector. In 2004 the Framework established LORN to facilitate exchange of learning objects between states and territories, based on a model of trust, cooperation and interoperability. LORN currently enables the sharing and sale of learning resources that support flexible delivery across the VET sector.” (from LORN website)

 Much of what LORN has developed has leveraged another VET infrastructure service, AEShareNet , not least the standards based licensing approach.

So what is LORN?

The Learning Object Repository Network (LORN) is an easy to use portal that allows teachers and trainers to access quality resources for the VET sector.

LORN consists of:

  • repository owner organisations that hold learning resources they are willing to share across the VET sector, and
  • consumer access providers (CAPs) that use the LORN search to display results within their organisation’s website.

So basically anyone can access and download learning objects; but in order to “advertise” that you have objects available to share, you must conform to some standards and specification both technical and non technical.

Repository owners who participate in LORN have agreed to the following principles:

  • Commitment to working with other members—in a spirit of cooperation—to advance the interests of the whole sector especially in relation to gaining efficiencies from sharing teaching and learning resources.
  • Commitment to exposing a reasonable amount of content so that using OAI harvesting in the federation of repositories is a rewarding experience for the consumer.
  • Agreement to adhere to a minimum set of business and technical specifications.
  • Agreement to licence learning objects to users to be reusable within the terms of the associated digital rights. Learning objects in the repositories should correspond with the AEShareNet‑U (unrestricted), AEShareNet-S (share and return), AEShareNet –P (Preserve Content) and AEShareNet‑FfE (Free for Education purposes) licences.

Technical specifications include:

  • Maintaining a repository of learning objects relevant to the VET sector
  • Providing a harvest file that includes descriptions of all learning objects and other resources using Vetadata (agreed VET specific metadata)
  • Using the AEShareNet instant licences (FfE, U, S & P) and the immediate C licences
  • Providing a pricing file in the approved format as required for the purpose of transacting immediate C licences.

I have been a member of various LORN references groups since its inception in 2003 as part of different roles and contracts I have held. It has been fascinating to be involved in its slow (sometimes frustratingly slow) and steady progress.  

So what are the upsides and where are the issues and where is it going?

The approach is basically driven by a bottom up agreement to cooperate and share. There is a small amount of national infrastructure funding that has enabled the development so far. But really the commitment put in by the repository owners has been the key to its growth and sustainability. And it is amazing that there has been so much agreement albeit hard won. The end result is a whole heap of learning objects accessible by teachers and trainers across the VET sector, which might otherwise have remained hidden within one institution or one jurisdiction. At the same time there has been a strenuous effort to keep it simple for the teachers or learning object seekers. The repositories do the hard yards behind the scenes to keep it simple and consistent for those looking to access learning resources. 

But with any such service there are issues.

One fundamental challenge has been the need to allow repositories to charge for learning objects. Basically only a relatively small number of objects would be released across the VET sector if only “free to access and use” resources were allowed. Models do not really exist across the VET sector for freely sharing resources across public and private training organisations, especially when there is both stiff competition between training providers, and fully commercial exploitation of resources in terms of both course provision and publishing. So a simple thing like charging for a resource sets up a huge challenge for LORN, in terms of providing simple and immediate access via micro payments.

Other emerging issues include:

  • The desire from repository owners to make non-downloadable learning resources accessible via LORN
  • The need to develop a sustainable business model in terms of who pays for the ongoing maintenance and further development of the LORN infrastructure (at present it is project funded)
  • The need to provide access to a larger variety of repositories including commercial publishers
  • The need to cater for (smaller) repository owners who might struggle to meet the technical specifications entry requirements

So does the LORN model have any relevance to other sectors? 

Well first of all in order to develop and deliver the service, LORN has had to tackle key challenges that any resource sharing approach would need to tackle, including:

  • Agreed metadata standards
  • Agreed and consistent licensing
  • Agreed federated harvesting/search protocols
  • Persistent identifiers for materials
  • Authentication for users
  • Also it provides a model for collaborative governance, especially across the public VET jurisdictions.

In designing a pay-per-access option LORN has provided a methodology for ensuring that learning resources can be “shared”, albeit with money changing hands sometime. This sharing can occur across public and private and between private institutions. Mind you this is not non-contestable. There is a school of thought that says the teachers accessing learning resources should not be faced with barriers of “pay before access” . This should be sorted at the macro rather than the micro level. In other words, jurisdiction or institutions or repository publishers provide access to any individual teacher based on a bulk arrangement, either pre or post facto for particular institutions or jurisdictions. (A simple Trust Federation may help in this regard.)

For 2010 LORN has a few key  tasks to drive things forward including: finalising the implementation of persistent identifiers, moving towards a smoother authentication approach, and incorporating non-downloadable learning resources into the network.

At the same time AEEYSOC (Australian Education, Early Childhood Development and Youth Affairs Senior Officials Committee) is apparently grappling  with the importance of a national eLearning architecture plan for digital resource discovery, development, storage and sharing  in the school sector. LORN might just have paved the way for such an approach with its hard won successes over the last six years. If nothing else it demonstrates that sharing learning resources was not meant to be easy.

 

Written by uldm

November 19, 2009 at 8:32 pm

Live annotation at eResearch Australasia

leave a comment »

For the last few years, tools to allow people to collaboratively annotate websites and other online objects have started to emerge as something researchers want. For example Annocryst is popular for collaboratively annotating 3D crystallographic models, and the University of Melbourne’s e-Scholarship Research Centre has identified online annotation as a highly desirable feature for their Online Heritage Resource Manager software.

However, there doesn’t seem to be a killer app for annotating web pages, and even Zotero — very popular in the e-Humanities — has limited uptake, since it only works in Firefox.

Ron Chernich (University of Queensland)’s live demonstration of a new annotation tool called Danno, at eResearch Australasia, was interesting for three reasons [abstract, presentation]: he explained why browser extensions are bad, he demonstrated an alternative approach using cross-browser javascript, and third, people started using it: right there, in the presentation!

What’s wrong with browser extensions

Most annotation tools used in e-research are browser extensions. While this has gotten the community a long way, there are limitations. In a nutshell:

  • They’re completely browser-specific, multiplying development effort: a Firefox plugin has to be completely rewritten for IE, Safari, Opera, etc.
  • They require installation and browser restart, increasing the barrier to entry. (Even that little bit matters)
  • They run with a high level of privileges, potentially compromising user security.
  • As they can conflict with a group’s Standard Operating Environment, they may require the approval and support of the IT department to install.

Danno: using cross-browser JavaScript

The UQ team were asked to develop a collaborative annotation service for the Atlas of Living Australia with one rule: no browser plugins. They took up the challenge,  finding a way to make  JavaScript work for any website. Their solution, Danno, works with two different models:

  • “Danno-friendly” sites include some scripts at the top to add features like showing and editing any annotations on the current page.
  • Unenhanced sites can be seen through a “Repeater” – effectively a single-use proxy server that injects the required JavaScript on the way through. Using a bookmarklet makes this a one-click operation for any page.

Getting JavaScript to work across all browsers is hard, of course. But they managed.

Result: people used it!

The really remarkable thing about the presentation was that no sooner had Ron shown the URL to the demo page, than audience members started spontaneously trying it out. It was pretty easy: hover over an annotation and click “Reply to Annotation”. Or find the “Dannotate” link (again, best used as a bookmarklet), and create a new one. You can even annotate regions within images. By taking away any requirement to install anything or even register as a user, participation just happened.

For comparison, there exists another tool, Diigo, with some of these features, and which can also operate without a plugin, but it is designed to require a username and password, retaining some barrier to entry.

No doubt, extensions like Zotero work well within in institution where there is IT support, a high level of engagement with a project, and everyone is using the same platform. But approaches like Danno might work better in distributed projects, with less engagement from prospective members (ie, barrier to entry matters more), and where support for a given browser extension cannot be guaranteed.

IMS Global Meeting: Learner Information Services

with one comment

The IMS Global quarterly meeting for late 2009 was hosted by Oracle at their Redwood City campus in California. During the meeting, Oracle and their partners gave a nice demonstration of systems integration using the emerging Learning Information Services specification.

About the LIS specification

The IMS Learning Information Services (LIS) specification supports

sharing of learner and course information between Student Information Systems and Learning Environments

It supersedes the previous IMS specification in this space (IMS Enterprise) that specified data formats for exchanging learning information between systems. LIS takes things a step further: as well as specifying data formats, it defines services for exchanging and synchronising student and course information between systems. This represents a new direction for IMS specifications: a shift toward a service oriented approach (soa) rather than a data oriented approach to system integration.

The LIS specification is large. It defines hundreds of operations in six services for managing updates to data about people, groups, memberships, courses, outcomes. It also has a bulk data exchange service that supports bulk provisioning of information between systems. Most of the services are defined using an IMS profile of the WS-I suite of specifications (WSDL, SOAP). There are also an LDAP binding for some of the services, and talk of REST-ful bindings in future versions.

An implementation of the specification is not required to support each and every service. Neither is an implementation required to support each and every operation. Rather, it is expected that communities will define profiles of the specification and implement those.

The demonstration

The demonstration itself involved an implementation of a higher education profile of the LIS specification. In the demonstration, Oracle used its Campus Solutions to manage information about students, course offerings, classes, grades etc in a mythical college. The product was essentially used as “single source of truth” for student and course information. Read the rest of this entry »

Modelling identity for different purposes

leave a comment »

Registries of data—whether in research, learning, government, or other domains, and whether repositories, data warehouses, Learning Management Systems, or libraries—typically contain metadata not just on the content itself, but on who the data came from. The people responsible for the data are of interest to the people consuming the data; so registries need to record information about them as well. The primary kind of people (or groups of people) that are of interest are the authors of the data—or, where that concept is not as applicable, the contributors or compilers of the data. (Because institutions and organisations can also claim authorship, we prefer to refer to parties rather than people, following the ISO 2146 information model for registries.) But many parties can be responsible for data ending up in a registry, in the form it does; a registry can track a range of parties involved with data, in a range of roles: publisher, editor, validator, annotator, designer.

Because it is important to record information about parties, lots of registries record that information, in lots of ways. And to lots varying extents of detail. That means that there are a variety of information models at play for parties in registries. That doesn’t mean that all information models are rigorous and well thought out. Whacking in just the login name of an uploader, as YouTube does, is itself an information model for a party involved with the content—even if the amount of thought that went into it was not overwhelming.

But that does not mean YouTube’s information model is wrong. How much information you capture on parties for a registry depends on what use that information will be put to in the registry. The information model for parties is driven by the business requirements of the registry.

That of course is no great surprise, and working out what information is required is not particularly onerous: people may not put a lot of thought into it when they put registries together, but often enough they don’t need to. Still, especially if you are shopping for standards on representing parties, it is worth spending a couple of minutes working out what you need—and as importantly, what you don’t need.
Read the rest of this entry »

Written by Nick Nicholas

November 16, 2009 at 10:05 am

Technical Standards for Digital Education – Focus Groups

leave a comment »

As referred to in an earlier post, Link Affiliates is working this year on supporting the Digital Education Revolution, through the Technical Standards for Digital Education project. A large part of the activities includes the establishment of Focus Groups, which have been established for 6 of the 7 activities. Each Focus Group consists of representatives from various jurisdictions within the education sector – primarily schools-based representatives, but also including some representatives from the VET sector. Different groups may also include people from other relevant organisations including government organisations.

So far we have identified three main purposes for the Focus Groups:

  1. Members bringing their own expertise and experience into the group to share
  2. Members acting as conduits back into their own organisations for the information that comes out of the Focus Group meetings
  3. Members utilising their own linkages (eg professional networks) to disseminate the information that comes out of group meetings, as well as utilising these linkages to bring further information into the group.

Each Focus Group operates a little differently from the others, based on the requirements of that activity. Initial meetings of most Focus Groups were held in August 2009, and regular meetings will continue to be held throughout the duration of the project until June 2010. The first meetings of the various Focus Groups have been very positive, and much discussion emerged on various topics. It also became apparent in a few of the groups that many of the members were delighted to have such a forum on which to discuss these pertinent matters with other members of their profession who, being in a variety of jurisdictions and organisations, were able to provide new perspectives. The Focus Groups are also making use of Edna groups in the form of wikis and forums to support group communication.

The Focus Groups are expected to help in providing a couple of important outputs for the Technical Standards for Digital Education project. Firstly, each Focus Group will provide input into a Briefing Paper which has initially been created in draft by Link Affiliates, but will eventually be an output of the whole group. This Briefing Paper will provide a snap shot of the state-of-play for each of the activities, and will benefit greatly from such a wide range of input from group members. In turn, it is hoped that the Briefing Papers will be of benefit to the education sector, providing resources for the sector as well The papers are a work-in-progress, and are expected to be completed by June 2010.

Secondly, the Focus Groups provide a medium for cross-jurisdictional and cross-organisational discussion regarding the various activities, ranging from the new WCAG2.0 guidelines and their impact upon content creation for the sector, to supporting schools in the use of ‘safe’ Web2.0 content, to looking at the interoperability challenges for e-portfolios in the Australian schools sector, to name just a few. It is expected that this melding of experiences from each group member will also result in members being able to take away something positive from their participation – something that can be taken back to each jurisdiction and organisation involved and assist in supporting the development of these key areas within the sector.

Written by sophiaca

November 12, 2009 at 11:45 am

Metadata-less ANDS

leave a comment »

Guest blog post by Lyle Winton, VERSI

I’ve seen quite a few ARDC (Australian Research Data Commons) ideas that will use existing digital records to create a nice metadata-full context around research datasets.  Many of these records will have to be “cleaned up” or involve new processes to ensure more complete metadata.  Having worked as a researcher, I realise institutions collect bits of this stuff already – people, grant, publication info – but there’s still a lot of activities and projects which probably don’t have corporate records.  So I fear the convergence of the metadata-full approach and normal research practice will be more reporting and/or more metadata entry for researchers. 

This leads me to an idea (still half baked) and it’s based on 2 premises: ARDC is essentially about good discovery, not necessarily good metadata; and heavy reliance on manual entry of metadata is either expensive or patchy.  (Feel free to disagree with my premises.) 

Somewhat following the Google approach of “linking text” being more important than metadata: at the time of dataset registration you could “link” (essentially attach) as much unstructured text around the dataset as possible.

A scenario: Joanne Bloggs registers a numerical dataset from a research survey.  In the process she attaches an email thread between herself and the data collectors, a grant application that’s in progress, and several loosely related papers in PDF and Word formats.  Provided these “attachments” are private and only used for text based searches (eg. free text search, semantic network analysis) the files you upload, how many, the structure, and possibly even exact relevance all wouldn’t matter so much.  Let your (Google-like) search engine figure it out.

I think this addresses the issue of the time-poor researcher who doesn’t want to enter metadata, with a dataset that isn’t self descriptive, who doesn’t mind dumping a few files they have lying around their desktop into a private area.  So I could foresee two types of records in the ARDC, one is a curated record with structured metadata around valuable research datasets (the usual thinking), and the other is essentially a title plus a link or “contact Joanne Bloggs” message that can still be easily and effectively discovered through an associated (but hidden) text cloud.  Would people use that?

Written by lylewinton

November 7, 2009 at 6:10 pm

Follow

Get every new post delivered to your Inbox.