Linking research & learning technologies through standards

Link Affiliates Blog

Posts Tagged ‘digital humanities

The Visible Archive

leave a comment »

The e-Research Australasia conference, which recently concluded in Sydney, had data visualisation as one of its major focuses. George Djorgovski’s plenary talk [abstract, presentation] on virtualisation in science posed the urgency for visualisation in stark terms: as the models that science is called on to build get more and more complex, not only will most data being gathered (in the “data avalanche”) never be seen by human eye; even the models needed to make sense of the data are starting to surpass human understanding, and can only be turned over to machines to deal with.

Someone will greet the notion that we will turn over our comprehension of the world to machines as a challenge. Your Humble Correspondent is more inclined to think of SkyNet

The highlight of the conference for me, at least, was Mitchell Whitelaw’s Exploring Archival Collections with Interactive Visualisation, presenting two prototype visualisations of the contents of the National Archives of Australia. The first visualisation gives an overview of all 57,000 series in the National Archives, according to their sizes (in number of items and shelf-space), and starting date. That visualisation allows critical dates to emerge from the data—the disproportionate importance of 1901, 1914 and 1939 in Australian archive-gathering, for example; and the visualisation also highlights relations between different series graphically. But although getting 57,000 complex data points into a single 2-D diagram has its appeal, there were no real surprises there.

The second visualisation, I found much more interesting. It uses tag clouds for the titles of the 65,000 records contained in a single archive series. Tag clouds have justly been called the mullets of Web 2.0 (as far back as 2005—strange to think there was a Web 2.0 as long ago as 2005!) But Whitelaw’s use of tag clouds, helped along with plenty of Java, are the most intelligent use of tag clouds I’ve seen in a while.

One very handy piece of interactivity is that you can select to ignore particular tags which are crowding out their peers in the cloud; if in a particular archive half the titles contain “Naturalisation” or “Citizenship”, then in most contexts those words will have no more interest for you as a researcher than instances of “the” or “and”: they become stop-words in the context of that archive. Choosing to eliminate those recurring words reveal the real diversity of topics in the archive, startlingly. The effect is like putting on glasses: fuzzy tags on the periphery of the tag cloud, blotted out by one or two stop words, suddenly come into focus.

But the critical distinction in what Whitelaw does is that it can explore collocations of words in the titles. Clicking a tag draws lines to all the other tags it coocurs with in the same title—the more frequently, the thicker the line. So you can straightforwardly get a sense of what contexts a particular word comes up in—with an accompanying bar chart giving the chronological distribution of those contexts. As Whitelaw shows in his example, clicking on Darwin in his example archive draws prominent lines to “1937″ and “cyclone”—which burbles up out of the data the fact of the 1937 Darwin Cyclone [PDF]. The visualisation allows the user to drill down to digitisations of the individual archive records that the tag cloud collocations expose. (The metadata to the 1937 cyclone archives are online.)

Which all means that intelligent navigation of tags and tag collocations can expose stories directly in the documents they are drawn from, without any prepping or mediation. All done with a highly engaging interface.

Whitelaw has blogged his visualisation work at The Visible Archive, which includes downloadable Java for both visualisations (with canned data). The Darwin 1937 Cyclone is not the only fact that emerges out of the tag clouds, and we encourage you to go exploring yourselves.

Written by Nick Nicholas

November 23, 2009 at 10:10 am

Project Bamboo

leave a comment »

Project Bamboo is an Andrew W Mellon Foundation-sponsored project that aims to dramatically improve the way digital technologies are used in humanities research, with a particular focus on shared services infrastructure. The main participants are humanities departments and libraries in major US universities such as Chicago and Berkley, but overseas universities including Cambridge, Oxford, ANU and the University of Melbourne are represented.

Founded in March 2008, Project Bamboo has run five workshops to turn input from the e-Scholarship community into a proposal which it will submit to the Andrew W Mellon Foundation at the end of 2009. The proposal will describe a 7-10 year process, but will focus heavily on implementation in years 1 and 2.

As the project has developed, its thinking has evolved. The project began in more optimistic financial conditions, and implicitly supported a very wide agenda to be realised over ten years. This includes shared services, an extensive, ongoing business analysis model (scholarly narratives, recipes, activities in theme groups, and a marketplace for goods, services for labour (Bamboo Exchange). The project argues that with a solid service based infrastructure supporting reusable applications and tools across different institutions, the cost and effort of using technology in humanities research will be reduced, with many new benefits. With the current global financial situation, the project’s immediate scope has become focused on two parts:

  • The Bamboo Services Platform is a cloud-based environment which will host shared services useful to researchers in e-Humanities. They will include existing services and applications re-engineered for the new platform, as well as novel services created to fill identified niches.
  • The Bamboo Commons is a broad discovery mechanism that allows Bamboo participants to find Bamboo services, tools, business analysis – and each other.

Link Affiliates has submitted two recipes and is using the e-Framework to model solutions to the problems they pose. The e-Framework with its principled binding of services analysis to business requirements is well positioned to offer a structured approach to the problem of interoperability of services, tools, content and business processes within the digital humanities sector.

Follow

Get every new post delivered to your Inbox.