some technical thoughts on the importance of Google Scholar

05-December-2004

email this
Graham and I have both been thinking about the implications of Google Scholar for the academic and research communities we support. I've been meaning to post a quick summary of some of the possibilities from a technical/architectural point of view.

Graham has just written a little note on Google Scholar:

More important from my point of view is that it shows the potential power of distributed and associated metadata. We need something like this for e-learning materials - able to aggregate data on materials use in practice.

This would overcome the present problem when to be standards compliant materials developers are required to fill in endless metadata fields.

The Wales-Wide Web - Google Scholar

We've both been thinking about the implications of Google Scholar for the academic and research communities we support. I've been meaning to post a quick summary of some of the possibilities from a technical/architectural point of view:

In brief, we've been planning to design and implement a resource-repository system, taking over the kinds of functionalities currently exposed (badly) through things like adding Annotated-Reference objects inside site or document content. We've even made some progress on the system: for the group-weblogging/team-tasks work in the smelearning site, we have an abstract resource-pointer type so that discussions can collect resources multiple, but with all commentary and trackbacks attaching to the original 'real' copy of the resource. We'll be working on making resource-management 'placeless' once we finish the present binge of knotes work. We've also implemented blogging categories in such a way as to be ready for generic structured metadata (and knotes' blog categories already allow categorical structures like 'development/knotes/weblogging/features') In my previoue work for the REM and Resource Locator projects and the IMS, I spent a lot of effort designing abstract repository systems, so we're not without ideas or motivation :O)

One of the most vivid and accesible use-cases for this kind of resource-repository approach is citation and schorarly referencing. For the NGRF site, we implemented a quick little content type called Annotated Reference in order to try to abstract referencing into independent objects (out of the html content of the documents which 'contain' the citations), and to allow special discursive content to be associated with bibliographic records. There are upwards of 1800 Annotated References in the NGRF site at present, but some number of these are duplicates, since the abstraction is one-level: the references are not in the body content of the documents, but they do live directly within its folder-wise content. Thus a list of references will contain duplicates. We would like to replace this with an extra abstraction, so that the record-details for a bibliographic record live placelessly in a repository, with citations within doccuments pointing to those. We would probably try to piggyback the trackback machinery to effect the linkage, so that we could have this working across CMS systems.

So where does Google-Scholar come into these plans? One of the attractions of a placeless, resource-centred way of handling citation and reference is that it removes the requirement of repeatedly filling-in record details. We would of course have to provide some kind of user interface for searching the existing reporitory, selecting items there for referencing, and adding new items where required. Google-Scholar may allow us to add an extra layer of search... if a record does not exist in the repository, we could allow users to search google scholar (using their XML-RPC API, and to one-click add to the repository and reference from the search results returned.

We also have hopes of developing a cross-CMS API for sharing repositories once we've implemented and trialed initial work on repositories in Plone. Google-Search could present important constraints on the design of such an API, and in the case of scholarly materials may even make it redundant. Because google will be harvesting its 'records' from real-world publications, it sidesteps the detailed entry of record details which presents such a barrier to getting structured content out of ordinary end-users.

We're great believers in thin, lightweight standards - for instance XML-RPC, trackback, RSS, Really Simple Discoverability, the weblog-management APIs - as opposed to comprehensive, heavyweight standards like the IMS specifications of old. Many small standards interacting with a rich universe of freely-created content can effect a lot more power than large-scale monolithic standards which try to envisage or constrain the way content will evolve. We're hoping that the advent and uptake of Google Scholar will create some new opportunities for leveraging the little standards.


Mike Malloch; 05-December-2004 08:56:01; forum (0) help

Comments please

If you are already registered here, please click the "Login" button to send your username/password with the comment. Click the "Anonymous/Join" button to leave a comment without logging in.

Please tell us who you are

E-Mail Address (Required)
We need a valid email address in order for you to post a comment. You will recieve an email containing a special validation link. The comment will not be published until validated
Name
Please leave your name
Join the site (optional)
If you would like to join the site while posting this comment, then choose a username.
Usernames must contain no spaces or special characters.
Title
Lead-in
Body Text ( HTML tags are allowed )
Validation
Please enter the text from the image above
Preview your comment

Linking and trackbacks

When linking to this weblog entry, please use the 'permalink', which is http://www.knownet.com/writing/elearning2.0/entries/3609206097

Some weblog systems will ask you for a "trackback link" (most systems will find this special 'hook' automatically, in the code for this page).

The trackback link for this entry is http://www.knownet.com/writing/elearning2.0/entries/3609206097/tb