Daily Archives: May 24, 2007

Time For Academic Librarians To Tune In To The Semantic Web

Editor’s Note: We present this guest post by Brett Bonfield, a graduate student in the LIS program at Drexel University, intern at the Lippincott Library at the University of Pennsylvania and an aspiring academic librarian.

What if your refrigerator automatically charged and discharged items, maintained standing orders, and weeded its collection? These are the sort of functions the Semantic Web could make possible—a catalog for the rest of the world that incorporates the goals we have for our library collections and even raises the bar several rungs. Fortunately, our catalogs have a head start on our refrigerators: once they have incorporated the Semantic Web’s flexible encoding standards and ontologies that work both inside and outside the library, what might they be capable of by the time refrigerators are as sophisticated as today’s catalogs? One possibility would be a catalog that transcends the physical constraints of collocation, allowing users to follow connections from within one item’s content to related information within any other item’s content. That is, if the Semantic Web ever materializes.

The Semantic Web couldn’t have a greater champion. Tim Berners-Lee, the inventor of the World Wide Web, has spent the last ten years developing it alongside his colleagues at the World Wide Web Consortium (W3C), the organization he founded to steward Web standards. These standards are the lifeblood of the WWW: just as MARC and AACR2 support libraries, W3C specs like HTTP, HTML, CSS, and XML support the WWW. The Semantic Web’s success hinges on developers adopting new W3C standards like RDF, which integrates applications like library catalogs, open Web directories, news aggregators, and personal music collections.

There was a time when RDF’s adoption would have been a given, when the W3C was seen as nearly infallible. Its standards had imperfections, but their openness, elegance, and ubiquity made it seem as though the Semantic Web was just around the corner. Unfortunately, that future has yet to arrive: we’re still waiting on the next iteration of basic specs like CSS; W3C bureaucracy persuaded the developers of Atom to publish their gorgeous syndication spec with IETF instead of W3C; and, perhaps most alarmingly, the perception that W3C’s HTML Working Group was dysfunctional encouraged Apple, Mozilla, and Opera to team with independent developers in establishing WHATWG to create HTML’s successor spec independently from the W3C. As more non-W3C protocols took on greater prominence, W3C itself seemed to be suffering a Microsoft-like death of a thousand cuts.

But then a marvelous thing happened: on April 9, WHATWG’s founders proposed to W3C that it build its HTML successor on WHATWG’s draft specification. On May 9, W3C agreed. W3C may never again be the standard bearer it once was, but this is compelling evidence that it is again listening to developers and that developers are responding. The payoff in immediate gratification—the increased likelihood of a new and better HTML spec—is important, but just as important is the possibility of renewed faith in W3C and its flagship project, the Semantic Web. Coincidentally, this agreement occurred around the same time that O’Reilly Media released two reports, “DocBook Elements in the Wild” and “DocBook in the Wild: A Look at Newer Content,” that provide encouraging glimpses into a more semantic future.

O’Reilly uses a schema called DocBook to structure its manuscripts, marking up content elements like index terms and image data, and even differentiating runnable code from the results of that code. In so doing, O’Reilly’s editors may be demonstrating the next stage in library content, catalog materials that cross-reference other items within our collection as well as the world of information outside the library. Although DocBook is not itself part of W3C’s Semantic Web specification, its developers work closely with W3C and it maps well to RDF.

Three years ago, Campbell and Fast asked what academic libraries and the Semantic Web could offer each other. At the time, the Semantic Web was years away from offering anything tangible. Now, for those of us wondering if the successor to AACR2 will be RDA or something less library-specific, the events at W3C and O’Reilly are calls to prick up our ears.