We will be celebrating in two weeks time (January 27) the first anniversary of our new University library. It is a good occasion for a quick evaluation of the impact in the university life and I’m going to do so with a vision of future online access to scholarly content. In particular, I will mention the efforts that are being carried out in order to make the new library a true library 2.0, but I will also suggest how to go beyond the 2.0 model so as to approach a much more effective 3.0 model.
Actions taken to become 2.0 fall under the scope of a consultancy firm, that cooperates with the University’s own computing service. My proposals to approach a 3.0 model would need partnership with research teams (inside or outside the Univeristy) specialized on knowledge extraction and language engineering.
Beyond library 2.0
- Doctoral course I0703-on Web semantics
- Position paper for a round table at a workshop II Jornadas sobre Documentación y Gestión de los contenidos digitales
- Library 2.0 workshop with staff from University of Deusto
My colleague Lorena Fernández (who is the person in charge of the project, from the University’s computing office) assisted to the third, but missed the first. Shame! She has written a couple of posts on the topic: Catálogos caseros en la nube (2010.01.03) and Día de la Biblioteca (2009.10.24). In the former she reviews aNobii and LibraryThing, two well known social bookmarking tools for books. This shows she is looking at common 2.0 facilities, which is great. But I need to express my own wishes beyond 2.0:
- Digitalization of the library’s exclusive book collection and UD journals: yes, why not?, in cooperation with GoogleBooks. Deusto already has a corporative license to use Google apps, so why not extend that agreement further?
- Rendering of digitalized text into XML/TEI and RDF. This will need some explanation, but basically it means that the digitalized text goes beyond mere plain ASCII text
- Named entity and specialized term extraction
- Mapping of those terms into folksonomies and taxonomies (i.e. catalogs)
- Mapping of taxonomies into relevant ontologies, eg. DBpedia
- Automatic reference extraction (eg. into BibTeX format, as GoogleScholar does)
- Automatic cross-reference, as CiteSeer, DBLP, WoK, and others do
- Croos-reference extraction from those services
- Integration into local reference management software (eg. ZOTERO), or user’s own online services (Citeulike, Bibsonomy, Connotea)
- Integration of GoogleScholar
- Development of a scholarly knowledge fragmentation tool
- Scientific conversation on the cloud
I’m going to explain now briefly what I mean for the scholarly or scientific knowledge fragmentation tool.
The Twitter lesson
We have learned from Twitter many things recently, but to me one of the most relevant things is that short texts are more efficient on the web than longer ones: shorter is better on the web.
Now there is a growing family of new short texts genres on the web including:
These are particularly well suited for the Linking Data project, which is very interesting. But I’m going to talk about their effect on a new trend of scientific conversation, which is in debt with Twitter.
Dynamic scientific discourse
What we are about to see in the near future is a new form of scientific discourse emerging from scientific snippets or tweets, much as there emerge conversations on the twittersphere.
In order to achieve such a conversation we will need to fragment scholarly papers and articles into more suitable knowledge pieces or atoms. Such atoms should include not only the content (sentences would be candidates for such atomic units) but also contextual metadata (author, title, date, publication, eg. as in Dublin Core) as well as cross-references (who quotes who, why and how, as in CiteSeer).
After fragmentation, we would just let them fly and reach the conceptual cloud, a global cloud of knowledge tweets in which a global scientific conversation will take place.