News and Updates on the KRR Group
Header image

The Botari application from the LarKC project has won the Open Track of the Semantic Web Challenge.

Botari is a LarKC workflow running on servers in Seoul, plus a user frontend that runs on a Galaxy Tab.

The workflow combines open data from the city of Seoul (Open Street Map, POI’s) with twitter traffic and combines stream processing, machine learning and querying over RDF datasets and streams to give personalised restaurant information and recommendations, presented in an augmented reality interface on the Galaxy Tab.

For more info on Botari, see either the website, or the demo movie or the slide deck or the paper.

Enhanced by Zemanta

Source: Semantic Web world for you

Over the last couple of years, we have engineered a fantastic data sharing technology based on open standards from the W3C: Linked Data. Using Linked Data, it is possible to express some knowledge with a set of facts and connect the facts together to build a network. Having such networked data openly accessible is a source of economical and societal benefits. It enables sharing data in an unambiguous, open and standard way, just as the Web enabled document sharing. Yet, the way we designed it deprives the majority of the World’s population from using it.

Doing “Web-less” Linked Data?

The problem may lay in the fact that Linked Data is based on Web technologies, or in the fact that Linked Data have been designed and engineered by individuals having an easy access to the Web, or maybe a combination of both aspects. Nowadays, Linked Data rhymes with having a Cloud hosted data storing services, a set of (web-based) applications to interact with this service and the infrastructure of the Web. As a result, if you don’t have access to this Web infrastructure, you can not use Linked Data. Which is a pity, because an estimated 4.5B persons don’t have access to it for various reasons (lack of infrastructure, cost of access, literacy issues, …). Wouldn’t it be possible to adjust our design choices to ensure they could also benefit from Linked Data, even if they don’t have the Web? The answer is yes, and the best news is that it wouldn’t be that hard either. But for it to happen, we need to adapt both our mindset and our technologies.

Changing our mindset

We have tendency to think that any data sharing platform is a combination of a cloud based data store, some client applications to access the data and form to feed new data into the system. This is not always applicable as central hosting of data may not be possible, or its access from client applications may not be guaranteed. We should also think of the part of the World which is illiterate and for which Linked Data, and the Web, are not accessible. In short, we need to think de-centralised, small and vocal in order to widen the access to Linked Data.

Think de-centralised

Star-shaped networks can be hard to deploy. They imply setting a central producer of resource somewhere and connecting all the clients to it. Electricity networks have already found a better alternative: the microgrids. Microgrids are made of small networks of producers/consumers (the “prosumers”) of electricity that locally manage the electricity needs. We could, and should, copy on this approach to manage local data production and consumption. For example, think of a decentralised DBpedia whose content would be made of the aggregation of several data sources producing part of the content – most likely, the content that is locally relevant to them.

Think small

Big servers require more energy and more cooling. They usually end up racked into big cabinets that in turn are packed into cooled data centers. These data centers needs to be big in order to cope with the scale issues. Thinking decentralised allow to think small, and we need to think small to provide alternatives to having data centers where these are not available. As the content production and creation goes decentralised, several small servers can be used. To continue with the analogy with microgrids, we can name these small servers taking care of locally relevant content “micro-servers”.

Think vocal

Unfortunately, not everyone can read and type. In some African areas, knowledge is shared using vocal channels (mobile phone, meetings, …) because there is no other alternative. Getting access to knowledge exchanged that way can not be done using form based data acquisition systems. We need to think of exploiting vocal conversation through Text To Speech (TTS) and Automatic Speech Recognition (ASR) rather than staying focused on forms.

Changing our technologies

Changing the mindsets is not enough, if we aim at stripping down the Web from Linked Data we also need to pay attention to our technologies and adapt them. In particular, there are 5 upcoming challenges that can be phrased as research questions:

  1. Dereferencability: How do you get a route to the data if you want to avoid using the routing system provided by the Web? For instance, how do you dereference an host-name based URIs if you don’t have access to the DNS network?
  2. Consistency: In a decentralised setting where several publishers produce part of a common data set, how do you ensure URIs are re-used and non colliding? There are chances that two different producers would use the same URI to describe different things.
  3. Reliability: Unlike centrally hosted data servers, micro-servers can not be asked to provide a 99% availability. They may go on and off unexpectedly. First thing to know is whether that’s an issue or not. The second is, if we should ensure their data remains available, how do we achieve this?
  4. Security: That’s also related to having a swarm of microservers serving a particular dataset. If any microserver can produce a chunk of that dataset, how do you avoid having a spammer getting in and starting producing falsified content? If we want to avoid centralized networks, authority based solution such as in Public Key Infrastructure (PKI) is not an option. We need to find decentralised authentication mechanisms.
  5. Accessibility: How do we make Linked Data accessible to those that are illiterate? As highlighted earlier, not everyone can read an write but illiterate persons can still talk. We need to take more of the vocal technologies into account in order to make Linked Data accessible to them. We can also investigate graphical based data acquisition techniques with visual representations of information.

More about this

This is a presentation that Stefan Schlobach gave at ISWC2011 on this topic:

You are also invited to read the associated paper “Is data sharing the privilege of a few ? Bringing Linked Data to those without the Web” and check out two projects working on the mentioned challenges: SemanticXO and Voices.

Source: Think Links

Yesterday, I had the pleasure of giving a tutorial at the NBIC PhD Course on Managing Life Science Information. This week long course focused on applying semantic web technologies to getting to grips with integrating heterogenous life science information.

The tutorial I gave focused on exposing relational databases to the web using the awesome D2R Server. It’s really a great piece of software that shows results right away. Perfect for a tutorial. I also covered how to get going with LarKC and where that software fit in the whole semantic web data management space.

On to the story…

The students easily exposed our test database (GPCR receptors) as RDF using D2R. Now the cool part: I found out just before the start of my tutorial  that the day before they had setup an RDF repository (Sesame) with some personal profile information. So on the fly I had them take the RDF produced by the database conversion and load that into the repository from the day before . This took a couple of clicks. They were then able to query over both their personal information and this new GPCR dataset. With not much work we hand munged together two really different data sets.

This is old hat to any Semantic Web person, but it was a great reminder of how the flexibility of RDF makes it easy to mashup data. No messing with about with tables or figuring out if the schema is right, just load it up into your triple store and start playing.

Filed under: academia, linked data Tagged: mashup, rdf, semantic web

Source: Think Links

I’ve been a bit quiet for the past couple of months. First, I was on a vacation and then we were finishing up the following demo for the Open PHACTS project. This is one of the main projects I’ll be working on for the next 2.5 years. The project is about integrating and exposing data for pharmacology. The demo below shows the first results of what we’ve done after the first 6 months of the project. Eventually, we aim to have the platform we’re developing be fully provenance enabled so all the integrated results can be checked and filtered based on their sources. Check it out and let me know what you think. Sorry for the poor voice over… it’s me :-)

Original Post

Filed under: academia, linked data

The LarKC project’s development team would like to announce a new release (v.3.0) of the LarKC platform, which is available for downloading here. The new release is a considerable improvement of the previous release (v.2.5), with the following distinctive features: PLATFORM New (plain) plug-in registry light-weight plug-in loading and thus very low platform’s start-up time [...]

With effect from November 15th there is a vacancy for a

PhD student CEDAR – Linked data

38 hours a week (1.0 fte) (for 48 months in total)

(Vacancy number DANS-2011-CEDARPhD1) (repeated call)

DANS in collaboration with the IISH, and the VU University Amsterdam is working on a project of the Computational Humanities programme of the KNAW “Census data open linked – From fragment to fabric – Dutch census data in a web of global cultural and historic information (CEDAR)”

Project Background

This project builds a semantic data-web of historical information taking Dutch census data as a starting point. With such a web we will answer questions such as:

  • What kind of patterns can we identify and interpret in expressions of regional identity?
  • How to relate patterns of changes in skills and labor to technological progress and patterns of geographical migration?
  • How to trace changes of local and national policies in the structure of communities and individual lives?

This project applies also a specific web-based data-model – exploiting the Resource Description Framework (RDF) technology– to make census data inter-linkable with other hubs of historical socio-economic and demographic data and beyond. The project will result in generic methods and tools to weave historical and socio-economic datasets into an interlinked semantic data-web. Further synergy will be created by linking CEDAR to Data2Semantics, a COMMIT project.

Information on the position

The PhD project, titled “Linked Open Data curation model in social science history – the case of Dutch census data” will be supervised by Professor Frank van Harmelen (VU Amsterdam). You will work in a project team consisting of another PhD student (PhD project “Theory and Practice of data harmonization in social history” (under the supervison of Professor Kees Mandemakers (Erasmus School of History, Culture and Communication, Erasmus University Rotterdam; IISH) and a postdoc experienced in complex network analysis and visualization. The project will be coordinated by Dr Andrea Scharnhorst (DANS, e-humanities group). It is part of the Computational Humanities Programme of the KNAW, which will be hosted at the e-humanities group (ehumanities.nl) and in which further projects (with PhD students, postdoc’s and senior staff) in the area of computational and digital humanities will be carried out.

    You will conduct research on:

    • Review of existing data models of census data, adaptation and modification, construction of the RDF model, links to other semantic web sources
    • Query design (specific to different user communities)
    • Development of RDF models of census data (historical variables)
    • Mapping of different ontologies across domains and along time,
    • Development of best practices to enable take-up of linking and re-use of data in other scientific disciplines and take-up in other KNAW institutes.
    • Visual navigation through RDF modelled information spaces

     

    Position requirements

    You preferably should have the following qualifications:

      • Master in computer sciences, artificial intelligence, information science or related areas
      • Interest in and knowledge of semantic technologies and their deployment on the Web
      • Fluency in spoken English and excellent written and verbal communication skills,
      • Knowledge of Dutch would be an advantage
      • Willingness and proven ability to work in a team and to liaise with colleagues in an international and interdisciplinary research environment

      Appointment and Salary
      The position involves a temporary appointment with DANS for 4 years with a 2-month period of probation. Applicants should have the right to work in the Netherlands for the duration of the contract. The gross salary will be € 2.042,- per month in the first year, rising to € 2.612,- per month in the fourth year for a full time appointment (scale P, for a PhD position, CAO-Dutch Universities).

      DANS offers an extensive package of fringe benefits, such as 8,3% year-end bonus, 8% holiday pay, a good pension scheme, 6 weeks holiday on an annual basis and the possibility to buy or sell vacation hours.

      Place of employment will be DANS – Data Archiving and Networked Services. The main working location will be at the e-Humanities Group of the KNAW (location Meertens institute, Joan Muyskenweg, Amsterdam).

      Information

      For the text of the CEDAR_proposal follow the link from HSN News

      http://www.iisg.nl/hsn/news/. For further information please contact: Dr. Andrea Scharnhorst andrea.scharnhorst@dans.knaw.nl or mobile phone (+31) (0)6 23 63 32 93.

      Applications

      Please send a letter of application including

      1. letter of motivation
      2. CV, copy of Master Thesis and list of M.Sc. courses and grades
      3. the names and addresses of two referees before October 15th , 2011 to DANS,

        t.a.v. Hetty Labots, Personnel Department, P.O. Box 95366, 2509 CJ Den Haag,

        The Netherlands or (preferably) by e-mail to sollicitaties@dans.knaw.nl.

      Interviews probably will take place at the end of October, 2011 in Amsterdam. When you already applied for this position please do not apply again.

        See also http://www.academictransfer.com/employer/KNAW/vacancy/11365/lang/en/

        Enhanced by Zemanta