News and Updates on the KRR Group
Header image

Source: Semantic Web world for you
Today I was attending an event entitled “Data-driven Visualization Symposium” in the beautiful Trippenhuis building of the KNAW in Amsterdam. There was a really rich schedule with 10 speakers showcasing some of their work in the area of big data and visualisation. Though I would have appreciated getting a bit more of the how instead […]

Source: Semantic Web world for you
Following the discussion I had after my previous posts, here is a bit more structured explanation of the ideas: Please feel free to ping me and/or comment on this post if you too think it’s a good idea Filed under: Visualisation

Source: Data2Semantics

Our website with additional material for our paper: “A Fast Approximation of the Weisfeiler-Lehman Graph Kernel for RDF Data” has won the Open Science Award at ECML/PKDD 2013. The jury praised the submission as “a perfect example of open science”.

A goal of the Data2Semantics project is to provide resuable software to support semantic enrichment of data. Therefore, the software used for the paper uses existing well-known libraries (SESAME, LibSVM) and was set up into three distinct projects from the start. The heart of software is the proppred library, which contains all the code for doing property prediction using graph kernels on RDF data. Some additional support code for handling RDF is in the d2s-tools project. All the code to run the experiments from the paper(s) is in a separate project called kernelexperiments. This setup allows for easy replication of the (and doing new) experiments and easier integration of the property prediction on RDF library into other projects.

For the future, we aim to provide even more scientific openess via the experimental machine learning platform that we are developing. One of the aims of the platform is to make experimentation easier, without introducing too much overhead. Furthermore, we wil export provenance of the experiments in the Prov-O format. This provenance is visualized using Prov-O-Viz (also developed in Data2Semantics), allowing researchers to gain better insight into the experiments without having to study the code.

Source: Semantic Web world for you
Yesterday I was sitting in a very interesting meeting with some experts in data visualisation. There was a lot of impressive things presented and the name of Wii remote and Kinect were mentioned a couple of time. As I observed so far, these devices are used as cheap way to get sensors. And they certainly […]

Source: Data2Semantics

Paul Groth co-authored an article about altmetrics in the Elsevier Library Connect newsletter for librarians. The newsletter reaches 18,000 librarians in 138 countries around the world.

Academic research and publishing have transitioned from paper to online platforms, and that migration has continued to evolve from closed platforms to connected networks. With this evolution, there is growing interest in the academic community in how we might measure scholarly activity online beyond formal citation.

See more at: http://bit.ly/19bEVpD

Enhanced by Zemanta

Source: Data2Semantics

As a complement to two papers that we will present at the ECML/PKDD 2013 conference in Prague in September we created a webpage with additional material.

The first paper: “A Fast Approximation of the Weisfeiler-Lehman Graph Kernel for RDF Data” was accepted into the main conference and the second paper: “A Fast and Simple Graph Kernel for RDF” was accepted at the DMoLD workshop.

We include links to the papers, to the software and to the datasets used in the experiments, which are stored in figshare. Furthermore, we explain how to rerun the experiments from the papers using a precompiled JAR file, to make the effort required as minimal as possible.

Source: Semantic Web world for you
One visiting the Netherlands will inevitably stumble upon some “BakFiets” in the streets. This Dutch speciality that seems to be the result from cross-breeding a pick-up with a bike can be used from many things from getting the kids around to moving a fridge. Now, let’s consider a Dutch bike shop that sells some Bakfiets […]

Source: Semantic Web world for you
Yesterday was the closing event of the Pilot Linked Open Data project. A significantly big crowd of politicians, civil servants, hackers, SME owners, open data activists and researchers gathered in the very nice building of the RCE in Amersfoort to hear about what has been done within this one year project lead by Erwin Folmer. […]

Source: Data2Semantics

This june 10 and 11, the Data2Semantics team locked itself in a room in the Amsterdam Public Library to build a first version of the Data2Semantics Golden Demo: a pipeline for publishing enriched data (‘semantics’) directly from Dropbox to Figshare, integrated in the Linkitup webservice.

In two days, we built and integrated:

Watch the video!

 

Enhanced by Zemanta

Source: Think Links

I think since I’ve moved to Europe I’ve been attending ESWC (Extended/European Semantic Web Conference) and I always get something out of the event. There are plenty of familiar faces but also quite a few new people and it’s a great environment for having chats. In addition, the quality of the content is always quite good. This year the event was held in Montpellier and was for the most part well organized: the main conference wifi worked!

The stats:

  • 300 participants
  • 42 accepted papers from 162 submissions
  • 26% acceptance rate
  • 11 workshops + 7 tutorials

So what was I doing there:

The VU Semantic Web group also had a strong showing:

  • Albert Meroño-Peñuela won the best PhD symposium paper for his work on digital humanities and the semantic web.
  • The USEWOD workshop’s (led by Laura Hollink) datasets were used by a number of main track papers for evaluation.
  • Stefan Schlobach and Laura Hollink were on the organizing committee. And we organized a couple of workshops & tutorials.
  • Posters/Demos:
    • Albert Meroño-Peñuela, Rinke Hoekstra, Andrea Scharnhorst, Christophe Guéret and Ashkan Ashkpour. Longitudinal Queries over Linked Census Data.
    • Niels Ockeloen, Victor de Boer and Lora Aroyo. LDtogo: A Data Querying and Mapping Framework for Linked Data Applications.
  • Several workshop papers.

I’ll try to pull out what I thought were the highlights of the event.

What is a semantic web application?

Can you escape Frank?

Can you escape Frank?

The keynotes from Enrico Motta and David Karger focused on trying to define what a semantic web application was. This starts out in the form of does a Semantic Web application need to use the Semantic Web set of standards (e.g. RDF, OWL, etc). So from my perspective, the answer is no. These standards are great infrastructure for building these applications but are they necessary, no (see google knowledge graph).  Then what is a semantic web application?

From what I could gather, Motta would define it as an application that is scalable, uses the web and embraces Model Theoretic semantics. For me that’s rather limiting, there are many other semantics that may be appropriate… we can ground meaning in something else other than model theory. I think a good example of this is the work on Pragmatic Semantics that my colleague Stefan Schlobach presented at the Artificial Intelligence meets the Semantic Web workshop. Or we can reach back into AI and see discussion’s from Brooks’ classic paper Elephant’s Don’t Play Chess.  I felt that Karger’s definition (in what was a great keynote) was getting somewhere. He defined a semantic web application essentially as:

An application whose schema is expected to change.

This seems to me to capture the semantic portion of the definition, in a sense that the semantics need to be understood on the fly. However, I think we need to role the web back into this definition… Overall, I thought this discussion was worth having and helps the field define what it is that we are aiming at. To be continued…

Homebrew databases

2013-05-29 09.18.05

Homebrew databases

As I said, I thought Karger’s keynote was great. He gave a talk within a talk, on the subject of homebrew databases from this paper in CHI 2011:

Amy Voida, Ellie Harmon, and Ban Al-Ani. 2011. Homebrew databases: complexities of everyday information management in nonprofit organizations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, NY, USA, 915-924. DOI=10.1145/1978942.1979078 http://doi.acm.org/10.1145/1978942.1979078

They define a homebrew database as “an assemblage of information management resources that people have pieced together to satisfice their information management needs.” This is just what we see all the time, the combination of excel, word, email, databases and don’t forget normal paper brought together to try to attack information management problems. A number of our use cases from the pharma industry as well as science reflect essentially this practice. It’s great to see a good definition of this problem grounded in ethnographic studies.

The Concerns of Linking

There were a couple of good papers on generating linkage across datasets (the central point of linked data). In Open PHACTS, we’ve been dealing with the notion of essentially context dependent linkages. I think this notion is becoming more prevalent in the community. We had a lot of positive response on this in the poster session when presenting Open PHACTS. Probably, my favorite paper was on linking the Smithsonian American Art museum to the Linked Data cloud. They use PROV to drive their link generation. Essentially, proposing links to human’s who then verify the connections. See:

I also liked the following paper on which hardware environment you should use when doing link discovery. Result: use GPU’s there fast!

Additionally, I think the following paper is cool because they use network statistics not just to measure but to do something, namely create links:

APIs

APIs were a growing theme of the event with things like the Linked Data Platform working group and  the successful SALAD workshop. (Fantastic acronym). Although I was surprised people in the workshop hadn’t heard of the Linked Data API. We had a lot of good feedback on the Open PHACTS API. It’s just the case that there is more developer expertise for using web service apis rather than semweb tech. I’ve actually seen a lot of demand for Semweb skills and while we our doing our best to train people there is still this gap. It’s good then that we are thinking about how these two technologies play together nicely.

Random Notes

Filed under: events, linked data Tagged: conference, eswc2013, linked data, semantic web