News and Updates on the KRR Group
Header image

LarKC announces a new release of mpiJava (1.2.6) – a Message-Passing Interface (MPI) library, allowing a Java application to efficiently run on a distributed, parallel, and high-performance computer architecture. First introduced in the HPJava project and developed by

  • Pervasive Technology Labs, Indiana University,
  • Syracuse University, and
  • CSM, University of Portsmouth,

mpiJava@SourceForge ( is now managed and maintained by the High Performance Computing Center Stuttgart (HLRS) in the framework of LarKC.

The library is easy to deploy and use within the application code, in particular for plug-ins. Among the new features are true multi-platforming (thanks to CMake support configuration procedure), very high performance characteristics (achieved by efficient utilisation of underlying native MPI-C implementation), and support of the most famous MPI realisations (MPICH, Open MPI, and MS-MPI).

home  |  stats  |  news (rss)  |  login

LarKC announces a new release of mpiJava (1.2.6) – a Message-Passing Interface (MPI) library, allowing a Java application to efficiently run on a distributed, parallel, and high-performance computer architecture. First introduced in the HPJava project and developed by

Pervasive Technology Labs, Indiana University,
Syracuse University, and
CSM, University of Portsmouth,

mpiJava@SourceForge ( is now managed and maintained by the […]

Source: Think Links

This has been a great week if you think that it’s important to know the origins of content on the web. First, Google announced the support of explicit metadata describing the origins of news article content that will be used by Google News. Publishers can now identify using two tags whether they the original source of a piece of news or are syndicating it from some other provider. Second, the New York Times now has the ability to do paragraph level permalinks. (So this is the link to the third paragraph of an article on starbucks recycling). So one can link to the exact paragraph when quoting a piece. This was supported by some other sites as well and there’s a wordpress plug-in for it but having the Times support it is big news. Essentially, with a couple of tweaks these techniques could make the quote pattern that you see in blogs (shown below) machine readable.

In the W3C  Provenance Incubator Group that is just wrapping up, one of the main scenarios was how to support a News Aggregator that can makes use of provenance to help determine the quality of the articles it automatically creates. With these developments, we are moving one step closer to being able to make this scenario possible.

To me, this is more evidence that with simple markup, and simple link structures, we can achieve the goal of having machines know where content on the web originates. However, like with a lot of the web, we need to agree on those simple structures so that everyone knows how to properly give credit on the web.

Filed under: provenance markup Tagged: google news syndication tags, new york times, permalinks, provenance

The 2010 LarKC PhD symposium was held in Beijing in Nov 14th, 2010. More than 40 participants attended this symposium (Most of them were participants from The 4th LarKC Ealy Adopters Tutorial).

The Proceeding of the symposium can be downloaded here.

The LarKC PhD symposium is an annual event that the Large Knowledge Collider (LarKC) Consortium organized. The main objectives of this symposium series is to provide a communication platform for young researchers (especially PhD students) on their recent progresses in the EU FP-7 LarKC project, Web-scale reasoning and the Semantic Web in general.

The seminar is open and free to everyone who is interested.The 2010 LarKC PhD Symposium is the 2nd symposium in this series. The 1st symposium is jointly held with STI PhD Seminar 2009 in Berlin. The participants of that events all agree that they learned a lot from each other and that is one of the most important reason why we have this event this year.During year 2 of the LarKC project, the consortium has many progresses on Web-scale reasoning and search, ranging from new selection and reasoning strategies to real-world use cases. Many of them are from PhD students and young researchers in this consortium. We are very proud to have these researchers report their recent results in the 2nd LarKC PhD symposium.

In addition, we are very pleased to see that there are external plug-in contributions outside the LarKC consortium in the form of close collaboration with the LarKC members.The speakers for the 2nd LarKC PhD symposium are from China, Germany, Italy, the Netherlands, UK, etc. We are please to have several talks which cover a wide range on Semantic Web, Machine Learning, and AI in general.

The topics focus but not limited to: Natural Language Interfaces to Ontologies, segmentation strategies for Web-scale data, Machine learning meets the Semantic Web, selection strategies, parallel and Contrastive reasoning for the Semantic Web, and Semantic Web-enabled Recommender System.Some of the speakers are still in their PhD program, hence we are very pleased to have several senior members from and outside the LarKC consortium to make comments and suggestions to their future research in the area of Web search and reasoning. More importantly, the speakers will learn from each other during their communications in the symposium.

by Yi Zeng
The 4th LarKC Early Adopters Tutorial took place in Beijing, China on Nov 13th, 2010.
LarKC 4th Early Adopters Tutorial

Approximately 90 participants attended the tutorial. 2 introductions, 4 hands on sessions as well as use cases demos from Urban computing has been given.The tutorial is in bi-lingual (English and Chinese), with most of the talks translated real-time to the audience.

The participants agreed that LarKC is easy to use as a plugable platform for Web-scale reasoning.The materials of the 4th LarKC Early Adopters Tutorial can be downloaded from the LarKC sourceforge and the following addresses:

Source: Think Links

Current ways of measuring scientific impact are rather course grained, they often don’t capture the many different ways that science and scientists might have impact. As science increasingly is done on-line and in the open, new metrics are being created to help measure this impact. Jason Priem, Dario Taraborelli, myself, and Cameron Neylon have recently put out a manifesto calling out-lining a research direction for these new metrics, termed alt-metrics.

You can read the manifesto here:


Filed under: academia Tagged: alt-metrics, science impact

The fourth early adopters tutorial will be held in conjunction with next LarKC project meeting on the 13th November, 2010 in the Gongda Jianguo Hotel, Beijing University of Technology, Beijing, China. This tutorial will enable participants to the get access to early research results and technologies from the LarKC project, and will mainly focus on […]

Source: Think Links

I wrote a post a while back around the idea of Data DJs: how do we make it as easy to mix data as it is to mix music. This notion requires advances on several fronts from data and knowledge integration, to user interfaces, along with data provenance and semantics. Most of the research I do then somehow relates to this Data DJ’s in some form or anther.

However, I always thought I it would be fun to push the analogy as far as I could. Last Christmas, I got a DJ deck (specifically a Numark Stealth Control-fantastic name, right?) with the idea of actually using it to mix data sets. For a host of reasons, including time but also a lack of a clear vision of what an integration interface should look like, I never got past just toying around with it. However, over the past couple of weekends I found time to revisit it and develop a super alpha version of a data integration system using the deck. Here’s a video to see what I’ve done, read on to get more details.

What really got me going was the notion that events (or who, what, when, where and why) are a perfect substrate for data integration. This is not my idea but has been something I’ve been hearing from a number of sources including from a number of people in the VU’s Web and Media Group down the hall, Raphaël Troncy, and probably best summed up by Mor Naaman. With this as inspiration, I developed a preliminary interface around integrating/and summarizing events (well actually tweets, but hopefully this will expand to other event sources) that you saw in the video above. The components of the interface (shown in the picture below) are as follows:

  • On the top is a list of the search terms that were used to retrieve the tweets. The tweets for each search term can be hidden and unhidden.
  • On the right is a list of the users (i.e. sources) who made the tweets. Each source can be filtered in and out impacting the term summary graph
  • In the middle are all the tweets on the same timeline.
  • On the right, is a bar graph that summarizes the most common terms across the tweets.
  • Below the bar graph, is the time span of the tweets and the current time of the selected tweet.
  • On the far right are hashtags that are selected by the user.

As you saw in the video it’s pretty fast to scroll through both sources and tweets. With a quick flick it’s easy to apply a filter and pretty natural to select and deselect search terms. Furthermore, we can easily delete tweets and data sources with the push of a button. There’s still much much more to be done to make this a viable user interface for the kind of data mixing task we want to support. But standing in front of the projector today scrolling through tweets, eliminating sources and seeing an overview fly-up really convinced me that this type of interaction is really suited to the data integration task. That being said any advice or comments on the interface would be greatly appreciated. In particular, suggestions for good infographics pertaining to events would be appreciated.

Technical Details:

The interface was completely implemented using HTML5. In particular, I used the nice ProtoVis framework along with JQuery and JQuery Tools. To get the fast updates from the deck, we use WebSockets. I have a small Java program reading midi off the deck which then acts as a socket server for WebSockets and pipes the midi signals (after translation to JSON) to the connected sockets. I’ve been using Google Chrome for development so I don’t know how it works in other browsers. To get data, we use the search interface of twitter and JSONP. In general, I was very impressed with what you can do in the browser. I felt like I wasn’t even pushing the capabilities especially since I don’t do web programming everyday.

What’s next?

Lots! This was really just a proof of concept. There’s a bunch of directions to go in: improved graphics, better use of the decks, social interaction around integration (two djs at once!), more data sources beyond twitter, experiments on task performance, live mixing of an event…. If you have any ideas, suggestions, or comments, I’d love to hear them.

How do you want to data DJ?

Filed under: data dj Tagged: data dj, decks, infographics, mixing data

LarKC – The Large Knowledge Collider has been nominated as a ’start project’ for the ICT 2010 event in Brussels. The project was present with a stand showcasing the life science and urban computing demonstrators, and the newly released LarKC Movie. The movie had its world premier on the first morning of the ICT event and triggered a lot of very positive feedback in course of the three-day event. The movie is published on the project Web site and we invite you to enjoy the introduction to the LarKC project and its approach.

The LarKC consortium representatives thank the numerous visitors to the stand that have shown general interest in the project, that shared technical insights, that intend to exploit some of the project results or that might have ideas and visions to collaborate with LarKC or its members in future activities and projects.

Watch the LarKC project movie and find out what it is all about!