News and Updates on the KRR Group
Header image

The LarKC development team is proud to announce the new release V2.5 of the LarKC platform. The new release is a considerable improvement over the previous V2.0 edition, with the following distinctive features:

  • V2.5 is fully compliant with the LarKC final architecture. You can now develop your workflows and plugins, and be assured that future updates won’t change the main APIs.
  • The Management Interface, which makes it possible to run LarKC from your browser, has an updated RESTful implementation. Besides RDF/XML, workflows can now be described in very readable N3 notation.
  • The endpoint for submitting queries to LarKC is now user-definable, and multiple endpoints are supported.
  • The Plug-in Registry has been improved, and is now coupled with the browser-based Management Interface
  • LarKC now uses a Maven-based build system, giving improved version and dependency management, and a simplified procedure for new plug-in creation
  • A number of extra tools have been introduced to make life for LarKC users a lot easier. Besides the Mangement Interface to run LarKC from your browser, V2.5 also contains:
    • A WYSIWIG Worfklow Designer tool that allows you to construct workflows by drag-and-drop, right from your browser: click on some plugins, drag them to the workspace, click to connect them, and press run! (see screenshot below).
    • An updated plug-in wizard for Eclipse.
  • We have thouroughly updated the distributed execution framework. Besides deploying LarKC plugins through Apache (simply by dropping them in your Apache folder), it is now also possible to deploy plugins through JEE (for webservers) or GAT (for clusters).
  • The WYSIWYG Workflow Designer allows you to specify remote execution of a plugin simply by connecting a plugin to a remote host. Templates are provided for such remote host declaration.
  • LarKC now takes care of advanced data caching for plug-ins
  • V2.5 comes with extended and improved JUnit tests
  • Last but not least, we have considerably improved documentation and user manuals, including a quick-start guide, tutorial materials and example workflows.

The release can be downloaded from
The platform’s manual is available at

Bugs can be submitted using the bug tracker at

As usual, you are encouraged to use the discussion forums and mailing lists served by the LarKC@SourceForge development environment.
please see at

LarKC Workflow Editor

Should the semantic web be just about querying RDF? Or is it usefual (or even: feasible) to use the semantics of RDF, RDF Schema and OWL to derive additional information from the published RDF graphs? Both the feasibility and the usefulness of this depends on the amount of additional triples that are derived by inference: when almost zero, there is little point to inference, when explosively large, it might become infeasible.

LarKC researchers at OntoText produced an informative table showing the amount of additional triples that can be inferred from some of the most popular datasets on the Web. It’s interesting to see how the datasets differ in their semantic richness, with their ratio of explicit triples vs. inferred triples ranging from close to zero (CIA Factbook) to a 16-fold increase (for DBPedia). Please let us know if you have similar statistics for other datasets.

All of the data below taken from FactForge which by itself now contains 1.5billion triples, nearly four times larger than in the beginning of the LarKC project in 2008. All of the figures below obtained with BigOWLIM 3.4, under the OWL-Horst semantics. Size is reported in billions of triples.

Dataset Explicit Indexed Triples Inferred Indexed Triples Total of Indexed Triples Entities (nodes in the graph) Inferred closure ratio
Schemata (Proton,
DC) and ontologies
(DBpedia, Geonames)
15 9 23 8 0.6
DBpedia (SKOS
2,915 47,837 50,751 1,135 16.4
NY Times 574 328 902 185 0.6
UMBEL 4,638 6,936 11,575 1,190 1.5
Lingvoj 20 182 201 18 9.2
CIA Factbook 76 4 80 24 0.1
WordNet 1,943 6,067 8,010 842 3.1
Geonames 142,011 194,191 336,202 42,738 1.4
DBpedia core 825,162 166,740 991,902 125,803 0.2
Freebase 494,344 52,411 546,754 123,511 0.1
MusicBrainz 45,492 36,572 82,064 15,585 0.8
Related articles

Enhanced by Zemanta

By Bosse Andersson
The first LarKC Pharma workshop was held in Stuttgart April 19 and 20. An interesting mix of participants from pharmaceutical companies, semantic web companies and research/academia formed an open atmosphere with many intense discussions and hopefully future interactions.

The workshop had an outline similar to previous LarKC tutorials with a twist from the pharma domain in presentations and examples.

Participants did find the LarKC platform and the Linked Life Data repository useful;

  • From pharma perspective questions circulated around what the requirements will be for us to host/use LarKC as an internal experimental platform.
  • The semantic web companies where more interested in how to use components of LarKC or provide services that can leverage from the LarKC platform. 
  • The research/academia community had a specific need to learn how to quickly get LarKC up and running for the first iteration in the Innovative Medicine Initiative, OpenPhacts.

Many questions did come up during lively discussions, some were answered others will be brought back to the consortium to address, e. g. how to lower the entrance to start using LarKC.

Although LarKC is based in Europe, the project of building, and applying, web-scale reasoning is world wide. One of the most exciting things about living in a connected world, and a world of abundant, location independent computational resources, is that people anywhere in the world can do world class AI research, and develop applications based on that research. The recent, and very rapid, increase in internet bandwidth going into Africa means that one can now use Shazam to get impromptu karaoke lyrics for the Texas country-and-western playing in a hotel bar in Accra. It also means that previously isolated African researchers can make a full contribution to the advance of semantic technology.  In February, partially supported by the FP7 Active project, we had the opportunity to present LarKC, and the potential benefits of AI and human-computer collaboration, to students and researchers at the Ghana-India Kofi Annan Centre of Excellence in ICT in Ghana. Discussion following the talks was lively, with great local ideas for the application of AI in knowledge capture from small farmers, and resource allocation for rural health care. Video from some talks is being made available on, there was good coverage from the local media, and we look forward to building a collaboration with our new colleagues.

Related articles

Enhanced by Zemanta

An interesting peek in Microsoft’s kitchen (the Beijing labs, by the looks of it): Probase and ReadWriteWeb writeup on it. It’s a very large web-fed knowledge-base, including concept hierarchies (2.7 million concepts, 4.5 million subclass relations, 16 million instances). Including all major knowledge sources (Freebase, WordNet, Cyc, DBPedia, Yago, a.o.), with pretty well researched quality measures. Unfortunately, none of the data is Linked in any way, none of this available, let alone in some standard format.This is interestingly different from IBM’s Watson knowledge base. That is mostly filled with knowledge extracted from linguistic sources (although structured data does play a limited role). Probase seems to rely much more on structured knowledge sources.

Web of Data Interpreter (WoDI) is a recently launched spin-off company from the LarKC project, currently located in Innsbruck, Austria. The targeted development segment of WoDI is the implementation of intelligent tools and methods for accessing, reasoning and consuming linked data. The main areas of WoDI innovation are scalable reasoning with rules and streams of […]

The LarKC team is proud to announce that its tutorial “Scalable Integration and Processing of Linked Data” was accepted at this year’s International World Wide Web Conference 2011 (WWW’11) that will take place from March 28th to April 1st, 2011 in Hyderabad, India.
Abstract of the Tutorial
The goal of this tutorial is to introduce, motivate and […]

Based on the great success of LarKC’s Early Adopters Tutorials, the next workshop is targeted to drug development and discovery.
The LarKC Pharma Workshop will take place on April 19th and 20th 2011 in Stuttgart. The workshop builds on previous tutorials and is customized for drug development/discovery participants. Having completed this workshop, participants will have the […]

We are glad to announce that the LarKC Platform Release v2.0 is now available in our repository on SourceForge.
The redistributable package can be downloaded via the following URL: (OS independent)

The source code belonging to this release can be checked out from SVN:

A complete manual for both users and developers can be found at:

If […]

LarKC announces a new release of mpiJava (1.2.6) – a Message-Passing Interface (MPI) library, allowing a Java application to efficiently run on a distributed, parallel, and high-performance computer architecture. First introduced in the HPJava project and developed by

  • Pervasive Technology Labs, Indiana University,
  • Syracuse University, and
  • CSM, University of Portsmouth,

mpiJava@SourceForge ( is now managed and maintained by the High Performance Computing Center Stuttgart (HLRS) in the framework of LarKC.

The library is easy to deploy and use within the application code, in particular for plug-ins. Among the new features are true multi-platforming (thanks to CMake support configuration procedure), very high performance characteristics (achieved by efficient utilisation of underlying native MPI-C implementation), and support of the most famous MPI realisations (MPICH, Open MPI, and MS-MPI).

home  |  stats  |  news (rss)  |  login