News and Updates on the KRR Group
Header image

Software Carpentry Bootcamp @VU

Posted by data2semantics in collaboration | computer science | large scale | semantic web | vu university amsterdam - (Comments Off on Software Carpentry Bootcamp @VU)

Source: Data2Semantics

Vrije Universiteit (Amsterdam). Left: Exact Sc...

VU University Amsterdam (Photo credit: Wikipedia)

Learn to build better code in less time.

Software Carpentry (http://www.software-carpentry.org) is a two day bootcamp for researchers to learn how to be more productive with code and software creation. VU University Amsterdam brings Software Carpentry to the Netherlands for the first time. PhD students, postdocs and researchers in physics are cordially invited for this free, 2-day workshop, on May 2–3, 2013, in Amsterdam.

Data2Semantics is sponsoring the event to help learn the issues facing scientists around managing their data.

Go to http://www.data2semantics.org/bootcamp for more information and registration (max. 40!) .

Enhanced by Zemanta

Source: Semantic Web world for you
[object Window] via ICT 4 Development course final presentations. Filed under: Linked Open Data, SemanticXO

Source: Semantic Web world for you
[object Window] via ICT 4 Development course final presentations. Filed under: Linked Open Data, SemanticXO

Source: Semantic Web world for you
Il y a quelque jours j’ai eu le plaisir, et la chance, de participer à la série de webinaires organisés par l’AIMS. L’objectif que je m’étais fixé pour ma présentation (en Français) intitulée “Clarifier le sens de vos données publiques avec le Web de données” était de démontrer l’avantage de l’utilisation du Web de données […]

Source: Semantic Web world for you
Il y a quelque jours j’ai eu le plaisir, et la chance, de participer à la série de webinaires organisés par l’AIMS. L’objectif que je m’étais fixé pour ma présentation (en Français) intitulée “Clarifier le sens de vos données publiques avec le Web de données” était de démontrer l’avantage de l’utilisation du Web de données […]

Source: Think Links

Below is a post-it note summary made with our students in the Web Science course. This is the capstone class for students doing the Web Science minor here a the VU and the summary highlights the topics they’ve learned about so far in four other courses.

webscience-summary

Filed under: academia Tagged: summary, web science

Source: Think Links

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

600 people reached the top of Mt. Everest in 2012. This blog got about 4,900 views in 2012. If every person who reached the top of Mt. Everest viewed this blog, it would have taken 8 years to get that many views.

Click here to see the complete report.

Filed under: Uncategorized

Toekomst Kijken/Looking into the Future

Posted by data2semantics in collaboration | computer science | large scale | semantic web | vu university amsterdam - (Comments Off on Toekomst Kijken/Looking into the Future)

Last Wednesday, Frank van Harmelen appeared on the Dutch science TV program “Labyrint”, where he interviews George Dyson, Luc Steels and François Pachet about their ideas on the future of Computers.

The program can be watched online (in Dutch):

And here’s the discussion session afterwards (in Dutch):

More information at the website of Labyrint.

YASGUI: Web-based SPARQL client with bells ‘n wistles

Posted by data2semantics in collaboration | computer science | large scale | semantic web | vu university amsterdam - (Comments Off on YASGUI: Web-based SPARQL client with bells ‘n wistles)

Source: Data2Semantics

A few months ago Laurens Rietveld was looking for a query interface from which he could easily query any other SPARQL endpoint.

But he couldn’t find any that fit my requirements:

So he decided to make his own!

Give it a try at: http://aers.data2semantics.org/yasgui/

Future work (next year probably):

Comments are appreciated (including feature ideas / bug reports).

Sources are available at https://github.com/LaurensRietveld/yasgui

Enhanced by Zemanta

In recent days cyberspace has seen some discussion concerning the relationship of the EU FP7 project LDBC (Linked Data Benchmark Council) and sociotechnical considerations. It has been suggested that LDBC, to its own and the community’s detriment, ignores sociotechnical aspects.

LDBC, as research projects go, actually has an unusually large, and as of this early date, successful and thriving sociotechnical aspect, i.e. involvement of users and vendors alike. I will here discuss why, insofar the technical output of the project goes, sociotechnical metricss are in fact out of scope.  Then yet again, to what degree the benefits potentially obtained from the use of LDBC outcomes are in fact realized does have a strong dependence on community building, a social process.

One criticism of big data projects we sometimes encounter is the point that data without context is not useful. Further, one cannot just assume that one can throw several data sets together and get meaning from this, as there may be different semantics for similar looking things, just think of 7 different definitions of blood pressure.

LDBC, in its initial user community meeting was, according to its charter, focusing mostly on cases where the data is already in existence and of sufficient quality for the application at hand.

Michael Brodie, Chief Scientist at Verizon, is a well known advocate of focusing on meaning of data, not only on processing performance. There is a piece on this matter by him, Peter Boncz, Chris Bizer and myself on the Sigmod Record: “The Meaningful Use of Big Data: Four Perspectives”.

I had a conversation with Michael at a DERRI meeting a couple of years ago about measuring the total cost of technology adoption, thus including socio-technical aspects such as acceptance by users, learning curves of various stakeholders, whether in fact one could demonstrate an overall gain in productivity arising from semantic technologies. ‘Can one measure the effectiveness of different approaches to data integration?’ asked I. ‘Of course one can,’ answered Michael, ‘this only involves carrying out the same task with two different technologies, two different teams and then doing a double blind test with users.  However, this never happens. Nobody does this because doing the task even once in a large organization is enormously costly and nobody will even seriously consider doubling the expense.’ [in my words, paraphrased]

LDBC does in fact intend to address technical aspects of data integration, i.e. schema conversion, entity resolution and the like. Addressing the sociotechnical aspects of this such as whether one should integrate in the first place, whether the integration result adds value, whether it violates privacy or security concerns, whether users will understand the result, what the learning curves are etc. is simply too diverse and so totally domain dependent that a general purpose metric cannot be developed, not at least in the time and budget constraints of the project.  Further, adding a large human
element in the experimental setting, e.g how skilled the developers are, how well the  stakeholders can explain their needs, how often these needs change, etc. will lead to experiments that are so expensive to carry out and whose results will have so many unquantifiable factors that these will constitute an insuperable barrier to adoption.  

Experience demonstrates that even agreeing on the relative importance of quantifiable metrics of database performance is hard enough. Overreaching would compromize the project’s ability to deliver its core value. Let us next talk about this.

It is only a natural part of the political landscape that the EC’s research funding choices are criticized by some members of the public. Some criticism is about the emphasis on big data.  Big data is a fact on the ground and research and industry need to deal with it. Of course there have been and will be critics of technology in general on moral or philosophical grounds. Instead of opening this topic, I will refer you to an article by Michael Brodie http://www.michaelbrodie.com/michael_brodie_statement.asp In a world where big data is a given, lowering the entry threshold for big data applications, thus making them available not only to government agencies and the largest businesses seems ethical to me, as per Brodie’s checklist. LDBC will contribute to this by driving greater availability, performance and lower costfor these technologies.

Once we accept that big data is there and is important, we arrive at the issue of deriving actionable meaning from it. A prerequisite of deriving actionable meaning from big data is the ability to flexibly process this data. LDBC is about creating metrics for this. The prerequisites for flexibly working with  data are fairly independent of the specific use case whereas the criteria of meaning, let alone actionable analysis, are very domain specific. Therefore in order to provide the greatest service to the broadest constituency, LDBC focuses on measuring that which is most generic, yet will underlie any decision support or other data processing deployment that involves RDF or graph data.

I would say that LDBC is an exceptionally effective use of taxpayer money.  LDBC will produce metrics that will drive technology innovation for years to come.  The total money spent towards pursuing goals set forth by LDBC is likely to vastly exceed the budget of LDBC. Only think of the person-centuries or even millennia that have gone into optimizing for TPC C and H. The vast majority of the money spent for these pursuits is paid by industry, not by research funding. It is spent worldwide, not in Europe alone.

Thus, if LDBC is successful, a limited amount of EC research money will influence how much greater product development budgets are spent in the future.  This multiplier effect applies of course to highly successful research outcomes in general but is specially clear with LDBC.

European research funding has played a significant role in creating the foundations of the RDF/linked data scene.  LDBC is a continuation of this policy, however the focus has now shifted to reflect the greater maturity of the technology.  LDBC is now about making the RDF and graph database sectors into mature industries whose products can predictably tackle the challenges out there.

Orri Erling
OpenLink Software, Inc.

Tags: