Discussing Open Data
Source: Think Links
This past Thursday, I had the opportunity to participate in a mini-symposium held by the VU University Amsterdam (where I work) around open data for science titled Open Data for Science: Will it hurt or help?
The symposium consisted of three 15 minute talks and then some lively discussion with the audience of I think ~60 people from the university. We were lucky to have Jos Engelen the chairman of the NWO (the dutch NSF) discuss the perspective from research policy makers. The main take away I got from his presentation and the subsequent discussion is that open data (despite all reservations) is a worthy endeavor to pursue and something that research funders should (and will) encourage. Furthermore, just his presence means that policy makers are reaching out to see what the academic community thinks and that the community will have a say in how (open) data management policies will be rolled out in the Netherlands.
The most difficult talk to give was by Eco de Geus, who was asked to reflect on the more negative aspects of open data. He presented important points about incentive structures (will I be scooped?), privacy, and the tendency towards one size fits all open data policies. These were important points. I think what made the reservations more poignant is that Prof. de Geus is not anti open data indeed he is deeply involved in large open data project in his domain.
I talked about the view from a scientist starting out in their career. I told two stories:
- how open data really benefited a collaborator of mine in her study of interdisciplinary work practices. As a consumer it really of data, open data really removes a number of barriers.
- in an analogy to open code, I discussed how an open source code I produced during my PhD led to more citations, a new collaboration, and others comparing there work to mine. However, these benefits were contrasted with the need to do support and having to be comfortable exposing my work practices.
I ended by making the following points about open data:
- Open data is a boon to young scientists when they are acting as consumers of data.
- It’s a more difficult position for producers of data. There are trade-offs including concerns about credit, time for support, and time to prepare data.
- Given 2, if we want to help scientists as consumers of data, we need to give support to producers.
- Clear simple guidelines for data publication are critical. Scientists shouldn’t need to be lawyers to either produce or consume data sets.
- Credit where credit is due. For open data to succeed, we need data citation on par with traditional citation.
You’ll find the slides to my talk below. Although they are a lot images so may not make much sense.
Overall, I thought the talks and discussion were excellent. It’s great to see this sort of discussion happening where I work. I hope it’s happening in many other institutions as well.