WELCOME TO COLOURING LONDON
DATA ETHICS POLICY
Trust in Colouring London and adherence to high data ethics standards are crucial to the platform's success and to its longevity. This includes trust that the stated aims and objectives are actually as described; that data are accurate (or if not that this is made clear); that building occupants’ privacy is respected, as well as the privacy of platform users; that partnerships are properly looked after, and that the project is responsibly managed. Breakdown of trust in any of these areas is considered to pose a major risk to the sustainability of the project.
Colouring London is designed for everyone. It looks to create a safe, positive, constructive space for users of all genders, ages, cultural backgrounds and abilities to enjoy and benefit from, and a place where the sharing of knowledge is supported. The Open Data Institute's Data Ethics Canvas is used below to help us identify and address potential ethical concerns.
THE DATA ETHICS CANVAS
Data ethics are described by the Open Data Institute (ODI) as a
" A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use."
Ethical use of data brings about trust and helps allow data to work for everyone.
Colouring London is using the ODI Data Ethics Canvas to help identify and
manage ethical issues throughout the lifecycle of the project.
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below.
WHERE ARE DATA FROM? ARE PERSONAL OR SENSITIVE DATA INVOLVED?
Colouring London plans to collect around fifty subcategories of data within the following twelve categories, to support research into the sustainability of London's building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability (including energy rating and estimated lifespan), design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building community owned, and whether the user thinks it contributes to the city.
Our data are accessible at no cost and will also be available on the GLA's London Datastore. Most data we are collating, collecting and/or generating relate to physical characteristics of building, already able to be seen from street or satellite images. Much of this information is also already held within government or commercial databases, but is restricted to the public and academia. Some datasets, such as building designation and energy rating are already publicly available. Property websites also contain large amounts of information on citizens' homes including interior images.
We do not collect personal data, and actively discourage users on our 'sign up' page from contributing email addresses or real names to the site. Our sign up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring London they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the editor, along with the time and date of the change and that all of this information will be made publicly available through the website and through bulk downloads of the edit history.
Data are gathered in three ways. Firstly and foremost by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly by harnessing knowledge held within the historic environment sector and community planning/heritage groups, and by those using and studying the building stock. Thirdly, through the use of computational approaches, allowing existing attribute data to be used to estimate building characteristics, which can later be verified.
Our job is to bring together and visualise fragmented information, make this more accessible and increase data accuracy through verification. Some data may perhaps derive from observation of the building itself, some from historical texts and some as ready-to-go spatial statistics. 'Like me?' is different in that it asks for the user's personal view on whether the building contributes to city. The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook at https://collaborative-data.theodi.org/.
HOW ARE WE ADDRESSING ACCURACY, BIAS AND INCOMPLETENESS?
To help address issues of accuracy and bias a number of features are being included. Each subcategory will have an accessible edit history, a source box, a verification button, and a query button to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options for sources range from 'expert assessment viewed at first hand', to an option to reference a specific scientific paper. Edit histories also allow users to assess the accuracy of data. Phrasing of specific subcategory questions is also employed in certain cases to address uncertainty, with data collected, unusually, at building level.
Our main editing page contains a clear statement that data are derived from multiple sources and that accuracy of the data must be determined by the user.
As with the Wikipedia model the project is being planned as a low cost model, overseen by expert contributors. Our stewarding structure is currently being developed.
WHO ARE WE SHARING DATA WITH AND UNDER WHAT CONDITIONS?
Colouring London has been designed by University College London as a free knowledge exchange platform that collates, collects and generates open data on London's building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (such as their email address) with other organisations.
The site is explicit in the user agreement, required to be accepted on our sign up page, on the way that contributed data can be used. Colouring London contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring London contributors.
Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence.
The sign up agreement emphasises that when you make a contribution to Colouring London, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also stated that UCL is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible.
Our platform code are also open and we encourage its use by other cities and towns. Code are available on our GitHub site
https://github.com/tomalrussell/colouring-london' under the following licensing terms:
'Colouring London Copyright (C) 2018 Tom Russell and Colouring London contributors'. This program is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.