WELCOME TO COLOURING LONDON
DATA ETHICS POLICY
Colouring London is designed for everyone. It looks to create a safe, positive, constructive space for users of all genders, ages, cultural backgrounds and abilities to enjoy and benefit from, where the sharing of knowledge is encouraged and supported.
User trust in Colouring London and adherence to high data ethics standards is essential to the platform's success and longevity. This includes: trust that the stated aims and objectives are actually as described; that user contributions will be treated with respect; that GDPR principles of lawfulness, fairness and transparency; purpose limitation; data minimisation; accuracy; storage limitation; integrity and confidentiality (security); and accountability will be met (we in fact actively discourage users from giving us personal data wherever possible); that building occupiers' privacy is prioritised in data collection and that this is also carefully balanced against the urgent need to collect information on the physical form of all London's buildings to help aid emissions reduction and increase the sustainability of the city as a whole; that ongoing efforts will be made to make data as accurate and accessible as possible, and available to the widest possible audience; that the project is being responsibly and efficiently managed for the public good,
cost effectively run, and that it is designed to last. Breakdown of trust in any of these areas is considered to pose a significant risk to the sustainability of the project.
Colouring London specifically requests permission from its public sector partners to display partner logos. These are considered helpful to users when assessing the trustworthiness of the project, and, in effect, act as trust marks. We use the Open Data Institute's (ODI) Data Ethics Canvas, discussed below, to identify and address potential ethical concerns, and we are also a member of the 'Facilitating Responsible Participation in Data Science' Special Interest Group at The Alan Turing Institute.
THE DATA ETHICS CANVAS
Data ethics are described by the Open Data Institute (ODI) as a
" A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use."
Ethical use of data brings about trust and helps allow data to work for everyone.
Colouring London is using the ODI Data Ethics Canvas to help identify and
manage ethical issues throughout the lifecycle of the project.
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below.
WHERE ARE DATA FROM? ARE PERSONAL OR SENSITIVE DATA INVOLVED?
Colouring London collects over fifty subcategories of data to support research into the sustainability of London's building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability (including energy rating and estimated lifespan), design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building is community owned, and whether the user thinks it contributes to the city.
Our data are accessible at no cost and will also be available on the GLA's London Datastore. Most data we are collating, collecting and/or generating relate to physical characteristics of building, already able to be seen from street or satellite images. Much of this information is also already held within government or commercial databases, but is restricted to the public and academia. Some datasets, such as building designation and energy rating are already publicly available. Detailed information on London's buildings is also provided by the commercial sector with images of the interiors of homes, their price and sale history now commonly available on property websites.
We do not collect personal data, other than optional, emails needed to enable users to reset site passwords, and actively discourage users on our 'sign up' page from contributing even their real names. Though it is important for us to understand what sectors and disciplines, groups our users are coming from to help us reach and ensure relevance to as wide an audience as possible, we believe requests for such information for this purpose should, if introduced, be optional only, with minimal information asked for. In the meantime we continue to explore how this issue can best be addressed by working and consulting across sectors, disciplines and community groups.
Our sign-up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring London they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the editor, along with the time and date of the change and that all of this information will be made publicly available through the website and through bulk downloads of the edit history.
Data are gathered in three ways. Firstly by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly by harnessing knowledge held within the community, whether this be from, for example, building specialists, building users, civic societies or schools. Thirdly, through large-scale computational data generation programmes run with results encouraged to be checked by the community as well.
Our job is to bring together and visualise fragmented information, make this more accessible and increase data accuracy through verification. Some data may perhaps derive from observation of the building itself, some from historical texts or ready-to-go spatial statistics. Our 'Community' section differs slightly in that it also asks users' whether they think buildings contribute to the city and/or local area. Users are informed on sign up that data cannot be accepted on the site where any restrictions to its open release may apply.
The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook at https://collaborative-data.theodi.org/.
HOW ARE WE ADDRESSING ACCURACY, BIAS AND INCOMPLETENESS?
To help address issues of accuracy and bias a number of features are being included. Each subcategory will have an accessible edit history, a source box, a verification button, and a query button to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options for sources range from 'expert assessment viewed at first hand', and link option allow references to sources and routes to further information. Easy to access edit histories also allow users to assess the accuracy of data. Phrasing of specific subcategory questions is also employed in certain cases to address uncertainty. Detailed data are collected, usually at building level.
Our landing page contains a clear statement that data are derived from multiple sources and that accuracy of the data must be determined by the user.
As with the Wikipedia model the project is being planned as a low cost model, overseen by expert contributors. Our stewarding structure is currently being developed.
WHO ARE WE SHARING DATA WITH AND UNDER WHAT CONDITIONS?
Colouring London has been designed as a free knowledge exchange platform that collates, collects and generates open data on London's building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (e.g email address) with any other organisation.
The site is explicit in the user agreement, required to be accepted on our sign up page, on the way that contributed data can be used. Colouring London contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring London contributors.
Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence.
The sign up agreement emphasises that when you make a contribution to Colouring London, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also stated that Colouring London is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible.
Our platform code are also open and we encourage its use by other cities and towns. Code are available on our GitHub site
https://github.com/colouring-london/colouring-london under the following licensing terms:
'Colouring London Copyright (C) 2018 Tom Russell and Colouring London contributors'. This program is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.