WELCOME TO COLOURING LONDON

DATA ETHICS POLICY

INTRODUCTION

Trust in Colouring London and adherence to high data ethics standards are crucial to the platform's success and to its longevity. This includes trust that the stated aims and objectives are actually as described; that data are accurate (or if not that this is made clear); that building occupants’ privacy is respected, as well as the privacy of platform users;  that partnerships are properly looked after, and that the project is responsibly managed. Breakdown of trust in any of these areas is considered to pose a major risk to the sustainability of the project. 

 Colouring London is designed for everyone. It looks to create a safe, positive, constructive space for users of all genders, ages, cultural backgrounds and abilities to enjoy and benefit from, and a place where the sharing of knowledge is supported. The Open Data Institute's Data Ethics Canvas is used below to help us identify and address potential ethical concerns.

THE DATA ETHICS CANVAS

Data ethics is described by the Open Data Institute (ODI) as a 
" A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use.
Ethical use of data brings about trust and helps allow data to work for everyone.
Colouring London is using the ODI Data Ethics Canvas to help identify and
manage ethical issues throughout the lifecycle of the project. 
 
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below. 

WHERE ARE DATA FROM? ARE PERSONAL OR SENSITIVE DATA INVOLVED?

Colouring London plans to collect around fifty subcategories of data within the following twelve categories, to support research into the sustainability of London's building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability (including energy rating and estimated lifespan), design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building community owned, and whether the user thinks it contributes to the city.
 
Our data are accessible at no cost and will also be available on the GLA's London Datastore. Most data we are collating, collecting and/or generating relate to physical characteristics of building, already able to be seen from street or  satellite images. Much of this information is also already held within government or commercial databases, but is restricted to the public and academia. Some datasets, such as building designation and energy rating are already publicly available.  Property websites also contain large amounts of information on citizens' homes including interior images.  
We do not collect personal data, and actively discourage users on our 'sign up' page from contributing email addresses or real names to the site. Our sign up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring London they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the editor, along with the time and date of the change and that all of this information will be made publicly available through the website and through bulk downloads of the edit history.
 
 Data are gathered in three ways.  Firstly and foremost by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly by harnessing knowledge held within the historic environment sector and community planning/heritage groups, and by those using and studying the building stock. Thirdly, through the use of computational approaches, allowing existing attribute data to be used to estimate building characteristics, which can later be verified. 
 
Our job is to bring together and visualise fragmented information, make this more accessible and increase data accuracy through verification.  Some data may perhaps  derive from observation of the building itself, some from historical texts and some as ready-to-go spatial statistics.  'Like me?' is different in that it asks for the user's personal view on whether the building contributes to city. The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook at https://collaborative-data.theodi.org/.

HOW ARE WE ADDRESSING ACCURACY, BIAS AND INCOMPLETENESS?

To help address issues of accuracy and bias a number of features are being included. Each subcategory will have an accessible edit history, a source box, a verification button, and a query button to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options for sources range from 'expert assessment viewed at first hand', to an option to reference a specific scientific paper.  Edit histories also allow users to assess the accuracy of data. Phrasing of specific subcategory questions is also employed in certain cases to address uncertainty, with data collected, unusually, at building level.
Our main editing page contains a clear statement that data are derived from multiple sources and that accuracy of the data must be determined by the user. 
As with the Wikipedia model the project is being planned as a low cost model, overseen by expert contributors. Our stewarding structure is currently being developed. 

WHO ARE WE SHARING DATA WITH AND UNDER WHAT CONDITIONS? 

 Colouring London has been designed  by University College London as a free knowledge exchange platform that collates, collects and generates open data on London's building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (such as their email address) with other organisations.
 
The site is explicit in the user agreement, required to be accepted on our sign up page, on the way that contributed data can be used. Colouring London contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring London contributors.

Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence.

The sign up agreement emphasises that when you make a contribution to Colouring London, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also stated that UCL is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible

 

Our platform code are also open and we encourage its use by other cities and towns. Code are available on our GitHub site 

https://github.com/tomalrussell/colouring-london'  under the following licensing terms: 
'Colouring London Copyright (C) 2018 Tom Russell and Colouring London contributors'. This program is free software: you can redistribute it and/or

modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

WHAT LEGISLATION & POLICY ARE SHAPING OUR DATA?

We closely monitor changes in UK policy with regard to the opening up of OSMastermap data by the Treasury. Once this has happened all our data will be available as open spatial statistics.  The  UK National Planning Policy Framework's requirement for greater transparency in planning, and greater public engagement is also influential. Advances in the release of planning data by the GLA means that much more structured planning portal information will also now be able to be streamed into our planning section.  
 
In the sustainability context, our data are specifically designed to support the UK's net-zero greenhouse gas targets for 2050, the United Nation's 17 Sustainable Development Goals , and its New Urban AgendaOur work also involves pressing government for the open release of currently restricted property tax datasets - now being opened up by many countries. These contain comprehensive building attribute data relating to many of our categories. 

WHAT RIGHTS WILL THE SOURCE HAVE? 

As noted above, under the Open Data Commons Open Database License (ODbL) Colouring London contributors are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence

ARE WE SURE WE ARE NOT CONTRAVENING ETHICAL FRAMEWORKS

 Data privacy and data ethics are of the highest priority to the project.  We work to the best of our ability to ensure we do not contravene existing ethical frameworks. We do this by trying to keep things simple. We actively discourage the contribution of personal data, avoid the collection of data within a building's fabric, and  incorporate controlled dropdown menus (with an internal moderation system for sources). Our working partnerships and discussions threads provide ways in which to capture comments regarding ethical issues and we are actively seeking to learn from, and collaborate with organisations advancing the data ethics agenda.  

WHY ARE WE COLLECTING DATA? ARE WE REPLACING A SERVICE? ARE WE MAKING THINGS BETTER AND FOR WHOM?

We are collating and collecting open data on London's building stock to provide  essential information for citizens, researchers, education providers and policy makers, to support sustainable development. We also want to assist those designing, constructing, caring for, managing and studying London's buildings to help solve urban problems and make the city more efficient. 
 
Our aim is to provide the first port of call for open data on London's stock. The release of open data is also designed to stimulate the production of innovative and efficient products  within the academic, non-profit and commercial sectors which promote and support the UK's transition to a low carbon economy, the UN's Sustainable Development Goals and the UN New Urban Agenda.
 
We also believe that it is healthy for platforms collating data on the building stock to be curated by universities, and others whose stance is impartial and whose brief is to undertake research for the public good.

ARE WE CLEAR IN THE WAY THE DATA WILL BE USED?

We are already aware of areas of research, such as energy, where demand for accurate building level attribute data is very high. We also know, from extensive consultation, that these data are also important to the construction and property industry, housing suppliers, planning bodies and the education sector. We are therefore excited about the many ways in which the data might be used.
 
We are currently developing a data showcase facility to allow users to upload how data from Colouring London can be best applied to solve urban problems, and to in doing so to inspire and inform.

WHO WILL BE POSITIVELY IMPACTED AND HOW? HOW CAN WE MAXIMISE AND MEASURE THIS?

As noted above, access to building attribute data for London at building level will provide free information for all those involved in the design, research, construction, management and maintenance and analysis of London's buildings, and its sustainable development. Our project is also designed to encourage use by diverse audiences. This element is central to its design.
 
Our task at the moment is to begin to release data on all our c50 subcategories. The second stage of the project will involve the introduction of features relating to the monitoring of platform (using analytical software ) and the Showcase section which will illustrate how the data are being used. 

WHO COULD BE NEGATIVELY AFFECTED BY THE PROJECT? & HOW IS tHIS BEING ADDRESSED?

The ODI's 10th item on its Data Ethics Canvas addresses the issue of negative project impact. Could the manner in which this data is collected, shared and used cause harm? Or be used to target, profile or prejudice people, unfairly restrict access?  Could people perceive it as harmful?
All spatial data projects that collect information able to be linked to specific addresses need to be very careful with regard to the type of data collected and how it is held and accessed. A number of checks have had to be put in place to ensure the safety and privacy of building occupants, and platform users. 
 
The main ways in which we are working to minimise negative impacts are by 
a) discouraging the submission of personal data (e.g. email addressed, real names), b) not collecting data on the insides of homes, c) not collecting freetext and using only preset dropdowns to prevent cyberbullying and security risks for occupants,
d) only allowing users one vote per user on 'like me?', e) having no negative option for 'Like me?' again to prevent cyberbulling, f) having a sign up page that provides clear guidelines for responsible and ethical use of the site and g) only allow the copy and paste tool to be used for a small number of buildings at once. 
 
Owing to concerns raised during consultation with regard to privacy and ownership data, Colouring London also only collects data on buildings where the freehold is held by the state or 3rd sector owners, with private ownership simply included as a default colour. UCL is also setting up an ethics advisory group.

HOW CAN PEOPLE APPEAL OR AFFECT CHANGES TO THE SERVICE?

Comments can also be made on existing discussion threads.  We are also looking at site improvement forms.

HOW ARE WE BUILDING IN CONSIDERATIONS OF PEOPLE AFFECTED BY OUR PROJECT?

The ODI's 13th Ethics Canvas item addresses considerations of people affected by our project. Are we creating potential risks or issues? How are limitations being communicated to those the data is about, and those impacted by its use? And how are we doing this?
 
Colouring London has been designed from the outset in consultation with representatives from diverse sectors. These are listed on our 'Who's involved page'. Our aim is to work with our partners and their networks to allow possible risks, concerns and improvements to be raised at the earliest possible stage, as well as through our public discussions forum and feedback forms. For these issues to then be addressed and features added, adjusted and/or removed as appropriate.

HOW WILL ONGOING ISSUES RELATING TO DATA ETHICS BE MONITORED & DISCUSSED?

We will be discussing data ethics issues on an ongoing basis with our project partners, our about-to-be-formed ethics committee and within our discussion threads. Relevant actions and updates to the canvas may be viewed on this page. Formal meetings for Colouring London's ethics committee, will take place three times a year.

WHAT DATA ETHICS ACTIONS ARE OUR CURRENT PRIORITES?

Our current data ethics priorities (October 2019) are setting up the Colouring London Data Ethics Committee and working on dropdowns, verification and alert features, and data moderation.

COLOURING

LONDON