WELCOME TO COLOURING LONDON

DATA ETHICS POLICY

INTRODUCTION

This page discusses Colouring London and the Colouring Cities Research in the context of the data ethics agenda, and includes discussion of issues relating to data privacy, security, quality, accessibility, platform transparency, inclusivity, governance and sustainability.

Further information on privacy and security, and on our code of conduct, can be accessed on our Menu pages. Progress on open code development relating to the privacy and security also can be viewed on GitHub site at https://github.com/colouring-london/colouring-london/issues/687, and https://github.com/colouring-london/colouring-london/issues/688).


The Colouring Cities Research Programme (CCRP) has been designed to test a new type of free public information tool, to encourage knowledge and data sharing about buildings and cities, for the public good. Colouring London is the programme's live testing platform for open source code, which are released on our GitHub site. The platform is being designed as a safe, positive, constructive space for users of diverse ages, genders, cultural backgrounds, skills and abilities to enjoy and benefit from. Users need to be sure their contributions will be treated with respect and that GDPR principles of lawfulness, fairness and transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity and confidentiality (security), and accountability, will be met. We in fact actively discourage users from giving us personal data wherever possible. The privacy of building occupiers is also an important issue for us and is prioritised in our data collection approach. It is carefully balanced against the increasingly urgent need to collect information on building stocks to aid emissions reduction and increase urban sustainability as a whole.

Our open code and open data licences mean that our data can be experimented with in any way. The CCRP has been set up at Turing to support testing of Colouring London prototype design with international research partners, and to promote ethical standards and principles for information management systems that deal with built environment data.  
 

Ongoing effort is made to make data accessible to the widest possible audience, and to also highlight uncertainty, and sources of data wherever possible. Breakdown of trust in any of these areas is considered to pose a significant risk to the long-term sustainability of the project. One of the main question asked by the CCRP is 'How do we balance the need to open up data on buildings,to increase sustainability, resilience and inclusivity in cities, with the need to protect the security and privacy of platform users, and of building users and occupiers as well?'

 

Key methods tested include:  a) making Colouring City platforms accessible to everyone in view-only mode, without sign-up being required;  b) requesting minimum personal data from platform editors but requiring them to adhere to a  clear code of conduct; c) avoiding collection of private/potentially sensitive data on buildings through ongoing research and consultation with stakeholders, (e.g on private space within homes); d) developing collaborative monitoring systems to pick up issues as quickly; e) using security software and firewalls where applicable to manage data and prevent malicious attacks; f) working with national and international partners to test frameworks such as that set out in the Alan Turing Institute's DataSafeHaven model, designed able to improve ethical standards in the management of  built environment data as a whole; g) constantly reassessing our security and privacy procedures and ethical framework. 

 

 Our programme's usefulness, success and longevity also relies public trust.  We try to be as transparent as possible regarding what our project is designed to do, what types of data does it collect and why these are needed to support the  public good, how the project is managed, and what security and privacy features/mechanisms are in place. We work with a 100 year + time horizon in mind, believing that though technologies will change, low-cost, accessible databases, providing free, high quality, detailed information on national stocks, gathered and will always be required and desired, and that these databases must, to prevent major breaches of privacy and security, and to ensure inclusivity, be built from the outset to rigorous ethical standards, which are constantly assessed.

 

 Below, information is first provided on existing principles we follow. The Open Data Institute's Data Ethics Canvas is then used to address specific questions.   

 

PRINCIPLES & FRAMEWORKS WE AIM TO PROMOTE

Below are frameworks and principles we promote. We also assess our platform against ethical standards set by them.
 

SETS OF ETHICAL PRINCIPLES THE COLOURING LONDON PROTOTYPE PLATFORM IS CHECKED AGAINST
 

  1. GENERAL DATA PROTECTION REGULATION (GDPR)
    Oversight: (UK) The Information Commissioner's Office (ICO)
    Link: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/principles/
    Colouring London is required to meet GDPR requirements with regard to personal data on individuals. GDPR principles are also applied to all types of data collected as great care is also needed when handing certain types of spatial data relating to people's homes, especially data relating to domestic building interior space/activities, and to ownership. (Domestic buildings make up the vast majority of buildings in national building stocks).
    GDPR data principles:

  • Lawfulness

  • Fairness

  • Transparency

  • Purpose limitation

  • Data minimisation

  • Accuracy

  • Storage limitation

  • Integrity

  • Confidentiality (security)

  • Accountability,
     

2. OPEN KNOWLEDGE FOUNDATION (OKF)  OPEN DEFINITION 2.1
https://opendefinition.org/od/2.1/en/
The OKF defines knowledge as 'open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness'.

 

3.  THE 'OPEN DATA CHARTER
https://opendatacharter.net/principles/
- Open by default
- Timely and comprehensive
- Accessible and useable
- Comparable and Interoperable
- For improved governance and citizen engagement
- For inclusive development and innovation

 

4. THE OPEN DATA INSTITUTE'S DATA INFRASTRUCTURE PRINCIPLES
https://theodi.org/article/principles-for-strengthening-our-data-infrastructure/
- Design for Open
- Build with the web
- Respect privacy
- Benefit everyone
- Think big but start small
- Design to adapt
- Encourage open innovation

 

5. THE OPEN DATA INSTITUTE'S PERSONAL DATA QUESTIONS 
https://theodi.org/article/openness-principles-for-organisations-handling-personal-data/. 
In Colouring Cities the following questions handling personal data are ALSO extended to people's homes
- What are we collecting? 
- How are we using it? 
- How are we sharing it? 
- How are we securing it? 
- How are we making decisions about it? 
- How are we accountable? 
- How can users influences use? 
- How can we make analysis/outputs accessible

See also  ODI's data ethics canvas below

 

6. THE GEMINI PRINCIPLES 
https://www.cdbb.cam.ac.uk/DFTG/GeminiPrinciples. 
The CCRP promotes the Gemini Principles, developed by the Centre for Digital Britain at the University of Cambridge (2019) to provide a ' conscience' for the framework for information management systems on the built environment/infrastructure, and for national digital twins, and to ensure these remain focused on the public good.
- Public good
- Value creation
- Insight
- Security
- Openness
- Quality
- Federation
- Curation
- Evolution 

 

7. THE NEW URBAN AGENDA
https://www.un.org/sustainabledevelopment/blog/2016/10/newurbanagenda/  and https://habitat3.org/the-new-urban-agenda/
The CCRP promotes the UN New Urban Agenda, created to drive global commitment to the goal of sustainable, inclusive, healthy and resilient cities and stocks: 
- Provide basic services for all citizens (e.g. housing, water, sanitation, food healthcare, education, culture,communication
  technologies.
- Ensure that all citizens have access to equal opportunities and face no discrimination
- Promote measures that support cleaner cities (air pollution, greenspaces, renewage energy/transport)
- Strengthen resilience in cities to reduce the risk and the impact of disasters (better urban planning, quality infrastructure 
   and improving local responses).
- Take action to address climate change by reducing cities' greenhouse gas emissions
- Fully respect the rights of refugees, migrants and internally displaced persons regardless of their migration status
- Improve connectivity and support innovative and green initiatives (including supporting cross sector partnerships)
- Promote safe, accessible and green public spaces

 

8. THE UNIVERSAL DECLARATION OF HUMAN RIGHTS
https://www.un.org/en/about-us/universal-declaration-of-human-rights
The CCRP works to support the UDHR, and specifically the following (of 30 Articles):  
Article 1: All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
Article 2: Everyone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status. Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.
Article 3: Everyone has the right to life, liberty and security of person.
Article 12: No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.
Article 19: Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers.
(Note: Such speech must also respect other UDHR Articles).
Article 21. Everyone has the right to take part in the government of his country, directly or through freely chosen representatives. Everyone has the right of equal access to public service in his country.
Article 25: Everyone has the right to a standard of living adequate for the health and well-being of himself and of his family, including food, clothing, housing and medical care and necessary social services, and the right to security in the event of unemployment, sickness, disability, widowhood, old age or other lack of livelihood in circumstances beyond his control.
Article 27: Everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits. Everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author.

 

 

THE DATA ETHICS CANVAS


Data ethics are described by the Open Data Institute (ODI) as a 
" A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use.

Ethical use of data brings about trust and helps allow data to work for everyone.
The Colouring Cities Research Programme uses the ODI Data Ethics Canvas to help identify and manage ethical issues throughout the lifecycle of its prototype platform Colouring London. 
 
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below. 

WHERE ARE DATA FROM? ARE PERSONAL OR SENSITIVE DATA INVOLVED?


Colouring London collects over fifty subcategories of data to support research into the sustainability of London's building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability (including energy rating and estimated lifespan), design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building is community owned, and whether the user thinks it contributes to the city.
 
Our data are available for download as open data from our platform. Most data we are collating, collecting and/or generating relate to physical characteristics of building, already able to be seen from street or satellite images. Much of this information is also already held within government or commercial databases, though in many cases these are restricted to the public and academia. Some government datasets, such as building designation and protection and energy rating are already publicly available.  Detailed information on London's buildings is also provided by the commercial sector with images of the interiors of homes, their price and sale history now commonly available on property websites.  

Colouring London doed not collect personal data, other than optional emails needed to enable users to reset site passwords. We actively discourage users, on our 'sign up' page, from contributing even their real names.  Though it is helpful to understand what sectors and disciplines and groups our users are coming from to help us reach and ensure relevance to as wide an audience as possible, we believe requests for information for this purpose should, if introduced, be optional only, with minimal information asked for. We will continue to explore how this issue can best be addressed by working and consulting across sectors, disciplines and community groups.  
 
Our sign-up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring London they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the user/editor, along with the time and date of the change and that all of this information will be made public through the website and through bulk downloads of the edit history.
 
 Data are gathered in four main ways.  Firstly by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly by harnessing knowledge held within the community through crowdsourcing at building level, whether this be from, for example, building professionals, local councils, local amenity societies, building users or schools. Thirdly, through large-scale computational data generation programmes, with results encouraged to be verified by local experts as well. Live streaming of planning data will also be tested.
 
Our job is to bring together and visualise data on the building stock that is currently high fragmented, restricted or unavailable, to make this more accessible and increase data accuracy through inclusion of sources and verification.  Some data may derive from observation of the building itself, some need to be extracted from historical texts and some come in the form of ready-to-go datasets comprising city wide spatial statistics.  Users are informed on sign up that data cannot be accepted on the site where any restrictions to its open release may apply.Our 'Community' section differs slightly in that it also asks users' how well buildings work and whether they contribute well to the city and/or local area.  

The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook at https://collaborative-data.theodi.org/, with specific datasets encouraged to be added to, verified and updated by specialist sources (e.g. historians for age data, government energy specialists for energy data, and communities for workability data). Our stewarding structure is in the process of being developed.

HOW ARE WE ADDRESSING ACCURACY, BIAS AND INCOMPLETENESS?

Data on buildings are collected at building and property level (a property representing  either the whole, or part of, a building). Significant questions currently exist in the UK with regard to access to address/location data and substantial work is still required on how data can best be matched to footprint polygons.

To help address issues of accuracy and bias a number of features are being included. Each subcategory has a source box, a verification button, with a  query button planned to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options plus links to allow references to sources and routes to further information are also included. Easy to access edit histories also allow users to assess the accuracy of data. Specific phrasing of specific subcategory questions is also required in certain cases to address uncertainty. In terms of updating our plan is to update Ordnance Survey footprints every six months and to store demolished building data gathered through this process in our 'Dynamics' section.


As with the Wikipedia and OpenStreetMap model Colouring Cities is designed as  as a low cost model, overseen by expert contributors.  Our landing page also contains a clear statement that data are derived from multiple sources and that accuracy of the data must, ultimately,  be determined by the user. 
 

WHO ARE WE SHARING DATA WITH AND UNDER WHAT CONDITIONS? 

 Colouring London has been designed as a free knowledge exchange platform that collates, collects and generates open data on London's building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (e.g email address) with any other organisation.
 
The site is explicit in the user agreement, required to be accepted on our sign up page, on the way that contributed data can be used. Colouring London contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring London contributors.

Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence.

The sign up agreement emphasises that when you make a contribution to Colouring London, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also explicitly stated that Colouring London is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible

 

Our platform code are also open and we encourage its use by other cities and towns. Code are available on our GitHub site 

https://github.com/colouring-london/colouring-london  under the following licensing terms: 
'Colouring London Copyright (C) 2018 Tom Russell and Colouring London contributors'. This program is free software: you can redistribute it and/or

modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

WHAT LEGISLATION & POLICY ARE SHAPING OUR DATA?


We closely monitor changes to UK policy in relation to the opening up of mapping data at building/property level particularly OS Mastermap footprints,  Valuation Office Agency property tax databases (which contain comprehensive building attribute data relating to many of our categories) and OS historical map data. Our work also involves demonstrating to government why open release of these datasets - with footprints and property tax data now freely available in many other countries- is so important. Once OSMM footprint geometry has been released, all our data will be able to be mapped by all users, as on our site, rather than point data only.   
 
Our data are also specifically designed to support the UK's net-zero greenhouse gas targets for 2050, the United Nation's 17 Sustainable Development Goals , and its New Urban AgendaChanges to The UK National Planning Policy Framework, and to energy legislation and  UK open spatial data policies are also particularly relevant to our research programme.
 

WHAT RIGHTS WILL THE SOURCE HAVE? 

Open Data Commons Open Database License (ODbL) Colouring London contributors are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence

ARE WE SURE WE ARE NOT CONTRAVENING ETHICAL FRAMEWORKS


 Data privacy and data ethics are of the highest priority to the Colouring Cities Research Programme.  We work to the best of our ability to ensure we do not contravene any existing ethical frameworks. Each data category is widely consulted on and rigorously checked for potential issues prior to release. Issues on  contravention can be raised by users on our Githib site which is constantly monitored. We also actively discourage the contribution of personal data, avoid the collection of data within the building fabric,  incorporate controlled dropdown menus (with an internal moderation system for sources), and moderate all bulk uploads. We also  state that all data uploaded must be from an open source or generated by the user themselves. Our ongoing work with internal data ethics working groups at the Alan Turing Institute and with national and local partners ensure diverse feedback routes and we actively seek to learn from, and collaborate with organisations advancing the data ethics agenda such as the ODI.  

WHY ARE WE COLLECTING DATA? ARE WE REPLACING A SERVICE? ARE WE MAKING THINGS BETTER AND FOR WHOM?


We are collating and collecting open data on London's building stock to provide  essential information for citizens, researchers, education providers and policy makers, to support the development of sustainable and resilient buildings stocks. . We also want to assist those designing, constructing, caring for, managing and studying London's buildings to help solve urban problems both providing data, and through interdisciplinary/collaborative work .
 
Our aim is to create a one-stop-shop for open data on London's stock. The release of these data is also designed to stimulate the production of innovative and efficient products  within the academic, non-profit and commercial sectors which promote and support the UK's transition to a low carbon economy, the UN's Sustainable Development Goals and the UN New Urban Agenda.
 
We also believe that it is healthy for Colouring Cities platforms collating data on the building stock to be curated by research institutions, whose stance is impartial and whose brief is to undertake research on the built environment for the public good. At the Alan Turing Institute we work within the Urban Analytics programme

 

ARE WE CLEAR IN THE WAY THE DATA WILL BE USED?

We are already aware of areas of research, such as energy, where demand for accurate building level attribute data is very high. We also know, from extensive consultation, that these building attribute data are also important to the construction and property industry, housing suppliers, planning bodies and the education sector. We are therefore excited about the many ways in which the data might be used.
 
We are currently developing a curated data showcase facility to allow users to upload information, images and links to how data from Colouring London are being applied to urban problems, and to in doing so to inspire and inform.
 

WHO WILL BE POSITIVELY IMPACTED AND HOW? HOW CAN WE MAXIMISE AND MEASURE THIS?


As noted above, building attribute data for London at building level is provided free for all those involved in the design, research, construction, management and maintenance and analysis of London's buildings, and its sustainable development. Our project is also designed to encourage use and knowledge sharing by diverse audiences. This element is central to its design.
 
Our task at the moment is to begin to release data for all our c50 subcategories. The second stage of the project will concentrate on improving data quality, and showing how data are/can be used.

WHO COULD BE NEGATIVELY AFFECTED BY THE PROJECT? & HOW IS tHIS BEING ADDRESSED?


The ODI's 10th item on its Data Ethics Canvas addresses the issue of negative project impact. Could the manner in which this data is collected, shared and used cause harm? Or be used to target, profile or prejudice people, unfairly restrict access?  Could people perceive it as harmful?

All spatial data projects that collect information able to be linked to specific addresses need to be very careful with regard to the type of data collected and how it is held and accessed. A number of checks have had to be put in place to ensure the safety and privacy of building occupants, and platform users. 
 
Examples of ways in which we are working to minimise negative impacts include
a) discouraging the submission of personal data (e.g. email addressed, real names), b) not collecting data on the insides of homes, c) avoiding  freetext wherever possible and using preset dropdowns, to prevent cyberbullying and security risks for occupants, d) only allowing users one vote per user on 'like me?', e) having no negative option for 'Like me?' again to prevent cyberbulling, f) having a sign up page that provides clear guidelines for responsible and ethical use of the site and g) only allowing the copy and paste tool to be used on one building at a time to deter macilous behaviour, and moderating all bulk uploads. 
 
Owing to concerns raised during consultation with regard to privacy and ownership data, Colouring London also only collects data on buildings where the freehold is held by the state or 3rd sector owners.  
 

HOW CAN PEOPLE APPEAL OR AFFECT CHANGES TO THE SERVICE?

Comments can also be made on existing discussion threads.  We are also looking at site improvement forms.

HOW ARE WE BUILDING IN CONSIDERATIONS OF PEOPLE AFFECTED BY OUR PROJECT?


The ODI's 13th Ethics Canvas item addresses considerations of people affected by our project. Are we creating potential risks or issues? How are limitations being communicated to those the data is about, and those impacted by its use? And how are we doing this?
 
Colouring London has been designed from the outset in consultation with representatives from diverse sectors. These are listed on our 'Who's involved page'. Our aim is to work with our partners and their networks to allow possible risks, concerns and improvements to be raised at the earliest possible stage, as well as through our public discussions forum and feedback forms. This will help allow ius to anticipate problems and for these to be addressed and features added, adjusted and/or removed as appropriate.
 

HOW WILL ONGOING ISSUES RELATING TO DATA ETHICS BE MONITORED & DISCUSSED?


We discuss data ethics issues, on an ongoing basis, with colleagues from Turing's Data Ethics Group. Turing's  'Facilitating responsible participation in data science' Special Interest Group, and with our project partners. Users are also able to raise issues for consideration on our discussion threads. 

WHAT DATA ETHICS ACTIONS ARE OUR CURRENT PRIORITES?


Our current data ethics priorities (Last updated August 2021) are to improve our user feedback forms and alert features, to try to address issues relating  to security/privacy in relation to free text boxes; and to identify areas of concern with our Colouring Cities Research Programme Partners.