COLOURING
LONDON
WELCOME TO COLOURING LONDON
DATA ETHICS POLICY
INTRODUCTION
This page discusses Colouring London and the Colouring Cities Research in the context of the data ethics agenda, and includes discussion of issues relating to data privacy, security, quality, accessibility, platform transparency, inclusivity, governance and sustainability.
Further information on privacy and security, and on our code of conduct, can be accessed on our Menu pages. Progress on open code development relating to the privacy and security also can be viewed on GitHub site at https://github.com/colouring-london/colouring-london/issues/687, and https://github.com/colouring-london/colouring-london/issues/688).
The Colouring Cities Research Programme (CCRP) has been designed to test a new type of free public information tool, to encourage knowledge and data sharing about buildings and cities, for the public good. Colouring London is the programme's live testing platform for open source code, which are released on our GitHub site. The platform is being designed as a safe, positive, constructive space for users of diverse ages, genders, cultural backgrounds, skills and abilities to enjoy and benefit from. Users need to be sure their contributions will be treated with respect and that GDPR principles of lawfulness, fairness and transparency, purpose limitation, data minimisation, accuracy, storage limitation, integrity and confidentiality (security), and accountability, will be met. We in fact actively discourage users from giving us personal data wherever possible. The privacy of building occupiers is also an important issue for us and is prioritised in our data collection approach. It is carefully balanced against the increasingly urgent need to collect information on building stocks to aid emissions reduction and increase urban sustainability as a whole.
Our open code and open data licences mean that our data can be experimented with in any way. The CCRP has been set up at Turing to support testing of Colouring London prototype design with international research partners, and to promote ethical standards and principles for information management systems that deal with built environment data.
Ongoing effort is made to make data accessible to the widest possible audience, and to also highlight uncertainty, and sources of data wherever possible. Breakdown of trust in any of these areas is considered to pose a significant risk to the long-term sustainability of the project. One of the main question asked by the CCRP is 'How do we balance the need to open up data on buildings,to increase sustainability, resilience and inclusivity in cities, with the need to protect the security and privacy of platform users, and of building users and occupiers as well?'
Key methods tested include: a) making Colouring City platforms accessible to everyone in view-only mode, without sign-up being required; b) requesting minimum personal data from platform editors but requiring them to adhere to a clear code of conduct; c) avoiding collection of private/potentially sensitive data on buildings through ongoing research and consultation with stakeholders, (e.g on private space within homes); d) developing collaborative monitoring systems to pick up issues as quickly; e) using security software and firewalls where applicable to manage data and prevent malicious attacks; f) working with national and international partners to test frameworks such as that set out in the Alan Turing Institute's DataSafeHaven model, designed able to improve ethical standards in the management of built environment data as a whole; g) constantly reassessing our security and privacy procedures and ethical framework.
Our programme's usefulness, success and longevity also relies public trust. We try to be as transparent as possible regarding what our project is designed to do, what types of data does it collect and why these are needed to support the public good, how the project is managed, and what security and privacy features/mechanisms are in place. We work with a 100 year + time horizon in mind, believing that though technologies will change, low-cost, accessible databases, providing free, high quality, detailed information on national stocks, gathered and will always be required and desired, and that these databases must, to prevent major breaches of privacy and security, and to ensure inclusivity, be built from the outset to rigorous ethical standards, which are constantly assessed.
Below, information is first provided on existing principles we follow. The Open Data Institute's Data Ethics Canvas is then used to address specific questions.
PRINCIPLES & FRAMEWORKS WE AIM TO PROMOTE
Below are frameworks and principles we promote. We also assess our platform against ethical standards set by them.
SETS OF ETHICAL PRINCIPLES THE COLOURING LONDON PROTOTYPE PLATFORM IS CHECKED AGAINST
-
GENERAL DATA PROTECTION REGULATION (GDPR)
Oversight: (UK) The Information Commissioner's Office (ICO)
Link: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/principles/
Colouring London is required to meet GDPR requirements with regard to personal data on individuals. GDPR principles are also applied to all types of data collected as great care is also needed when handing certain types of spatial data relating to people's homes, especially data relating to domestic building interior space/activities, and to ownership. (Domestic buildings make up the vast majority of buildings in national building stocks).
GDPR data principles:
-
Lawfulness
-
Fairness
-
Transparency
-
Purpose limitation
-
Data minimisation
-
Accuracy
-
Storage limitation
-
Integrity
-
Confidentiality (security)
-
Accountability,
2. OPEN KNOWLEDGE FOUNDATION (OKF) OPEN DEFINITION 2.1
https://opendefinition.org/od/2.1/en/
The OKF defines knowledge as 'open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness'.
3. THE 'OPEN DATA CHARTER'
https://opendatacharter.net/principles/
- Open by default
- Timely and comprehensive
- Accessible and useable
- Comparable and Interoperable
- For improved governance and citizen engagement
- For inclusive development and innovation
4. THE OPEN DATA INSTITUTE'S DATA INFRASTRUCTURE PRINCIPLES
https://theodi.org/article/principles-for-strengthening-our-data-infrastructure/
- Design for Open
- Build with the web
- Respect privacy
- Benefit everyone
- Think big but start small
- Design to adapt
- Encourage open innovation
5. THE OPEN DATA INSTITUTE'S PERSONAL DATA QUESTIONS
https://theodi.org/article/openness-principles-for-organisations-handling-personal-data/.
In Colouring Cities the following questions handling personal data are ALSO extended to people's homes
- What are we collecting?
- How are we using it?
- How are we sharing it?
- How are we securing it?
- How are we making decisions about it?
- How are we accountable?
- How can users influences use?
- How can we make analysis/outputs accessible
See also ODI's data ethics canvas below
6. THE GEMINI PRINCIPLES
https://www.cdbb.cam.ac.uk/DFTG/GeminiPrinciples.
The CCRP promotes the Gemini Principles, developed by the Centre for Digital Britain at the University of Cambridge (2019) to provide a ' conscience' for the framework for information management systems on the built environment/infrastructure, and for national digital twins, and to ensure these remain focused on the public good.
- Public good
- Value creation
- Insight
- Security
- Openness
- Quality
- Federation
- Curation
- Evolution
7. THE NEW URBAN AGENDA
https://www.un.org/sustainabledevelopment/blog/2016/10/newurbanagenda/ and https://habitat3.org/the-new-urban-agenda/
The CCRP promotes the UN New Urban Agenda, created to drive global commitment to the goal of sustainable, inclusive, healthy and resilient cities and stocks:
- Provide basic services for all citizens (e.g. housing, water, sanitation, food healthcare, education, culture,communication
technologies.
- Ensure that all citizens have access to equal opportunities and face no discrimination
- Promote measures that support cleaner cities (air pollution, greenspaces, renewage energy/transport)
- Strengthen resilience in cities to reduce the risk and the impact of disasters (better urban planning, quality infrastructure
and improving local responses).
- Take action to address climate change by reducing cities' greenhouse gas emissions
- Fully respect the rights of refugees, migrants and internally displaced persons regardless of their migration status
- Improve connectivity and support innovative and green initiatives (including supporting cross sector partnerships)
- Promote safe, accessible and green public spaces
8. THE UNIVERSAL DECLARATION OF HUMAN RIGHTS
https://www.un.org/en/about-us/universal-declaration-of-human-rights
The CCRP works to support the UDHR, and specifically the following (of 30 Articles):
Article 1: All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
Article 2: Everyone is entitled to all the rights and freedoms set forth in this Declaration, without distinction of any kind, such as race, colour, sex, language, religion, political or other opinion, national or social origin, property, birth or other status. Furthermore, no distinction shall be made on the basis of the political, jurisdictional or international status of the country or territory to which a person belongs, whether it be independent, trust, non-self-governing or under any other limitation of sovereignty.
Article 3: Everyone has the right to life, liberty and security of person.
Article 12: No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.
Article 19: Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers.
(Note: Such speech must also respect other UDHR Articles).
Article 21. Everyone has the right to take part in the government of his country, directly or through freely chosen representatives. Everyone has the right of equal access to public service in his country.
Article 25: Everyone has the right to a standard of living adequate for the health and well-being of himself and of his family, including food, clothing, housing and medical care and necessary social services, and the right to security in the event of unemployment, sickness, disability, widowhood, old age or other lack of livelihood in circumstances beyond his control.
Article 27: Everyone has the right freely to participate in the cultural life of the community, to enjoy the arts and to share in scientific advancement and its benefits. Everyone has the right to the protection of the moral and material interests resulting from any scientific, literary or artistic production of which he is the author.
THE DATA ETHICS CANVAS
Data ethics are described by the Open Data Institute (ODI) as a
" A branch of ethics that evaluates data practices with the potential to adversely impact on people and society-in data collection, sharing and use."
Ethical use of data brings about trust and helps allow data to work for everyone.
The Colouring Cities Research Programme uses the ODI Data Ethics Canvas to help identify and manage ethical issues throughout the lifecycle of its prototype platform Colouring London.
As part of the process of development, existing, and new features within the platform are checked against the questions posed by the Ethics Canvas. First stage responses to core questions are given below.
WHERE ARE DATA FROM? ARE PERSONAL OR SENSITIVE DATA INVOLVED?
Colouring London collects over fifty subcategories of data to support research into the sustainability of London's building stock. These relate to building location, use, type, age and history, size, materials and construction, sustainability (including energy rating and estimated lifespan), design/construction team, planning/ designation/demolition status, streetscape/green context, whether the building is community owned, and whether the user thinks it contributes to the city.
Our data are available for download as open data from our platform. Most data we are collating, collecting and/or generating relate to physical characteristics of building, already able to be seen from street or satellite images. Much of this information is also already held within government or commercial databases, though in many cases these are restricted to the public and academia. Some government datasets, such as building designation and protection and energy rating are already publicly available. Detailed information on London's buildings is also provided by the commercial sector with images of the interiors of homes, their price and sale history now commonly available on property websites.
Colouring London doed not collect personal data, other than optional emails needed to enable users to reset site passwords. We actively discourage users, on our 'sign up' page, from contributing even their real names. Though it is helpful to understand what sectors and disciplines and groups our users are coming from to help us reach and ensure relevance to as wide an audience as possible, we believe requests for information for this purpose should, if introduced, be optional only, with minimal information asked for. We will continue to explore how this issue can best be addressed by working and consulting across sectors, disciplines and community groups.
Our sign-up agreement also tries to be as transparent as possible. It emphasises that when users make a contribution to Colouring London they are creating a permanent, public record of all data they add, remove, or change; that the database will record the username and ID of the user/editor, along with the time and date of the change and that all of this information will be made public through the website and through bulk downloads of the edit history.
Data are gathered in four main ways. Firstly by identifying and collating existing datasets held by central and regional government bodies and other organisations. Secondly by harnessing knowledge held within the community through crowdsourcing at building level, whether this be from, for example, building professionals, local councils, local amenity societies, building users or schools. Thirdly, through large-scale computational data generation programmes, with results encouraged to be verified by local experts as well. Live streaming of planning data will also be tested.
Our job is to bring together and visualise data on the building stock that is currently high fragmented, restricted or unavailable, to make this more accessible and increase data accuracy through inclusion of sources and verification. Some data may derive from observation of the building itself, some need to be extracted from historical texts and some come in the form of ready-to-go datasets comprising city wide spatial statistics. Users are informed on sign up that data cannot be accepted on the site where any restrictions to its open release may apply.Our 'Community' section differs slightly in that it also asks users' how well buildings work and whether they contribute well to the city and/or local area.
The platform is also being designed as a collaborative data maintenance project as described in the ODI's handbook at https://collaborative-data.theodi.org/, with specific datasets encouraged to be added to, verified and updated by specialist sources (e.g. historians for age data, government energy specialists for energy data, and communities for workability data). Our stewarding structure is in the process of being developed.
HOW ARE WE ADDRESSING ACCURACY, BIAS AND INCOMPLETENESS?
Data on buildings are collected at building and property level (a property representing either the whole, or part of, a building). Significant questions currently exist in the UK with regard to access to address/location data and substantial work is still required on how data can best be matched to footprint polygons.
To help address issues of accuracy and bias a number of features are being included. Each subcategory has a source box, a verification button, with a query button planned to enable problems that cannot be addressed within the editing system to be raised. Moderated dropdown options plus links to allow references to sources and routes to further information are also included. Easy to access edit histories also allow users to assess the accuracy of data. Specific phrasing of specific subcategory questions is also required in certain cases to address uncertainty. In terms of updating our plan is to update Ordnance Survey footprints every six months and to store demolished building data gathered through this process in our 'Dynamics' section.
As with the Wikipedia and OpenStreetMap model Colouring Cities is designed as as a low cost model, overseen by expert contributors. Our landing page also contains a clear statement that data are derived from multiple sources and that accuracy of the data must, ultimately, be determined by the user.
WHO ARE WE SHARING DATA WITH AND UNDER WHAT CONDITIONS?
Colouring London has been designed as a free knowledge exchange platform that collates, collects and generates open data on London's building stock, able to be used by everyone. We do not sell data and we will not share user’s personal data (e.g email address) with any other organisation.
The site is explicit in the user agreement, required to be accepted on our sign up page, on the way that contributed data can be used. Colouring London contributions are licensed under the Open Data Commons Open Database License (ODbL) by Colouring London contributors.
Users are free to copy, distribute, transmit and adapt our data, as long as they credit Colouring London and its contributors. If users alter or build on Colouring London data, they may distribute the result only under the same licence.
The sign up agreement emphasises that when you make a contribution to Colouring London, you are creating a permanent, public record of all data added, removed, or changed by you as noted above. It is also explicitly stated that Colouring London is unable to accept any data derived from copyright or restricted sources, other than as covered by fair use. Data sources are encouraged to be recorded wherever possible.
Our platform code are also open and we encourage its use by other cities and towns. Code are available on our GitHub site
https://github.com/colouring-london/colouring-london under the following licensing terms:
'Colouring London Copyright (C) 2018 Tom Russell and Colouring London contributors'. This program is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.