There is a lot of data around these days. Much more than there ever used to be, and in all aspects of our modern life. Increasing ubiquity of data, the vastness of datasets, and the growth in new or alternative data sources, means that data can be used to predict which apps we want to use in the morning, what we want to eat at lunch, what we want to listen to on the internet and what time we go to work.
Data ethics affects almost every area of life in ways we don’t think much about – such as satellite data used in navigation maps in your car. But perhaps the most important thing about data is understanding its uses and its power (and the security issues that come with it).
To address this there has been a lot of government legislation and procedure around the ethics and organisation of all this data, specifically addressing the law, and the ethics around data collection and use.
We are going to look at the ethics of all this data and data handling, and tell you what you need to know to keep your information accurate and secure
Although the UK data industry isn’t fully legislated, public protection has been a major priority. As such, some acts have been put into law that protects data belonging to the public.
The Data Protection Act
The Data Protection Act 1998 is the main piece of legislation that governs the protection of personal data in the UK. It was enacted to bring British law into line with the 1995 EU Data Protection Directive, which required member states to protect people’s fundamental rights and freedoms, in particular their right to privacy with respect to the processing of personal data; the basic premise is that it provides a way for individuals to control information about themselves. And it also requires companies and individuals to keep the personal information they receive to themselves.
The Data Protection Act controls how your personal information is used by organisations, businesses or the government.
Everyone responsible for using data has to follow strict rules called ‘data protection principles’. They must make sure that the information is:
You can find out the details of the act here: http://www.legislation.gov.uk/ukpga/1998/29/contents
It’s a necessary read for anyone handling large amounts of sensitive data.
The law is indisputable, but a much larger grey area is always in the ethics of data. Who should have access to what information, and how should it be collected and used. Not only legally, but for the benefit of society.
The issues around these areas tend to fall into:
Big data presents complex issues; the use of data and evidence is increasingly recognised as a major opportunity but is also an opportunity that needs questioning. There are questions of who is able to access data and whether there are sufficient ethical frameworks in the governmental and public domain to guide access and use.
Ethical concerns include questions about personal privacy, individual consent, data ownership and transparency. There are also worries about algorithms making important social decisions where the moral compass may be slightly skewed by data.
In general, the public are unaware of how this data is being accessed, how much of it is being accessed, and what companies are doing with their data. This inherently makes them distrustful of data collection, research and progress in the field of data science and how we can use this data in a beneficial way.
Also, public views on companies using their data do change depending on several factors, such as how well informed they have been, media coverage, etc
The relationship between ethics and public perception is not straightforward. Without suitable ethical frameworks in place for businesses, to the public it may seem that there is no structure for accountability outside of public outcry.
So how do you do it? How can you collect and use all of this data legally and ethically? What are the principles?
Do our existing ethical, regulatory and legal frameworks need to change, or can they already accommodate all this data?
Do companies need to change their SOP’s (standard operating procedures) in light of the all this data?
How can we use the data for public good and with public support?
Some general rules are helpful:
Be transparent, across everything you do. And choose a form to be transparent in, whether it be blogs, white papers, or something else.
There are some general principles and guidelines you can follow also when collecting data from users or customers.
Start with a clear user need and benefit: this will help you justify the level of data you need and the method you use to obtain it.
Use the minimum level of data necessary to fulfil the benefit: there are many techniques for doing so, such as de-identification, aggregation or querying against data.
Be alert to public perceptions: put simply, what would a normal person on the street think about what you are doing with the data?
Be as open and accountable as possible: Transparency is antiseptic for unethical behaviour. Aim to be as open as possible (with explanations in plain English).
Keep data safe and secure: this is not restricted to data science projects. Studies have consistently shown that the public are most concerned about losing control of their data.
One of the most important concepts in data ethics is informed consent.
The provisions of both European and UK law, such as the Data Protection Act and guidelines of many professional research organisations recommend the following principles be followed to ensure that consent is informed:
Failure to properly and fully address issues of informed consent may restrict the opportunities for initial use of data, publishing your results and sharing data.
Before data can be shared, you may need to make it anonymous so that individuals, organisations or businesses cannot be identified. Doing so may be needed for ethical reasons to protect people’s identities, for legal reasons, or for commercial reasons.
Personal data from research should never be disclosed, unless a participant has given specific consent to do so, ideally in writing.
Anonymising research data can be time consuming and therefore costly, but early planning helps reduce these costs.
You should have NO direct identifiers. Remove identifiers such as names, addresses, postcode information, telephone numbers or pictures.
Also, indirect identifiers which, when linked with other publicly available information sources, could identify someone, for example, information on workplace, occupation or exceptional values of characteristics like salary or age have to be removed also.
Direct identifiers are often collected as part of the research administration process but are usually not essential research information and can therefore easily be removed from the data.
Be careful also about Data under End User Licence.
For example, Data disseminated by the UK Data Service is not in the public domain. Their use is restricted to specific purposes. Users sign an End User Licence which has contractual force in law, in which they agree to certain conditions, such as not to disseminate any identifying or confidential information on individuals, households or organisations; and not to use the data to attempt to obtain information relating specifically to an identifiable individual.
Thus users can use data for research purposes, but cannot publish or use them in a way that would potentially reveal people’s or organisations’ identities.
Veber are experts in cloud hosting if you have any questions about your data and legal issues around cloud hosting and data get in contact with one of our friendly team.