the identity / data divide

Proving who someone is online and letting them access their personal data – such as their tax, welfare, pension or medical records – often get lumped together as a single problem. Prove who someone is and, Voila!, access to their data happens automatically.

If only reality were that simple. Reliably matching someone to their data often turns out to be a bigger problem than identifying who they are. As ever, the devil is in the detail.

Establishing trusted identity

The need to establish proof of someone’s identity before letting them access sensitive personal data has long been recognised:

Progress towards higher level services for government electronic service delivery will crucially depend on the development of appropriate electronic authentication and security processes for use by businesses and citizens.

The above statement isn’t new: it’s 17 years old, from the UK Online Annual Report of 2000. It went on to note that:

To ensure that this can take place the Government will need to:

  • work with a range of trusted service providers, to ensure interoperability with government processes; and
  • identify where the marketplace is adopting suitable technologies for secure transactions and access, and ensure that the Government makes full use of these to meet electronic service delivery targets.

The UK government experimented with third party identity providers during its 1997 trial of smartcard technology (PDF) from NatWest bank. Two years later, the 1999 report e-commerce@its.best.uk set out proposals to use trusted service providers to help identify and authenticate citizens and businesses online.

In 2000 the e-government Authentication Framework expanded on this idea. It encouraged an approach to online proof of identity that could work across both public and private sectors with suppliers needing to conform to t-scheme accredited standards. That framework, and several related docs [1], are the predecessors of more recent guidance setting out the rules by which an online identity scheme should work, such as the Good Practice Guides.

However, while a trusted framework for proof of identity would help improve some aspects of our online lives, it won’t provide a magic bullet. One trumpeting pink elephant in the room remains the significant problem of matching a proven identity to data held by different organisations – often referred to either as “identity matching” or “data matching”.

Even a so-called “gold standard” of identity, involving both the biometric and biographic data – and the central register(s) – proposed by ID card enthusiasts, wouldn’t magically solve this difficult issue of matching an identity to existing personal data records maintained across multiple systems and organisations.

Identification and verification don’t solve the matching problem

The reason for this matching problem is that there’s no such thing as a single universal “identity” for most people. Even where a trusted third party identity provider such as a bank is prepared to vouch that someone online is say “Joan Smith” it doesn’t solve the problem of providing “Joan Smith” with automatic access to the right services and personal data.

After all, no service provider wants to risk giving an online user access to another user’s personal data records – particularly in sensitive areas such as our medical data. So they also need to establish proof of linkage between a claimed online identity and the data that person is trying to access.

This problem of matching is difficult: there is often a very poor data overlap between different systems in an organisation, and between organisations. An identity provider such as Experian can’t confirm that it’s the same “Joan Smith” as the particular “Joan Smith” or “J Smith” or “Smith, J” or even “J Brown” or other permutation associated with a specific set of personal data in another organisation, such as a government department, local authority or pension provider. This problem becomes even more thorny when trying to provide integrated services – such as bringing together in one place the various employer, personal and state pension schemes of a “Joan Smith”.

Knowing that someone online is a “Joan Smith” certainly narrows some of the options. But the challenge is that the specific record or personal data to which an organisation is trying to link someone – such as a pension, tax account, or medical record – often only knows who the “user” is via a specific identifier, such as a national insurance number, unique tax reference number, NHS number or client account number.

Those unique identifiers are typically something a commercial organisation such as Experian will not know. There is no easy or consistent way of binding between an asserted online identity risk-assessed by a third party and a specific record or set of personal data held by other organisations. It will often take time, and even manual processes, to build trusted linkages between an identity and the data and records that legitimately relate to that person.

This problem of matching is doubtless causing some of the current significant drop-out of early users of the latest use of third party identity providers, with just 40% success across all services (as of the time of blogging). Even if users successfully prove who they are to an identity provider, many still fail to prove to a service organisation that they are the specific user who should have access to a particular set of personal information.

This well-recognised matching problem is also likely to impact the assumptions underlying the “data sharing” agenda of Part 5 of the Digital Economy Act. It’s unclear whether the complex issues to be resolved in successfully, and reliably, matching “proven identity” with specific personal data sets spread across multiple organisations and systems have been properly understood, analysed, modelled and costed as part of the business case of these identity and data initiatives. This is why issues of identity and personal data need to be considered together.

Identity and data policy need to be designed together

The post-election period will provide an ideal opportunity for policy and technology options to be revisited across this complex identity and data landscape. There’s a clear case to be made for a better integrated strategy – and a move towards a genuine collaboration on tackling these common problems.

At the moment there appears to be a widespread expectation that getting identity right will also automatically improve access to personal data – when the experiences of the past 20 or so years tell us otherwise.


[1] see for example:

You can find more in the ‘Digital Government and e-Government Archives

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s