Monday, July 16, 2007
We all know that there is little agreement on the definition of identity, digital identity, identity (meta) system(s), user centricity, etc. There are probably as many definitions of these terms out there, as there are actors playing in this market. While many of these definitions are somewhat similar, there is still a significant semantic gap between how the various "clans" are using them and which terms are the "correct" ones: Just think about the debates around relying party, service provider, and consumer or user agent, and browser.

Traditional Digital Identity

 I would like to go back to one of the more common definitions of digital identity. For some time now, I have been operating on the notion that a digital identity is - essentially - a collection of attributes. For example, your digital identity probably has attributes such as name(s), addresses, phone numbers, email addresses, etc. The collection of these attributes - accessible in a machine processable form - constitutes a lot of knowledge about you.

On Identifiers and System Specific Attributes

While not central to the theme of this article, it should still be noted that the user name (or – more generally – the identifier) is yet another attribute in itself that might change. To limit the number of attributes, an identity system might also decide to use an existing attribute that can be taken to be sufficiently unique (e.g. an email address) for the user name/identifier.

In addition to the identifier there may be more attributes that arise through the use of a particular identity system. These can be system internal attributes guaranteeing uniqueness (e.g. GUIDs) or pseudonymous identifiers used with individual relying parties.

All these additional identifiers might be random. Yet, through their usage in the identity system they are tightly coupled to are particular digital identity and they should be treated with the same importance and privacy awareness as any other personally identifiable attribute.

Cryptography

In many cases, this collection is accompanied by some cryptographic keying material, often in the form of a public/private key pair. As the 'owner' of this digital identity, you typically have access to the private key and you can use it in transactions to prove that it is you (i.e. the 'owner' of the digital identity and its keying material) who participated in this transaction.

Derived Statements

Depending on the context of this digital identity (some people might want to call this context an identity system, federation, or identity meta-system), you [1] can create statements about your collection of attributes that do not necessarily contain all the information about your digital identity, but only a subset: for example, you might be able to create a statement about your email address and name and nothing else. Or it might be handy to create a statement about the fact that you are over 21, without disclosing your actual age or birth date.

Issues

Overall, this concept of a digital identity was - and still is - quite useful in many cases. It has a lot of built-in flexibility and can be applied to a very large number of problems.

The problem with this view is trickling up to the surface, as soon as we get concerned about the privacy of the different actors in this definition. It is quite clear [2] that within the world of this definition privacy breaches are quite easy: As soon as parts of a digital identity become known, these parts (or attributes) can be collected in databases and sold to those who are interested. This fact has already resulted in the massive disruption of email through spammers. Going forward, it is all too easy to imagine a world where private data collectors or nosy governments collect more and more attributes and information about a person's digital identity[3].

Identity By Relation

I am starting to think about identity (and in particular digital identity) in a more dynamic way:

A digital identity is a collection of relations to (i) itself, (ii) other digital identities, (iii) external entities. These relations can, but do not have to be decorated with one or more attributes.

One of the benefits of this definition is that it becomes intuitively clear that a single digital identity is not necessarily stored in a single place, but much more commonly in a number of different places. This decentralization is a crucial building block for creating a world with strong privacy by segregating as much data as possible by design. At the end of the day, it will be (almost - see below) exclusively the 'owner' of a particular digital identity that is capable to correlate across different digital identity storage locations.

With such a definition in mind, you can gather a lot of data about someone by using the identity web services of theirs, but a lot of it may be very ephemeral (e.g., their current geolocation or presence status). As such, it is actually closer to their real 'in-the-world' identity.

Correlating Through Auditing

One might argue that this separation of identity data will in turn weaken the capability to effectively correlate information about a given digital identity for legitimate purposes, in particular when it comes to requirements such as "proof of source" or "non-repudiation". These concerns can be overcome by auditing: while different storage locations are typically not capable of correlating, a concerted action (e.g. based on a court warrant or subpoena) can evaluate audit trails and construct a comprehensive image of a digital identity.


[1] More precisely: A component of the identity system can create such statements about the attributes of your digital identity on your behalf. This could be your identity provider, some active user agent, or another service separate from the identity provider.

[2] Actually from experience: probably all participants in electronic commerce or even simple electronic communication have had some of their digital identity disclosed to parties that should better not have them, e.g. spammers or worse. Frequently, this happens through the sale of this information to marketeers.

[3] This scenario applies to loosely coupled, internet-scale identity systems. In more tightly coupled systems (e.g. in internal business applications or cross-enterprise collaborations) there are usually tight governance models that regulate how data is being handled through contracts and laws.

tag:

Monday, July 16, 2007 12:03:26 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Sunday, July 15, 2007

Germany recently changed their copyright and intellectual property laws, with a devastating effect on science and research: Going forward, libraries will only under very limited circumstances have the right to send out digital copies of a scientific article. There are many other new and significant changes - most of the times to the benefit of the "Content Community" (aka content mafia).

Maybe you are directly impacted, or maybe only tangentially. But ultimately, this kind of advantage for the content creator will continue is nibbling away from our rights to private copies, fair use, and - eventually - free speech. And since we do live in a fairly globalized world (at least as far as lobbying by the content mafia goes), this will effect all of us. Therefore, I ask you to consider signing the "Göttingen Declaration", asking for a reform of the latest changes in one of the biggest economies in the world.

tag: ,


Sunday, July 15, 2007 5:44:50 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Sunday, July 08, 2007
Here is a short little article by the German news magazine DER SPIEGEL on green datacenters. Interestingly enough, one of the biggest German hosting companies (1&1) has decided to go with the SunFire systems with the Niagara processor (8 core SPARC). Econony and ecology go hand in had into the mainstream...

Saturday, July 07, 2007 11:33:31 PM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 
Monday, June 11, 2007

The MPAA has finally proved to the world what they really are: a criminal cartel that does not stop short of illegal means to advance their interest. CNET reports that TorrentSpy has filed a complaint against the MPAA, accusing them of hiring a professional data thief and anarchist (a.k.a. hacker) to steal private communication and trade secrets from TorrentSpy.

Protecting intellectual property and prespecting copyrights? Yeah, sure...


Monday, June 11, 2007 9:33:42 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Wednesday, June 06, 2007
Today, our OpenID provider finaly went into production. As some may have noticed, http://openid.sun.com/ has been live for some time now, and the team has been playing around with it. As of last night, we (or more precisely: Hubert) flipped the switch and we are officially live.

tag: , , ,

Wednesday, June 06, 2007 10:04:44 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Friday, June 01, 2007
No, this post is entirely unrelated to LAMP or even technology. This is only about a bird nest in the lamp over our main entry door at home. The are two chicks in that nest that really make a lot of noise ...

And here is a closeup:

Anyone an idea what birds these are?

tag: ,

Friday, June 01, 2007 3:54:21 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Friday, May 25, 2007

This is quite astonishing: I am sitting in a public elementary school in Massachusetts, happily booting my laptop to finish reading some PDF document. After logging in I suddenly notice that my wireless adapter picks up a network: 'linksys'. Amazed that some neighboring home reached into the school building with their WiFi access point, I only quickly check the nameserver to see which ISP that access point is connected to: (name of town).mec.edu. What??? I am in the school network? No WAP/WEP, firewalls, proxy or anything.

Given the fact that the calendar shows the year 2007, I am now really astonished and shocked, that the IT environment of an entire school system is exposed to the world through an unprotected WiFi AP.

The security, privacy, and potential ID theft implications are huge: I assume (though I cannot speak for certain, since I did not even try to touch any of the systems) that some of the systems in this infrastructure contain personally identifyable information about the school staff, teacher and even students. Even a well patched and maintained system that is monitored by advanced intrusion detection software can not necessarily replace a firewall that blocks in-coming traffic. I just hope that - going forward - things like this will never happen again.

tag: ,

Friday, May 25, 2007 1:32:12 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Thursday, May 24, 2007

In order to go through some exercise here, I recently needed to create a few Java classes from XSD schema. "Well," I thought, "JAXB with its integrated XJC is your friend!" And so it is, but you might have to dig a little deeper.

The problem I was facing was a schema that had references to WS-Security, XML Encryption and XML Signature. As such, it imported all these schemas from the web using <xsd:import namespace="..." schemaLocation="http://..." />. Since xjc is pretty flexible, accessing these schemas on the web was a charm, even through the firewall. After all, this is much better than downloading all the referenced schemas (and all schemas they reference) and edit the imports to point to the right location in the file system.

Well, not so quick. In their infinite wisdom and foresight, the schema developers at OASIS and W3C decided to use different schema locations for XML Dsig. They reference the same schema (with identical namespace, obviously), but import through different schemaLocation URIs. That confuses xjc to no end, since it detects a re-definition of the same object and gives up.

In order to resolve this problem, you can create an XML Catalog, that allows you to rewrite (or redefine) URLs referenced in you schema. Here is an example:

<?xml version="1.0"?>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
  <system
      systemId="http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/xmldsig-core-schema.xsd"
      uri="http://www.w3.org/TR/xmldsig-core/xmldsig-core-schema.xsd" />
</catalog>

This simple catalog redefines the URI used by the XML encryption schema to point to the one used by OASIS. The XML Catalog specification provides many more options, and it is good to know that xjc supports this.

While this is quite simple, I found it relatively hard to find concrete examples on how to use this mechanism.

tag: , , , , ,

Thursday, May 24, 2007 3:17:18 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 

Copyright by Gerald Beuchelt.