What is spot2?
What is spot2?
Conceptually, I think we can all understand what a "spot2" is. According to its definition, spot2 is supposed to be a proteomics analyte confined within a region of a 2DE gel. But practically, what do you expect to retrieve from hitting its URI?
Of course, our technologies haven't advanced to the stage of Star Trek, so getting the real spot object is out of the question. Through web, the URI should lead us to a stream of electronic bits. But the question is then what the bits should represent? Should it be an image (but in which format, i.e., tiff, jpeg, png or somethingelse?), an XML (then to which schema, AGML, HUP-ML, or others?)? I think retrieving a HTML page to discuss the question of "What is spot2?" is perhaps the least expected.
In an ideal RDF world, we should expect to see an RDF document about the resource. But the problem is that the world of RDF is distributed. Information about a particular resource doesn't have to be physically collocalized with the resource. In addition, it is technically impossible as well. For instance, consider what should we put within this web page if we know everything about the spot2? Molecular weight, PI, shape, intensity are the obvious (from the perspective of a 2DE experiment). Protein identification would be nice so we need to attach a protein sequence? But what about the function of the identified protein? Or the molecular composite of the sequence and the chemical bonds etc? Where and by which criteria are we going to stop?
Identity Crisis
At its core, the above problem is caused by the lack of a clear definition of resource. From the definition of RDF, a resource is anything that has a URI. But using a URL to represent a logical entity other than a web page incurs the Identy Crisis that has been the topics of an ongoing discussion of web community (See Berners-Lee 2003, Booth 2003, Clark, 2002 Hawke 2002).
Any solution?
Depends on the problem at hands, there are many practical solutions without the need to settle the philosophical argument. For instance, Pepper pointed out that part of the problem is caused by the mixed usage of URI to both directly and indirectly refere resources. He suggest using subject idicators and identifiers to solve the problem.
But as pointed out by Berners-Lee , any resource can have multiple dimensionalities. That means, even if we know when to use a URI to label and when to retrieve, we still don't know in what dimensionality the resource will or should show up. (Again, what do you expect to get from spot2?)
A practical solution is to avoid using URL and use URN instead. For instance, use Life Science Identifier (LSID) can turn the above logical argument into an implementation decision. Because LSID couples its naming scheme with a retrieving framework, it is up to implementator to decide what kind of resource to offer. (In MHO, this is the real important contribution of LSID to RDF world.) With LSID, one don't have to figure out what the resource is and in which form the resource will show up. A user can do it through simple quering by asking questions like: Do you have spot2 in the format of AGML? The current specification of LSID does not - but can be twisted to - work as we have suggested in last section. And we certainly hope in the future version of LSID, it can be modified to suit the purpose.
References
- Clark, Kendall Grant (2002) Identity Crisis, XML.com, September 11, 2002, http://www.xml.com/pub/a/2002/09/11/deviant.html
- Booth, David (2003) Four Uses of a URL: Name, Concept, Web Location and Document Instance, January 28 2003, http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm
- Hawke, Sandro (2002) Disambiguating RDF Identifiers, January 4 2003, http://www.w3.org/2002/12/rdf-identifiers/
- Berners-Lee, Tim (2003) What do HTTP URIs Identify?, February 15 2003, http://www.w3.org/DesignIssues/HTTP-URI
- Pepper, Steve (2003) Curing the Web's Identity Crisis. http://www.ontopia.net/topicmaps/materials/identitycrisis.html





