Charleston Core Gel Ontology
by Xiaoshu Wang
Introduction
The Charleston Core Gel Ontology (herein refered as CCGel) defines concepts that can be used to describe the result of two-dimensional gel experiment. CCGel defines the semantics for the basic vocabularies that are used daily by web-bench scientists, such as "Gel" and "Spot". But CCGel also defined other type of "gels", such as "reference gel" or "image gel" and the corresponding "reference spot" and "image spot" etc. The reason for "inventing" these words is described in the following paragraphs.
Designing an ontology face the debate that has challenged philosophers for centuries. Ever since the study of Ontology started, question has been raised regarding what Ontology should be committed to. Ontological commitment in here refers to "the existence of one thing is presupposed or implied by asserting the existence of another"[1]. Historically, different camps of philosophers hold different views on ontological commitments. For the realists who belives a world exists beyond human language and belief, ontology commitment should be made to this external world. For analystical philosophers, however, the world is only available through the glass of our language and belief. Thus, ontology should be committed to the metaphysics of our internal world.
Desining information technology ontology, thought much more practical than the philosophical Ontology, can not avoid the incurred conflict due ro the split philosophical viewpoints on ontological commitement. An ontology about gel description is no exception. What is, for example, a "spot" of a "gel"? On one hand, it refers to the confined region where protein or a set of proteins reside. On the other hand, it refers to a region of the image of a gel where intensity differs from background. Is the distinction important? The answer depends - well - on your philosophical point of view. Because the actual protein spot can result into multiple copies of gel images, a ployacrylamide gel is never the same as its image. When we say "a spot is cut off to run a MALDI TOF", the "spot" is meant to the macromolecular being resolved rather than the "spot" on an image. A protein spot can have properties of such like isoelectric point, molecular weight and perhaps sequences but an image spot can have coordinates, shape and intensity.
To model according to the reality however incurs the unnecessary overhead. The objective of developing an IT ontology is to help data representation. In other words, our primary task is to encode data in such so that machine can understand with minimal human intervention. But we have to keep in mind that it is still human that code the machine to act. It follows that programatically the clear distinction isn't that important as long as we clearly specify what we meant for a word. "
To create effective representations", as Barry Smith writes[2], "it is an advantage if one knows something about the things and processes one is trying to represent." Hence, let's follow this Ontologist's Credo and analyze the process of a 2DE experiment.
The process of a gel experiment can be divided into the following steps:
- Two-dimensional Gel experiment: Sample tissue goes through a series step of preparation, solubilization, reduction, isoelectric focusing and SDS-Page separation and resulted in a "gel".
- Image acquisation: The gel resulted from last step will be digitized into image of certain format. Depending on the dye used to stain the gel, a range of machines, such as scanners, laser densitometers, charge-coupled device(CCD) cameras, and fluorescent and phosphor imagers.
-
Image analysis:
After acquiring gel in digitized format, image will be analyzed according to the following format
- Preprocessing: This step deals with the noise removal, background correction and streak artifacts etc.
- Spot detection: After preprocessing stage, algorithm such as Gaussian fitting and Laplacian filtering can be applied to detect the location of spot.
- Image registration : If a series of image needs to be compared, pattern matching algorithm will be used to establish corresponding spot in the reference and targeted images, often with user-defined landmarks.
NoteNote: Spot detection and image registration step can be reversed in certain cases. For instance, Z3 system uses intensity based image registration mechanism first and feature detection later. But the end result of
Figure 1: General processing of a 2DE gel.
From the above process, it is obvious that the word "Gel" can refer to a "gel" or sometimes an image. But no matter what "gel" - the "analogous" polyacrylamide gel or "digitized" image gel, conceptually we recognize a gel as a list of spot. The purpose of running a 2DE is to separate protein (spots); when running image analysis, the purpose is also detect (or separate) "spot". With this perception, it is easier (or at least more pragmatic for us) to take the stand of analytical philosopher. That is: we have to model our ontology according to how we think.
In this way, we can obtain a taxonomy of Gel concepts. An digitized gel is a gel because it is conceptually the same only its spot has properties of locations, shape and intensity. On the other hand, a gel serving as reference is also a gel. For example, the reference map for the federated 2DE database proposed by Appel etc., is also a Gel. But on the reference map, the protein spot or the "annotated spot" is located on a pI-MW plane rather than the x- and y- coordinates. Therefere, a RefGel and a ImgGel is different but they can all be considered as a Gel.
Figure 2: Hierarchy of Gel concepts.
With this pragmatic approach, we defined the basic ontologies to define Gel.
The overal ontology is shown in Figure 4.
Figure 4: Gel ontology (The syntax of the graphical language is DLG2 .
Vocabulary
Classes
Gel and Spot
The concept of a ccgel:Gel and ccgel:Spot is very general. They are devised to serve as the root of a subclasses. Put it another way, they are designed to serve as a concept holder to express a concept that shows in Figure 2. No property restrictions are given so special type of Gel or Spot can be extended.
Figure 5: Gel and Spot.
RefGel
A RefGel is a reference gel. In a RefGel, ther two-dimensional plane is "pi" and "mw (molecular weight)" of a protein spot. A RefGel has a property restriction. I.e., if a hasSpot property is used, the object type must be a RefSpot property.
RefSpot
A RefSpot is a "hypothetical spot" on the pi-mw plane. Each of these spot in theory should represent extactly one protein entity.
ImgGel
An ImgGel is the image of a gel. As an ImgGel is described in reference to a set of hypothetical x- and y- coordinates. Its hasSpot property must be of type "ImgSpot" which carry various geometric property.
ImgSpot
An ImgSpot has a center, a shape and an intensity.
OvalSpot
OvalSpot is a subClass of an ImgSpot. But the shape of an OvalSpot must be a ccb:Ellipse.
MGel and MSpot
MGel (think it as MasterGel or MixedGel). is an intersection of a RefGel and ImgGel. This concept is created to accommondate particular software demands. For instance, PDQuest stores all duplicate gel image (e.g., ImgGel) with spots which both x-, and y- coordinate as well as pi- and mw- are labeled. In this case, MGel comes in handy to allow users to record gel indepdendly without refering to an external reference gel.
Properties
hasSpot
The ccgel:hasSpot is an ObjectProperty. Its domain is a ccgel:Gel and its range is a ccgel:Spot.
pi
Property "pi" is a DatatypeProperty to describe the value of isoelectric point (pI) of a (or a set of) reference protein spot (RefSpot).
mw
Property "mw" is a DatatypeProperty to describe the value of molecular weight of a (or a set of) reference protein spot (RefSpot). The standard unit used here should be Dalton (Da).
center
Each ImgSpot should have a center. But the precise mathematical definition of this center may varies from the type of spot. For instance, for a "OvalSpot", the value of the spot is therefore the center of the Ellipse.
spotShape
Each ImgSpot has a Shape.
Intensity
The opacity value of a spot. The precise meanings of the Intensity might vary. But unless otherwise specified, it shall refer to the total quantity of the spot.
Reference
1. The Freedictionary.com: [http://encyclopedia.thefreedictionary.com/Ontological%20commitment].
2. Barry Smith Ontology and Information System [URL: http://ontology.buffalo.edu/smith//articles/ontologies.htm]
3. Appel, A. Bairoch, J.C. Sanchez, J.R. Vargas, O. Golaz, C. Pasquali, D.F. Hochstrasser. Federated 2-DE database: a simple means of publishing 2-DE data. Electrophoresis 17, 1996, 540-546, 1996.
Schema
The definition of this ontology is "http://www.charlestoncore.org/ontology/gel"
