CC

Minimum Ontology for 2DE Gel Electrophoresis

by Xiaoshu Wang, Romesh Stanislaus, Liu Hong Jiang, Jonas S. Almeida

Introduction

This document is developed to serve as a human guide on how to use The Charleston Core Minimum Ontology for 2DE Gel Electorphoresis, herein refers to MO2DE. MO2DE is an ontology designed for annotating information necessary to describe the production of a 2DE experiment. In particular, the ontology is designed as the RDF counterpart of Minimum Information about 2DE Gel (MI2DG).

Ontology design differs from an XML schema design in that the former is a process of modelling knowledgewhereas the later is of data. Any modelling process must employ, and therefore comply to the constraints, of a programming language. To model data in XML, one must model its data into a tree. However, not all data are necessarily expressed as a tree; nor can all of them. Therefore, a data structure in a XML schema sometimes is designed not for the semantic relationshp of data but for practical, sometimes aesthetic, purposes. Modeling knowledge, on the other hand, is different. First, the language to be used OWL/RDF employes a directed labeled graph (DLG). Second, ontology design should consider the potential (or expected) use of inference engine. So, the modeling should adhere to its semantic relationship as much as possible.

However, an all-RDF solution will make an ontology to resemble very little to its XML counterpart. The design of this ontology therefore takes a middle road. At one hand, we try to ground it to existing ontologies, for instance, BOSS; On the other hand, we tried to mimic data relationship of its XML counter part as much as we can.

Schema

The RDF definition of MO2DG is located at http://www.charlestoncore.org/ontology/mo2dg

Design Overview

A 2DE Study

In general, MO2DG considers a 2DE Gel work be a boss:Study. Every TwoDEStudy shall use some BioSample as boss:material and and GelProtocol as boss:method (see Figure 1).

Overview of MO2DG

Figure 1: Overview of MO2DG in DLG2 .

Sample

A BioSample is the subject of study. Each BioSample must label its species, tissue and cell (see Figure 2).

Biological Sample

Figure 2: Design of BioSample in DLG2..

Note
Here species, tissues and cell is specified as an xsd:string. A more RDF-like way shall use a URI. But we do not think we are in the position to assign this URI so using xsd:string is a temporary solution. But meanwhile, we recomment using controlled vocabularies (for example, MeSH) as much as possible.

Gel Protocol

A typical 2DE gel experiment follows the following steps:

  1. Prepare sample
  2. Load sample
  3. Run first dimension, typically isoelectric focusing
  4. Run second dimension, typically SDS-PAGE
  5. Stain the gel
  6. Digitize the gel

The above six steps are modelled into the ontology structure shown in Figure 3. Among them, only three of them are modeled as a class of boss:Protocol. Two of them - staining and densitomety - are simply modeled as a string element. As we mentioned earlier, such a treatment is to accommodate the design of its XML counterpart. We will discuss a more RDF-like treatment in a future document.

Figure 3: GelProtocol

Figure 3: Design of GelProtocol in DLG2 .

Sample Preparation

Preparing samples for a 2DE gel usually involves two main processing steps. They are modeled as shown in Figure 4.

  1. Prefractionation: to reduce the complexity of samples and enrich proteins of interest. Such step is modeled as a owl:DatatypeProperty of xsd:string.
  2. Solubilization: in order to obtain the consistent molecular weight and pI of the proteins of interest, all noncovalent bounds should be disrupted. Such steps are usually achieved by treating samples in some lysis buffer that contains chaotropes, detergents, reducing agent, ampholyte, protease inhibitors and alklynation agents.

Figure 4: Sample Preparation

Figure 4: Design of SamplePrep in DLG2 .

Isoelectric Focusing Protocol

The first dimension of a 2DE gel usually involves separating protein mixtures according to their isoelectric point. The separation is done by applying electric field to a medium of certain pH gradient. This process is modeled as shown in figure 5.

Figure 5: Isoelectric Focusing Protocol

Figure 5: Design of IEF_Protocol in DLG2 .

SDS-Page Protocol

The separation of protein on the second dimension of a 2DE is by the size of proteins. Such step can be achieved by applying electric field on SDS polyacrylamide gel. Such step is modeled in Figure 6.

Figure 6: SDS-PAGE Protocol

Figure 6: Design of SDS-PAGE Protocol in DLG2 .

Reagent

A biological experiment need chemical reagents. The Reagent class is created for the purpose. Once again, this design is trying to accommodate its XML counter part. A more reasonable approach will at least separate them into a separate namespace. The desing of Reagent and some classes used in this ontology is shown in Figure 7.

Figure 7: Reagent

Figure 7: Design of Reagent in DLG2 .

Additional Note

Many concepts existed in MI2DG are not designed in MO2DG becaue first, it is not necessary. For instance, "mi2dg/protocol_id" is not needed because RDF by default requires a URI to make sense. Second, many concepts that have been developed elsewhere. For example, most of the "description", "note", "date", "submitter" etc. can use "description", "date", "creator" developed by Dublin Core Metadata Initiative.

Vocabularies

Classes

TwoDEStudy

The general concept used to describe a study using 2DE gel electrophoresis as its methdology.

Constraints: A TwoDEStudy must use at least one BioSample as its boss:material and one GelProtocol as its bossmethod.

Conterparts in MI2DG: "/mi2dg"

BioSample

The biological materials that will be processed and run on a gel during a TwoDEStudy.

Constraints: A BioSample has three mandatory properties: species, tissue and cell, all of which are of type xsd:string.

Conterparts in MI2DG: "/mi2dg/sample_type"

Reagent

The concept is used to represent chemicals used in an experiment protocols.

Constraints: A Reagent must be provided with the following three properties: chemName, concentration and unit, each of which is an owl:DatatypeProperty of xsd:string.

Counterpart in MI2DG: none.

Chaotrope

Chaotrope refers to the chemical reagents, like high concentration of urea, that are used to destablize proteins.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/chaotrope". The difference in MO2DG chaotrope can be used by somethingelse other than for protein_solubilization.

Detergent

It refers to chemical reagent that helps increasing the solubility of hydrophobic proteins, such as CHAPS, Triton X-100 and NP-40 etc.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/detergent". The reason for the difference is the same as described for chaotropes.

Ampholyte

The ampholytic reagent that helps solubilization.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/ampholyte". The reason for the difference is the same as described for chaotropes.

Reductant

The reducing agent, such as DTT, to prevent proteins from oxidation.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/reducing_agent". The reason for the difference is the same as described for chaotropes.

ProteaseInhibitor

Protease inhibitor that is used to stop active protease in the buffer.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/protease_inhibitor". The reason for the difference is the same as described for chaotropes.

Alkyl

The alkylization agent.

Constraints: A subClassOf Reagent.

Counterpart in MI2DG: roughly to: "/mi2dg/sample_prep/protein_solubalization/alkylization_agent". The reason for the difference is the same as described for chaotropes.

GelProtocol

Methodology used in a TwoDEStudy.

Constraints: GelProtocol is a subClassOf boss:Protocol.

It must have one of each of the following properties:

  • A SamplePrep as the object of its boss:method property.
  • A proteinLoad property
  • An IEF_Protocol as the object of its boss:method property.
  • A SDS-PAGE_Protocol as the object of its boss:method
  • A staining property
  • A densitometry property

Conterparts in MI2DG: None.

SamplePrep

Procedures used to prepare a BioSample.

Constraints: A subClassOf boss:Protocol.

A SamplePrep must have the following properties:

  • One prefractionation.
  • A Solubilization as its boss:method.

Counterpart in MI2DG: "/mi2dg/sample_prep"

Solubilization

Constraints: A subClassOf boss:Protocol.

It must use one of the following as its materials:

  • Chaotrope
  • Detergent
  • Ampholyte
  • Reductant
  • Alkyl
  • ProteaseInhibitor

Conterparts in MI2DG: "/mi2dg/sample_prep/protein_solubilization".

IEF_Protocol

Protocols used to describe the step of isoelectric focusing

Constraints: A subClassOf boss:Protocol.

It must have Rehydration and IEF_Focusing as its boss:method and FocusMedium as its boss:material.

Conterparts in MI2DG: "/mi2dg/conditions/first_dimension"

Rehydration

Protocols used to rehydrate the FocusMedium.

Constraints: A subClassOf boss:Protocol.

It must have properties of rehyBuffer and rehyTime.

Conterparts in MI2DG: "/mi2dg/conditions/first_dimension/rehydration_buffer"

IEF_Focusing

Protocols used to focus the isoelectric point.

Constraints: A subClassOf boss:Protocol.

It must have properties of focusBuffer and focusVolt and focusHour.

Conterparts in MI2DG: the combination of "/mi2dg/conditions/first_dimension/focusing_buffer_system" and "/mi2dg/conditions/first_dimension/electric_field"

FocusMedium

The material, i.e., gel used to perform IEF_focusing.

Constraints: It must have the following properties:

  • typeOfIEF: For instance, if the medium is used for carrier ampholyte (CA) to generate pH gradient or using immobilized pH gradient (IPG)
  • low_pH
  • high_pH
  • Must have a ccb:Rectangle as the object for itsccb:shape.
  • thickness
Note
The combination of thickness and a ccb:Rectangle seems suggesting another Class, like Cube. But as we mentioned earlier, this ontology is created to show the RDF way of its XML counter part, the thickness is created for this ontology alone.

Conterparts in MI2DG: the combination of "/mi2dg/conditions/first_dimension/focusing_medium_dimension" and "/mi2dg/conditions/first_dimension/ph_range"

SDS-PAGE_Protocol

The protocol used to do the second dimension of a 2DE Gel Electrophoresis.

Constraints: A subClassOf boss:Protocol. It must have the following properties:

  • equiBuffer
  • runBuffer
  • runVolt
  • runAmp.
  • runHour
  • Must use an AcrylamideGel as its boss:material.

Conterparts in MI2DG: "/mi2dg/conditions/second_dimension".

AcrylamideGel

The material, i.e., polyacrylamide gel used to separate proteins by their size.

Constraints: It must have the following properties:

  • acrylamidePercent
  • crossLinker
  • manufacturer
  • productID
  • Must have a ccb:Rectangle as the object for itsccb:shape.
  • thickness

Conterparts in MI2DG: the combination of "/mi2dg/conditions/second_dimension/gel_composition" and "/mi2dg/conditions/second_dimension/gel_dimension"

Properties

species

A xsd:string DatatypeProperty that is used to indicate the species where the biological sample come from.

domain: BioSample

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_type/species"

tissue

A DatatypeProperty that is used to indicate the origin of tissue where the biological sample is obtained.

domain: BioSample

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_type/tissue"

cell

A DatatypeProperty that is used to indicate the cell type where the biological sample is obtained.

domain: BioSample

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_type/cell"

chemName

A DatatypeProperty that is used to indicate the name of a Reagent.

domain: Reagent

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_prep/protein_solubilization/*[@name]"

Note
We have to use wildcard in above XPath expression to indicate the mapping. However, the concepts of chemName is more general than it is designed in MI2DG.

concentration

A DatatypeProperty that is used to indicate the concentration of a Reagent.

domain: Reagent

range: xsd:double

Counterpart in MI2DG "/mi2dg/sample_prep/protein_solubilization/*[@conc]"

unit

A DatatypeProperty that is used to indicate the unit of a measurement. In our case, it is the unit of concentration.

domain: Reagent

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_prep/protein_solubilization/*[@unit]"

Note
Unit is always a problem to define data standard. I wish there is an organization can develope an ontology so we can use a URI to refer to a specific unit.

manufacturer

A DatatypeProperty that is used to indicate the manufacturer of a product.

domain: owl:Thing

range: xsd:string

Counterpart in MI2DG "/mi2dg/conditions/second_dimension/gel_composition[@manufacturer_name]"

Note
This property is created out of the need for one element in MI2DG. But once again, it is more general than its usage in MI2DG.

productID

A DatatypeProperty that is used to indicate the productID when ordered from a manufacturer.

domain: owl:Thing

range: xsd:string

Counterpart in MI2DG "/mi2dg/conditions/second_dimension/gel_composition[@product_id]"

densitometry

The densitometry is used to record how the numerical value of a 2DE gel is obtained.

domain: GelProtocol

range: xsd:string

Counterpart in MI2DG "/mi2dg/conditions/second_dimension/detection_protocol"

staining

The property is used to record the method of staining.

domain: GelProtocol

range: xsd:string

Counterpart in MI2DG "/mi2dg/conditions/second_dimension/detection_protocol"

proteinLoad

The property is used to record the amount of protein loaded to run a 2DE gel.

domain: GelProtocol

range: Reagent

Counterpart in MI2DG "/mi2dg/conditions/first_dimension/protein_load"

Note
You may have noticed the different organization between MI2DG and MO2DG for the staining, densitometry and proteinLoad. This in part reflects the differences of design philosophy. MI2DG took an approach to match the experimental procedure whereas MO2DG attempts to group things logically. For instance, although sample protein is loaded during the preparation of the first dimension but its purpose is to run on both dimension.

prefractionation

The property is used to record the method of prefractionation during sample preparation.

domain: SamplePrep

range: xsd:string

Counterpart in MI2DG "/mi2dg/sample_prep/prefractionation"

low_pH

The property is used to record lower end of ph_range of IEF.

domain: FocusMedium

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/ph_range[@low]"

high_pH

The property is used to record higher end of ph_range of IEF.

domain: FocusMedium

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/ph_range[@high]"

typeOfIEF

The property is used to indicate how the pH gradient is generated. I.e., is it through carrier ampholyte or via immobilized pH gradient.

domain: FocusMedium

range: xsd:string

Counterpart in MI2DG "/mi2dg/condition/first_dimension/IEF_method"

rehyBuffer

The property is used to record the composition of rehydration buffer.

domain: Rehydration

range: xsd:string

Counterpart in MI2DG "/mi2dg/condition/first_dimension/rehydration_buffer/"

rehyTime

The property is used to record the time used to rehydrate IEF strip.

domain: Rehydration

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/rehydration_buffer[@hours]"

Note
The unit of rehyTime shall be in hours.

focusBuffer

The property is used to record the composition of IEF focusing buffer.

domain: IEF_Focusing

range: xsd:string

Counterpart in MI2DG "/mi2dg/condition/first_dimension/focusing_buffer_system"

focusVolt

The property is used to record the electric voltage used to perform IEF focusing.

domain: IEF_Focusing

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/electric_field[@volts]"

focusHour

The property is used to record the time when electric power is applied to do IEF focusing.

domain: IEF_Focusing

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/electric_field[@hours]"

acrylamidePercent

The property is used to indicate the portion of acrylamide used in SDS-PAGE gel.

domain: AcrylamideGel

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/second_dimension/gel_composition[@percent_acrylamide]"

crossLinker

The property is used to indicate the portion of cross linking agent used in SDS-PAGE gel.

domain: AcrylamideGel

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/second_dimension/gel_composition[@cross_linker]"

The unit of this value is percentage.

equiBuffer

The property is used to indicate the buffer used to equilibrium SDS-PAGE gel.

domain: SDS-PAGE_Protocol

range: xsd:string

Counterpart in MI2DG "/mi2dg/condition/second_dimension/equillibration_buffer"

runBuffer

The property is used to indicate the buffer used to run SDS-PAGE gel.

domain: SDS-PAGE_Protocol

range: xsd:string

Counterpart in MI2DG "/mi2dg/condition/second_dimension/running_buffer_system"

runVolt

The property is used to indicate the electric voltage used to run SDS-PAGE gel.

domain: SDS-PAGE_Protocol

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/second_dimension/electric_field[@volts]"

runAmp

The property is used to indicate the electric current (in amperage) used to run SDS-PAGE gel.

domain: SDS-PAGE_Protocol

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/second_dimension/electric_field[@amperage]"

runHour

The property is used to indicate the time (in hours) used to run SDS-PAGE gel.

domain: SDS-PAGE_Protocol

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/second_dimension/electric_field[@hours]"

thickness

The property is used to indicate the thickness of a thing. Used in this context we will commonly use milimeter as the default unit

domain: owl:Thing

range: xsd:double

Counterpart in MI2DG "/mi2dg/condition/first_dimension/focus_medium_dimension[@height]" and "/mi2dg/condition/second_dimension/gel_dimension[@height]"