readCbModel

PURPOSE ^

readCbModel Read in a constraint-based model

SYNOPSIS ^

function model = readCbModel(fileName,defaultBound,fileType,modelDescription,compSymbolList,compNameList)

DESCRIPTION ^

readCbModel Read in a constraint-based model

 model = readCbModel(fileName,defaultBound,fileType,modelDescription)

 If no arguments are passed to the function, the user will be prompted for
 a file name.

OPTIONAL INPUTS
 fileName          File name for file to read in (optional)
 defaultBound      Default value for maximum flux through a reaction if
                   not given in the SBML file (Default = 1000)
 fileType          File type for input files: 'SBML', 'SimPheny', or
                   'SimPhenyPlus', 'SimPhenyText' (Default = 'SBML')
                   * SBML indicates a file in SBML format
                   * SimPheny is a set of three files in SimPheny 
                     simulation output format
                   * SimPhenyPlus is the same as SimPheny except with 
                     additionalfiles containing gene-protein-reaction 
                     associations andcompound information
                   * SimPhenyText is the same as SimPheny except with 
                     additionaltext file containing gene-protein-reaction
                     associations
 modelDescription  Description of model contents 
 compSymbolList    Compartment Symbol List
 compNameList      Name of compartments corresponding to compartment
                   symbol list 

OUTPUT
 Returns a model in the COBRA format:

 model         
  description      Description of model contents
  rxns             Reaction names
  mets             Metabolite names
  S                Stoichiometric matrix
  lb               Lower bounds
  ub               Upper bounds
  rev              Reversibility vector
  c                Objective coefficients
  subSystems       Subsystem name for each reaction (opt)
  grRules          Gene-reaction association rule for each reaction (opt)
  rules            Gene-reaction association rule in computable form (opt)
  rxnGeneMat       Reaction-to-gene mapping in sparse matrix form (opt)
  genes            List of all genes (opt)
  rxnNames         Reaction description (opt)
  metNames         Metabolite description (opt)
  metFormulas      Metabolite chemical formula (opt)

EXAMPLES OF USE:

 1)    Load a file to be specified in a dialog box:

           model = readCbModel;

 2)    Load model named 'iJR904' in SBML format with maximum flux set
       at 1000 (requires file named 'iJR904.xml' to exist)

           model = readCbModel('iJR904',1000,'SBML');

 3)    Load model named 'iJR904' in SimPheny format with maximum flux set
       at 500 (requires files named 'iJR904.rxn', 'iJR904.met', and 'iJR904.sto' to exist)

           model = readCbModel('iJR904',500,'SimPheny');

 4)    Load model named 'iJR904' in SimPheny format with gpr and compound information
       (requires files named 'iJR904.rxn', 'iJR904.met','iJR904.sto',
        'iJR904_gpr.txt', and 'iJR904_cmpd.txt' to exist)
,
           model = readCbModel('iJR904',500,'SimPhenyPlus');

 Markus Herrgard 7/11/06

 Richard Que 02/08/10 - Added inptus for compartment names and symbols

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function model = readCbModel(fileName,defaultBound,fileType,modelDescription,compSymbolList,compNameList)
0002 %readCbModel Read in a constraint-based model
0003 %
0004 % model = readCbModel(fileName,defaultBound,fileType,modelDescription)
0005 %
0006 % If no arguments are passed to the function, the user will be prompted for
0007 % a file name.
0008 %
0009 %OPTIONAL INPUTS
0010 % fileName          File name for file to read in (optional)
0011 % defaultBound      Default value for maximum flux through a reaction if
0012 %                   not given in the SBML file (Default = 1000)
0013 % fileType          File type for input files: 'SBML', 'SimPheny', or
0014 %                   'SimPhenyPlus', 'SimPhenyText' (Default = 'SBML')
0015 %                   * SBML indicates a file in SBML format
0016 %                   * SimPheny is a set of three files in SimPheny
0017 %                     simulation output format
0018 %                   * SimPhenyPlus is the same as SimPheny except with
0019 %                     additionalfiles containing gene-protein-reaction
0020 %                     associations andcompound information
0021 %                   * SimPhenyText is the same as SimPheny except with
0022 %                     additionaltext file containing gene-protein-reaction
0023 %                     associations
0024 % modelDescription  Description of model contents
0025 % compSymbolList    Compartment Symbol List
0026 % compNameList      Name of compartments corresponding to compartment
0027 %                   symbol list
0028 %
0029 %OUTPUT
0030 % Returns a model in the COBRA format:
0031 %
0032 % model
0033 %  description      Description of model contents
0034 %  rxns             Reaction names
0035 %  mets             Metabolite names
0036 %  S                Stoichiometric matrix
0037 %  lb               Lower bounds
0038 %  ub               Upper bounds
0039 %  rev              Reversibility vector
0040 %  c                Objective coefficients
0041 %  subSystems       Subsystem name for each reaction (opt)
0042 %  grRules          Gene-reaction association rule for each reaction (opt)
0043 %  rules            Gene-reaction association rule in computable form (opt)
0044 %  rxnGeneMat       Reaction-to-gene mapping in sparse matrix form (opt)
0045 %  genes            List of all genes (opt)
0046 %  rxnNames         Reaction description (opt)
0047 %  metNames         Metabolite description (opt)
0048 %  metFormulas      Metabolite chemical formula (opt)
0049 %
0050 %EXAMPLES OF USE:
0051 %
0052 % 1)    Load a file to be specified in a dialog box:
0053 %
0054 %           model = readCbModel;
0055 %
0056 % 2)    Load model named 'iJR904' in SBML format with maximum flux set
0057 %       at 1000 (requires file named 'iJR904.xml' to exist)
0058 %
0059 %           model = readCbModel('iJR904',1000,'SBML');
0060 %
0061 % 3)    Load model named 'iJR904' in SimPheny format with maximum flux set
0062 %       at 500 (requires files named 'iJR904.rxn', 'iJR904.met', and 'iJR904.sto' to exist)
0063 %
0064 %           model = readCbModel('iJR904',500,'SimPheny');
0065 %
0066 % 4)    Load model named 'iJR904' in SimPheny format with gpr and compound information
0067 %       (requires files named 'iJR904.rxn', 'iJR904.met','iJR904.sto',
0068 %        'iJR904_gpr.txt', and 'iJR904_cmpd.txt' to exist)
0069 %,
0070 %           model = readCbModel('iJR904',500,'SimPhenyPlus');
0071 %
0072 % Markus Herrgard 7/11/06
0073 %
0074 % Richard Que 02/08/10 - Added inptus for compartment names and symbols
0075 
0076 %% Process arguments
0077 
0078 if (nargin < 2)
0079     defaultBound = 1000;
0080 else
0081     if (isempty(defaultBound))
0082         defaultBound = 1000;
0083     end
0084 end
0085 if (nargin < 3)
0086     fileType = 'SBML';
0087     if (isempty(fileType))
0088        fileType = 'SBML'; 
0089     end
0090 end
0091 
0092 
0093 
0094 % Open a dialog to select file
0095 if (nargin < 1)
0096     [fileNameFull,filePath] = uigetfile({'*.xml';'*.sto'});
0097     if (fileNameFull)
0098         [t1,t2,t3,t4,tokens] = regexp(fileNameFull,'(.*)\.(\w*)');
0099         noPathName = tokens{1}{1};
0100         fileTypeTmp = tokens{1}{2};
0101         fileName = [filePath noPathName];
0102     else
0103         model = 0;
0104         return;
0105     end
0106     switch fileTypeTmp
0107         case 'xml'
0108             fileType = 'SBML';
0109         case 'sto'
0110             fileType = 'SimPheny';
0111         otherwise
0112             error('Cannot process this file type');
0113     end
0114 end
0115 
0116 if (nargin < 4)
0117     if (exist('filePath'))
0118         modelDescription = noPathName;
0119     else
0120         modelDescription = fileName;
0121     end
0122 end
0123 
0124 if (nargin < 5)
0125     compSymbolList = {};
0126     compNameList = {};
0127 end
0128 
0129 switch fileType
0130     case 'SBML',
0131         if isempty(regexp(fileName,'\.xml$', 'once'))
0132             model = readSBMLCbModel([fileName '.xml'],defaultBound,compSymbolList,compNameList);
0133         else
0134             model = readSBMLCbModel(fileName,defaultBound,compSymbolList,compNameList);
0135         end
0136     case 'SimPheny',
0137         model = readSimPhenyCbModel(fileName,defaultBound,compSymbolList,compNameList);
0138     case 'SimPhenyPlus',
0139         model = readSimPhenyCbModel(fileName,defaultBound,compSymbolList,compNameList);
0140         model = readSimPhenyGprCmpd(fileName,model);
0141     case 'SimPhenyText',
0142         model = readSimPhenyCbModel(fileName,defaultBound,compSymbolList,compNameList);
0143         model = readSimPhenyGprText([fileName '_gpra.txt'],model);
0144     otherwise,
0145         error('Unknown file type');
0146 end
0147 
0148 % Check reversibility
0149 model = checkReversibility(model);
0150 
0151 % Check uniqueness of metabolite and reaction names
0152 checkCobraModelUnique(model);
0153 
0154 model.b = zeros(length(model.mets),1);
0155 
0156 model.description = modelDescription;
0157 
0158 % End main function
0159 
0160 %% Make sure reversibilities are correctly indicated in the model
0161 function model = checkReversibility(model)
0162 
0163 selRev = (model.lb < 0 & model.ub > 0);
0164 model.rev(selRev) = 1;
0165 
0166 %% readSBMLCbModel Read SBML format constraint-based model
0167 function model =  readSBMLCbModel(fileName,defaultBound,compSymbolList,compNameList)
0168 
0169 if ~(exist(fileName,'file'))
0170     error(['Input file ' fileName ' not found']);
0171 end
0172 
0173 if isempty(compSymbolList)
0174     compSymbolList = {'c','m','v','x','e','t','g','r','n','p'};
0175     compNameList = {'Cytosol','Mitochondria','Vacuole','Peroxisome','Extra-organism','Pool','Golgi Apparatus','Endoplasmic Reticulum','Nucleus','Periplasm'};
0176 end
0177 
0178 % Read SBML
0179 modelSBML = TranslateSBML(fileName);
0180 
0181 % Convert
0182 model = convertSBMLToCobra(modelSBML,defaultBound,compSymbolList,compNameList);
0183 
0184 %%
0185 function model = readSimPhenyCbModel(baseName,defaultBound,compSymbolList,compNameList)
0186 %readSimPhenyCbModel Read a SimPheny metabolic model
0187 %
0188 % model = readSimPhenyCbModel(baseName,defaultBound)
0189 %
0190 % baseName      Base filename for models
0191 % vMax          Maximum flux through a reaction
0192 %
0193 % model.mets    Metabolite names
0194 % model.rxns    Reaction names
0195 % model.rev     Reversible (1)/Irreversible (0)
0196 % model.lb      Lower bound
0197 % model.ub      Upper bound
0198 % model.c       Objective coefficients
0199 % model.S       Stoichiometric matrix
0200 %
0201 % Markus Herrgard 8/3/04
0202 
0203 if (nargin < 2)
0204     defaultBound = 1000;
0205 end
0206 
0207 if ~(exist([baseName '.met'],'file') & exist([baseName '.rxn'],'file') & exist([baseName '.sto'],'file'))
0208     error('One or more input files not found');
0209 end
0210 
0211 if isempty(compSymbolList)
0212     compSymbolList = {'c','m','v','x','e','t','g','r','n','p'};
0213     compNameList = {'Cytosol','Mitochondria','Vacuole','Peroxisome','Extra-organism','Pool','Golgi Apparatus','Endoplasmic Reticulum','Nucleus','Periplasm'};
0214 end
0215 
0216 % Get the metabolite names
0217 fid = fopen([baseName '.met']);
0218 cnt = 0;
0219 while 1
0220     tline = fgetl(fid);
0221     if ~ischar(tline), break, end
0222     if (~isempty(regexp(tline,'^\d', 'once')))
0223         cnt = cnt + 1;
0224         fields = splitString(tline,'\t');
0225         mets{cnt} = fields{2};
0226 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0227 %        mets{cnt} = strrep(mets{cnt}, '(', '[');
0228 %        mets{cnt} = strrep(mets{cnt}, ')', ']');
0229         
0230         comp{cnt,1} = fields{4};
0231 %        compSymb = compSymbolList{strcmp(compNameList,comp{cnt})};
0232         compSymb = 0;
0233         TF = strcmp('comp{cnt}', compNameList);
0234             for n = 1:length(compNameList)
0235                 if TF(n)
0236                     compSymb = compSymbolList{n};
0237                 end
0238             end
0239         if (isempty(compSymb))
0240             compSymb = comp{cnt};
0241         end
0242         if (~isempty(regexp(mets{cnt},'\(', 'once')))
0243             mets{cnt} = strrep(mets{cnt},'(','[');
0244             mets{cnt} = strrep(mets{cnt},')',']');
0245         else
0246             mets{cnt} = [mets{cnt} '[' compSymb ']'];
0247         end
0248         metNames{cnt} = fields{3};
0249     end
0250 end
0251 fclose(fid);
0252 
0253 mets = columnVector(mets);
0254 metNames = columnVector(metNames);
0255 
0256 % Get the reaction names, lower/upper bounds, and reversibility
0257 fid = fopen([baseName '.rxn']);
0258 cnt = 0;
0259 startRxns = false;
0260 while 1
0261     tline = fgetl(fid);
0262     if ~ischar(tline), break, end
0263     if (regexp(tline,'^REACTION'))
0264         startRxns = true;
0265     end
0266     if (startRxns & ~isempty(regexp(tline,'^\d', 'once')))
0267         cnt = cnt + 1;
0268         fields = splitString(tline,'\t');
0269         rxns{cnt} = fields{2};
0270         rxnNames{cnt} = fields{3};
0271         revStr{cnt} = fields{4};
0272         lb(cnt) = str2num(fields{5});
0273         ub(cnt) = str2num(fields{6});
0274         c(cnt) = str2num(fields{7});
0275     end
0276 end
0277 fclose(fid);
0278 
0279 revStr = columnVector(revStr);
0280 rev = strcmp(revStr,'Reversible');
0281 rxns = columnVector(rxns);
0282 rxnNames = columnVector(rxnNames);
0283 lb = columnVector(lb);
0284 ub = columnVector(ub);
0285 c = columnVector(c);
0286 lb(lb < -defaultBound) = -defaultBound;
0287 ub(ub > defaultBound) = defaultBound;
0288 
0289 % Get the stoichiometric matrix
0290 fid = fopen([baseName '.sto']);
0291 fid2 = fopen('load_simpheny.tmp','w');
0292 while 1
0293     tline = fgetl(fid);
0294     if ~ischar(tline), break, end
0295     % This here might give some problems, but it worked for the iJR904
0296     % model
0297     if (~isempty(regexp(tline,'^[-0123456789.]', 'once')))
0298         fprintf(fid2,[tline '\n']);
0299     else
0300         % For debugging
0301         %tline
0302     end
0303 end
0304 fclose(fid);
0305 fclose(fid2);
0306 S = load('load_simpheny.tmp');
0307 
0308 % tmp = regexp(rxns,'deleted');
0309 % sel_rxn = ones(length(rxns),1);
0310 % for i = 1:length(tmp)
0311 %     if (~isempty(tmp{i}))
0312 %         sel_rxn(i) = 0;
0313 %     end
0314 % end
0315 %
0316 % tmp = regexp(mets,'deleted');
0317 % sel_met = ones(length(mets),1);
0318 % for i = 1:length(tmp)
0319 %     if (~isempty(tmp{i}))
0320 %         sel_met(i) = 0;
0321 %     end
0322 % end
0323 
0324 % Store the variables in a structure
0325 model.mets = mets;
0326 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
0327 model.metComps = comp;
0328 
0329 model.metNames = metNames;
0330 model.rxns = removeDeletedTags(rxns);
0331 model.rxnNames = rxnNames;
0332 model.rev = rev;
0333 model.lb = lb;
0334 model.ub = ub;
0335 model.c = c;
0336 model.S = sparse(S);
0337 
0338 % Delete the temporary file
0339 delete('load_simpheny.tmp');
0340 
0341 %%
0342 function list = removeDeletedTags(list)
0343 %removeDeletedTags Get rid of the [deleted tags in the SimPheny files
0344 %
0345 % list = removeDeletedTags(list)
0346 %
0347 % 5/19/05 Markus Herrgard
0348 
0349 for i = 1:length(list)
0350     item = list{i};
0351     ind = strfind(item,' [deleted');
0352     if (~isempty(ind))
0353         list{i} = item(1:(ind-1));        
0354     end
0355 end
0356 
0357 %% readSimPhenyGprCmpd Read SimPheny GPRA and compound data and integrate it
0358 % with the model
0359 function model = readSimPhenyGprCmpd(baseName,model)
0360 
0361 [rxnInfo,rxns,allGenes] = readSimPhenyGPR([baseName '_gpr.txt']);
0362 
0363 nRxns = length(model.rxns);
0364 
0365 % Construct gene to rxn mapping
0366 rxnGeneMat = sparse(nRxns,length(allGenes));
0367 h = waitbar(0,'Constructing GPR mapping ...');
0368 for i = 1:nRxns
0369     rxnID = find(ismember(rxns,model.rxns{i}));
0370     if (~isempty(rxnID))
0371         if mod(i,10) == 0
0372             waitbar(i/nRxns,h);
0373         end
0374         [tmp,geneInd] = ismember(rxnInfo(rxnID).genes,allGenes);
0375         rxnGeneMat(i,geneInd) = 1;
0376         rules{i} = rxnInfo(rxnID).rule;
0377         grRules{i} = rxnInfo(rxnID).gra;
0378         grRules{i} = regexprep(grRules{i},'\s{2,}',' ');
0379         grRules{i} = regexprep(grRules{i},'( ','(');
0380         grRules{i} = regexprep(grRules{i},' )',')');
0381         subSystems{i} = rxnInfo(rxnID).subSystem;
0382         for j = 1:length(geneInd)
0383             %rules{i} = strrep(rules{i},['x(' num2str(j) ')'],['x(' num2str(geneInd(j)) ')']);
0384             rules{i} = strrep(rules{i},['x(' num2str(j) ')'],['x(' num2str(geneInd(j)) '_TMP_)']);
0385         end
0386         rules{i} = strrep(rules{i},'_TMP_','');
0387     else
0388         rules{i} = '';
0389         grRules{i} = '';
0390         subSystems{i} = '';
0391     end
0392 end
0393 if ( regexp( version, 'R20') )
0394         close(h);
0395 end
0396 
0397 %% Read SimPheny cmpd output file
0398 [metInfo,mets] = readSimPhenyCMPD([baseName '_cmpd.txt']);
0399 
0400 baseMets = parseMetNames(model.mets);
0401 nMets = length(model.mets);
0402 h = waitbar(0,'Constructing metabolite lists ...');
0403 for i = 1:nMets
0404     if mod(i,10) == 0
0405         waitbar(i/nMets,h);
0406     end
0407     metID = find(ismember(mets,baseMets{i}));
0408     if (~isempty(metID))
0409        metFormulas{i} = metInfo(metID).formula; 
0410     else
0411        metFormulas{i} = '';
0412     end
0413 end
0414 if ( regexp( version, 'R20') )
0415         close(h);
0416 end
0417 
0418 model.rxnGeneMat = rxnGeneMat;
0419 model.rules = columnVector(rules);
0420 model.grRules = columnVector(grRules);
0421 model.genes = columnVector(allGenes);
0422 model.metFormulas = columnVector(metFormulas);
0423 model.subSystems = columnVector(subSystems);
0424 
0425

Generated on Thu 21-Jun-2012 15:39:23 by m2html © 2003