xls2model

PURPOSE ^

xls2model Writes a model from Excel spreadsheet.

SYNOPSIS ^

function model = xls2model(fileName,biomassRxnEquation)

DESCRIPTION ^

 xls2model Writes a model from Excel spreadsheet.

 model = xls2model(fileName,metFileName)

 INPUT
 fileName      xls spreadsheet, with one 'reactions' and one 'metabolites' tab

 'reactions' tab: Required headers:
               col 1     Abbreviation    HEX1
               col 2     Name            Hexokinase
               col 3     Reaction        1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]
               col 4     GPR             b0001
               col 5     Genes           b0001 (optional: column can be empty)
               col 6     Protein           AlaS (optional: column can be empty)
               col 7     Subsystem       Glycolysis
               col 8     Reversible      0
               col 9     Lower bound     0
               col 10    Upper bound     1000
               col 11    Objective       0    (optional: column can be empty)
               col 12    Confidence Score 0,1,2,3,4
               col 13    EC. Number      1.1.1.1
               col 14    Notes           N/A  (optional: column can be empty)
               col 15    References      PMID: 1111111  (optional: column can be empty)

 'metabolites' tab: Required headers: (needs to be complete list of metabolites, i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows. Abbreviations needs to overlap with use in Reaction List
               col 1     Abbreviation
               col 2     Name
               col 3     Formula (neutral)
               col 4     Formula (charged)
               col 5     Charge
               col 6     Compartment
               col 7     KEGG ID
               col 8     PubChem ID
               col 9     ChEBI ID
               col 10    InChI string
               col 11    Smiles

 OPTIONAL INPUT (may be required for input on unix macines)
 biomassRxnEquation        .xls may have a 255 character limit on each cell, 
                           so pass the biomass reaction separately if it hits this maximum.        

 OUTPUT
 model         COBRA Toolbox model

 Ines Thiele   01/02/09
 Richard Que   04/27/10    Modified reading of PubChemID and ChEBIID so that if met 
                           has multiple IDs, all are passed to model. Confidence Scores
                           PubChemIDs, and ChEBIIDs, are properly passed as cell arrays. 
 Ronan Fleming 08/17/10    Support for unix

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function model = xls2model(fileName,biomassRxnEquation)
0002 % xls2model Writes a model from Excel spreadsheet.
0003 %
0004 % model = xls2model(fileName,metFileName)
0005 %
0006 % INPUT
0007 % fileName      xls spreadsheet, with one 'reactions' and one 'metabolites' tab
0008 %
0009 % 'reactions' tab: Required headers:
0010 %               col 1     Abbreviation    HEX1
0011 %               col 2     Name            Hexokinase
0012 %               col 3     Reaction        1 atp[c] + 1 glc-D[c] --> 1 adp[c] + 1 g6p[c] + 1 h[c]
0013 %               col 4     GPR             b0001
0014 %               col 5     Genes           b0001 (optional: column can be empty)
0015 %               col 6     Protein           AlaS (optional: column can be empty)
0016 %               col 7     Subsystem       Glycolysis
0017 %               col 8     Reversible      0
0018 %               col 9     Lower bound     0
0019 %               col 10    Upper bound     1000
0020 %               col 11    Objective       0    (optional: column can be empty)
0021 %               col 12    Confidence Score 0,1,2,3,4
0022 %               col 13    EC. Number      1.1.1.1
0023 %               col 14    Notes           N/A  (optional: column can be empty)
0024 %               col 15    References      PMID: 1111111  (optional: column can be empty)
0025 %
0026 % 'metabolites' tab: Required headers: (needs to be complete list of metabolites, i.e., if a metabolite appears in multiple compartments it has to be represented in multiple rows. Abbreviations needs to overlap with use in Reaction List
0027 %               col 1     Abbreviation
0028 %               col 2     Name
0029 %               col 3     Formula (neutral)
0030 %               col 4     Formula (charged)
0031 %               col 5     Charge
0032 %               col 6     Compartment
0033 %               col 7     KEGG ID
0034 %               col 8     PubChem ID
0035 %               col 9     ChEBI ID
0036 %               col 10    InChI string
0037 %               col 11    Smiles
0038 %
0039 % OPTIONAL INPUT (may be required for input on unix macines)
0040 % biomassRxnEquation        .xls may have a 255 character limit on each cell,
0041 %                           so pass the biomass reaction separately if it hits this maximum.
0042 %
0043 % OUTPUT
0044 % model         COBRA Toolbox model
0045 %
0046 % Ines Thiele   01/02/09
0047 % Richard Que   04/27/10    Modified reading of PubChemID and ChEBIID so that if met
0048 %                           has multiple IDs, all are passed to model. Confidence Scores
0049 %                           PubChemIDs, and ChEBIIDs, are properly passed as cell arrays.
0050 % Ronan Fleming 08/17/10    Support for unix
0051 %
0052 warning('xls2model IS NOT SUPPORTED BY THE openCOBRA CORE TEAM AND WILL BE MOVED FROM THE CORE IN THE NEAR FUTURE');
0053 warning off
0054 
0055 if isunix
0056     %assumes that one has an xls file with two tabs
0057     [Numbers, Strings] = xlsread(fileName,'reactions');
0058     [MetNumbers, MetStrings] = xlsread(fileName,'metabolites');
0059     %trim empty row from Numbers and MetNumbers
0060     Numbers = Numbers(2:end,:);
0061     MetNumbers = MetNumbers(2:end,:);
0062     
0063     if isempty(MetStrings)
0064         error('Save .xls file as Windows 95 version using gnumeric not openoffice!');
0065     end
0066     
0067     nRxns=length(Strings(:,1))-1;
0068     nMets=length(MetStrings(:,1))-1;
0069     
0070     %[Numbers, Strings] = xlsread(fileName,'reactions',['A1:O' nRxns],'basic');
0071     %[MetNumbers, MetStrings] = xlsread(fileName,'metabolites',['A1:K' nMets],'basic');
0072 else
0073     %assumes that one has an xls file with two tabs
0074     [Numbers, Strings] = xlsread(fileName,'reactions');
0075     [MetNumbers, MetStrings] = xlsread(fileName,'metabolites');
0076     % assumed that first row is header row
0077     nRxns=length(Strings(:,1))-1;
0078     nMets=length(MetStrings(:,1))-1;
0079     %add empty line to
0080 end
0081 
0082 rxnAbrList = Strings(2:end,1); 
0083 rxnNameList = Strings(2:end,2);
0084 rxnList = Strings(2:end,3);
0085 grRuleList = Strings(2:end,4);
0086 Protein = Strings(2:end,6);
0087 subSystemList = Strings(2:end,7);
0088 
0089 if isunix
0090     for n=1:length(rxnList)
0091         if length(rxnList{n})==255
0092             if exist('biomassRxnEquation','var')
0093                 rxnList{n}=biomassRxnEquation;
0094             else
0095                 error('biomassRxnEquation .xls may have a 255 character limit on each cell, so pass the biomass reaction separately if it hits this maximum.')
0096             end
0097         end
0098     end
0099 end
0100 
0101 [r,c] = size(Numbers);
0102 if c >= 1
0103     revFlagList = Numbers(:,1);
0104 else
0105     revFlagList = [];
0106 end
0107 if c >= 2
0108     lowerBoundList = Numbers(:,2);
0109 else
0110     lowerBoundList = 1000*ones(length(rxnAbrList),1);
0111 end
0112 if c >= 3
0113     upperBoundList = Numbers(:,3);
0114 else
0115     upperBoundList = 1000*ones(length(rxnAbrList),1);
0116 end
0117 if c >= 4
0118     Objective = Numbers(:,4);
0119 else
0120     Objective = zeros(length(rxnAbrList),1);
0121 end
0122 model = createModel(rxnAbrList,rxnNameList,rxnList,revFlagList,lowerBoundList,upperBoundList,subSystemList,grRuleList);
0123 if size(Numbers,2)>=5
0124     ConfidenceScore = Numbers(:,5);
0125     model.confidenceScores = regexprep(cellstr(num2str(ConfidenceScore)),'NaN| ','');
0126 else
0127     model.confidenceScores = cell(length(model.rxns),1); %empty cell instead of NaN
0128 end
0129 if size(Strings,2)>=13
0130     model.rxnECNumbers = Strings(2:end,13);
0131 end
0132 if size(Strings,2)>=14
0133     model.rxnNotes = Strings(2:end,14);
0134 end
0135 if size(Strings,2)>=15
0136     model.rxnReferences = Strings(2:end,15);
0137 end
0138 
0139 %fill in opt info for metabolites
0140 [nMetNum mMetNum] = size(MetNumbers);
0141 if mMetNum<5
0142     MetNumbers(:,mMetNum+1:5) = nan(nMetNum,5-mMetNum);
0143 end
0144 
0145 if ~isempty(Objective) && length(Objective) == length(model.rxns)
0146     model.c = (Objective);
0147 end
0148 model.proteins = Protein;
0149 
0150 % case 1: all metabolites in List have a compartment assignement
0151 
0152 if ~cellfun('isempty',(strfind(MetStrings(2,1),'[')))
0153     for i = 2 : length(MetStrings(:,1))% assumes that first row is header
0154         % finds metabolites in model structure
0155         MetLoc =  strmatch(MetStrings{i,1},model.mets,'exact');
0156         if ~isempty(MetLoc)
0157             model.metNames{MetLoc} = MetStrings{i,2};
0158             model.metFormulasNeutral{MetLoc} = MetStrings{i,3};
0159          %   model.metFormulas{MetLoc} = char(MetStrings{i,4});
0160             model.metFormulas{MetLoc} = MetStrings{i,4};
0161             model.metCompartment{MetLoc} = MetStrings{i,6};
0162             model.metKEGGID{MetLoc} = MetStrings{i,7};
0163             if size(MetStrings,2) >= 10
0164                 model.metInChIString{MetLoc} = MetStrings{i,10};
0165             end
0166             if size(MetStrings,2) >= 11
0167                 model.metSmiles{MetLoc} = MetStrings{i,11};
0168             end
0169             if ~isempty(MetNumbers)
0170                 model.metCharge(MetLoc) = MetNumbers(i-1,1);
0171                 if (~isnan(MetNumbers(i-1,4)))
0172                     model.metPubChemID(MetLoc) = num2Cell(MetNumbers(i-1,4));
0173                 else
0174                     model.metPubChemID(MetLoc) = MetStrings(i,8);
0175                 end
0176                 if (~isnan(MetNumbers(i-1,5)))
0177                     model.metChEBIID(MetLoc) = num2Cell(MetNumbers(i-1,5));
0178                 else
0179                     model.metChEBIID(MetLoc) = MetStrings(i,9);
0180                 end
0181             end
0182         else
0183             warning(['Metabolite ' MetStrings{i,1} ' not in model']);
0184         end
0185         MetLoc=[];
0186     end
0187 else
0188     % case 2: all metabolites in List have no compartment assignement
0189     for i = 2 : length(MetStrings(:,1))% assumes that first row is header
0190         % finds metabolites in model structure
0191         % this assumes that the compartment is shown with '[ ]'
0192         MetLoc =  strmatch(strcat(MetStrings{i,1},'['),model.mets);
0193         if ~isempty(MetLoc)
0194             for j = 1 : length(MetLoc)
0195                 model.metNames{MetLoc(j)} = MetStrings{i,2};
0196                 model.metFormulasNeutral{MetLoc(j)} = MetStrings{i,3};
0197                 model.metFormulas{MetLoc(j)} = MetStrings{i,4};
0198                 model.metCompartment{MetLoc(j)} = MetStrings{i,6};
0199                 model.metKEGGID{MetLoc(j)} = MetStrings{i,8};
0200                 if size(MetStrings,2) >= 10
0201                     model.metInChIString{MetLoc(j)} = MetStrings{i,10};
0202                 end
0203                 if size(MetStrings,2) >= 11
0204                     model.metSmiles{MetLoc(j)} = MetStrings{i,11};
0205                 end
0206                 if ~isempty(MetNumbers)
0207                     model.metCharge(MetLoc) = MetNumbers(i-1,1);
0208                     if (~isnan(MetNumbers(i-1,4)))
0209                         model.metPubChemID(MetLoc) = num2cell(MetNumbers(i-1,4));
0210                     else
0211                         model.metPubChemID(MetLoc) = MetStrings(i,8);
0212                     end
0213                     if (~isnan(MetNumbers(i-1,5)))
0214                         model.metChEBIID(MetLoc) = num2Cell(MetNumbers(i-1,5));
0215                     else
0216                         model.metChEBIID(MetLoc) = MetStrings(i,9);
0217                     end
0218                 end
0219             end
0220         else
0221             warning(['Metabolite ' MetStrings{i,1} ' not in model']);
0222         end
0223         MetLoc=[];
0224     end
0225 end
0226 
0227 %% Verify all vectors are column Vectors
0228 model.lb = columnVector(model.lb);
0229 model.ub = columnVector(model.ub);
0230 model.rev = columnVector(model.rev);
0231 model.c = columnVector(model.c);
0232 model.b = columnVector(model.b);
0233 model.rxns = columnVector(model.rxns);
0234 model.rxnNames = columnVector(model.rxnNames);
0235 model.mets = columnVector(model.mets);
0236 model.metNames = columnVector(model.metNames);
0237 model.metFormulas = columnVector(model.metFormulas);
0238 % commented out since changing this will require updating all model versions used for testing
0239 % model.metCharges = columnVector(model.metCharge); % all others have plural for vector
0240 model.metCharge = columnVector(model.metCharge);
0241 model.metFormulasNeutral = columnVector(model.metFormulasNeutral);
0242 model.subSystems = columnVector(model.subSystems);
0243 model.rules = columnVector(model.rules);
0244 model.grRules = columnVector(model.grRules);
0245 model.genes = columnVector(model.genes);
0246 model.confidenceScores = columnVector(model.confidenceScores);
0247 model.rxnECNumbers = columnVector(model.rxnECNumbers);
0248 model.rxnNotes = columnVector(model.rxnNotes);
0249 model.rxnReferences = columnVector(model.rxnReferences);
0250 model.proteins = columnVector(model.proteins);
0251 model.metPubChemID = columnVector(model.metPubChemID);
0252 model.metChEBIID = columnVector(model.metChEBIID);
0253 
0254 if isfield(model,'metCompartment')
0255     model.metCompartment = columnVector(model.metCompartment);
0256 end
0257 if isfield(model,'metKEGGID')
0258     model.metKEGGID = columnVector(model.metKEGGID);
0259 end
0260 if isfield(model,'metInChIString')
0261     model.metInChIString = columnVector(model.metInChIString);
0262 end
0263 if isfield(model,'metSmiles')
0264     model.metSmiles = columnVector(model.metSmiles);
0265 end
0266 
0267 warning on

Generated on Thu 21-Jun-2012 15:39:23 by m2html © 2003