runHiLoExp

PURPOSE ^

runHiLoExp

SYNOPSIS ^

function [experiment] = runHiLoExp(experiment)

DESCRIPTION ^

 runHiLoExp
   takes an experiment with the following structure 
       and splits the sample space at the median of a target flux
       solves the two spaces with a given sugar and compares the
       resulting mdvs to provide a z-score.

   experiment -      
   model - S     = the stoichiometric matrix
         - rxns  = array of reaction names, corresponding the S
         - c     = optimization target 1, or -1
         - ub,lb = upper and lower bounds of reactions
   points = a #fluxes X #samples (~2000) array of the solution space
         if missing or empty, will generate a sample
   glcs = an array of sugars in isotopomer format, each column a separate sugar.
           Should not be in MDV format.  Conversion is done automatically.
         will default to generate 1 random sugar if set to []
   targets = an array of cells with string for the reaction to 
         split on the solution space, defaults to 'PGL'
   thresholds = # targets X 1 array of thresholds
   metabolites = an optional parameter fed to calcMDVfromSamp.m
      which only calculates the MDVs for the metabolites listed in this
      array.  e.g.
      - optionally, metabolites can also be a structure of fragments
   hilo = a #targets X #samples array of 0/1's, 0 id's the sample of
      fluxes as the lo side and 1 id's the sample for the hi side.
      hilo will only be calculated/recalculated if it's missing or 
      if the targets have been replaced using the param list
   mdvs = structure of mdv results
        - (name) = name of the run = t + glc#
                  e.g. t1, t2, glc# refers to the glc in the glcs array
                 - mdv = array of mdv results
                 - zscore = array of zscores from comparison btw the two mdvs
                 - totalz
        Note that the split of mdvs are not stored,
           also since the only time mdvs should be regen'd is when glcs
           has changed, but we have no way of knowing when this happens,
           the user will have to manually empty out mdvs to have it
           regenerated.
   zscores = array of zscores from each run, targets X glcs
   rscores = array of ridge scores from each run, targets X glcs

   target is an optional string for a specific rxn to target.
       if supplied, it will override and replace the targets field in the
       experiment structure.
   threshold is an optional number to apply on the solution space fluxes
       if supplied, it will be applied to the hilo field and replace the
       hilo splits.


   outputs the experiment array.

  basically, this code will loop thru one experiment
   per sugar, per target

 Wing Choi 3/14/08

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function [experiment] = runHiLoExp(experiment)
0002 % runHiLoExp
0003 %   takes an experiment with the following structure
0004 %       and splits the sample space at the median of a target flux
0005 %       solves the two spaces with a given sugar and compares the
0006 %       resulting mdvs to provide a z-score.
0007 %
0008 %   experiment -
0009 %   model - S     = the stoichiometric matrix
0010 %         - rxns  = array of reaction names, corresponding the S
0011 %         - c     = optimization target 1, or -1
0012 %         - ub,lb = upper and lower bounds of reactions
0013 %   points = a #fluxes X #samples (~2000) array of the solution space
0014 %         if missing or empty, will generate a sample
0015 %   glcs = an array of sugars in isotopomer format, each column a separate sugar.
0016 %           Should not be in MDV format.  Conversion is done automatically.
0017 %         will default to generate 1 random sugar if set to []
0018 %   targets = an array of cells with string for the reaction to
0019 %         split on the solution space, defaults to 'PGL'
0020 %   thresholds = # targets X 1 array of thresholds
0021 %   metabolites = an optional parameter fed to calcMDVfromSamp.m
0022 %      which only calculates the MDVs for the metabolites listed in this
0023 %      array.  e.g.
0024 %      - optionally, metabolites can also be a structure of fragments
0025 %   hilo = a #targets X #samples array of 0/1's, 0 id's the sample of
0026 %      fluxes as the lo side and 1 id's the sample for the hi side.
0027 %      hilo will only be calculated/recalculated if it's missing or
0028 %      if the targets have been replaced using the param list
0029 %   mdvs = structure of mdv results
0030 %        - (name) = name of the run = t + glc#
0031 %                  e.g. t1, t2, glc# refers to the glc in the glcs array
0032 %                 - mdv = array of mdv results
0033 %                 - zscore = array of zscores from comparison btw the two mdvs
0034 %                 - totalz
0035 %        Note that the split of mdvs are not stored,
0036 %           also since the only time mdvs should be regen'd is when glcs
0037 %           has changed, but we have no way of knowing when this happens,
0038 %           the user will have to manually empty out mdvs to have it
0039 %           regenerated.
0040 %   zscores = array of zscores from each run, targets X glcs
0041 %   rscores = array of ridge scores from each run, targets X glcs
0042 %
0043 %   target is an optional string for a specific rxn to target.
0044 %       if supplied, it will override and replace the targets field in the
0045 %       experiment structure.
0046 %   threshold is an optional number to apply on the solution space fluxes
0047 %       if supplied, it will be applied to the hilo field and replace the
0048 %       hilo splits.
0049 %
0050 %
0051 %   outputs the experiment array.
0052 %
0053 %  basically, this code will loop thru one experiment
0054 %   per sugar, per target
0055 %
0056 % Wing Choi 3/14/08
0057 
0058 if (nargin < 1)
0059     disp '[experiment] = runHiLoExp(experiment,target,threshold)';
0060     return;
0061 end
0062 
0063 runzscore = true;      %binary flag for z-scores
0064 runrscore = false;      %binary flag for r-scores
0065 runkscore = false;      %binary flag for k-scores
0066 
0067 
0068 %% if no model, exit with error.
0069 % if model exists, but no points
0070 %   then points are generated
0071 %   existing mdvs in experiment, if any, are wiped
0072 if (not(isfield(experiment,'model')))
0073     disp 'ERROR: input structure experiment lacks the requisite model in the model field';
0074     return;
0075 end
0076 
0077 model = experiment.model;
0078 points = [];
0079 
0080 if (isfield(experiment,'points'))
0081     points = experiment.points;
0082 end
0083 
0084 if (isempty(points))
0085     disp 'WARN: experiment.points is empty or missing';
0086     disp ' generating a sample and empty out mdvs';
0087     pointSample = generateRandomSample(model,2000);
0088 
0089     points = pointSample.point; 
0090     experiment.mfrac = pointSample.mf; 
0091     %points = m.points;
0092     experiment.points = points;
0093     %experiment.mfrac = mf;
0094     % recalculate the mdvs since we have new points.
0095     experiment.mdvs = [];
0096 end
0097 
0098 
0099 %% if the rxns array is inverted
0100 %    display error message indicating that rxns array is inverted
0101 dr = size(model.rxns,2);
0102 if (dr > 1),
0103     disp 'ERROR: rxns array is inverted';
0104     return;
0105 end
0106 
0107 if (isfield(experiment,'metabolites'))
0108     metabolites = experiment.metabolites;
0109 else
0110     metabolites = [];
0111 end
0112     
0113 %% if no sugar, generate a random one and warn the user.
0114 %    existing mdvs are wiped
0115 glcs = [];
0116 if (isfield(experiment,'glcs'))
0117     glcs = experiment.glcs;
0118 end
0119 if (isempty(glcs))
0120     disp 'WARN: glcs not found, will generate 1 random sugar for experiment and empty out mdvs';
0121     glcs = getRandGlc();
0122     experiment.glcs = glcs;
0123     experiment.mdvs = [];
0124 end
0125 
0126 %% set glucose mixture name description;
0127 glcsnames = {};
0128 for i = 1:size(glcs,2)
0129     glcsnames{1,i} = '';
0130     glci = glcs(:,i);
0131     fglc = find(glci);
0132     for j = 1:length(fglc)
0133         if j ~=1
0134             glcsnames{1,i} = strcat(glcsnames{1,i}, '+');
0135         end
0136         if round(100*glci(fglc(j))) < 100
0137             glcsnames{1,i} = strcat(glcsnames{1,i}, num2str(round(100*glci(fglc(j)))), '% ');
0138         end
0139         if fglc(j)-1 == 0 % not labeled
0140             glcsnames{1,i} = strcat(glcsnames{1,i}, 'C0');
0141         elseif abs(log2(fglc(j)-1) - round(log2(fglc(j)-1)))<1e-8 % if it's a perfect power
0142             glcsnames{1,i} = strcat(glcsnames{1,i}, 'C', num2str(6-round(log2(fglc(j)-1))) );
0143         elseif fglc(j)-1 == 32+16 % C12
0144             glcsnames{1,i} = strcat(glcsnames{1,i}, 'C12');
0145         elseif fglc(j)-1 == 63 % fully labeled
0146             glcsnames{1,i} = strcat(glcsnames{1,i}, 'CU');
0147         else
0148             glcsnames{1,i} = strcat(glcsnames{1,i}, '#', dec2bin(fglc(j)-1, 6));
0149         end
0150     end
0151 end
0152 experiment.glcsnames = glcsnames;
0153 
0154 
0155 %% if hilo is defined in experiment
0156 %    then inform user and error out
0157 if (not(isfield(experiment,'hilo')))
0158     disp 'ERROR: hilo field not found in input structure experiment';
0159     return;
0160 else
0161     hilo = experiment.hilo;
0162 end
0163 
0164 ntarget = size(hilo,1);
0165 nglc = size(glcs,2);
0166 
0167 
0168 %% we don't care about the thresholds field from this point on
0169 if (not(isfield(experiment,'mdvs')))
0170     experiment.mdvs = [];
0171 end
0172 mdvs = experiment.mdvs;
0173 if (isempty(mdvs))
0174     disp('no mdvs found, recalculating mdvs and emptying out zscores');
0175     mdvs = struct;
0176     for iglc = 1:nglc
0177         fprintf('MDVS on glucose %d of %d\n',iglc, nglc);
0178         glc = glcs(:,iglc);
0179         % translate sugar from isotopomer to cuomer format
0180         if abs(sum(glc)-1)>1e-6 || any(glc <0)
0181             display('invalid glc.  needs to be idv');
0182             disp(iglc)
0183             glc
0184             pause;
0185         end
0186 
0187         mdv = calcMDVfromSamp(glc,experiment.points,metabolites);
0188         
0189         name = sprintf('t%d',iglc);
0190         mdvs.(name) = mdv;
0191     end
0192     experiment.mdvs = mdvs;
0193     experiment.zscores = [];
0194 end
0195 
0196 %% regen the zscores
0197 if (not(isfield(experiment,'zscores')))
0198     experiment.zscores = [];
0199 end
0200 zscores = experiment.zscores;
0201 if (isempty(zscores) && runzscore)
0202     disp('calculating zscores');
0203     for iglc = 1:nglc     
0204         fprintf('z-scores on glucose %d of %d\n',iglc, nglc);
0205         for itgt = 1:ntarget
0206 %             target = char(targets(itgt));
0207             hl = hilo(itgt,:);
0208             name = sprintf('t%d',iglc);
0209             mdv = mdvs.(name);
0210             [hiset,loset] = splitMDVbyTarget(mdv,hl);
0211 %             if ((size(loset,1)) ~= (size(hiset,1)))
0212 %                 zscores(itgt,iglc) = -1;
0213 %                 disp('problem with the hi or lo set, cannot calculate zscore');
0214 %                 continue;
0215 %             end
0216             mdv1.names = mdv.names;
0217             mdv1.ave = mean(loset,2);
0218             mdv1.stdev = std(loset,0,2);
0219             mdv2.ave = mean(hiset,2);
0220             mdv2.stdev = std(hiset,0,2);
0221 
0222             [totalz,zscore] = compareTwoMDVs(mdv1,mdv2);
0223             zscores(itgt,iglc) = totalz;
0224         end
0225     end
0226     experiment.zscores = zscores;
0227 end
0228 
0229 
0230 %% regen the ridge score
0231 
0232 if (not(isfield(experiment,'rscores')))
0233     experiment.rscores = [];
0234 end
0235 rscores = experiment.rscores;
0236 if (isempty(rscores) && runrscore)
0237     disp('calculating ridge scores');
0238     for iglc = 1:nglc        
0239         fprintf('r-scores on glucose %d of %d\n',iglc, nglc);
0240         for itgt = 1:ntarget
0241             hl = hilo(itgt,:)';
0242             name = sprintf('t%d',iglc);
0243             mdv = mdvs.(name).mdv;
0244             rscore = score_ridge(mdv,hl);
0245             rscores(itgt,iglc) = rscore;
0246         end
0247     end
0248     experiment.rscores = rscores;
0249 end
0250 
0251 %% regen the KS score
0252 if (not(isfield(experiment,'kscores')))
0253     experiment.kscores = [];
0254 end
0255 kscores = experiment.kscores;
0256 if (isempty(kscores) && runkscore)
0257     disp('calculating KS scores');
0258     for iglc = 1:nglc        
0259         fprintf('KS-scores on glucose %d of %d\n',iglc, nglc);
0260         for itgt = 1:ntarget
0261             hl = hilo(itgt,:)';
0262             name = sprintf('t%d',iglc);
0263             mdv = mdvs.(name).mdv;
0264             kscore = score_KS(mdv,hl);
0265             kscores(itgt,iglc) = kscore;
0266         end
0267     end
0268     experiment.kscores = kscores;
0269 end
0270 
0271 return;
0272 
0273 end
0274 
0275 
0276 %% findIndexToTarget
0277 %
0278 % function [targetind] = findIndexToTarget(model,target)
0279 %
0280 % % Given a model, find the index to the target in the model.rxns
0281 %
0282 % d = size(model.c);
0283 % %find index to target flux
0284 % found = false;
0285 % for r = 1:d(1),
0286 %     if ~isempty(findstr(char(model.rxns(r)),target))
0287 %         found = true;
0288 %         break;
0289 %     end
0290 % end
0291 % if (~found)
0292 %     disp(sprintf('could not locate %s flux',target));
0293 %     targetind = -1;
0294 %     return;
0295 % end
0296 % targetind = r;
0297 % disp(sprintf('found target flux for %s at: %d',target,targetind));
0298 %
0299 % return
0300 % end
0301 
0302 %% splitMDVbyTarget
0303 %
0304 function [hiset,loset] = splitMDVbyTarget(mdv,hilo)
0305 
0306 % Given an mdv set and a hilo discriminator, split the
0307 %   mdvset into 2 sets: a lo and hi set each with numinset cols.
0308 hiset = mdv.mdv(:,hilo==1);
0309 loset = mdv.mdv(:,~hilo);
0310 return;
0311 % nmdv = size(mdv.mdv,2);
0312 %
0313 % hisetcount = 0;
0314 % losetcount = 0;
0315 % hiset = [];
0316 % loset = [];
0317 % mdva = mdv.mdv;
0318 % mdvnan = mdv.mdvnan;
0319 % cnan = size(mdvnan,2);
0320 %
0321 % % if ((2*numinset) > nmdv)
0322 % %     disp('WARN: insufficient number of points to cover split into hi and lo set');
0323 % % end
0324 %
0325 %
0326 %
0327 % for i = 1:nmdv
0328 %     if (cnan >= i)
0329 %         % do nan check only if nan array is larger than i index
0330 %         if (sum(isnan(mdvnan(:,i))) > 1)
0331 %             continue;
0332 %         end
0333 %     end
0334 %     if (hilo(1,i) == 1)
0335 %         if (hisetcount <= numinset)
0336 %             hisetcount = hisetcount + 1;
0337 %             hiset(:,hisetcount) = mdva(:,i);
0338 %         end
0339 %     else
0340 %         if (losetcount <= numinset)
0341 %             losetcount = losetcount + 1;
0342 %             loset(:,losetcount) = mdva(:,i);
0343 %         end
0344 %     end
0345 %     if ((hisetcount >= numinset) && (losetcount >= numinset))
0346 %         break;
0347 %     end
0348 % end
0349 %
0350 % % might have read thru the entire set but lots of nan for mdv's
0351 % if ((hisetcount < numinset) || (losetcount < numinset))
0352 %     disp(sprintf('WARN: hisetcount = %d, losetcount = %d',hisetcount,losetcount));
0353 % end
0354 
0355 end

Generated on Thu 21-Jun-2012 15:39:23 by m2html © 2003