sjtul matlab8 v02生物信息學第八課_第1頁
sjtul matlab8 v02生物信息學第八課_第2頁
sjtul matlab8 v02生物信息學第八課_第3頁
sjtul matlab8 v02生物信息學第八課_第4頁
sjtul matlab8 v02生物信息學第八課_第5頁
已閱讀5頁,還剩76頁未讀 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

LifeScienceNecessities:FlexibilityandDataGathering–BreadthandEasilyloadcommonfileExcel,CSVandotherImage(jpeg,tiff,gif,bmp,png,AccesstomanyspecializedSequencedata(fasta,embl,genbank,etc.)Microarray(Affymetrix,GenePix,GEO,BLASTReports,MassSpec,PhylogeneticTrees,CompleteintegrationtoSQLandODBCDirectAccesstoExternalVideoCameras,MedicalEquipment,Example:SeamlessDatabaseVisualQueryAccessdatawithoutknowingScrollthroughtablesandCustomizeyourBuilt-invisualizationPlottingandCreatingHMTLHandlingdateReuseSQLstatementsinyourownProblemswithinsufficientlyautomatedcomputationalLackofInadequatemetricsforquantification,Slow,Humanerror,transcriptionLimitedscientificPerformaspectrumofanalysesincludingnonlinearmixed-effects(非線性混合效應),sequence(測序),microarray(微陣列),phylogenetictree(系統進化樹),massspectrometry(質譜分析),andgeneontology(基因本體論)Importdatafrommultiplesources,suchasdatabases,fileformats,orShareresultswithautomaticallygeneratedHTMLreports,datavisualizations,orstand-alonetoolsParallelizedataanalysistodecreasecomputationAutomateanalysestoimplementbatchprocessingofcontiguousExploreProductsforComputationalDataAcquisitionTheadvantageofautomatedcomputationalObtainobjectiveReducecostsandDecreaseprocessingandanalysisAlleviatehumanerrorsandtranscriptionConsiderthisimagefromNationalCancerGoal:ToquantifytheamountofInitialmethod:Post-docsitsbehindmicroscopeandcountsthenumberofmetastaticspotsnottootimeconsumingforoneimage

NotaveryconvincingGoal:ToquantifytheamountoftissuemetastasisforInitialmethod:Post-docsitsbehindmicroscopeandcountsthenumberofmetastaticspotsHowautomatedcomputingObtainobjectiveReducecostsandDecreaseprocessingandanalysisAlleviatehumanerrorsandtranscriptionTectorialMembraneGoal:DetermineelasticityofTectorialMembraneAtomicForceMicroscopeInitialmethodtoAtomicForceMicroscopeAnalysisof1AFMfiletook30-40Arealisticgoalwastoanalyze10filesinoneWithautomatedcomputing,theobtainableamountofdataincreasedAnalysisof1AFMnowtook3-4Nowwecouldanalyze100soffilesinaportionofaAnalysisofFluoresceinGoalDeterminemeancirculationtime平均循環時間(MCT)andretinalbloodflow視網膜血流Intensity, Intensity,Fit ensity-vs-TimetolognormalparameterizedbyIo,Ip,tp,b(shapeMCTMCT=tm,vein-RBF=2art+ AnalysisofFluorescein =t-t)exp3 Manuallytrackvessels,collectingtime-intensitydata(40minutesinadarkroom!)Manuallyidentifyarteries,TransferintensityinformationtostatisticspackagetocalculatefitparametersDetermineManuallymeasurevesselCalculateLogresultsinlabPerfectapplicationforneuralnetworksAutomatedtheanalysiswithMATLABandCodecurrentlyusedinlabsLet’stakeaGoal:Determinemeancirculationtime(MCT)andretinalbloodflowPreviousTimeVeryAutomatedcomputingallowedusObtainobjectiveDecreaseprocessingandanalysisReducecostsandTypicalAccessAnalyzeShareSimBiology,Systems

SimBiology?providesanappandprogrammatictoolstomodel,simulate,andanalyzedynamicsystems,focusingonpharmacokinetic/pharmacodynamic(PK/PD)andsystemsbiologyapplications.Itprovidesablockdiagrameditorforbuildingmodels,oryoucancreatemodelsprogrammaticallyusingtheMATLAB?language.SimBiologyincludesalibraryofcommonPKmodels,whichyoucancustomizeandintegratewithmechanisticsystemsbiologymodels.Avarietyofmodelexplorationtechniquesletyouidentifyoptimaldosingschedulesandputativedrugtargetsincellularpathways.SimBiologyusesordinarydifferentialequations(ODEs)andstochasticsolverstosimulatethetimecourseprofileofdrugexposure,drugefficacy,andenzymeandmetabolitelevels.Youcaninvestigatesystemdynamicsandguideexperimentationusingparametersweepsandsensitivityanalysis.YoucanalsousesinglesubjectorpopulationdatatoestimatemodelSimBiologyUserinterfacetofacilitatebuilding,simulating,andanalyzingdynamicImport,build,andexportmechanisticorPKPDrepresentationofsystemSimulateresponsestobiologicalvariabilityordifferentdosingconditions,scanparameterranges,calculatesensitivitiesLeast-squaresestimationofgroupedorpooleddata,andmaximumlikelihoodestimationofpopulationparametersDeploySimBiologymodelsforstandaloneQuestionstoWhatisthevalueofmodelingQuestionstoWhyCreateQuantitativeBiochemicalReactionBiochemicalpathwaysstartoutsimpleandquicklygrowinTestingpathwaysviaexperimentisexpensiveinbothtimeandmoney.QuantitativemodelingnarrowstherangeofOncecreatedandvalidatedwithexperimentsthequantitativemodelcanbeusedasanin-silicosandboxtotestnewideasdramaticallyfasterthanthroughexperimentation.ChallengeswithincomputingbiochemicalIntegratingknowledgefromexperimentaldata,intuition,literature,andothermodelsisdifficultModelersandscientistshavedifficultycommunicatingknowledgeandsharingworkThemathematicsforsolvingthesemodelsisevolvingfasterthanthetoolsManydifferenttoolsareneededtocompleteentireworkflowModelcreatedbyEnterinchemicalEstimateparametersusingexperimentaldataIsolaterelevantparametersusingsensitivityanalysis>>IntroductiontoProvidesoneenvironmentforbothgraphicalandprogrammaticIntroductiontoProvidesonetoolformodeling,simulating,andanalyzingpathwaysUsedbymodelersorprogrammerstogaininsightintotheirpathwayandtocommunicatetheirpathwaywithKeyBuildingaTabularViaMATLABImportSBMLRunningaAnalyzingaSensitivityLet’sLet’sbuildasimpleAsimplegeneregulationmodelwithtranslation,andnegativefeedbacktosuppressLet’sbuildasimpleTranscription:theprocessthroughwhichaDNAsequenceisenzymaticallycopiedbyanRNApolymerase聚合酶toproduceacomplementaryRNA;thetransferofgeneticinformationfromDNAintoRNA.Translation:thesecondpartofproteinbiosynthesis生物合成,inwhichanmRNAsequenceisconvertedtoachainofaminoacidstoformaprotein.

>>>>Pharmacokinetics.Thestudyofwhatthebodydoestoadrugafteradministration.是指抗生ThestudyofAbsorptionDistributionMetabolismandExcretion分泌(ADME)ofdrugsinthebodyPharmacodynamics.Thestudyofwhatthedrugdoestothebody.是指抗生素在感染部位達到相應的濃Thestudyofthebiochemicalandphysiological生理學effectsofdrugsmechanismsofdrugactionrelationshipbetweendrugconcentrationandeffectPROBLEM:Theeffectofadrugiscalculatedfromtheamountinthebiophase,which,unfortunately,cannotbedirectlymeasured.PKknowledgeisneededtomodeltransferofdrugfrombloodtoeffectsiteChallengesinPK/PDManytoolsChallengesinPK/PDNONMEM,Basic,Fortan,C:Buildingandmaintainingmodelscanbedifficult.OrganSpecificornicheSimulationtoolsaretoocomplexand/orblackboxOrganmodelsnoteditable,methodsarenotFlexibilityisWorkflowismanual,notModelling,simulation,statistics,andvisualizationallrequiredifferenttoolsManualintegrationistimePKExampleTransdermalInputNicotinepatchisappliedtotheskinfor16Overlappingzero-orderinputDrugconcentrationmonitoredfor24Singlecompartment

Rapiddecreaseinconcentrationwheninfusionratesdrop==

Totaldose–Doseslow

dC/dt=(FfastdC/dt=(Ffast+Fslow–

NoPKExample…PKExample…1234568FastinfusionrunsfortimeSlowinfusionrunsfortimeInitialnicotineconcentration=2V=140V=140=78 =6=17

GenericPBPKmodelofFromPoulinandThiel;JPharmaceuticalSciences.91:5,MayFromPoulinandThiel;JPharmaceuticalSciences.91:5,MayPKExample–Let’sPKExample–Let’sshowhowwemightimplementthisin>>>>Read,analyze,andvisualizegenomicandproteomicBioinformaticsToolbox?providesalgorithmsandappsforNextGenerationSequencing(NGS),microarrayanalysis,massspectrometry,andgeneontology.Usingtoolboxfunctions,youcanreadgenomicandproteomicdatafromstandardfileformatssuchasSAM,FASTA,CEL,andCDF,aswellasfromonlinedatabasessuchastheNCBIGeneExpressionOmnibusandGenBank?.Youcanexploreandvisualizethisdatawithsequencebrowsers,spatialheatmaps,andclustergrams.Thetoolboxalsoprovidesstatisticaltechniquesfordetectingpeaks,imputingvaluesformissingdata,andselectingBioinformaticsToolbox--KeyNextGenerationSequencinganalysisandSequenceanalysisandvisualization,includingpairwiseandmultiplesequencealignmentandpeakdetectionMicroarraydataanalysis,includingreading,filtering,normalizing,andMassspectrometryanalysis質譜分析includingclassification,andmarkerPhylogenetictreeGraphtheoryfunctions,includinginteractionmaps,hierarchyplots,andpathwaysDataimportfromgenomic,proteomic,andgeneexpressionfiles,includingSAM,FASTA,CEL,andCDF,andfromdatabasessuchasNCBIandGenBankThemicroarraydataforthisexampleisDeRisi,J.L.,Iyer,V.R.,andBrown,P.O.(Oct24,1997).Exploringthemetabolicandgeneticcontrolofgeneexpressiononagenomicscale.Science,278(5338),680–686.PMID:9381177.TheauthorsusedDNAmicroarraystostudytemporalgeneexpressionofalmostallgenesinSaccharomycescerevisiaeduringthemetabolicshiftfromfermentationtorespiration.Expressionlevelsweremeasuredatseventimepointsduringthediauxicshift.ThefulldatasetcanbedownloadedfromtheGeneExpressionOmnibusWebsiteat:1、LoaddataintotheMATLABenvironment.loadyeastdata.mat2、GetthesizeofthedatabyAns=Accesstheentriesusingcellarray%Thisdisplaysthe15throwofthevariableyeastvalues,whichcontainsexpressionlevelsfortheopenreadingframe(ORF)YAL054C.ans=4、UsethefunctionwebtoaccessinformationaboutthisORFintheSaccharomycesGenomeDatabase(SGD).url=5、AsimpleplotcanbeusedtoshowtheexpressionprofileforthisORF(openreadingframe).xlabel('Time(Hours)');6、Plottheactualvalues.plot(times,2.^yeastvalues(15,:))xlabel('Time(Hours)');ylabel('RelativeExpressionLevel');TheMATLABsoftwareplotsthefigure.ThegeneassociatedwiththisORF,appearstobestronglyup-regulatedduringthediauxicshift.7、Compareothergenesbyplottingmultiplelinesonthesamefigure.holdxlabel('Time(Hours)');ylabel('RelativeExpressionLevel');title('ProfileExpressionLevels');TheMATLABsoftwareplotstheThisprocedureillustrateshowtofilterthedatabyremovinggenesthatarenotexpressedordonotchange.Thedatasetisquitelargeandalotoftheinformationcorrespondstogenesthatdonotshowanyinterestingchangesduringtheexperiment.Tomakeiteasiertofindtheinterestinggenes,reducethesizeofthedatasetbyremovinggeneswithexpressionprofilesthatdonotshowanythingofinterest.Thereare6400expressionprofiles.Youcanuseanumberoftechniquestoreducethenumberofexpressionprofilestosomesubsetthatcontainsthemostsignificantgenes. M‘emptySpots=strcmp('EMPTY',genes);yeastvalues(emptySpots,:)=[];genes(emptySpots)=[];2、Usetheisnanfunctiontoidentifythegeneswithmissingdataandthenuseindexingcommandstoremovethegenes.nanIndices=any(isnan(yeastvalues),2);yeastvalues(nanIndices,:)=[];genes(nanIndices)=[];ans3、UsethefunctiongenevarfiltertofilteroutgeneswithsmallvarianceovertimeThefunctionreturnsalogicalarrayofthesamesizeasthevariablegeneswithonescorrespondingtorowsofyeastvalueswithvariancegreaterthanthe10thpercentileandzeroscorrespondingtothosebelowthethreshold.mask=%Usethemaskasanindexintothevaluestoremove%filteredyeastvalues=yeastvalues(mask,:);genes=genes(mask);ans4、Thefunctiongenelowvalfilterremovesgenesthathaveverylowabsoluteexpressionvalues.Notethatthegenefilterfunctionscanalsoautomaticallycalculatethefiltereddataandnames.[mask,yeastvalues,genes]=ans5、Usethefunctiongeneentropyfiltertoremovegeneswhoseprofileshavelowentropy:[mask,yeastvalues,genes]=ans H=?ln(1/30)= uniformNowthatyouhaveamanageablelistofgenes,youcanlookforrelationshipsbetweentheprofilesusingsomedifferentclusteringtechniquesfromtheStatisticsandMachineLearningToolbox?1、Forhierarchicalclusteringthefunctionpdistcalculatesthepairwisedistancesbetweenprofiles,andthefunctionlinkagecreatesthehierarchicalclustertree.corrDist=pdist(yeastvalues,'corr');clusterTree=linkage(corrDist,'average');2、ThefunctionclustercalculatestheclustersbasedoneitheracutoffdistanceoramaximumnumberofclustersInthiscasethe'maxclust'optionisusedtoidentify16distinctclusters.clusters=cluster(clusterTree,'maxclust',3、Theprofilesofthegenesintheseclusterscanbeplottedtogetherusingasimpleloopandthefunctionsubplot.forc=1:16plot(times,yeastvalues((clusters==c),:)');axistightsuptitle('HierarchicalClusteringof4、TheStatisticsandMachineLearningToolboxsoftwarealsohasaK-meansclusteringfunction.Again,16clustersarefound,butbecausethealgorithmisdifferentthesearenotnecessarilythesameclustersasthosefoundbyhierarchicalclustering.forc=

TheMATLABsoftwareiterations,totalsumofdistances=iterations,totalsumofdistances=8.6267426iterations,totalsumofdistances=8.8606622iterations,totalsumofdistances=9.7767626iterations,totalsumofdistances=9.010354、TheStatisticsandMachineLearningToolboxsoftwarealsohasaK-meansclusteringfunction.Again,16clustersarefound,butbecausethealgorithmisdifferentthesearenotnecessarilythesameclustersasthosefoundbyhierarchicalclustering.forc=5、Insteadofplottingalloftheprofiles,youcanplotjusttheforc=1:16axistightaxisoff %turnofftheaxissuptitle('K-MeansClusteringofClustering6、YoucanusethefunctionclustergramtocreateaheatmapanddendrogramfromtheoutputofthehierarchicalClusteringPrincipal-componentanalysis(PCA)isausefultechniqueyoucanusetoreducethedimensionalityoflargedatasets,suchasthosefrommicroarrayanalysis.YoucanalsousePCAtofindsignalsinnoisydata.1、UsethepcafunctionintheStatisticsandMachineLearningToolboxsoftwaretocalculatetheprincipalcomponentsofadataset.[pc,zscores,pcvars]=pca(yeastvalues)

TheMATLABsoftwarepcColumns1through2、Youcanusethefunctioncumsumtoseethecumulativesumofthevariances.cumsum(pcvars./sum(pcvars)*Thisshowsthatalmost90%ofthevarianceisaccountedforbythefirsttwoprincipal

TheMATLABsoftwareans3、Ascatterplotofthescoresofthefirsttwoprincipalcomponentsshowsthattherearetwodistinctregions.Thisisnotunexpected,becausethefilteringprocessremovedmanyofthegeneswithlowvarianceorlowinformation.Thesegeneswouldhaveappearedinthemiddleofthescatterplot.xlabel('FirstPrincipalComponent');ylabel('SecondPrincipalComponent');title('PrincipalComponentScatterPlot');4、ThegnamefunctionfromtheStatisticsandMachineLearningToolboxsoftwarecanbeusedtoidentifygenesonascatterplot.Youcanselectasmanypointsasyoulikeonthescatterplot.5、AnalternativewaytocreateascatterplotiswiththegscatterfunctionfromtheStatisticsandMachineLearningToolboxsoftware.gscattercreatesagroupedscatterplotwherepointsfromeachgrouphaveadifferentcolorormarker.Youcanuseclusterdata,oranyotherclusteringfunction,togroupthepcclusters=clusterdata(zscores(:,1:2),6);xlabel('FirstPrincipalComponent');ylabel('SecondPrincipalComponent');title('PrincipalComponentScatterPlotwithColoredgname(genes)%Pressenterwhenyoufinishselectinggenes.SupportedDataSupportedDataBLAST

GeneExpressionOtherDataDesignofPrimersforAutomatedDNACalculatepropertiesofFilterprimersbasedonGCcontentorCheckfordimerizationandhairpinRetrieveprimerFindrestrictionenzymethatcutinsideIsolateprimerslackingaGC Pos 50cacatagcccttgccataag11375054.37AppliedBiosystemsDevelopsaCrucialDNASequencingAlgorithminMATLAB?TheTodeveloparobustyetflexiblecalibrationalgorithmtobeincludedinahigh-throughputDNAanalysisinstrumentTheUseMATLABtotestideasandcodeaprototype,andthenusetheMATLAB

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論