




版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
AdditiveModels,Trees,andRelatedModelsProf.LiqingZhangDept.ComputerScience&Engineering,ShanghaiJiaotongUniversityIntroduction9.1:GeneralizedAdditiveModels9.2:Tree-BasedMethods9.3:PRIM:BumpHunting9.4:MARS:MultivariateAdaptiveRegressionSplines9.5:HME:HierarchicalMixtureofExperts9.1GeneralizedAdditiveModelsIntheregressionsetting,ageneralizedadditivemodelshastheform:
Here
sareunspecifiedsmoothandnonparametricfunctions.InsteadofusingLBE(LinearBasisExpansion)inchapter5,wefiteachfunctionusingascatterplotsmoother(e.g.acubicsmoothingspline)GAM(cont.)Fortwo-classclassification,theadditivelogisticregressionmodelis:
Here
GAM(cont)Ingeneral,theconditionalmeanU(x)ofaresponseyisrelatedtoanadditivefunctionofthepredictorsviaalinkfunctiong:
Examplesofclassicallinkfunctions:Identity:Logit:Probit:Log:FittingAdditiveModelsTheadditivemodelhastheform:HerewehaveGivenobservations,acriterionlikepenalizedsumsquarescanbespecifiedforthisproblem:
Wherearetuningparameters.FAM(cont.)Conclusions:ThesolutiontominimizePRSSiscubicsplines,howeverwithoutfurtherrestrictionsthesolutionisnotunique.Ifholds,itiseasytoseethat:
Ifinadditiontothisrestriction,thematrixofinputvalueshasfullcolumnrank,then(9.7)isastrictconvexcriterionandhasanuniquesolution.Ifthematrixissingular,thenthelinearpartoffjcannotbeuniquelydetermined.(Buja1989)LearningGAM:BackfittingBackfittingalgorithmInitialize:Cycle: j=1,2,…,p,…,1,2,…,p,…,(mcycles)
UntilthefunctionschangelessthanaprespecifiedthresholdBackfitting:PointstoPonderComputationalAdvantage?Convergence?Howtochoosefittingfunctions?FAM(cont.)Initialize:Cycle:untilthefunctionschangelessthanaprespecifiedthreshold.Algorithm9.1TheBackfittingAlgorithmforAdditiveModels.AdditiveLogisticRegression2024/12/711ThegeneralizedadditivelogisticmodelhastheformThefunctionsareestimatedbyabackfittingwithinaNewton-Raphsonprocedure.LogisticRegressionModeltheclassposteriorintermsofK-1log-oddsDecisionboundaryissetofpointsLineardiscriminantfunctionforclasskClassifytotheclasswiththelargestvalueforits
k(x)LogisticRegressioncon’tParametersestimationObjectivefunctionParametersestimationIRLS(iterativelyreweightedleastsquares)Particularly,fortwo-classcase,usingNewton-Raphsonalgorithmtosolvetheequation,theobjectivefunction:LogisticRegressioncon’tLogisticRegressioncon’tLogisticRegressioncon’tLogisticRegressioncon’tLogisticRegressioncon’tFeatureselectionFindasubsetofthevariablesthataresufficientforexplainingtheirjointeffectontheresponse.Onewayistorepeatedlydroptheleastsignificantcoefficient,andrefitthemodeluntilnofurthertermscanbedroppedAnotherstrategyistorefiteachmodelwithonevariableremoved,andperformananalysis
ofdeviancetodecidewhichonevariabletoexcludeRegularizationMaximumpenalizedlikelihoodShrinkingtheparametersviaanL1constraint,imposingamarginconstraintintheseparablecaseFittinglogisticregressionFittingadditivelogisticregression1.2.Iterate:Usingweightedleastsquarestofitalinearmodeltoziwithweightswi,givenewestimates3.Continuestep2until converge1. where2.Iterate:b.a.a.c.c.Usingweightedbackfittingalgorithmtofitanadditivemodeltoziwithweightswi,givenewestimatesb.3.Continuestep2untilconvergeAdditiveLogisticRegression:BackfittingAdditiveLogisticRegressionComputestartingvalues:,where,thesampleproportionofones,andset.Defineand.Iterate:ConstructtheworkingtargetvariableConstructweightsFitanadditivemodeltothetargetsziwithweightswi,usingaweightedbackfittingalgorithm.ThisgivesnewestimatesContinuestep2.untilthechangeinthefunctionsfallsbelowaprespecifiedthreshold.Algorithm9.2LocalScoringAlgorithmfortheAdditiveLogisticRegressionModel.SPAMDetectionviaAdditiveLogisticRegressionInputvariables(predictors):48quantitativepredictors—thepercentageofwordsintheemailthatmatchagivenword.Examplesincludebusiness,address,internet,free,andgeorge.Theideawasthatthesecouldbecustomizedforindividualusers.6quantitativepredictors—thepercentageofcharactersintheemailthatmatchagivencharacter.Thecharactersarech;,ch(,ch[,ch!,ch$,andch#.Theaveragelengthofuninterruptedsequencesofcapitalletters:CAPAVE.Thelengthofthelongestuninterruptedsequenceofcapitalletters:CAPMAX.Thesumofthelengthofuninterruptedsequencesofcapitalletters:CAPTOT.Outputvariable:SPAM(1)orEmail(0)fj’saretakenascubicsmoothingsplines2024/12/7AdditiveModels222024/12/7AdditiveModels232024/12/7AdditiveModels24SPAMDetection:ResultsTrueClassPredictedClassEmail(0)SPAM(1)Email(0)58.5%2.5%SPAM(1)2.7%36.2%Sensitivity:Probabilityofpredictingspamgiventruestateisspam=Specificity:Probabilityofpredictingemailgiventruestateisemail=GAM:SummaryUsefulflexibleextensionsoflinearmodelsBackfittingalgorithmissimpleandmodularInterpretabilityofthepredictors(inputvariables)arenotobscuredNotsuitableforverylargedataminingapplications(why?)Introduction9.1:GeneralizedAdditiveModels9.2:Tree-BasedMethods9.3:PRIM:BumpHunting9.4:MARS:MultivariateAdaptiveRegressionSplines9.5:HME:HierarchicalMixtureofExperts9.2Tree-BasedMethodBackground:Tree-basedmodelspartitionthefeaturespaceintoasetofrectangles,andthenfitasimplemodel(likeaconstant)ineachone.Apopularmethodfortree-basedregressionandclassificationiscalledCART(classificationandregressiontree)CARTCARTExample:Let’sconsideraregressionproblem:continuousresponseYinputsX1andX2.Tosimplifymatters,weconsiderthepartitionshownbythetoprightpaneloffigure.ThecorrespondingregressionmodelpredictsYwithaconstantCminregionRm:Forillustration,wechooseC1=-5,C2=-7,C3=0,C4=2,C5=4inthebottomrightpanelinfigure9.2.RegressionTreeSupposewehaveapartitionintoMregions:R1R2…..RM.WemodeltheresponseYwithaconstantCmineachregion:
IfweadoptasourcriterionminimizationofRSS,itiseasytoseethat:RegressionTree(cont.)FindingthebestbinarypartitionintermofminimumRSSiscomputationallyinfeasibleAgreedyalgorithmisused.
HereRegressionTree(cont.)Foranychoicejands,theinnerminimizationissolvedby:ForeachsplittingvariableXj
thedeterminationofsplitpointscanbedoneveryquicklyandhencebyscanningthroughalloftheinput,determinationofthebestpair(j,s)isfeasible.Havingfoundthebestsplit,wepartitionthedataintotworegionsandrepeatthesplittingprogressineachofthetworegions.RegressionTree(cont.)Weindexterminalnodesbym,withnodemrepresentingregionRm.Let|T|denotesthenumberofterminalnotesinT.Letting:Wedefinethecostcomplexitycriterion:ClassificationTree:theproportionofclasskonmode:themajorityclassonnodemInsteadofQm(T)definedin(9.15)inregression,wehavedifferentmeasuresQm(T)ofnodeimpurityincludingthefollowing:MisclassificationError:GiniIndex:Cross-entropy(deviance):ClassificationTree(cont.)Example:Fortwoclasses,ifpistheproportionofinthesecondclass,thesethreemeasuresare:
1-max(p,1-p),2p(1-p),-plog(p)-(1-p)log(1-p)2024/12/7AdditiveModels372024/12/7AdditiveModels38Introduction9.1:GeneralizedAdditiveModels9.2:Tree-BasedMethods9.3:PRIM:BumpHunting9.4:MARS:MultivariateAdaptiveRegressionSplines9.5:HME:HierarchicalMixtureofExperts9.3PRIM:BumpHuntingThepatientruleinductionmethod(PRIM)findsboxesinthefeaturespaceandseeksboxesinwhichtheresponseaverageishigh.Henceitlooksformaximainthetargetfunction,anexerciseknownasbumphunting.PRIM(cont.)PRIM(cont.)PRIM(cont.)Introduction9.1:GeneralizedAdditiveModels9.2:Tree-BasedMethods9.3:PRIM:BumpHunting9.4:MARS:MultivariateAdaptiveRegressionSplines9.5:HME:HierarchicalMixtureofExpertsMARS:MultivariateAdaptiveRegressionSplinesMARSusesexpansionsinpiecewiselinearbasisfunctionsoftheform:(x-t)+and(t-x)+.Wecallthetwofunctionsareflectedpair.MARS(cont.)TheideaofMARSistoformreflectedpairsforeachinputXjwithknotsateachobservedvalueXij
ofthatinput.Therefore,thecollectionofbasisfunctionis:Themodelhastheform:whereeachhm(x)isafunctioninCoraproductoftwoormoresuchfunctions.MARS(cont.)Westartwithonlytheconstantfunctionh0(x)=1inthemodelsetMandallfunctionsinthesetCarecandidatefunctions.AteachstageweaddtothemodelsetMthetermoftheform:
thatproducesthelargestdecreaseintrainingerror.TheprocessiscontinueduntilthemodelsetMcontainssomepresetmaximumnumberofterms.MARS(cont.)MARS(cont.)MARS(cont.)Thisprogresstypicallyoverfitsthedataandsoabackwarddeletionprocedureisapplied.ThetermwhoseremovalcausesthesmallestincreaseinRSSisdeletedfromthemodelateachstep,producinganestimatedbestmodelofeachsize.GeneralizedcrossvalidationisappliedtoestimatetheoptimalvalueofThevalueM(λ)istheeffectivenumberofparametersinthemodel.Introduction9.1:GeneralizedAdditiveModels9.2:Tree-BasedMethods9.3:PRIM:BumpHunting9.4:MARS:MultivariateAdaptiveRegressionSplines9.5:HME:HierarchicalMixtureofExpertsHierarchicalMixturesofExpertsTheHMEmethodcanbeviewedasavariantofthetree-basedmethods.Difference:Themaindifferenceisthatthetreesplitsarenotharddecisionsbutrathersoftprobabilisticones.InanHMEalinear(orlogisticregression)modelisfittedineachterminalnode,insteadofaconstantasintheCART.HME(cont.)Asimpletwo-levelHMEmodelisshowninFigure.Itcanbeviewedasatreewithsoftsplitsateachnon-terminalnode.HME(cont.)Theterminalnodeiscalledexpertandthenon-terminalnodeiscalledgatingnetworks.TheideaeachexpertprovidesapredictionabouttheresponseY,thesearecombinedtogetherbythegatingnetworks.
HME(cont.)Thetopgatingnetworkhastheoutput:whereeachisavectorofunknownparameters.Thisrepres-entsasoftK-waysplit(K=2inFigure9.13.)Eachistheprobabilityofassigninganobservationwithfeaturevectorxtothejthbranch.HME(cont.)Atthesecondlevel,thegatingnetworkshaveasimilarform:Attheexperts,wehaveamodelfortheresponsevariableoftheform:Thisdiffersaccordingtotheproblem.HME(cont.)Reg
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025建筑材料供應合作合同
- 2025企業勞動派遣合同范本
- 2025塔吊租賃和維護的合同書
- 2025建筑外墻仿石漆施工合同 安全施工協議
- 2025適宜用于商業注冊的辦公室租賃合同
- 土建工程施工合同補充協議
- 裝飾畫制作合同范本
- 起重機轉讓合同范本
- 寵物美容洗澡協議書
- 共享商圈協議書范本
- 《肌力訓練》課件
- 全媒體運營師-國家職業標準(2023年版)
- 招標投標法培訓課件
- GLB-2防孤島保護裝置試驗報告
- 中鐵員工內退管理辦法
- 皮膚科玫瑰痤瘡的臨床表現和診斷學習培訓課件
- 高考日語復習:日語形容詞用法專項課件
- 馬克思主義與社會科學方法論概述(課件)
- 城市道路養護技術規程
- 2023年國家藥監局直屬單位公開招聘筆試參考題庫(共500題)答案詳解版
- JGT116-2012 聚碳酸酯(PC)中空板
評論
0/150
提交評論