




版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
ObjectNet:Alarge-scalebias-controlleddatasetfor
pushingthelimitsofobjectrecognitionmodels
AndreiBarbu?
MIT,CSAIL&CBMM
DavidMayo?
MIT,CSAIL&CBMM
JulianAlverio
MIT,CSAIL
WilliamLuo
MIT,CSAIL
ChristopherWang
MIT,CSAIL
DanGutfreund
MIT-IBMWatsonAI
JoshuaTenenbaum
MIT,BCS&CBMM
BorisKatz
MIT,CSAIL&CBMM
Abstract
Wecollectalargereal-worldtestset,ObjectNet,forobjectrecognitionwithcontrols
whereobjectbackgrounds,rotations,andimagingviewpointsarerandom.Most
scienti?cexperimentshavecontrols,confoundswhichareremovedfromthedata,
toensurethatsubjectscannotperformataskbyexploitingtrivialcorrelationsin
thedata.Historically,largemachinelearningandcomputervisiondatasetshave
lackedsuchcontrols.Thishasresultedinmodelsthatmustbe?ne-tunedfornew
datasetsandperformbetterondatasetsthaninreal-worldapplications.When
testedonObjectNet,objectdetectorsshowa40-45%dropinperformance,with
respecttotheirperformanceonotherbenchmarks,duetothecontrolsforbiases.
ControlsmakeObjectNetrobustto?ne-tuningshowingonlysmallperformance
increases.Wedevelopahighlyautomatedplatformthatenablesgatheringdatasets
withcontrolsbycrowdsourcingimagecapturingandannotation.ObjectNetis
thesamesizeastheImageNettestset(50,000images),andbydesigndoesnot
comepairedwithatrainingsetinordertoencouragegeneralization.Thedataset
isbotheasierthanImageNet–objectsarelargelycenteredandunoccluded–and
harder,duetothecontrols.Althoughwefocusonobjectrecognitionhere,data
withcontrolscanbegatheredatscaleusingautomatedtoolsthroughoutmachine
learningtogeneratedatasetsthatexercisemodelsinnewwaysthusproviding
valuablefeedbacktoresearchers.Thisworkopensupnewavenuesforresearch
ingeneralizable,robust,andmorehuman-likecomputervisionandincreating
datasetswhereresultsarepredictiveofreal-worldperformance.
1Introduction
Datasetsareofcentralimportancetocomputervisionandmorebroadlymachinelearning.Particularly
withtheadventoftechniquesthatarelesswellunderstoodfromatheoreticalpointofview,raw
performanceondatasetsisnowthemajordriverofnewdevelopmentsandthemajorfeedbackabout
thestateofthe?eld.Yet,asacommunity,wecollectdatasetsinawaythatisunusualcomparedto
otherscienti?c?elds.Werelyalmostexclusivelyondatasetsizetominimizeconfounds(arti?cial
correlationsbetweenthecorrectlabelsandfeaturesintheinput),toattestunusualphenomena,and
encouragegeneralization.Unfortunately,scaleisnotenoughbecauseofrareeventsandbiases–
Sunetal.[1]provideevidencethatweshouldexpecttoseelogarithmicperformanceincreasesasa
functionofdatasetsizealone.Thesourcesofdatathatdatasetsdrawontodayarehighlybiased,e.g.,
objectclassiscorrelatedwithbackgrounds[2],andomitmanyphenomena,e.g.,objectsappearin
stereotypicalrotationswithlittleocclusion.Theresultingdatasetsthemselvesaresimilarlybiased[3].
?Equalcontribution.Websitehttps://objectnet.dev.Correspondingauthorabarbu@
33rdConferenceonNeuralInformationProcessingSystems(NeurIPS2019),Vancouver,Canada.
100
90
80
70
60
ImageNetTop-5
ImageNetTop-1
OverlapTop-5
OverlapTop-1
ObjectNetTop-5
ObjectNetTop-1
50
40
40-45%
performance
drop
30
20
10
Detectors
byyear
0
Figure1:PerformanceonObjectNetforhigh-performingdetectorstrainedonImageNetinrecent
years:AlexNet[4],VGG-19[5],ResNet-152[6],Inception-v4[7],NASNET-A[8],andPNASNet-5
Large[9].Solidlinesshowtop-1performance,dashedlinesshowtop-5performance.ImageNet
performanceonall1000classesisshowningreen.ImageNetperformanceonclassesthatoverlap
withObjectNetisshowninblue;thetwooverlapin113classesoutof313ObjectNetclasses,which
areonlyslightlymoredif?cultthantheaverageImageNetclass.PerformanceonObjectNetfor
thoseoverlappingclasses.Weseea40-45%dropinperformance.Objectdetectorshaveimproved
substantially.PerformanceonObjectNettracksperformanceonImageNetbutthegapbetweenthe
tworemainslarge.
Inotherareasofscience,suchissuesarecontrolledforwithcarefuldatacreationandcurationthat
intentionallycoversphenomenaandcontrolsforbiases–importantideasthatdonoteasilyscaleto
largedatasets.Forexample,modelsfornaturallanguageinference,NLI,thatperformwellonlarge
datasetsfailwhensystematicallyvaryingaspectsoftheinput[10],butthesearenotcollectedatscale.
Incomputervision,datasetslikeCLEVR[11]dothesamethroughsimulation,butsimulateddatais
mucheasierformoderndetectorsthanreal-worlddata.Weshowthatwithsigni?cantautomationand
crowdsourcing,youcanhavescaleandcontrolsinreal-worlddataandthatthisprovidesfeedback
aboutthephenomenathatmustbeunderstoodtoachievehuman-levelaccuracy.
ObjectNetisanewlargecrowdsourcedtestsetforobjectrecognitionthatincludescontrolsforobject
rotations,viewpoints,andbackgrounds.Objectsareposedbyworkersintheirownhomesinnatural
settingsaccordingtospeci?cinstructionsdetailingwhatobjectclasstheyshoulduse,howandwhere
theyshouldposetheobject,andwheretoimagethescenefrom.Everyimageisannotatedwiththese
properties,allowingustotesthowwellobjectdetectorsworkacrosstheseconditions.Eachofthese
propertiesisrandomlysampledleadingtoamuchmorevarieddataset.
Ineffect,weareremovingsomeofthebrittlepriorsthatobjectdetectorscanexploittoperformwell
onexistingdatasets.Overall,currentobjectdetectorsexperiencealargeperformanceloss,40-45%,
whensuchpriorsareremoved;see?g.1forperformancecomparisons.Eachofthecontrolsremoves
aprioranddegradestheperformanceofdetectors;see?g.2forsampleimagesfromthedataset.
Practically,thismeansthatanimportantfeedbackforthecommunityaboutthelimitationsofmodels
ismissing,andthatperformanceondatasetsislimitedasapredictoroftheperformanceuserscan
expectontheirownunrelatedtasks.
2
ImageNetObjectNet
ChairsChairsby
rotation
Chairsby
background
Chairsby
viewpointTeapotsT-shirts
Figure2:ImageNet(leftcolumn)oftenshowsobjectsontypicalbackgrounds,withfewrotations,and
fewviewpoints.TypicalObjectNetobjectsareimagedinmanyrotations,ondifferentbackgrounds,
frommultipleviewpoints.The?rstthreecolumnsshowchairsvaryingbythethreepropertiesthatare
beingcontrolledfor:rotation,background,andviewpoint.Onecanseethelargevarietyintroduced
tothedatasetbecauseofthesemanipulations.ObjectNetimagesarelightlycroppedforthis?gure
duetoinconsistentaspectratios.MostdetectorsfailonmostoftheimagesincludedinObjectNet.
Toencouragegeneralization,wemakethreeotherunusualchoiceswhenconstructingObjectNet.
First,ObjectNetisonlyatestset,anddoesnotcomepairedwithatrainingset.Separatingtraining
andtestsetcollectionmaybeanimportanttooltoavoidcorrelationsbetweenthetwowhichare
easilyaccessibletolargemodelsbutnotdetectablebyhumans.Sincehumanseasilygeneralize
tonewdatasets,adoptingthisseparationcanencouragenewmachinelearningtechniquesthatdo
thesame.Second,whileObjectNetwillbefreelyavailable,itcomeswithanimportantstipulation:
onecannotupdatetheparametersofanymodelforanyreasonontheimagespresentinObjectNet.
While?ne-tuningfortransferlearningiscommon,itencouragesover?ttingtoparticulardatasets
–wedisallow?ne-tuningbutreportsuchexperimentsinsection4.3todemonstratetherobustness
ofthedataset.Third,wemarkeveryimagebyaonepixelredborderthatmustberemovedonthe
?ybeforetesting.Aslarge-scalewebdatasetsaregathered,thereisadangerthatdatawillleak
betweenthetrainingandtestsetsofdifferentdatasets.Thishasalreadyhappened,asCaltech-UCSD
Birds-200-2011,apopulardataset,andImageNetwerediscoveredtohaveoverlapputtinginto
questionsomeresults[12].Withtestsetimagesmarkedbyaredborderandavailableonline,onecan
performreverseimagesearchanddetermineifanimageisincludedinanytrainingsetanywhere.We
encourageallcomputervisiondatasets–notjustonesforobjectdetection–toadoptthisstandard.
3
Whileitincludescontrols,ObjectNetisnothardinarbitraryways.Itisinmanywaysintentionally
easycomparedtoImageNetorotherdatasets.Objectsarehighlycentralizedintheimage,they
arerarelyoccludedandeventhenlightlyso,andmanybackgroundsarenotparticularlycluttered.
Inothersenses,ObjectNetisharder,asmallpercentageofviewpoints,rotations,andevenobject
instances,arealsodif?cultforhumans.Thisdemonstratesamuchwiderrangeofdif?cultyand
providesanopportunitytoalsotestthelimitsofhumanobjectrecognition–ifobjectdetectorsare
toaugmentorreplacehumans,suchknowledgeiscritical.Ouroverallgoalistotestthebiasof
detectorsandtheirabilitytogeneralizetospeci?cmanipulations,nottojustcreateimagesthatare
dif?cultforarbitraryreasons.Futureversionsofthedatasetwillratchetupthisdif?cultyintermsof
clutter,occlusion,lighting,etc.withadditionalcontrolsfortheseproperties.
Ourcontributionsare:
1.anewmethodologytoevaluatecomputervisionapproachesondatasetsthathavecontrols,
2.anautomatedplatformtogatherdataatscaleforcomputervision,
3.anewobjectrecognitiontestset,ObjectNet,consistingof50,000images(thesamesizeas
theImageNettestset)and313objectclasses,and
4.ananalysisofbiasesatscaleandtheroleof?ne-tuning.
2Relatedwork
ManylargedatasetsforobjectrecognitionexistsuchasImageNet[13],MSCOCO[14],and
OpenImages[15].Whilethetrainingsetsforthesedatasetsarehuge,thetestsetsarecomparableto
thesizeofthedatasetpresentedhere,withImageNethaving50,000testimages,MSCOCOhaving
81,434,andOpenImageshaving125,436,comparedtoObjectNet’s50,000testimages.Suchdatasets
arecollectedfromrepositoriesofexistingimages,particularlyFlickr,whichconsistofphotographs–
imagesthatuserswanttoshareonline.Thisintentbiasesagainstmanyobjectinstances,backgrounds,
rotations,occlusion,lightingconditions,etc.Biasesleadsimultaneouslytomodelsthatdonottransfer
wellbetweendatasets[3]–detectorspickuponbiasesinsideadatasetandfailwhenthosebiases
change–andthatachievegoodperformancewithlittle?ne-tuningonnewdatasets[16]–detectors
canquicklyacquirethenewbiasesevenwithonlyafewtrainingimagesperclass.Incomputer
visionapplications,biasesmaynotmatchthoseofanyexistingdataset,theymaychangeovertime,
adversariesmayexploitthebiasesofasystem,etc.
Thedataset-dependentnatureofexistingobjectdetectorsiswell-understoodwithseveralother
approaches–asidefromscale–havingbeenattemptedtoalleviatethisproblem.Somefocuson
thedatasetsthemselves,e.g.,Khoslaetal.[17]subdividedatasetsintopartitionsthataresuf?ciently
different,somethingpossibleonlyifdatasetshaveenoughvarietyinthem.Othersfocusonthe
models,e.g.,Zhuetal.[2]trainmodelsthatseparateforegroundsandbackgroundsexplicitlyto
becomemoreresilienttobiases.Demonstratingthevalueofmodelsthathaverobustnessbuiltinto
thembydesignrequiresdatasetsthatcontrolforbiases–controlsarenotjustasanitycheck,they
encouragebetterresearch.
Somedatasets,suchasMPIIcooking[18],KITTI[19],TACoS[20],CHARADES[21],Something-
Something[22],AVA[23],andPartiallyOccludedHands[24]collectnoveldata.Explicitlycollecting
dataisdif?cult,asevidencedbythelargegapinscalebetweenthesedatasetsandthosecollected
fromexistingonlinesources.Atthesametime,explicitinstructionsandcontrolscanleadtomore
variedandinterestingdatasets.Thesedatasetsonthewholedonotattempttoimposecontrolsby
systematicallyvaryingsomeaspectofthedata–usersarepromptedtoperformactionsorhold
objectsbutarenottoldhowtodothisorwhatpropertiesthoseactionsshouldhave.Workerschoose
convenientsettingsandmannersinwhichtoperformactionsleadingtobiasesindatasets.
3Datasetconstruction
ObjectNetiscollectedbyworkersonMechanicalTurkwhoimageobjectsintheirhomes;see?g.3.
Thisgivesuscontroloverthepropertiesofthoseobjectswhilealsoensuringthattheimagesare
natural.Weaskedworkerstoimageobjectsin4backgrounds(kitchens,livingrooms,bedrooms,
washrooms),from3viewpoints(top,angledat45degrees,andside),andin50objectrotations.
Rotationswereuniformlydistributedonasphere,afterwhichnearbypointsweresnappedtothe
equatorandthepoles.Wefoundthatworkersareabletoposeobjectstowithinaround20degreesof
4
Figure3:Workersselectoneobjectthattheyhaveavailablefromasmallnumberofchoices.They
areshownarectangularprism,inblue,withtwolabeledorthogonalaxesinredandyellow.These
labelsareobject-classspeci?c,sothatworkerscanregistertheobjectcorrectlyagainsttherectangular
prism.Wedonotshowworkersimagesofdesiredobjectstonotbiasthemtowardcertaininstances.
Workersseeananimationofhowtheobjectshouldbemanipulated,performthismanipulation,and
thenaligntheobjectagainstthe?nalrectangularprismrenderedontheircamera.Notshownaboveis
thepost-capturereviewUItoensurethatimagescontaintherightobjectsandarenotblurry.
rotationdependingontheaxis,althoughtheuniformityoftheresultingrotationsvariesbyclass.This
couldbemoreaccurate,butweintentionallydidnotshowinstancesofobjectclassestoworkersin
ordertoavoidbiasingthemtowardparticularinstances.Inroughlyonethirdofthetrialsweshowed
arotated3Dcar(carsdonotappearinourdataset)asanadditionalcueforthedesiredrotation.
WorkersaretransitionedtotheirphoneusingaQRcode,anobjectisdescribedtothem(butno
exampleisshown),andtheyverifyifanobjectthatmatchesthedescriptionisavailable.Arectangular
prismisthenpresentedwithlabeledfacesthataresemanticallyrelevanttothatobject,e.g.,thefront
andtopofachair.Eachobjectclasswasannotatedwithtwosemanticallymeaningfulorthogonal
axes,asingleaxisiftheobjectclasswasrotationallysymmetric,ornoaxisifitwasspherical.We
foundthatdescribingsuchpartsinamannerthatleadstolittledisagreementisdif?cultandrequires
carefulvalidation.Whilethisprovidesaweakbiastowardparticularobjectinstances–onemight
imagineachairwithnodistinctivefront–itisnecessaryforexplainingthedesiredobjectpose.
Therectangularprismisalsoanimatedtoshowthedesiredobjectpose.Theanimationstartswith
therectangularprismrepresentingtheobjectinadefaultandcommonpose,e.g.,thefrontofachair
facingauserandthetoppointedupward,andthentransitionsitintothedesiredpose.Another
animationshowstheviewpointfromwhichtheobjectshouldbeimaged.Wefoundthatanimating
suchinstructionswascriticalinallowingworkerstodeterminethedesiredobjectposes.
Workersareaskedtomovetheobjectintoaspeci?croom,poseit,andimageitfromacertain
angle.Therectangularprismwasoverlayedontheirphonecamerainthe?naldesiredposition
withthearrowsmarkingtheclass-speci?csemantically-relevantfaces.Thisalsoprovedcriticalas
rememberingthedesiredrotationforanobjectistoounreliable.
Thisprocessannotateseveryimagewiththreeproperties(rotation,viewpoint,andbackground);it
controlsforbiasesbysamplingthesepropertiesrandomly,thusallowingustoincludeobjectsin
rotationsandscenesthatareunusual.Eachimageisvalidatedtoensurethatitcontainsthecorrect
objectsandthatanyidentifyinginformationisremoved.
Toselectobjectclassesforthedataset,welisted420commonhouseholdobjects.Ofthese,55classes
wereeliminatedbecausetheyarenoteasilymovable,e.g.,beds(16classes),poseasafetyconcern,
e.g.,?realarms(8),weretooconfusingtosubjects,e.g.,wefoundlittleagreementonwhatarmbands
are(10),posedprivacyconcerns,e.g.,people(5),orwerealiveandcannotbemanipulatedsafely,
e.g.,plants(2);numbersdonotaddbecauseclasseswereexcludedformultiplereasons.Inaddition,52objectclassesweretoorare,e.g.,golfclubs.Datawascollectedfor313objectclasses,with≈160imagesperclassonaveragewithastandarddeviationof44.
5
Workersdidnotalwayshaveinstancesofeveryclass.Foreachimagetobecollected,theywere
giventenchoicesoutofwhichtoselectonethatisavailableorrequesttenotherchoices.This
naturallywouldleadtoanextremeclassimbalanceastheeasiestandmostcommonclasseswouldbe
vastlyoverrepresented.Tomaketheclassdistributionmoreuniform,wepresentedobjectsinversely
proportionaltohowfrequenttheyare;theresultingdistributionisfairlyuniform,see?g.4.
Objectsweredescribedtoworkersusingonetofourwords,dependingontheclass.Twoexceptions
weremade,forforksandspoons,asuseragreementonhowtolabeltwoorthogonalfacesofthese
objectclassesisverylow;roughsketcheswereshowninstead.Whenaligningtheirobjectandphone,
workerswereinstructedtoignoretheaspectratiooftherectangularprism.Wefoundthathavinga
singleaspectratio,acubeforexample,forallobjectclasseswasveryconfusingtoworkers.Each
objectclassisannotatedwitharoughaspectratioforitsrectangularprism.Thisagainrepresentsa
smallbiastowardparticularkindsofobjects,althoughthisisalleviatedbythefactthatmostobjects
didnot?tarectangularprismanyway.Deformableobjectswerestillrotatedandusersfollowedthose
rotationsaligningthesemanticallymeaningfulaxeswithobjectparts,butotherdetailsoftheobject
posewerenotcontrolledfor.
Noinstructionsweregivenabouthowtostabilizeobjectsinthedesiredposes.Whennecessary,some
workersheldtheobjectswhileotherproppedthemup.Foreachimage,workerswereaskedtwo
questionsontheirphonecollectionUI:toverifythattheimagedepictsanobjectoftheintendedclass
andthatitisnottooblurry.Inmanyindoorlightingconditions,particularlywithlow-endcameras,it
iseasytotakeunrecognizablephotoswithoutcarefulstabilization.Weestimatethetasktookaround
1.5minutesperobjectonaverageandworkerswerepaid10dollarsperhouronaverage.
Intotal,95,824imageswerecollectedfrom5,982workersoutofwhich50,000imageswereretained
aftervalidationandincludedinthedataset.Eachimagewasmanuallyveri?ed.About48%ofthe
datacollectedwasremoved.In10%ofimages,objectswereplacedinincorrectbackgrounds,showed
faces(0.2%ofimages),orcontainedotherprivateinformation(0.03%ofimages).Wefoundthat
despiteinstructions,manyuserstookphotosofscreensiftheydidnothaveanobject(23%)–these
wereremovedbecauseonthewholetheyareveryeasyformodelstorecognize.Centralizedlocations
thatemployworkersonMechanicalTurkwereeliminatedfromthedatasettoensurethatobjectsare
notimagedonthesamebackgroundsacrossmanyworkers(20%).Notethatsomeproblemcategories
overlapped.Soasnottobiasthedatasettowardimageswhichareeasyforhumans,validatorswere
instructedtobepermissiveandonlyruleoutanimageofanobjectifitclearlyviolatedtheconstraints.
Sinceworkerswhocarryoutthetaskcorrectlydosonearlyperfectly,whileworkerswhodonot,
carriedoutalmosteverytrialincorrectly,wehaveadditionalcon?dencethatimageswhicharehard
torecognizedepictthecorrectobjectclasses.
Thisdatasetconstructionmethodisnotwithoutitslimitations.Allobjectsareindoorobjectswhich
areeasytomanipulate,theycannotbetoolargeorsmall,?xedtothewall,ordangerous.Wecannot
askworkerstomanipulateobjectsinwaysthatwoulddamageorotherwisepermanentlyalterthem.
Someobjectclasseswhicharerarecanbedif?culttogatherandaremorelikelytohaveincorrect
imagesbeforevalidation.Notallundesirablecorrelationsareremovedbythisprocess;forexample,
someobjectsaremorelikelytobeheldthanotherswhilecertainobjectclassesarepredisposedto
haveparticularcolors.Wearenotguaranteedtocoverthespaceofshapesortexturesforeachobject
class.Finally,notallobjectclassesareaseasytorotate,sotheresultingposesarestillcorrelated
withtheobjectclass.
4Results
WeinvestigateobjectdetectorperformanceonObjectNetusinganimagelabelingtask;seesection4.1.
Thenweexplainthisperformancebybreakingdownhowcontrolsaffectresults;section4.2.Finally
wedemonstratethatthedif?cultyofObjectNetliesinthecontrols,andnotintheparticularproperties
oftheimages,by?ne-tuningonthedataset;section4.3.
4.1TransferfromImageNet
WetestedsixobjectdetectorspublishedoverthepastseveralyearsonObjectNet,choosingtop
performersforeachyear:AlexNet(2012)[4],VGG-19(2014)[5],ResNet-152(2016)[6],Inception-
v4(2017)[7],NASNET-A(2018)[8],andPNASNet-5L(2018)[9].Alldetectorswerepre-trained
6
ObjectclassBackground
Rotation?Viewpoint
Figure4:Thedistributionofthe313objectclasses,backgrounds,rotations,andviewpointsinthe
dataset.Theclassdistributionisfairlyuniformduetobiasingworkerstowardlow-frequencyobjects.
Objectbackgrounds,viewpoints,androtationsweresampleduniformlybutrejecteddatacanskew
thedistribution.Eachimageisalsolabeledwitha3Drectangularprismandsemanticallymeaningful
facesforeachobject.Sphericalobjectspopoutoftherotationhistogramastheyhaveasingle
rotation.(?)Notethatobjectrotationsarelessreliablethanthisindicates:notallobjectsareequally
easytorotate,theactualrotationsofobjectspicturedinthedatasetarelessuniform.Thisrepresents
theobjectrotationsthatworkerswereaskedtocollect.Whilethisisalsotrueforbackgroundand
viewpoint,weexpectthatthetruerotationgraphismoreskewedthantheothertwo.
Airfreshener,Alarmclock,Backpack,Bakingsheet,Banana,Bandaid,Baseballbat,Baseballglove,Basket,
Bathrobe,Bathtowel,Battery,Bedsheet,Beerbottle,Beercan,Belt,Bench,Bicycle,Bikepump,Bills
(money),Binder(closed),Biscuits,Blanket,Blender,Blouse,Boardgame,Book(closed),Bookend,Boots,
Bottlecap,Bottleopener,Bottlestopper,Box,Bracelet,Breadknife,Breadloaf,Briefcase,Brooch,Broom,
Bucket,Butcher’sknife,Butter,Button,CD/DVDcase,Calendar,Canopener,Candle,Cannedfood,Cellphone,
Cellphonecase,Cellphonecharger,Cereal,Chair,Cheese,Chesspiece,Chocolate,Chopstick,Clotheshamper,
Clotheshanger,Coaster,Coffeebeans,Coffeegrinder,Coffeemachine,Coffeetable,Coin(money),Comb,
Combinationlock,Computermouse,Contactlenscase,Cookingoilbottle,Cork,Cuttingboard,DVDplayer,
Deodorant,Desklamp,Detergent,Dishragorhandtowel,Dishsoap,Documentfolder(closed),Dogbed,
Doormat,Drawer(open),Dress,Dresspants,Dressshirt,Dressshoe(men),Dressshoe(women),Drill,Drinking
Cup,Drinkingstraw,Dryingrackforclothes,Dryingrackforplates,Dustpan,Earbuds,Earring,Egg,Eggcarton,
Envelope,Eraser(whiteboard),Extensioncable,Eyeglasses,Fan,Figurineorstatue,Firstaidkit,Flashlight,
Flosscontainer,Flourcontainer,Fork,Frenchpress,Fryingpan,Gluecontainer,Hairbrush,Hairclip,Hair
dryer,Hairtie,Hammer,Handmirror,Handbag,Hat,Headphones(overear),Helmet,Honeycontainer,Ice,Ice
cubetray,Iron,Ironingboard,Jam,Jar,Jeans,Kettle,Keyboard,Keychain,Ladle,Lampshade,Laptop(open),
Laptopcharger,Leaf,Leggings,Lemon,Letteropener,Lettuce,Lightbulb,Lighter,Lipstick,Loofah,Magazine,
Makeup,Makeupbrush,Marker,Match,Measuringcup,Microwave,Milk,Mixing/SaladBowl,Monitor,Mouse
pad,Mouthwash,Mug,Multitool,Nail,Nailclippers,Nail?le,Nailpolish,Napkin,Necklace,Newspaper,Night
light,Nightstand,Notebook,Notepad,Nutforascrew,Orange,Ovenmitts,Padlock,Paintbrush,Paintcan,
Paper,Paperbag,Paperplates,Papertowel,Paperclip,Peeler,Pen,Pencil,Peppershaker,Petfoodcontainer,
Landlinephone,Photograph,Pillbottle,Pillorganizer,Pillow,Pitcher,Placemat,Plasticbag,Plasticcup,Plastic
wrap,Plate,Playingcards,Pliers,Plunger,Popcan,Portableheater,Poster,Powerbar,Powercable,Printer,
Raincoat,Rake,Razor,Receipt,Remotecontrol,Removableblade,Ribbon,Ring,Rock,Rollingpin,Ruler,
Runningshoe,Safetypin,Saltshaker,Sandal,Scarf,Scissors,Screw,Scrubbrush,Shampoobottle,Shoelace,
Shorts,Shovel,Skateboard,Skirt,Sleepingbag,Slipper,Soapbar,Soapdispenser,Sock,SoupBowl,Sewingkit,
Spatula,Speaker,Sponge,Spoon,Spraybottle,Squeegee,Squeezebottle,Standinglamp,Stapler,Stepstool,
StillCamera,SinkStopper,Strainer,Stuffedanimal,Sugarcontainer,Suitjacket,Suitcase,Sunglasses,Sweater,
Swimmingtrunks,T-shirt,TV,Tableknife,Tablecloth,Tablet,Tanktop,Tape,Tapemeasure,Tarp,Teabag,
Teapot,Tennisracket,Thermometer,Thermos,Throwpillow,Tie,Tissue,Toaster,Toiletpaperroll,Tomato,
Tongs,Toothbrush,Toothpaste,Totebag,Toy,Trashbag,Trashbin,Travelcase,Tray,Trophy,Tweezers,
Umbrella,USBcable,USB?ashdrive,Vacuumcleaner,Vase,Videocamera,Walker,Walkingcane,Wallet,
Watch,Waterbottle,Water?lter,Webcam,Weight(exercise),Weightscale,Wheel,Whisk,Whistle,Winebottle,
Wineglass,Winterglove,Wok,Wrench,Ziplocbag
Figure5:The313objectclassesinObjectNet.Wechoseobjectclassesthatwerefairlycommon,
nottoosimilartooneanother,coverawiderangeofobjectsavailableinhomes,andcanbesafely
manipulatedbyworkers.The113classeswhichoverlapwithImageNetaremarkedinitalics.
7
ObjectclassBackground
RotationViewpoint
Figure6:Top-1performanceofResNet-152pretrainedonImageNetonthesubsetofObjectNet
–113classeswhichoverlapwithImageNet–asafunctionofcontrolsused.No?ne-tuningwas
performed;seesection4.3.Classes
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 打包租賃合同協議書
- 終止合同退款協議書范本
- 2025年VFP考試中遺漏的關鍵知識點試題及答案
- 河南開封科技傳媒學院招聘考試真題2024
- 以用戶為中心的軟件測試實踐指引試題及答案
- 物資合同封帳協議書模板
- 繩子訂單合同協議書范本
- JAVA用戶身份驗證方案試題及答案
- 2025餐飲品牌特許經營合同范本
- 自來水合同修復協議書
- 大廈垃圾房管理制度
- 北汽昌河Q25-汽車使用手冊用戶操作圖示圖解詳解駕駛指南車主車輛說明書電子版
- D500-D505 2016年合訂本防雷與接地圖集
- 念珠菌定植與藥物選擇
- 寧夏回族自治區社會保險變更登記表
- GB/T 18684-2002鋅鉻涂層技術條件
- 拘留所教育課件02
- 31小動物本領大-課件
- 干部人事檔案管理工作實務
- 品質異常8D改善報告(雜項)
- 深圳城市更新工改工專題研究報告
評論
0/150
提交評論