ANovelKnowledgeNetworkFrameworkforFinancialNewsNavigation
LiliZhou1,HanchaoWang1,LeiZhang1,EnhongChen1,andJunChen2
1UniversityofScienceandTechnologyofChina{zhoulili,hanchaow,stone}@
2XinhuaNewsAgencycheneh@,chenjun2008@
Abstract.Nowadays,variousfinancialnewsretrievalplatformsareprovidedtohelpusers,especiallyforfinancialprofessionalsandhobbyiststomakerightdecisions.Inthoseplatforms,usersusuallygetinformationbysearchingtherelevantnewsviakeywordsorclickingthemendednewswiththeicintheclickedwebpage.However,suchwaystoobtainfinancialinformationcannoteffectivelymeetusers’furtherneeds.Theyareeagertoobtaintherelevantnewswithdifferentdomainsinashorttime.Toaddressthisproblem,weproposeanovelfour-layers-basedworkframeworkforfinancialnewsnavigation.Experimentsonrealdatasetsdemonstratetheeffectivenessandefficiencyofourproposedframework.
Keywords:retrievalplatform,work,financialnewsnavigation.
1Introduction
Currently,lotsoffinancialnewsretrievalplatformsincludinggeneralsearchengines(e.g.,baidu)andfinancialdomainverticalwebsites(e.g.,sina)arisetofacilitateesstofinancialinformationforusers,especiallyforfinancialprofessionalsandhobbyists.Theyareeagertogetthelatestfinancialinformationtohelpthemmakerightdecisions.Intheseplatforms,userscansearchtherelevantnewsbykeywordsorclickthecorrespondinglabelinthenavigationmenutoobtainfinancialinformationtheyneed.Ifwantingtoknowtheinsandoutsoftheclickednews,usershavetocontinuetoseekthoserelatednewsthroughtheaforementionedtwoways.Suchaprocessistime-consumingandlaborious.Althoughthemajorityofnewspageslistsomemendednewslinkswiththeicforextensionreading,thisdoesnotalleviatetheproblem.Meanwhile,existingstudiesonfinancialnewsmainlyfocusonhowanizeandpresentfinancialnewsforusersmorefriendly[1]orminingtheunderlyinginformationinfinancialnews[2].ButAllofthemdonotsolvethisproblemproperly.
Forthis,weproposeanovelfour-layers-basedworkframeworkforfinancialnewsnavigation.Specifically,wefirstcrawllargeamountsoffinancialnewsfrommanypopularfinancialwebportals.Second,weicmodel
F.Lietal.(Eds.):WAIM2014,LNCS8485,pp.723–727,2014.cSpringerInternationalPublishingSwitzerland2014 724L.Zhouetal. andclassificationmethodstoextractknowledgefromthecollectednewscorpus.Basedontheseknowledgesets,wethenconstructafour-layers-basedknowledgeforeachnewsreport.Finally,weimplementandvisualizethisworkwithapopularusedlibraryD3.js. 2TheFour-Layers-BasedKnowledgeNetwork Constructingtheworkframeworkconsistsoftwosteps,i.e.,offlinestepandonlinestep.Next,wedetailtheinvolvedtechniquesandimplementationsrespectively. 2.1KnowledgeDefinitionandExtraction Intheofflinestep,wecrawlrealdatasetsandextracttheknowledgefromthem.Herewedefineknowledgeasfinancialinformationcontainedinfinancialnews,whichcanbeexpressedasanindustrylabeloric,etc.Amongtheseknowledge,icsandindustrylabelofonenewsreportareusuallydifficulttoobtain.Fortunately,LatentDirichletAllocation[3]whichisaicmodelcanbeusedtoextracttheicsfromvastamountsoffinancialnewseffectively.Sincetheheadlineoftenrepresentsthecoreideaofanewsreport,weconsiderboththetitleandnewscontentandgivedifferentweightstothemwhenusingLDAmodel.Torecognizetheindustrylabel,wechooseSupportVectorMachine[4]duetoitshighuracyandefficiency. 2.2KnowledgeNetworkConstructionandVisualization Intheonlinestep,weconstructthedynamicposedofnumerousknowledges.Fig.1showsthefullview. Supposeauseropensanewspage,thenthefirstlayeroftheknowledgeforthisnewsreportshowninFig.1(a)ispresentedbehindthenewscontent.InFig.1(a),thecenternodedenotestheclickednews(hereafterwecallMainNews)andeachoftheotherthreenodesconnectingtoMainNewsstandsforanewssetofmonnature.Forexample,ifMainNews’sindustrylabelisrealestate,thenodenamed“industrypolicy”representssomerelevantnewswhoselabelsareallrealestatecontrolpolicy.IftheuserisinterestedinmacrodataafterreadingMainNews,heorshecanclickthecorrespondingnews-clusternodetogetdetailedinformation.Fig.1(b)showsthesecondlayeroftheknowledgewhereeachbluenodedenotesaicsetwithadescriptionofseveralwordsandicinthesetiscontainedbysomenewsinitsparentnode.WeuseasimpleclusteringmethodnamedK-Means[5]toaggregatethoseics.Andclickonebluenode,thespecifiicsinthissetaredisplayedshowninFig.1(c).Furthermore,theuseralsocanfi-nrelevantnewsforoneicbyclickingtheic-node.AsshowninFig.1(d),thereare5macroeconomynewsreportswhoseicsareall“TopicC”.Wefind-5newsreportsordingtotheprobabilityvalueofic ANovelKnowledgeNetworkFrameworkforFinancialNewsNavigation725 NewsCluster industrypolicy MainNews NewsCluster NewsCluster macroeconomy upstreamanddownstreamindustries industrypolicyNewsCluster topic-cluster1 TopicCluster MainNews NewsCluster TopicCluster topic-cluster2 TopicATopic macro economy Topic topic-cluster3 TopicCluster TopicBTitle:xxx upstreamand downstream industriesNewsCluster TopicDTopic News TopicCTopic Title:yyyNews Title:zzzNews Title:sss Title:tttNews News industrypolicy NewsCluster MainNews (a) topic-cluster1 TopicCluster topic-cluster2 TopicCluster macroeconomy NewsCluster industry ic-cluster3News Cluster Topic Cluster MainNews (d) topic-cluster1 TopicCluster topic-cluster2 TopicCluster TopicA NewsCluster macro Topic economy topic-cluster3TopicB TopicCluster Topic NewsCluster upstreamanddownstreamindustries upstreamanddownstream TopicD NewsindustriesTopic Cluster TopicCTopic (b) (c) Fig.1.Thestructureofaknowledgeforonefinancialnewsreport inallnewsreports.Finally,theusercanreadoneofthesenewsbyclickingit.Throughthewholeprocess,userscanconvenientlygrasptheinformationwithdiffics WevisualizeeachlayeroftheworkwithapopularusedlibraryD3.js(/)whichcanbringdatatolifeusingHTML,SVGandCSS. 3Experiments Tovalidatetheproposedframework,weimplementourproposedframeworkfocusingonrealestateordingtoSection2.AlltheimplementationsinJavaareonaWindows7PCwithIntel1-corei33.10GHzCPU,4GBofmainmemoryanda64-bitoperatingsystem.Onthisbasis,wedesignascoringsystemparingtwodifferentkindsofnewspresentation,thatis,thenovelnewspagewiththeknowledgeembeddedindenotedasPage1andtheoriginalnewspagedenotedasPage2.Weofferfourmeasurestousersformarkingthetwodifferentpages.Table1presentsmeaningsofthesemeasures.Inordertoavoidhumanbias,wedeploythisscoringsystemontheservertoletexternalusersgradethemafterusingit. Sofar,therearemorethantwothousandscoringrecords.Werandomlyselect600piecesofdataamongthemforfinalstatistics.Amongtheseusers,399users 726L.Zhouetal. Table1.Detailsoffourscoringindexesfortwofinancialnewspresentations Measure Annotation info-diversityinformationdiversityofextendednewssets(1-10’) relInfo-searchwhetherhelpusersfindrelatednewsandofferapositiveuserexperience.(1-10’) user-confidenceusers’confidenceforthescorestheygive(1-10’) whether-likewhetherlikesuchawayofnewspresentation(0/1) expressapreferenceforPage1.AsshowninFig.2(a),theaveragescoresofPage1arebothhigherthanwhichofPage2forinfo-diversityandrelInfosearch,similartotheothersituationthatbineusers’confidencefortheirratings.Higherusers’confidencemeansthattheresultstheygradedaremorecredible.Meanwhile,thevariancesofscoresofPage1arelowerthanthatofPage2forthetwoindexesinbothcases.Inaddition,weapplyz-testandfindthatthedifferencesbetweentheratingsobtainedbyourproposedapproachandtheexitingnewspresentations(e.g.,sina)arestatisticallysignificantwith|z|≥2.58andthusp≤0.01.Therefore,ourproposedframeworkforfinancialnewsnavigationoutperformstheexistingothernewspresentations. (a)Mean (b)Variance Fig.2.Thescoringresults:meanandvariance 4Conclusion Inthisstudy,wepresentedafour-layers-basedworkframeworkforfinancialnewsnavigation.Theexperimentsfocusingonrealestateindustrydemonstratedthattheproposedframeworkcaneffectivelysatisfyusers’furtherneeds.Furthermore,theideaofusingworktofacilitateusers’esstofinancialinformationcanbegenerallyapplicabletootherdomainsofnewsnavigation(e.g.,sportsnews).Inthefuture,wewouldliketoincorporatethebehaviordatasuchasgeographicinformationandusers’browsinghistorylogsintotheproposedframeworkforpersonalizedmendation. Acknowledgments.ThisresearchwaspartiallysupportedbygrantsfromtheResearchFundfortheDoctoralProgramofHigherEducationofChina(GrantNo.201),theScienceandTechnologyDevelopmentofAnhui ANovelKnowledgeNetworkFrameworkforFinancialNewsNavigation727 Province,China(GrantsNo.13Z02008-5and1301022064),theInternationalScience&TechnologyCooperationPlanofAnhuiProvince(GrantNo.1303063008),theNatureScienceFoundationofAnhuiEducationDepartment(GrantNo.KJ2012A273). References
1.Wang,
H.,Wang,
Z.:Mobilefinancialnewsmashupdevelopmentbasedonyql.In:2013FifthInternationalConferenceonComputationalandInformationSciences(ICCIS),pp.1717–1720.IEEE(2013)
2.Alanyali,
M.,Moat,
H.S.,Preis,
T.:Quantifyingtherelationshipbetweenfinancialnewsandthestockmarket.ScientificReports3(2013)
3.Blei,
D.M.,Ng,
A.Y.,Jordan,
M.:Latentdirichletallocation.TheJournalofMachineLearningResearch3,993–1022(2003)
4.Sebastiani,
F.:Machinelearninginautomatedtextcategorization.ACMComputingSurveys(CSUR)34
(1),1–47(2002)
5.Jain,
A.K.:Dataclustering:50yearsbeyondk-means.PatternRecognitionLetters31
(8),651–666(2010)
F.Lietal.(Eds.):WAIM2014,LNCS8485,pp.723–727,2014.cSpringerInternationalPublishingSwitzerland2014 724L.Zhouetal. andclassificationmethodstoextractknowledgefromthecollectednewscorpus.Basedontheseknowledgesets,wethenconstructafour-layers-basedknowledgeforeachnewsreport.Finally,weimplementandvisualizethisworkwithapopularusedlibraryD3.js. 2TheFour-Layers-BasedKnowledgeNetwork Constructingtheworkframeworkconsistsoftwosteps,i.e.,offlinestepandonlinestep.Next,wedetailtheinvolvedtechniquesandimplementationsrespectively. 2.1KnowledgeDefinitionandExtraction Intheofflinestep,wecrawlrealdatasetsandextracttheknowledgefromthem.Herewedefineknowledgeasfinancialinformationcontainedinfinancialnews,whichcanbeexpressedasanindustrylabeloric,etc.Amongtheseknowledge,icsandindustrylabelofonenewsreportareusuallydifficulttoobtain.Fortunately,LatentDirichletAllocation[3]whichisaicmodelcanbeusedtoextracttheicsfromvastamountsoffinancialnewseffectively.Sincetheheadlineoftenrepresentsthecoreideaofanewsreport,weconsiderboththetitleandnewscontentandgivedifferentweightstothemwhenusingLDAmodel.Torecognizetheindustrylabel,wechooseSupportVectorMachine[4]duetoitshighuracyandefficiency. 2.2KnowledgeNetworkConstructionandVisualization Intheonlinestep,weconstructthedynamicposedofnumerousknowledges.Fig.1showsthefullview. Supposeauseropensanewspage,thenthefirstlayeroftheknowledgeforthisnewsreportshowninFig.1(a)ispresentedbehindthenewscontent.InFig.1(a),thecenternodedenotestheclickednews(hereafterwecallMainNews)andeachoftheotherthreenodesconnectingtoMainNewsstandsforanewssetofmonnature.Forexample,ifMainNews’sindustrylabelisrealestate,thenodenamed“industrypolicy”representssomerelevantnewswhoselabelsareallrealestatecontrolpolicy.IftheuserisinterestedinmacrodataafterreadingMainNews,heorshecanclickthecorrespondingnews-clusternodetogetdetailedinformation.Fig.1(b)showsthesecondlayeroftheknowledgewhereeachbluenodedenotesaicsetwithadescriptionofseveralwordsandicinthesetiscontainedbysomenewsinitsparentnode.WeuseasimpleclusteringmethodnamedK-Means[5]toaggregatethoseics.Andclickonebluenode,thespecifiicsinthissetaredisplayedshowninFig.1(c).Furthermore,theuseralsocanfi-nrelevantnewsforoneicbyclickingtheic-node.AsshowninFig.1(d),thereare5macroeconomynewsreportswhoseicsareall“TopicC”.Wefind-5newsreportsordingtotheprobabilityvalueofic ANovelKnowledgeNetworkFrameworkforFinancialNewsNavigation725 NewsCluster industrypolicy MainNews NewsCluster NewsCluster macroeconomy upstreamanddownstreamindustries industrypolicyNewsCluster topic-cluster1 TopicCluster MainNews NewsCluster TopicCluster topic-cluster2 TopicATopic macro economy Topic topic-cluster3 TopicCluster TopicBTitle:xxx upstreamand downstream industriesNewsCluster TopicDTopic News TopicCTopic Title:yyyNews Title:zzzNews Title:sss Title:tttNews News industrypolicy NewsCluster MainNews (a) topic-cluster1 TopicCluster topic-cluster2 TopicCluster macroeconomy NewsCluster industry ic-cluster3News Cluster Topic Cluster MainNews (d) topic-cluster1 TopicCluster topic-cluster2 TopicCluster TopicA NewsCluster macro Topic economy topic-cluster3TopicB TopicCluster Topic NewsCluster upstreamanddownstreamindustries upstreamanddownstream TopicD NewsindustriesTopic Cluster TopicCTopic (b) (c) Fig.1.Thestructureofaknowledgeforonefinancialnewsreport inallnewsreports.Finally,theusercanreadoneofthesenewsbyclickingit.Throughthewholeprocess,userscanconvenientlygrasptheinformationwithdiffics WevisualizeeachlayeroftheworkwithapopularusedlibraryD3.js(/)whichcanbringdatatolifeusingHTML,SVGandCSS. 3Experiments Tovalidatetheproposedframework,weimplementourproposedframeworkfocusingonrealestateordingtoSection2.AlltheimplementationsinJavaareonaWindows7PCwithIntel1-corei33.10GHzCPU,4GBofmainmemoryanda64-bitoperatingsystem.Onthisbasis,wedesignascoringsystemparingtwodifferentkindsofnewspresentation,thatis,thenovelnewspagewiththeknowledgeembeddedindenotedasPage1andtheoriginalnewspagedenotedasPage2.Weofferfourmeasurestousersformarkingthetwodifferentpages.Table1presentsmeaningsofthesemeasures.Inordertoavoidhumanbias,wedeploythisscoringsystemontheservertoletexternalusersgradethemafterusingit. Sofar,therearemorethantwothousandscoringrecords.Werandomlyselect600piecesofdataamongthemforfinalstatistics.Amongtheseusers,399users 726L.Zhouetal. Table1.Detailsoffourscoringindexesfortwofinancialnewspresentations Measure Annotation info-diversityinformationdiversityofextendednewssets(1-10’) relInfo-searchwhetherhelpusersfindrelatednewsandofferapositiveuserexperience.(1-10’) user-confidenceusers’confidenceforthescorestheygive(1-10’) whether-likewhetherlikesuchawayofnewspresentation(0/1) expressapreferenceforPage1.AsshowninFig.2(a),theaveragescoresofPage1arebothhigherthanwhichofPage2forinfo-diversityandrelInfosearch,similartotheothersituationthatbineusers’confidencefortheirratings.Higherusers’confidencemeansthattheresultstheygradedaremorecredible.Meanwhile,thevariancesofscoresofPage1arelowerthanthatofPage2forthetwoindexesinbothcases.Inaddition,weapplyz-testandfindthatthedifferencesbetweentheratingsobtainedbyourproposedapproachandtheexitingnewspresentations(e.g.,sina)arestatisticallysignificantwith|z|≥2.58andthusp≤0.01.Therefore,ourproposedframeworkforfinancialnewsnavigationoutperformstheexistingothernewspresentations. (a)Mean (b)Variance Fig.2.Thescoringresults:meanandvariance 4Conclusion Inthisstudy,wepresentedafour-layers-basedworkframeworkforfinancialnewsnavigation.Theexperimentsfocusingonrealestateindustrydemonstratedthattheproposedframeworkcaneffectivelysatisfyusers’furtherneeds.Furthermore,theideaofusingworktofacilitateusers’esstofinancialinformationcanbegenerallyapplicabletootherdomainsofnewsnavigation(e.g.,sportsnews).Inthefuture,wewouldliketoincorporatethebehaviordatasuchasgeographicinformationandusers’browsinghistorylogsintotheproposedframeworkforpersonalizedmendation. Acknowledgments.ThisresearchwaspartiallysupportedbygrantsfromtheResearchFundfortheDoctoralProgramofHigherEducationofChina(GrantNo.201),theScienceandTechnologyDevelopmentofAnhui ANovelKnowledgeNetworkFrameworkforFinancialNewsNavigation727 Province,China(GrantsNo.13Z02008-5and1301022064),theInternationalScience&TechnologyCooperationPlanofAnhuiProvince(GrantNo.1303063008),theNatureScienceFoundationofAnhuiEducationDepartment(GrantNo.KJ2012A273). References
1.Wang,
H.,Wang,
Z.:Mobilefinancialnewsmashupdevelopmentbasedonyql.In:2013FifthInternationalConferenceonComputationalandInformationSciences(ICCIS),pp.1717–1720.IEEE(2013)
2.Alanyali,
M.,Moat,
H.S.,Preis,
T.:Quantifyingtherelationshipbetweenfinancialnewsandthestockmarket.ScientificReports3(2013)
3.Blei,
D.M.,Ng,
A.Y.,Jordan,
M.:Latentdirichletallocation.TheJournalofMachineLearningResearch3,993–1022(2003)
4.Sebastiani,
F.:Machinelearninginautomatedtextcategorization.ACMComputingSurveys(CSUR)34
(1),1–47(2002)
5.Jain,
A.K.:Dataclustering:50yearsbeyondk-means.PatternRecognitionLetters31
(8),651–666(2010)
声明:
该资讯来自于互联网网友发布,如有侵犯您的权益请联系我们。