Large-scale factorization of type-constrained multi-relational data

The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands...

Full description

Saved in:
Bibliographic Details
Published inDSAA : 2014 International Conference on Data Science and Advanced Analytics : October 30, 2014-November 1, 2014 pp. 18 - 24
Main Authors Krompass, Denis, Nickel, Maximilian, Tresp, Volker
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2014
Subjects
Online AccessGet full text
DOI10.1109/DSAA.2014.7058046

Cover

Abstract The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands of relations, and billions of relational tuples. Collective factorization methods have been shown to scale up to these large multi-relational datasets, in particular in form of tensor approaches that can exploit the highly scalable alternating least squares (ALS) algorithms for calculating the factors. In this paper we extend the recently proposed state-of-the-art RESCAL tensor factorization to consider relational type-constraints. Relational type-constraints explicitly define the logic of relations by excluding entities from the subject or object role. In addition we will show that in absence of prior knowledge about type-constraints, local closed-world assumptions can be approximated for each relation by ignoring unobserved subject or object entities in a relation. In our experiments on representative large datasets (Cora, DBpedia), that contain up to millions of entities and hundreds of type-constrained relations, we show that the proposed approach is scalable. It further significantly outperforms RESCAL without type-constraints in both, runtime and prediction quality.
AbstractList The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands of relations, and billions of relational tuples. Collective factorization methods have been shown to scale up to these large multi-relational datasets, in particular in form of tensor approaches that can exploit the highly scalable alternating least squares (ALS) algorithms for calculating the factors. In this paper we extend the recently proposed state-of-the-art RESCAL tensor factorization to consider relational type-constraints. Relational type-constraints explicitly define the logic of relations by excluding entities from the subject or object role. In addition we will show that in absence of prior knowledge about type-constraints, local closed-world assumptions can be approximated for each relation by ignoring unobserved subject or object entities in a relation. In our experiments on representative large datasets (Cora, DBpedia), that contain up to millions of entities and hundreds of type-constrained relations, we show that the proposed approach is scalable. It further significantly outperforms RESCAL without type-constraints in both, runtime and prediction quality.
Author Tresp, Volker
Krompass, Denis
Nickel, Maximilian
Author_xml – sequence: 1
  givenname: Denis
  surname: Krompass
  fullname: Krompass, Denis
  email: Denis.Krompass@campus.lmu.de
  organization: Ludwig Maximilian Univ., Munich, Germany
– sequence: 2
  givenname: Maximilian
  surname: Nickel
  fullname: Nickel, Maximilian
  email: mnick@mit.edu
  organization: Massachusetts Inst. of Technol., Cambridge, MA, USA
– sequence: 3
  givenname: Volker
  surname: Tresp
  fullname: Tresp, Volker
  email: Volker.Tresp@siemens.com
  organization: Corp. Technol., Siemens AG, Munich, Germany
BookMark eNotj0FLxDAUhCMoqGt_gHjpH0hNmjTJO5aqq1DwoJ6X1zZPIt12SeNh_fUW3dMMw8cwc83Op3nyjN1KUUgp4P7hra6LUkhdWFE5oc0Zy8A6qS2AAZDqkmXL8iWEkGDWXF2xpsX46fnS4-hzwj7NMfxgCvOUz5Sn48Hzfp6WFDFMfsj332MKPPrxD8ExHzDhDbsgHBefnXTDPp4e35tn3r5uX5q65UFJmXhnidQ6SpcaBIFSq6kqbXsyQG6QBkmrktBZHLrOyA4InHfWAcneWlIbdvffG7z3u0MMe4zH3emq-gWb8Et-
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/DSAA.2014.7058046
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781479969913
1479969915
EndPage 24
ExternalDocumentID 7058046
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i311t-b7ff304642490f9334245547cf69f8d16af432fa87adbb61b9f98e8789f1c77f3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:01:34 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i311t-b7ff304642490f9334245547cf69f8d16af432fa87adbb61b9f98e8789f1c77f3
PageCount 7
ParticipantIDs ieee_primary_7058046
PublicationCentury 2000
PublicationDate 2014-10
PublicationDateYYYYMMDD 2014-10-01
PublicationDate_xml – month: 10
  year: 2014
  text: 2014-10
PublicationDecade 2010
PublicationTitle DSAA : 2014 International Conference on Data Science and Advanced Analytics : October 30, 2014-November 1, 2014
PublicationTitleAbbrev DSAA
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001967813
Score 1.6579009
Snippet The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge...
SourceID ieee
SourceType Publisher
StartPage 18
SubjectTerms Lead
Title Large-scale factorization of type-constrained multi-relational data
URI https://ieeexplore.ieee.org/document/7058046
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NTwMhEJ20njxVbY3f4eBRtssCCxybatMYa0y0SW8NsJAYTWvq9uKvd9mlbTQevBESAmECAzPvzQO4Tqm2Pks5Ns7nmJnMYeWYwl6YgituU1qz-CeP-XjK7md81oKbLRfGOVeDz1wSmnUuv1jadQiV9UXKZfWfa0NbyLzhau3iKaq6dgmNiUuSqv7t82AQsFssieN-CKjU_mPUgclm5gY28pasS5PYr19FGf-7tAPo7Zh66Gnrgw6h5RZH0NlINaB4crswfAiIb_xZWcShRmMnEjDR0qMQh8U2vBSDYIQrUA0zxKsIlNPvKABJezAd3b0MxzjqJ-BXSkiJjfA-JD5Z9cVKvaI0ZDk5E9bnysuC5NozmnkthS6MyYlRXkknhVSeWCE8PYa9xXLhTgAxL7QmOuNSKqZJYSwxle0tt5QTatgpdMOezD-aEhnzuB1nf3efw36wS4OJu4C9crV2l5VvL81VbdRv6pelUA
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NTwIxEJ0gHvSECsZve_Bol-223W6PBCWoQEyEhBtpu21iNGBwufjr3e4uEI0Hb02Tpk0n7bQz780DuAmpMi4KOdbWxZjpyGJpmcRO6JRLbkJasPiHo7g_YY9TPq3B7YYLY60twGc28M0il58uzMqHytoi5En-n9uBXc4Y4yVbaxtRkfnFS2iVuiShbN-9dDoevcWCauQPCZXCg_QaMFzPXQJH3oJVpgPz9ass438XdwCtLVcPPW-80CHU7PwIGmuxBlSd3SZ0Bx7zjT9zm1hUquxUFEy0cMhHYrHxb0UvGWFTVAAN8bKCyql35KGkLZj07sfdPq4UFPArJSTDWjjnU58s_2SFTlLq85ycCeNi6ZKUxMoxGjmVCJVqHRMtnUxsIhLpiBHC0WOozxdzewKIOaEUURFPEskUSbUhOre-4YZyQjU7habfk9lHWSRjVm3H2d_d17DXHw8Hs8HD6Okc9r2NSoTcBdSz5cpe5p4-01eFgb8BwoKonQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=DSAA+%3A+2014+International+Conference+on+Data+Science+and+Advanced+Analytics+%3A+October+30%2C+2014-November+1%2C+2014&rft.atitle=Large-scale+factorization+of+type-constrained+multi-relational+data&rft.au=Krompass%2C+Denis&rft.au=Nickel%2C+Maximilian&rft.au=Tresp%2C+Volker&rft.date=2014-10-01&rft.pub=IEEE&rft.spage=18&rft.epage=24&rft_id=info:doi/10.1109%2FDSAA.2014.7058046&rft.externalDocID=7058046