Negative Sampling for Learning Knowledge Graph Embeddings (ICPSR doi:10.11588/data/YYULL2)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Negative Sampling for Learning Knowledge Graph Embeddings

Identification Number:

doi:10.11588/data/YYULL2

Distributor:

heiDATA

Date of Distribution:

2019-08-19

Version:

1

Bibliographic Citation:

Kotnis, Bhushan, 2019, "Negative Sampling for Learning Knowledge Graph Embeddings", https://doi.org/10.11588/data/YYULL2, heiDATA, V1

Study Description

Citation

Title:

Negative Sampling for Learning Knowledge Graph Embeddings

Subtitle:

Analysis of the Impact of Negative Sampling on Link Prediction in Knowledge Graphs

Identification Number:

doi:10.11588/data/YYULL2

Authoring Entity:

Kotnis, Bhushan (Department of Computational Linguistics, Heidelberg University, Germany (2016-2018), NEC Laboratories Europe GmbH (since 2018))

Date of Production:

2018

Distributor:

heiDATA

Date of Distribution:

2019-08-19

Study Scope

Keywords:

Arts and Humanities, Computer and Information Science, knowledge graphs, negative sampling, embedding models, linkprediction

Topic Classification:

knowledge discovery in knowledge graphs

Abstract:

<p>Reimplementation of four KG factorization methods and six negative sampling methods.</p> <strong> Abstract </strong> </p> Knowledge graphs are large, useful, but incomplete knowledge repositories. They encode knowledge through entities and relations which define each other through the connective structure of the graph. This has inspired methods for the joint embedding of entities and relations in continuous low-dimensional vector spaces, that can be used to induce new edges in the graph, i.e., link prediction in knowledge graphs. Learning these representations relies on contrasting positive instances with negative ones. Knowledge graphs include only positive relation instances, leaving the door open for a variety of methods for selecting negative examples. In this paper we present an empirical study on the impact of negative sampling on the learned embeddings, assessed through the task of link prediction. We use state-of-the-art knowledge graph embeddings -- \rescal , TransE, DistMult and ComplEX -- and evaluate on benchmark datasets -- FB15k and WN18. We compare well known methods for negative sampling and additionally propose embedding based sampling methods. We note a marked difference in the impact of these sampling methods on the two datasets, with the "traditional" corrupting positives method leading to best results on WN18, while embedding based methods benefiting the task on FB15k.

Kind of Data:

program source code

Methodology and Processing

Other Study-Related Materials

Label:

kge-rl-master.zip

Notes:

application/zip