Author Name: Heinzerling, Benjamin
Keyword Term: multilingual
Publication Year: 2019
Keyword Term: byte-pair encoding
1 to 1 of 1 Result
Feb 6, 2019 - AIPHES
Heinzerling, Benjamin, 2019, "BPEmb: Pre-trained Subword Embeddings in 275 Languages (LREC 2018)", https://doi.org/10.11588/data/V9CXPR, heiDATA, V1
BPEmb is a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages better than alternative subword approaches, while r... |