11 to 20 of 10,009 Results
Jun 16, 2014 - Statistical Natural Language Processing Group
Sokolov, Artem; Jehl Laura; Hieber Felix; Ruppert, Eugen; Riezler, Stefan, 2014, "BoostCLIR: JP-EN Relevance Marked Patent Corpus", https://doi.org/10.11588/data/10001, heiDATA, V1
BoostCLIR is a bilingual (Japanese-English) corpus of patent abstracts, extracted from the MAREC patent data, and the data from the NTCIR PatentMT workshop collections, accompanied with relevance judgements for the task of patent prior-art search. Important: The English side of t... |
Jun 16, 2014 -
BoostCLIR: JP-EN Relevance Marked Patent Corpus
Gzip Archive - 241.8 MB -
MD5: 35fde8d24e6e80bf932490549c991a3f
data set |
Jun 16, 2014 -
BoostCLIR: JP-EN Relevance Marked Patent Corpus
Plain Text - 1.5 KB -
MD5: 544fa4db045f692d07a7d4596da99741
README |
Jun 16, 2014 - Statistical Natural Language Processing Group
Wäschle, Katharina; Riezler, Stefan, 2014, "PatTR: Patent Translation Resource", https://doi.org/10.11588/data/10002, heiDATA, V3
PatTR is a sentence-parallel corpus extracted from the MAREC patent collection. The current version contains more than 22 million German-English and 18 million French-English parallel sentences collected from all patent text sections as well as 5 million German-French sentence pa... |
Jun 18, 2014 - Statistical Natural Language Processing Group
Hieber, Felix; Schamoni, Shigehiko; Sokolov, Artem; Riezler, Stefan, 2014, "WikiCLIR: A Cross-Lingual Retrieval Dataset from Wikipedia", https://doi.org/10.11588/data/10003, heiDATA, V1
WikiCLIR is a large-scale (German-English) retrieval data set for Cross-Language Information Retrieval (CLIR). It contains a total of 245,294 German single-sentence queries with 3,200,393 automatically extracted relevance judgments for 1,226,741 English Wikipedia articles as docu... |
Jun 18, 2014 -
WikiCLIR: A Cross-Lingual Retrieval Dataset from Wikipedia
Plain Text - 1.8 KB -
MD5: f2d15639b962977ea19a20308bccbfc4
|
Jun 18, 2014 -
WikiCLIR: A Cross-Lingual Retrieval Dataset from Wikipedia
Gzip Archive - 846.8 MB -
MD5: 8f51894ff1c6ba2987d07dde62b3143d
data set |
Tabular Data - 9.4 KB - 20 Variables, 162 Observations - UNF:5:P2FZD04x5K8+kaZwkl17XA==
|
Tabular Data - 27.1 KB - 30 Variables, 366 Observations - UNF:5:Lvm+8FYirprlYzeXpqMblA==
|
Unknown - 204.1 KB -
MD5: 36ffde871d68b4025549ccc6349317af
Original z-Tree file |