The Statistical Natural Language Processing Group is part of the Department of Computational Linguistics.
Our research addresses various aspects of the problem of the confusion of languages, by means of statistical learning techniques.
Research topics include the following:
  • Statistical machine translation, statistical parsing, question answering, information retrieval, learning-to-rank.
  • Statistical machine learning methods, especially unsupervised, semi-supervised and discriminative learning techniques.
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

11 to 19 of 19 Results
Jun 16, 2014
Wäschle, Katharina; Riezler, Stefan, 2014, "PatTR: Patent Translation Resource", https://doi.org/10.11588/data/10002, heiDATA, V3
PatTR is a sentence-parallel corpus extracted from the MAREC patent collection. The current version contains more than 22 million German-English and 18 million French-English parallel sentences collected from all patent text sections as well as 5 million German-French sentence pa...
Gzip Archive - 234.3 MB - MD5: 3bd140f68ab0eefe239e3e893012c991
de-en
data set de-en, Part 1/3 (License information: see part 1)
Gzip Archive - 1.3 GB - MD5: 2d1336fe8eecd100c01488f5e3e9bc97
de-en
data set de-en, Part 2/3
Gzip Archive - 1.3 GB - MD5: b838211b8ddc04001d79f7e1e2e066cb
de-en
data set de-en, Part 2/3 (License information: see part 1)
Gzip Archive - 669.7 MB - MD5: bf9d77a06ebd10d50648c2c8d300c5e2
en-fr
data set en-fr, Part 1/3
Gzip Archive - 1.0 GB - MD5: 421d98c4fea4eebd076044acffd77095
en-fr
data set en-fr, Part 2/3 (License information: see part 1)
Gzip Archive - 628.3 MB - MD5: a4a327f7104842bbc86ccb6bdfbc229e
en-fr
data set en-fr, Part 3/3 (License information: see part 1)
Gzip Archive - 645.5 MB - MD5: 120484093f5f930fe8646eb3b3be76e3
fr-de
data set fr-de, Part 1/1
Plain Text - 4.4 KB - MD5: f5b9aca3616904df6183f95c2cb36454
README
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.