Metrics
193,550 Downloads
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

91 to 100 of 167 Results
Feb 24, 2023 - Ground truth data for HTR on South Asian Scripts
Tübingen University Library, 2023, "Ground Truth data for printed Malayalam", https://doi.org/10.11588/data/L2KRZO, heiDATA, V1
Ground Truth (GT) data (JPG, PAGE and ALTO XML files) which can be used to train OCR models that recognize printed text in Malayalam script. The training material is gathered from 19th and 20th centuries prints. The GT data was trained in Transkribus with the HTR+ and the PyLaia...
Oct 26, 2022 - Ground truth data for HTR on South Asian Scripts
Merkel-Hilf, Nicole, 2022, "Ground Truth data for printed Devanagari", https://doi.org/10.11588/data/EGOKEI, heiDATA, V1
Ground truth (GT) data (jpg and alto xml files) for an OCR model that recognizes printed text in Devanagari script. The GT data was trained on Transkribus with the HTR+ engine. The training was performed on appr. 220 pages with appr. 27,000 words. The validation set was 10% of th...
Sep 2, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Wiegand, Michael, 2019, "GermEval-2018 Corpus (DE)", https://doi.org/10.11588/data/0B5VML, heiDATA, V1
This dataset comprises the training and test data (German tweets) from the GermEval 2018 Shared on Offensive Language Detection.
Jan 20, 2021 - Empirical Linguistics and Computational Language Modeling (LiMo)
van den Berg, Esther; Korfhage, Katharina; Ruppenhofer, Josef; Wiegand, Michael; Markert, Katja, 2020, "German Twitter Titling Corpus", https://doi.org/10.11588/data/AOSUY6, heiDATA, V2, UNF:6:14BxjwJS7Q3mfI6ei7iBBw== [fileUNF]
The German Titling Twitter Corpus consists of 1904 stance-annotated tweets collected in June/July 2018 mentioning 24 German politicians with a doctoral degree. The Addendum contains an additional 296 stance-annotated tweets from each month of 2018 mentioning 10 politicians with a...
Mar 26, 2020 - Empirical Linguistics and Computational Language Modeling (LiMo)
Rehbein, Ines; Ruppenhofer, Josef, 2020, "German causal language annotations and lexicon (verbs, nouns, prepositions) (DE)", https://doi.org/10.11588/data/ZHI94V, heiDATA, V1
Annotations of causal verbs, nouns and prepositions in context and lexicon file for causal verbs, nouns and prepositions.
Dec 10, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Becker, Maria, 2019, "GER_SET: Situation Entity Type labelled corpus for German", https://doi.org/10.11588/data/BBQYD0, heiDATA, V1
Semantic clause types, also called Situation Entity (SE) types (Smith, 2003) are linguistic characterizations of aspectual properties shown to be useful for tasks like argumentation structure analysis (Becker et al., 2016), genre characterization (Palmer and Friedrich, 2014), and...
Dec 17, 2015 - Universitätsbibliothek Heidelberg
Antretter, Marlene; Eller, Dirk; Elstermann, Hannes; Geissler, Stefan; Grüning, Simon; Horn, Sebastian; Kohler, Matthias; Kuck, Kevin; Lingnau, Anna; Nozik, Alexandra; Odenwald, Jakob; Rieger, Felix; Schubert, Christopher; Wenze, Felix; Zimmermann, Karin, 2015, "Georeferencing of the "Lorscher Codex"", https://doi.org/10.11588/data/10063, heiDATA, V1
Collection of historic place names from the Lorscher Codex (1176-1200). Most of the (deserted) villages and cities named in the codex had been in possession of Abbey Lorsch, many of them are mentioned for the first time here. In the project the property should be visualised in mo...
Oct 22, 2019 - Empirical Linguistics and Computational Language Modeling (LiMo)
Becker, Maria, 2019, "Genre-sensitive Neural Situation Entity classifier (DE, EN)", https://doi.org/10.11588/data/XXKWU0, heiDATA, V1
This is a Classifier for situation entity types as described in Becker et al., 2017. These clause types depend on a combination of syntactic-semantic and contextual features. We explore this task in a deeplearning framework, where tuned word representations capture lexical, synta...
Feb 16, 2018 - Cluster of Excellence - Asia and Europe in a Global Context
Franziska Koch, 2018, "GECCA mapped", https://doi.org/10.11588/data/6JTPWL, heiDATA, V1
GECCA mapped is a pilot project that visualizes and provides geo-referential metadata of sixty exhibition entries collected in the larger GECCA data base (more than 700 entries). The exhibition sample is limited to Western, i.e. Western European and Northern American group exhibi...
Oct 10, 2017 - Propylaeum@heiDATA
Frommer, Sören, 2017, "Gammertingen, St. Michael. Auswertung der archäologischen Ausgrabungen insbesondere unter herrschafts-, siedlungs- und landesgeschichtlicher Fragestellung [Dataset]", https://doi.org/10.11588/data/MHGXU6, heiDATA, V1
Die Datensammlung umfasst die Befund- und Funddatenbank zu der monographisch veröffentlichten archäologischen Auswertung der Ausgrabung in der St. Michaelskapelle in Gammertingen. Sie besteht aus einer Access-Datenbank mit 25 relational verknüpften Tabellen, eine in Stratify erst...
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.