|
Persistent Identifier
|
perma:BSC/AQRKOH |
|
Publication Date
|
2025-06-26 |
|
Title
| AMSMB HTR model for Medieval Notarial Manuscripts |
|
Alternative Title
| AMSMB HTR model |
|
Author
| Berganzo-Besga, Ibanhttps://ror.org/05sd8tv96ORCIDhttps://orcid.org/0000-0002-6161-2452
Coll Ardanuy, Marionahttps://ror.org/05sd8tv96ORCIDhttps://orcid.org/0000-0001-8455-7196 |
|
Point of Contact
|
Use email button above to contact.
Iban Berganzo-Besga (Barcelona Supercomputing Center) |
|
Description
| AMSMB HTR is a model trained using Kraken. The model uses the Tridis v1 model as base and is fine-tuned on the AMSMB HTR dataset (model named ft:v1 on the paper). Details on how the model was trained are reported in our GitLab repository and in our paper (see citation information below). (2025-06-25) |
|
Subject
| Arts and Humanities |
|
Keyword
| handwritten text recognition
Late Middle Ages
automatic transcription |
|
Topic Classification
| medieval history
artificial intelligence
archival history
diplomatics
paleography
document analysis and recognition |
|
Related Publication
| Is Supplement To: Mariona Coll Ardanuy, Iban Berganzo-Besga, Ramon Sarobe, and Coral Cuadrada. 2025 (forthcoming). Evaluating Handwritten Text Recognition in Medieval Notarial Manuscripts: A New Dataset and Comprehensive Analysis. In International Conference on Document Analysis and Recognition. |
|
Notes
| Related Datasets: AMSMB HTR dataset: https://dataverse.bsc.es/citation?persistentId=perma:BSC/0VB0MC |
|
Language
| Latin; Catalan, Valencian |
|
Producer
| Barcelona Supercomputing Center (Barcelona Supercomputing Center) (BSC) https://bsc.es/ |
|
Production Date
| 2025-03-14 |
|
Production Location
| Barcelona |
|
Contributor
| Other: Arxiu dels Marquesos de Santa Maria de Barberà
Other: Arxiu Municipal de Vilassar de Dalt |
|
Funding Information
| AI4S fellowship within the “Generación D” initiative by Red.es, Ministerio para la Transformación Digital y de la Función Pública, for talent attraction, funded by NextGenerationEU through PRTR: C005/24-ED CV1 |
|
Distributor
| Barcelona Supercomputing Center (Barcelona Supercomputing Center) (BSC) https://bsc.es/ |
|
Distribution Date
| 2025-06-25 |
|
Depositor
| Coll Ardanuy, Mariona |
|
Deposit Date
| 2025-06-25 |
|
Time Period
| Start Date: 1208-01-01; End Date: 1499-12-31 |
|
Date of Collection
| Start Date: 2024-08-01; End Date: 2025-03-14 |
|
Data Type
| Kraken handwritten text recognition model (.mlmodel) |
|
Software
| python, Version: 3.12.9
kraken, Version: 5.3.0
ketos, Version: 5.3.0 |
|
Origin of Historical Sources
| Arxiu dels Marquesos de Santa Maria de Barberà: https://arxiumarquesosdebarbera.cat/ |
|
Documentation and Access to Sources
| The digitized images that have been used to train this model have been provided by the Arxiu Municipal de Vilassar de Dalt (AMVD), through its agreement with the Arxiu dels Marquesos de Santa Maria de Barberà (AMSMB). For further details, check the dataset datasheet at: https://dataverse.bsc.es/citation?persistentId=perma:BSC/0VB0MC. |