Barcelona Supercomputing Center

Metrics

33,620 Downloads

The BSC Dataverse is the institutional research data repository of the Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS). It seeks to enable the storage, sharing, and search of research data coming from the BSC researchers, collaborators, and affiliated projects.

Computational Social Sciences & Humanities Dataverse

Earth Sciences

Life Sciences

BSC AI Factory

Red Española de Supercomputación

BSC Dissemination

Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

31 to 40 of 107 Results

Towards an open-access dataset of the flow over realistic urban geometries: high-fidelity simulations and validation Nov 3, 2025 - RES Users' Conference 2025 Duró Diaz, Josep Maria, 2025, "Towards an open-access dataset of the flow over realistic urban geometries: high-fidelity simulations and validation", https://doi.org/10.82201/CWCU4Q, BSC Dataverse, V1 This work presents the development of a high-resolution, open-access dataset of urban airflow over a realistic district in Barcelona, based on large-eddy simulations (LES) performed for 16 different wind directions. The simulations are conducted over a highly detailed computational domain that faithfully reproduces the real urban geometry, using gr...
(Data Records) A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags Nov 3, 2025 - A Decade of DerStandard Forum Interactions Fraxanet Morales, Emma; Gómez, Vicenç; Kaltenbrunner, Andreas; Pellert, Max, 2025, "(Data Records) A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/P32CXW, BSC Dataverse, V2, UNF:6:MmzkAl6KMTPYXLdJYALuKw== [fileUNF] This dataset contains the full set of data records described in the "A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags" publication. It includes the data records for User-level metadata, Comment-level data, Voting behavior, Article metadata, and Pre-computed text embeddings. Additionally, it includes: Annot...
ES-OC_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-OC_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/TTRGIC, BSC Dataverse, V2 The ES-OC Parallel Corpus is a synthetic Spanish-Aranese dataset created to support the use of under-resourced languages from Spain, such as Aranese, in NLP tasks, specifically Machine Translation. Aranese is a variant of the Occitan language spoken in the Val d'Aran, Spain, where it is recognised as a co-official language. The dataset can be used...
ES-AST_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-AST_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/2BK1NZ, BSC Dataverse, V2 The ES-AST Parallel Corpus is a Spanish-Asturian dataset created to support the use of under-resourced languages from Spain, such as Asturian, in NLP tasks, specifically Machine Translation. This dataset aggregates both synthetic and authentic data, and can be used to train Bilingual Machine Translation models between Asturian and Spanish in any di...
ES-AN_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-AN_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/ELCJXZ, BSC Dataverse, V2 The ES-AN Parallel Corpus is a mainly synthetic Spanish-Aragonese dataset created to support the use of under-resourced languages from Spain, such as Aragonese, in NLP tasks, specifically Machine Translation. The dataset can be used to train Bilingual Machine Translation models between Aragonese and Spanish in any direction, as well as Multilingual...
CA-ZH_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Melero, Maite; Villegas, Marta; Liao, Xixian, 2025, "CA-ZH_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/MFXKNG, BSC Dataverse, V2 The CA-ZH Parallel Corpus is a Catalan-Chinese textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Chinese and Catalan in any direction, as well as Multilingual Machine Translation models. The...
CA-IT_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-IT_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/BTMN1V, BSC Dataverse, V2 The CA-IT Parallel Corpus is a Catalan-Italian textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Italian and Catalan in any direction, as well as Multilingual Machine Translation models.
CA-FR_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-FR_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/V7AZKB, BSC Dataverse, V2 The CA-FR Parallel Corpus is a Catalan-French textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between French and Catalan in any direction, as well as Multilingual Machine Translation models.
CA-DE_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-DE_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/DYAZII, BSC Dataverse, V2 The CA-DE Parallel Corpus is a Catalan-German textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between German and Catalan in any direction, as well as Multilingual Machine Translation models.
CA-PT_Parallel_Corpus Nov 3, 2025 - Language Technologies Laboratory De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-PT_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/XYBQ8Q, BSC Dataverse, V2 The CA-PT Parallel Corpus is a Catalan-Portuguese textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Portuguese and Catalan in any direction, as well as Multilingual Machine Translation models...

Towards an open-access dataset of the flow over realistic urban geometries: high-fidelity simulations and validation

Nov 3, 2025 - RES Users' Conference 2025

Duró Diaz, Josep Maria, 2025, "Towards an open-access dataset of the flow over realistic urban geometries: high-fidelity simulations and validation", https://doi.org/10.82201/CWCU4Q, BSC Dataverse, V1

This work presents the development of a high-resolution, open-access dataset of urban airflow over a realistic district in Barcelona, based on large-eddy simulations (LES) performed for 16 different wind directions. The simulations are conducted over a highly detailed computational domain that faithfully reproduces the real urban geometry, using gr...

(Data Records) A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags

Nov 3, 2025 - A Decade of DerStandard Forum Interactions

Fraxanet Morales, Emma; Gómez, Vicenç; Kaltenbrunner, Andreas; Pellert, Max, 2025, "(Data Records) A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/P32CXW, BSC Dataverse, V2, UNF:6:MmzkAl6KMTPYXLdJYALuKw== [fileUNF]

This dataset contains the full set of data records described in the "A Decade of News Forum Interactions: Threaded Conversations, Signed Votes, and Topical Tags" publication. It includes the data records for User-level metadata, Comment-level data, Voting behavior, Article metadata, and Pre-computed text embeddings. Additionally, it includes: Annot...

ES-OC_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-OC_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/TTRGIC, BSC Dataverse, V2

The ES-OC Parallel Corpus is a synthetic Spanish-Aranese dataset created to support the use of under-resourced languages from Spain, such as Aranese, in NLP tasks, specifically Machine Translation. Aranese is a variant of the Occitan language spoken in the Val d'Aran, Spain, where it is recognised as a co-official language. The dataset can be used...

ES-AST_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-AST_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/2BK1NZ, BSC Dataverse, V2

The ES-AST Parallel Corpus is a Spanish-Asturian dataset created to support the use of under-resourced languages from Spain, such as Asturian, in NLP tasks, specifically Machine Translation. This dataset aggregates both synthetic and authentic data, and can be used to train Bilingual Machine Translation models between Asturian and Spanish in any di...

ES-AN_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Aleix Sant Savall; Melero, Maite; Villegas, Marta, 2025, "ES-AN_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/ELCJXZ, BSC Dataverse, V2

The ES-AN Parallel Corpus is a mainly synthetic Spanish-Aragonese dataset created to support the use of under-resourced languages from Spain, such as Aragonese, in NLP tasks, specifically Machine Translation. The dataset can be used to train Bilingual Machine Translation models between Aragonese and Spanish in any direction, as well as Multilingual...

CA-ZH_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Melero, Maite; Villegas, Marta; Liao, Xixian, 2025, "CA-ZH_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/MFXKNG, BSC Dataverse, V2

The CA-ZH Parallel Corpus is a Catalan-Chinese textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Chinese and Catalan in any direction, as well as Multilingual Machine Translation models. The...

CA-IT_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-IT_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/BTMN1V, BSC Dataverse, V2

The CA-IT Parallel Corpus is a Catalan-Italian textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Italian and Catalan in any direction, as well as Multilingual Machine Translation models.

CA-FR_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-FR_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/V7AZKB, BSC Dataverse, V2

The CA-FR Parallel Corpus is a Catalan-French textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between French and Catalan in any direction, as well as Multilingual Machine Translation models.

CA-DE_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-DE_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/DYAZII, BSC Dataverse, V2

The CA-DE Parallel Corpus is a Catalan-German textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between German and Catalan in any direction, as well as Multilingual Machine Translation models.

CA-PT_Parallel_Corpus

Nov 3, 2025 - Language Technologies Laboratory

De Luca Fornaciari, Francesca; Mash, Audrey; Melero, Maite; Villegas, Marta, 2025, "CA-PT_Parallel_Corpus", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/XYBQ8Q, BSC Dataverse, V2

The CA-PT Parallel Corpus is a Catalan-Portuguese textual dataset created to support Catalan in NLP tasks, specifically Machine Translation. The dataset is structured at the sentence level and can be used to train Bilingual Machine Translation models between Portuguese and Catalan in any direction, as well as Multilingual Machine Translation models...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications