Metrics
33,620 Downloads
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

31 to 40 of 48 Results
Sep 29, 2025 - Language Technologies Laboratory
Ruiz-Fernández, Valle; Gonzalez-Agirre, Aitor; Villegas, Marta; Falcão, Júlia; Vasquez Reina, Luis Antonio, 2025, "CaBBQ", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/1OO2M0, BSC Dataverse, V1
CaBBQ is the Catalan adaptation of the BBQ benchmark, adjusted to Catalan language and the social context of Spain. It aims to evaluate social bias in language models via a multiple-choice QA task, following the same 10 social categories as EsBBQ.
Sep 29, 2025 - Language Technologies Laboratory
Ruiz-Fernández, Valle; Gonzalez-Agirre, Aitor; Falcão, Júlia; Vasquez Reina, Luis Antonio; Villegas, Marta, 2025, "EsBBQ", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/MJGCT3, BSC Dataverse, V1
EsBBQ is an adaptation of the the original BBQ benchmark to Spanish and the Spanish social context. It is used to evaluate social bias in language models via a multiple‑choice question answering task along 10 social categories (Age, Disability, Gender, LGBTQIA, Nationality, Physical Appearance, Race/Ethnicity, Religion, Socioeconomic Status, Spanis...
Sep 29, 2025 - Life Sciences
Filella Merce, Isaac; Guallar, Victor; Isaac Soul Garcia, 2025, "Composite Database of Ultra-Large Chemical Libraries", https://doi.org/10.82201/6043UT, BSC Dataverse, V1
The Composite Database consists of approximately 120 billion molecules sourced from five ultra-large (> 100 million compounds) and nine large publicly available chemical libraries. Developed to support early-stage drug discovery, it is the largest publicly available database of enlisted molecules, readily accessible for efficient analog searches an...
Sep 18, 2025 - Language Technologies Laboratory
Hernández Mena, Carlos Daniel; Armentano i Oller, Carme, 2025, "cv17_es_other_automatically_verified", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/2DFJYA, BSC Dataverse, V1
"cv17_es_other_automatically_verified," is, as the name suggests, the result of the automatic validation of the "other" portion of Common Voice 17.0. The validation process was carried out using OpenAI's Whisper large model. If Whisper produces the same text as the Common Voice prompt, the transcription is considered valid regardless of its votes....
Sep 18, 2025 - Language Technologies Laboratory
Külebi, Baybars, 2025, "parlament_parla", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/RBGNSJ, BSC Dataverse, V1
This is the ParlamentParla speech corpus of more than 600 hours of speech from Catalan Parliament sessions. The audio segments were extracted from recordings the Catalan Parliament (Parlament de Catalunya) plenary sessions, which took place between 2007/07/11 - 2018/07/17. We aligned the transcriptions with the recordings and extracted the corpus....
Sep 18, 2025 - Language Technologies Laboratory
Armentano i Oller, Carme; Hernández Mena, Carlos Daniel; Külebi, Baybars, 2025, "commonvoice_benchmark_catalan_accents", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/WOYNOE, BSC Dataverse, V1
This is a new presentation of the corpus Catalan Common Voice v17 - metadata annotated version with the splits redefined to benchmark ASR models with various Catalan accents: From the validated recording split, we have selected, for each of the main accents of the language (balearic, central, northern, northwestern, valencian), the necessary male a...
Sep 18, 2025 - Language Technologies Laboratory
Solito, Sarah; Messaoudi, Abir; Külebi, Baybars, 2025, "parlament_parla_v3", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/JSUTRR, BSC Dataverse, V1
'parlament_parla_v3' is a speech corpus composed of Catalan Parliamentary Sessions.The v3 and last version of the corpus includes both clean and other quality segments, divided into short segments (less than 30 seconds) and long segments (more than 30 seconds). The total dataset encompasses 1059h 48m 04s of speech, including 945h 51m 06s for the sh...
Sep 18, 2025 - Earth Sciences
Bowdalo, Dene, 2025, "GHOST: A globally harmonised dataset of surface atmospheric composition measurements", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/1YNJTT, BSC Dataverse, V1
GHOST: Globally Harmonised Observations in Space and Time, represents one of the biggest collection of harmonised measurements of atmospheric composition at the surface. In total, 7,275,148,646 measurements from 1970-2023, of 227 different components, from 38 reporting networks, are compiled, parsed, and standardised. Components processed include g...
Sep 17, 2025 - Earth Sciences
Di Tomaso, Enza, 2025, "MONARCH high-resolution reanalysis data set of desert dust aerosol over Northern Africa, the Middle East and Europe", https://doi.org/10.82201/1APRWJ, BSC Dataverse, V1
This repository contains a high resolution regional reanalysis data set of desert dust aerosols. It covers Northern Africa, the Middle East and Europe along with the Mediterranean sea and parts of Central Asia, and the Atlantic and Indian Oceans between 2007 and 2016 at the horizontal resolution of 0.1° latitude × 0.1° longitude in rotated grid, an...
Sep 15, 2025 - Earth Sciences
Bretonnière, Pierre-Antoine, 2025, "CORDEX data", https://dataverse.bsc.es/dataset.xhtml?persistentId=perma:BSC/EDVBJN, BSC Dataverse, V1
This is a partial replica of the "CORDEX" data (Coordinated Regional Climate Downscaling Experiment: https://cordex.org/) hosted and downloadable from https://esgf-metagrid.cloud.dkrz.de/search?project=CORDEX They include multiple models and experiments from different downscaling domains, and at different frequencies and for different variables.
Add Data

Sign up or log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.