Baselight

Hindi-audio-speech To Text-or-visa-versa

hindi-audio_speech-to-text use for fine tune speech model for hindi language

@kaggle.rohansinghjadoan_hindi_audio_speech_detection

Loading...
Loading...

About this Dataset

Hindi-audio-speech To Text-or-visa-versa

🗂️ Dataset Schema Description

This dataset provides paired audio-to-text samples for training and evaluating speech recognition or audio-to-speech models. Each record represents a unique audio clip along with its metadata, transcription, and reference URLs.

Fields:

user_id – An anonymized identifier representing the speaker or user who provided the audio recording.

recording_id – A unique identifier assigned to each audio sample in the dataset.

language – ISO language code of the spoken audio (e.g., "hi" for Hindi, "en" for English).

duration – The total length of the audio clip in seconds. Useful for filtering long or short samples and batching data during model training.

rec_url_gcp – Direct URL to the raw audio file stored on cloud infrastructure (e.g., Google Cloud Storage). This serves as the main input for model training or inference.

transcription_url – URL to the corresponding ground-truth transcription text for each audio file. This acts as the label or target text for supervised learning tasks.

metadata_url – Link to additional metadata about the recording (e.g., device type, accent, background noise, recording conditions). While optional, it can provide valuable insights for analysis, model robustness, and domain adaptation.

💡 Usage

This dataset is ideal for:

Speech-to-text (ASR) model training and evaluation

Audio feature extraction and preprocessing

Multilingual speech research

Acoustic environment analysis and speaker variation studies

Tables

Hindi Audio Detection

@kaggle.rohansinghjadoan_hindi_audio_speech_detection.hindi_audio_detection
  • 13.93 kB
  • 104 rows
  • 7 columns
Loading...
CREATE TABLE hindi_audio_detection (
  "user_id" BIGINT,
  "recording_id" BIGINT,
  "language" VARCHAR,
  "duration" BIGINT,
  "rec_url_gcp" VARCHAR,
  "transcription_url_gcp" VARCHAR,
  "metadata_url_gcp" VARCHAR
);

Share link

Anyone who has the link will be able to view this.