added column not getting saved in dataset

#78
by SameerMahajan - opened

I created a dataset and pushed it to hub using my code: https://github.com/sameermahajan/whisper/blob/main/CreateDataset.py

However when I load it and try to access the added column of "labels" as

from datasets import Dataset, Audio, concatenate_datasets, load_dataset

audio_dataset = load_dataset("SameerMahajan/marathi_numbers-1-20")
print (audio_dataset["labels"])

I get an error of:

Traceback (most recent call last):
File "C:\ML\Tables\whisper\try.py", line 4, in
print (audio_dataset["labels"])
File "C:\Users\GS-1316\AppData\Local\Programs\Python\Python310\lib\site-packages\datasets\dataset_dict.py", line 58, in getitem
return super().getitem(k)
KeyError: 'labels'

I can access the column alright in my in memory processing as: https://github.com/sameermahajan/whisper/blob/main/Retrain.py

Any ideas?

Hey @SameerMahajan - this kind of question is probably better suited to the Hugging Face forum under the "Datasets" category: https://discuss.huggingface.co

I can see from your dataset that the column labels is present, so you should be able to load it. You might need to first slice the "train" split:

audio_dataset = load_dataset("SameerMahajan/marathi_numbers-1-20")
print (audio_dataset["train"]["labels"])

Sign up or log in to comment