Error while loading dataset from HF into kaggle notebook

#2
by Someet24 - opened

"data = load_dataset("datadrivenscience/movie-genre-prediction", use_auth_token=True)"

while loading the dataset got the following error.

"UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 7: invalid start byte"

Competitions org

can you try again? we made a fix to the dataset

Competitions org

also, please use the latest datasets version

raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({type(e).name})")
ConnectionError: Couldn't reach 'datadrivenscience/movie-genre-prediction' on the Hub (SSLError)

@abhishek - I could not able to access the data.

Competitions org

Did you go to the dataset page and submit access request? Did your use your token?

@abhishek . Yes I have tried accessing the data through my token, which is created from huggingface. Surprisingly, now I would be able to access the data from datasets through load dataset module in google colab. why am I not able to access data in my local PC.

Competitions org

@SSwaminathan Glad to know that it worked in colab.
Your concern mentioned here: https://huggingface.co/spaces/competitions/movie-genre-prediction/discussions/4 is already being looked at internally.
Please don't create multiple posts for same problem as it creates confusion :)

@abhishek . Sure My intention here is not to create multiple posts. My first post related to hugging face login issue and the second one was with data. Since there was one related to data already, I tried using that root.

Competitions org

Do you get an error while accessing the data?

image.png

Following image is from my local PC.

Competitions org

It seems like the same SSL error mentioned in the other post :)
We are looking into it!

Sure. Thanks

can you try doing:

pip install --upgrade certifi requests

and then try loading the dataset again?

@abhishek . Still the same issue persists

image.png

@abhishek . Now I am having issue in Google colab too.

image.png

I am having the same issue on kaggle, how to solve it?

Image 16-06-2023 at 15.33 (1).jpeg
Image 16-06-2023 at 15.33.jpeg

Competitions org

you need to update datasets to latest version

Ok thank you it works now

@urielnguefack I am getting the same error. I also updated datasets.
What did you do to resolve the problem

Sign up or log in to comment