UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte

Depressed Dormouse

Code: Whatever

2021-07-05 06:14:49

import pandas as pd
data = pd.read_csv(filename, encoding= 'unicode_escape')

James K

Code: Whatever

2021-01-23 10:26:42

pd.read_csv('ml-100k/u.item', sep='|', names=m_cols , encoding='latin-1')

Andrey Morozov

Code: Whatever

2021-01-23 10:27:53

# Use 'ISO-8859-1' instead of "utf-8" for decoding
text = open(fn, 'rb').read().decode('ISO-8859-1')

chigusa

Code: Whatever

2021-01-23 10:25:09

with open(file, newline='', encoding="utf16") as MyFile:

FAF5CD5358EF24DC

Code: Whatever

2021-01-23 10:25:59

As suggested by Mark Ransom, I found the right encoding for that problem.
The encoding was "ISO-8859-1", so replacing

open("u.item", encoding="utf-8")
with
open('u.item', encoding = "ISO-8859-1")

will solve the problem.

Dave

Code: Whatever

2021-01-23 10:24:03

pd.read_csv("C:/Users/Admin/Desktop/Python/Past.csv",encoding='cp1252')

Stanley

Code: Whatever

2021-01-23 10:27:19

#use rb over r
with open(path, 'rb') as f:
  text = f.read()

New to Communities?

Join the community

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte

Tags

Related

New to Communities?