python - Broken Korean strings when reading DataFrame from CSV -
python - Broken Korean strings when reading DataFrame from CSV -
i korean user.
when read .csv file pandas dataframe, korean strings broken this: �����
english good.
input info sample:
unnamed: 0 �������� �������ε����� ��x��ǥ ��y��ǥ �����ڵ� ������ ������������� ����Ǽ� �������� 0 165244 20131201 �ٻ�62175541 962170 1955410 331 �������� 1 2 18224.03
why korean text corrupted?
your text format unicode need decode utf-8
:
import csv def unicode_reader('your_file_name',delimiter='your_delimiter', **kwargs): spamreader = csv.reader('your_file_name',delimiter='your_delimiter', **kwargs) row in spamreader: yield [unicode(w, 'utf-8') w in row] reader = unicode_csv_reader(open('your_file_name')) tex in reader: print tex
python unicode pandas
Comments
Post a Comment