python - Broken Korean strings when reading DataFrame from CSV -



python - Broken Korean strings when reading DataFrame from CSV -

i korean user.

when read .csv file pandas dataframe, korean strings broken this: �����

english good.

input info sample:

unnamed: 0 �������� �������ε����� ��x��ǥ ��y��ǥ �����ڵ� ������ ����߻��������� ����Ǽ� �������� 0 165244 20131201 �ٻ�62175541 962170 1955410 331 �������� 1 2 18224.03

why korean text corrupted?

your text format unicode need decode utf-8 :

import csv def unicode_reader('your_file_name',delimiter='your_delimiter', **kwargs): spamreader = csv.reader('your_file_name',delimiter='your_delimiter', **kwargs) row in spamreader: yield [unicode(w, 'utf-8') w in row] reader = unicode_csv_reader(open('your_file_name')) tex in reader: print tex

python unicode pandas

Comments

Popular posts from this blog

assembly - What is the addressing mode for ld, add, and rjmp instructions? -

vowpalwabbit - Interpreting Vowpal Wabbit results: Why are some lines appended by "h"? -

ubuntu - Bash Script to Check That Files Are Being Created -