python - Make list of unicode words that are in a file -
python - Make list of unicode words that are in a file -
my code is
f = codecs.open(r'c:\users\admin\desktop\nepali.txt', 'r', 'utf-8') nepali = f.read().split() in nepali: print
display words in file:
यो किताब टेबुल मा छ यो एक किताब हो केटा
but when seek create list of words code:
file=codecs.open(r'c:\users\admin\desktop\nepali.txt', 'r', 'utf-8') nepali = list(file.read().split()) print nepali
the output displayed this
[u'\ufeff\u092f\u094b', u'\u0915\u093f\u0924\u093e\u092c', u'\u091f\u0947\u092c\u0941\u0932', u'\u092e\u093e', u'\u091b', u'\u092f\u094b', u'\u090f\u0915', u'\u0915\u093f\u0924\u093e\u092c', u'\u0939\u094b',]
the output should like:
[यो, किताब, टेबुल, मा, छ,यो, एक, किताब, हो]
you looking @ output of repr()
function, used displaying contents of containers. output meant debugging, not end-user displays.
you'll have produce output manually:
print u'[{}]'.format(u', '.join(nepali))
this produces unicode string formatted list object, without using repr()
.
demo:
>>> nepali = [u'\ufeff\u092f\u094b', u'\u0915\u093f\u0924\u093e\u092c', u'\u091f\u0947\u092c\u0941\u0932', u'\u092e\u093e', u'\u091b', u'\u092f\u094b', u'\u090f\u0915', u'\u0915\u093f\u0924\u093e\u092c', u'\u0939\u094b',] >>> print u'[{}]'.format(u', '.join(nepali)) [यो, किताब, टेबुल, मा, छ, यो, एक, किताब, हो]
python unicode utf-8
Comments
Post a Comment