python - preclassified trained twitter comments for categorization -
python - preclassified trained twitter comments for categorization -
so have 1 1000000 lines of twitter comments info in csv format. need classify them in categories if talking : "product longevity", "cheap/costly", "on sale/discount" etc.
as can see have multiple classes classify these tweets info into. thing how generate/create training info such huge data.silly question wondering whether/not there preclassified/tagged comments info train our model with? if not best approach create training info multi-class classification of text/comments ?
while have tried , tested naivebayes sentiment classification smaller dataset, please suggest classifier shall utilize problem (multiple categories classify comments into).
thanks!!!
the thing how generate/create training info such huge data
i suggest finding training info set help categories interested in. let's cost related articles, might want find training info set cost related articles , perhaps expand using synonyms key-words inexpensive or so. , perhaps sentence construction find out whether if construction of sentence helps classifier algorithm.
if not best approach create training info multi-class classification of text/comments? key-words, pulling articles related categories , go there.
lastly, suggest beingness familiar nltk's corpus library, might help retrieving training info well.
as lastly question, i'm kinda confused mean 'multiple categories classify comments into', mean having multiple classifiers particular comment belong in? comment can belong 1 more classifiers?
python twitter machine-learning classification nltk
Comments
Post a Comment