Good idea in principle, however, did you even try to run a test before releasing...

jonathanbgn · on Nov 19, 2019

That's a fair criticism. Sentiment analysis is quite hard to get right on social media messages because of diversity, subtlety, and many other aspects. From my experience with similar commercial and (very!) expensive products, their accuracy is far from perfect too.

Also consider the lack of labeled data for HN and Reddit messages: I had to use Twitter messages to train the classifiers.

This is the reason why I tried to play with BERT to see if I could get a model to generalize well from only Twitter messages. From my experiments, if you activate BERT (which makes the app much slower), you should be able to get 60~70% accuracy.

It's not perfect, but not too bad as well if you are getting averages over a large amount of messages.

Overall it's still a work in progress, I expect to greatly improve the accuracy over the following weeks!

chimi · on Nov 19, 2019

I came here to say the same thing as the GP. I don't understand why some words are red or green.

For example, you can type in non-brand words as well. I typed in "houses" and the word "homeless" came up in green!

With a brand, facebook, I got this word "amiriteguyze" in red and clicking on it

Negative 11/19/2019, 12:13:31 PM

facebook is bad amiriteguyze?!?!?!?

Why is that even a word that would show up in the word cloud? I can't imagine it was entered a bunch of times. I can't intuit any correlation between the colors, sizes, or words themselves that show up in the clouds.

jonathanbgn · on Nov 20, 2019

The algorithm will try to give more importance to words which appear rarely and are only used with the chosen brandname (similar to TF-IDF). This is why sometimes weird words can surface to the wordcloud, especially when the sample size of messages is small.

To prevent those words from appearing, I was thinking to implement some dictionary-check to only allow for meaningful words. However this approach also have drawback as you restrict people's words and can miss important new concepts.

Thanks for the feedback.

BubRoss · on Nov 19, 2019

To be clear you made something that doesn't work, posted it and got attention because you asserted that it worked, and when people point out it doesn't work, you say 'it is hard and other people's software also doesn't work'.

jonathanbgn · on Nov 20, 2019

This is not what I said. I said that the accuracy is not 100% perfect, but that you can improve it by turning on BERT in the menu bar.

BubRoss · on Nov 20, 2019

Everyone else said it for you

screaminghawk · on Nov 19, 2019

Equally I saw "dumb" as positive. In context, it was negative but the whole post was positive.

inamesh · on Nov 19, 2019

I guess it's one of those things where hate and love are good but mediocre is bad