reddit-sentiment-analysis
This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.
Program Parameters
subs = [] sub-reddit to search post_flairs = {} posts flairs to search || None flair is automatically considered goodAuth = {} authors whom comments are allowed more than once uniqueCmt = True allow one comment per author per symbol ignoreAuthP = {} authors to ignore for posts ignoreAuthC = {} authors to ignore for comment upvoteRatio = float upvote ratio for post to be considered, 0.70 = 70% ups = int define # of upvotes, post is considered if upvotes exceed this # limit = int define the limit, comments 'replace more' limit upvotes = int define # of upvotes, comment is considered if upvotes exceed this # picks = int define # of picks here, prints as "Top ## picks are:" picks_ayz = int define # of picks for sentiment analysis
Sample Output
It took 216.65 seconds to analyze 5862 comments in 80 posts in 4 subreddits.
Posts analyzed saved in titles
10 most mentioned picks:
GME: 197
BB: 72
FB: 56
PLTR: 36
TSLA: 25
PLUG: 17
RC: 15
NIO: 14
SPCE: 10
TLRY: 10
Bearish Neutral Bullish Total/Compound
GME 0.087 0.763 0.150 0.161
BB 0.058 0.768 0.175 0.261
FB 0.119 0.708 0.173 0.127
PLTR 0.062 0.804 0.134 0.235
TSLA 0.124 0.690 0.187 0.195
Data:
Includes US stocks with market cap > 100 Million, and price above $3. It doesn't include penny stocks.
You can download data from here:
Source (US stocks): https://www.nasdaq.com/market-activity/stocks/screener?exchange=nasdaq&letter=0&render=download\
Implementation: I am using sets for 'x in s' comparison, sets time complexity for "x in s" is O(1) compare to list: O(n).
Limitations: It depends mainly on the defined parameters for current implementation: It completely ignores the heavily downvoted comments, and there can be a time when the most mentioned ticker is heavily downvoted, but you can change that in upvotes variable.