cssupport
/

bert-tweet-to-stock-return-predictor

Feature Extraction

PyTorch

English

Model card Files Files and versions Community

cssupport commited on Sep 7, 2023

Commit

6b5c384

•

1 Parent(s): 647b3ea

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ pipeline_tag: feature-extraction
 This is a very light weigh model and could be used in multiple analytical applications. -->
 This model is an example on how to handle multi-target regression problem using llms. Model takes in tweet,stock ticker, month, last_price and volume for a stock (around the tweet was publish) and returns 1,2,3 and 7 day returns and 10 day annualized volatility.  Model uses feature vectors output by the tweet text (mobile-bert output), numerical (last price and volume), and categorical(stock ticker and month) sub-components then are concatenated into a single feature vector which is fed into a final ouput layers.
-Used [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for text feature extraction (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks). This model detects SQLInjection attacks in the input string (check How To Below).
 This is again a very very light  model (100mb), used following dataset from [Kaggle](www.kaggle.com) called [Tweet Sentiment's Impact on Stock Returns (by THE DEVASTATOR)](https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns).
 **Disclaimer: This model should not be used for trading. Data source is not verified, assumption is that data is synthetically generated. This is just an example how to handle multi-target regression problem**.
 Contact us for more info: support@cloudsummary.com
@@ -57,7 +57,7 @@ tokenizer = AutoTokenizer.from_pretrained('google/mobilebert-uncased')
 model = torch.load('pytorch_model.pt')
 #load the stock enoder
 #list of ticker supported - ['21CF', 'ASOS', 'AT&T', 'Adobe', 'Allianz', 'Amazon', 'American Express', 'Apple', 'AstraZeneca', 'Audi', 'Aviva', 'BASF', 'BMW', 'BP', 'Bank of America', 'Bayer', 'BlackRock', 'Boeing', 'Burberry', 'CBS', 'CVS Health', 'Cardinal Health', 'Carrefour', 'Chevron', 'Cisco', 'Citigroup', 'CocaCola', 'Colgate', 'Comcast', 'Costco', 'Danone', 'Deutsche Bank', 'Disney', 'Equinor', 'Expedia', 'Exxon', 'Facebook', 'FedEx', 'Ford', 'GSK', 'General Electric', 'Gillette', 'Goldman Sachs', 'Google', 'Groupon', 'H&M', 'HP', 'HSBC', 'Heineken', 'Home Depot', 'Honda', 'Hyundai', 'IBM', 'Intel', 'JPMorgan', 'John Deere', "Kellogg's", 'Kroger', "L'Oreal", 'Mastercard', "McDonald's", 'Microsoft', 'Morgan Stanley', 'Nestle', 'Netflix', 'Next', 'Nike', 'Nissan', 'Oracle', 'P&G', 'PayPal', 'Pepsi', 'Pfizer', 'Reuters', 'Ryanair', 'SAP', 'Samsung', 'Santander', 'Shell', 'Siemens', 'Sony', 'Starbucks', 'TMobile', 'Tesco', 'Thales', 'Toyota', 'TripAdvisor', 'UPS', 'Verizon', 'Viacom', 'Visa', 'Vodafone', 'Volkswagen', 'Walmart', 'Wells Fargo', 'Yahoo', 'adidas', 'bookingcom', 'eBay', 'easyJet', 'salesforce.com']
-stock_encoder = joblib.load("data/stock_encoder.pkl")
 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

 This is a very light weigh model and could be used in multiple analytical applications. -->
 This model is an example on how to handle multi-target regression problem using llms. Model takes in tweet,stock ticker, month, last_price and volume for a stock (around the tweet was publish) and returns 1,2,3 and 7 day returns and 10 day annualized volatility.  Model uses feature vectors output by the tweet text (mobile-bert output), numerical (last price and volume), and categorical(stock ticker and month) sub-components then are concatenated into a single feature vector which is fed into a final ouput layers.
+Used [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) for text feature extraction (MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks).
 This is again a very very light  model (100mb), used following dataset from [Kaggle](www.kaggle.com) called [Tweet Sentiment's Impact on Stock Returns (by THE DEVASTATOR)](https://www.kaggle.com/datasets/thedevastator/tweet-sentiment-s-impact-on-stock-returns).
 **Disclaimer: This model should not be used for trading. Data source is not verified, assumption is that data is synthetically generated. This is just an example how to handle multi-target regression problem**.
 Contact us for more info: support@cloudsummary.com
 model = torch.load('pytorch_model.pt')
 #load the stock enoder
 #list of ticker supported - ['21CF', 'ASOS', 'AT&T', 'Adobe', 'Allianz', 'Amazon', 'American Express', 'Apple', 'AstraZeneca', 'Audi', 'Aviva', 'BASF', 'BMW', 'BP', 'Bank of America', 'Bayer', 'BlackRock', 'Boeing', 'Burberry', 'CBS', 'CVS Health', 'Cardinal Health', 'Carrefour', 'Chevron', 'Cisco', 'Citigroup', 'CocaCola', 'Colgate', 'Comcast', 'Costco', 'Danone', 'Deutsche Bank', 'Disney', 'Equinor', 'Expedia', 'Exxon', 'Facebook', 'FedEx', 'Ford', 'GSK', 'General Electric', 'Gillette', 'Goldman Sachs', 'Google', 'Groupon', 'H&M', 'HP', 'HSBC', 'Heineken', 'Home Depot', 'Honda', 'Hyundai', 'IBM', 'Intel', 'JPMorgan', 'John Deere', "Kellogg's", 'Kroger', "L'Oreal", 'Mastercard', "McDonald's", 'Microsoft', 'Morgan Stanley', 'Nestle', 'Netflix', 'Next', 'Nike', 'Nissan', 'Oracle', 'P&G', 'PayPal', 'Pepsi', 'Pfizer', 'Reuters', 'Ryanair', 'SAP', 'Samsung', 'Santander', 'Shell', 'Siemens', 'Sony', 'Starbucks', 'TMobile', 'Tesco', 'Thales', 'Toyota', 'TripAdvisor', 'UPS', 'Verizon', 'Viacom', 'Visa', 'Vodafone', 'Volkswagen', 'Walmart', 'Wells Fargo', 'Yahoo', 'adidas', 'bookingcom', 'eBay', 'easyJet', 'salesforce.com']
+stock_encoder = joblib.load("stock_encoder.pkl")
 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")