model_v5 / README.md
taoshi-mbrown's picture
Update README.md
d5bca6b verified
metadata
license: mit
language:
  - en
tags:
  - bittensor

MIT License

Copyright (c) 2024 Taoshi Inc

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Background

The models provided here were created using open source modeling techniques provided in https://github.com/taoshidev/time-series-prediction-subnet (TSPS). They were achieved using the runnable/miner_training.py, and tested against existing models in runnable/miner_testing.py.

Note
This model requires the Feature Set Creator (FSC) functionality added in the latest release of the TSPS.

Build Strategy

This section outlines the strategy used to build the models.

Understanding Dataset Used

The dataset used to build the models can be generated using the runnable/generate_historical_data.py. A lookback period between June 2023 and January 2024 on the 5m interval was used to train the model. Recent data was used because it more closely correlates to the current market and macroeconomic conditions.

Testing data was used between January 2024 and February 2024 to determine the performance of the models. This was tested using the runnable/miner_testing.py file with live historical data sources.

Understanding Model Creation

As of now, the model only uses the following features to predict:

  • close
  • high
  • low
  • volume
  • time of day
  • time of week
  • time of month

Other features from a wide range of feature sources are being added to TSPS infrastructure in the near future as improvements to the FSC.

A variety of windows and parameters were tested and eliminated. The final strategy to derive this model was the following:

model = BaseMiningModel(
    filename="model_v5_1.h5",
    mode="w",
    feature_count=7,
    sample_count=500,
    prediction_feature_count=1,
    prediction_count=10,
    prediction_length=100,
    layers=[
        [1024, 0],
        [1024, 0.3],
    ],
    learning_rate=0.000001,
    dtype=Policy("mixed_float16"),
)

The LSTM model has two stacked layers with a 0.3 dropout rate.

Understanding Training Decisions

Training was done with 500 samples per scenario and 128 scenarios per batch, with 20 training epochs and 10 passes over the entire dataset. Additional epochs and passes were not found to improve the model's predictions.

Strategy to Predict

The strategy to predict 100 closes of data into the future was to use 10 predictions evenly spaced along the length of the prediction space, and then linearly interpolating between each prediction. By doing so, the model could learn to predict the general shape of the market movement, rather than predicting all 100.