AndiB93
/

CosmicVoyage_RL

Reinforcement Learning

stable-baselines3

Model card Files Files and versions Community

AndiB93 commited on 20 days ago

Commit

8042066

•

1 Parent(s): 9ab19b3

Create README.md

Files changed (1) hide show

README.md +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,49 @@

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: reinforcement-learning
+tags:
+- web
+- game
+- CosmicVoyage
+---This model is a reinforcement learning agent trained to autonomously navigate and control the web-based game Cosmic Voyager. Utilizing the Proximal Policy Optimization (PPO) algorithm, the agent learns optimal strategies to maximize in-game performance.
+Training Configuration:
+Algorithm: Proximal Policy Optimization (PPO)
+Policy: Convolutional Neural Network (CnnPolicy)
+Learning Rate: 5e-5
+Batch Size: 256
+Number of Steps per Update (n_steps): 2048
+Number of Epochs: 20
+Maximum Gradient Norm (max_grad_norm): 0.75
+Discount Factor (gamma): 0.95
+GAE Lambda (gae_lambda): 0.95
+Clip Range: 0.1
+Entropy Coefficient (ent_coef): 0.02
+Target KL Divergence (target_kl): 0.025
+Total Timesteps: 3,000,000
+Policy Architecture:
+Feature Extractor Dimensions: 1024
+Network Architecture:
+Policy Network (pi): [1024, 512, 256]
+Value Function Network (vf): [1024, 512, 256]
+Activation Function: LeakyReLU
+Image Normalization: Disabled
+Environment Configuration:
+Observation Dimensions: Adjusted to fit the game's requirements
+Frame Stacking: Implemented to provide temporal context
+Usage:
+This model is designed to be integrated into the Cosmic Voyager game, enabling autonomous gameplay. For integration details and deployment instructions, please refer to the accompanying documentation.
+Training Monitoring:
+Training progress and metrics were tracked using Weights & Biases under the project 'Cosmic Voyager RL' by the entity 'andiB1293'.
+Disclaimer:
+This model is tailored specifically for the Cosmic Voyager game environment. Performance in different settings or games may vary. Users are advised to test the model thoroughly in their specific use cases.