FinRL Environment

A wrapper around FinRL stock trading environments that conforms to the OpenEnv specification.

Overview

This environment enables reinforcement learning for stock trading tasks using FinRL’s powerful StockTradingEnv, exposed through OpenEnv’s simple HTTP API. It supports:

Stock Trading: Buy/sell actions across multiple stocks
Portfolio Management: Track balance, holdings, and portfolio value
Technical Indicators: MACD, RSI, CCI, DX, and more
Flexible Configuration: Custom data sources and trading parameters

Quick Start

1. Build the Docker Image

First, build the base image (from OpenEnv root):

cd OpenEnv
docker build -t envtorch-base:latest -f src/openenv/core/containers/images/Dockerfile .

Then build the FinRL environment image:

docker build -t finrl-env:latest -f envs/finrl_env/server/Dockerfile .

2. Run the Server

Option A: With Default Sample Data

docker run -p 8000:8000 finrl-env:latest

This starts the server with synthetic sample data for testing.

Option B: With Custom Configuration

Create a configuration file config.json:

{
  "data_path": "/data/stock_data.csv",
  "stock_dim": 3,
  "hmax": 100,
  "initial_amount": 100000,
  "num_stock_shares": [0, 0, 0],
  "buy_cost_pct": [0.001, 0.001, 0.001],
  "sell_cost_pct": [0.001, 0.001, 0.001],
  "reward_scaling": 0.0001,
  "state_space": 25,
  "action_space": 3,
  "tech_indicator_list": ["macd", "rsi_30", "cci_30", "dx_30"]
}

Run with configuration:

docker run -p 8000:8000 \
  -v $(pwd)/config.json:/config/config.json \
  -v $(pwd)/data:/data \
  -e FINRL_CONFIG_PATH=/config/config.json \
  finrl-env:latest

3. Use the Client

from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Connect to server
client = FinRLEnv(base_url="http://localhost:8000")

# Get configuration
config = client.get_config()
print(f"Trading {config['stock_dim']} stocks")
print(f"Initial capital: ${config['initial_amount']:,.0f}")

# Reset environment
result = client.reset()
print(f"Initial portfolio value: ${result.observation.portfolio_value:,.2f}")

# Trading loop
for step in range(100):
    # Get current state
    state = result.observation.state

    # Your RL policy here (example: random actions)
    num_stocks = config['stock_dim']
    actions = np.random.uniform(-1, 1, size=num_stocks).tolist()

    # Execute action
    result = client.step(FinRLAction(actions=actions))

    print(f"Step {step}: Portfolio=${result.observation.portfolio_value:,.2f}, "
          f"Reward={result.reward:.2f}")

    if result.done:
        print("Episode finished!")
        break

client.close()

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    RL Training Framework                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ Policy Net   │  │ Value Net    │  │ Replay       │      │
│  │ (PyTorch)    │  │ (PyTorch)    │  │ Buffer       │      │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘      │
│         └──────────────────┴──────────────────┘              │
│                            │                                 │
│                   ┌────────▼────────┐                        │
│                   │ FinRLEnv        │ ← HTTP Client          │
│                   │ (HTTPEnvClient) │                        │
│                   └────────┬────────┘                        │
└────────────────────────────┼─────────────────────────────────┘
                             │ HTTP (JSON)
                    ┌────────▼────────┐
                    │ Docker Container│
                    │  Port: 8000     │
                    │                 │
                    │ ┌─────────────┐ │
                    │ │FastAPI      │ │
                    │ │Server       │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ Environment │ │
                    │ └──────┬──────┘ │
                    │        │        │
                    │ ┌──────▼──────┐ │
                    │ │ FinRL       │ │
                    │ │ StockTrading│ │
                    │ │ Env         │ │
                    │ └─────────────┘ │
                    └─────────────────┘

API Reference

FinRLAction

Trading action for the environment.

Attributes:

actions: list[float] - Array of normalized action values (-1 to 1) for each stock
- Positive values: Buy
- Negative values: Sell
- Magnitude: Relative trade size

Example:

# Buy stock 0, sell stock 1, hold stock 2
action = FinRLAction(actions=[0.5, -0.3, 0.0])

FinRLObservation

Observation returned by the environment.

Attributes:

state: list[float] - Flattened state vector
- Structure: [balance, prices..., holdings..., indicators...]
portfolio_value: float - Total portfolio value (cash + holdings)
date: str - Current trading date
done: bool - Whether episode has ended
reward: float - Reward for the last action
metadata: dict - Additional information

Example:

obs = result.observation
print(f"Portfolio: ${obs.portfolio_value:,.2f}")
print(f"Date: {obs.date}")
print(f"State dimension: {len(obs.state)}")

Client Methods

reset() -> StepResult[FinRLObservation]

Reset the environment to start a new episode.

result = client.reset()

step(action: FinRLAction) -> StepResult[FinRLObservation]

Execute a trading action.

action = FinRLAction(actions=[0.5, -0.3])
result = client.step(action)

state() -> State

Get episode metadata (episode_id, step_count).

state = client.state()
print(f"Episode: {state.episode_id}, Step: {state.step_count}")

get_config() -> dict

Get environment configuration.

config = client.get_config()
print(config['stock_dim'])
print(config['initial_amount'])

Data Format

The environment expects stock data in the following CSV format:

date	tic	close	high	low	open	volume	macd	rsi_30	cci_30	dx_30
2020-01-01	AAPL	100.0	102.0	98.0	99.0	1000000	0.5	55.0	10.0	15.0
2020-01-01	GOOGL	1500.0	1520.0	1480.0	1490.0	500000	-0.3	48.0	-5.0	20.0

Required columns:

date: Trading date
tic: Stock ticker symbol
close, high, low, open: Price data
volume: Trading volume
Technical indicators (as specified in tech_indicator_list)

Configuration Parameters

Parameter	Type	Description
`data_path`	str	Path to CSV file with stock data
`stock_dim`	int	Number of stocks to trade
`hmax`	int	Maximum shares per trade
`initial_amount`	int	Starting cash balance
`num_stock_shares`	list[int]	Initial holdings for each stock
`buy_cost_pct`	list[float]	Transaction cost for buying (per stock)
`sell_cost_pct`	list[float]	Transaction cost for selling (per stock)
`reward_scaling`	float	Scaling factor for rewards
`state_space`	int	Dimension of state vector
`action_space`	int	Dimension of action space
`tech_indicator_list`	list[str]	Technical indicators to include

Integration with RL Frameworks

Stable Baselines 3

from stable_baselines3 import PPO
from envs.finrl_env import FinRLEnv, FinRLAction
import numpy as np

# Create custom wrapper for SB3
class SB3FinRLWrapper:
    def __init__(self, base_url):
        self.env = FinRLEnv(base_url=base_url)
        config = self.env.get_config()
        self.action_space = spaces.Box(
            low=-1, high=1,
            shape=(config['action_space'],),
            dtype=np.float32
        )
        self.observation_space = spaces.Box(
            low=-np.inf, high=np.inf,
            shape=(config['state_space'],),
            dtype=np.float32
        )

    def reset(self):
        result = self.env.reset()
        return np.array(result.observation.state, dtype=np.float32)

    def step(self, action):
        result = self.env.step(FinRLAction(actions=action.tolist()))
        return (
            np.array(result.observation.state, dtype=np.float32),
            result.reward or 0.0,
            result.done,
            result.observation.metadata
        )

# Train
env = SB3FinRLWrapper("http://localhost:8000")
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

Troubleshooting

Server won’t start

Check if base image exists:
```
docker images | grep envtorch-base
```

Build base image if missing:

docker build -t envtorch-base:latest -f src/openenv/core/containers/images/Dockerfile .

Import errors

Make sure you’re in the src directory:

cd OpenEnv/src
python -c "from envs.finrl_env import FinRLEnv"

Configuration errors

Verify your data file has all required columns:

import pandas as pd
df = pd.read_csv('your_data.csv')
print(df.columns.tolist())

Examples

See the examples/ directory for complete examples:

examples/finrl_simple.py - Basic usage
examples/finrl_training.py - Full training loop with PPO
examples/finrl_backtesting.py - Backtesting a trained agent

License

BSD 3-Clause License (see LICENSE file in repository root)

References

Update on GitHub