Online Environments¶
Online environments connect to real trading APIs for paper trading or live execution. They provide the same TorchTrade interface as offline environments but fetch real-time market data from exchanges. Each offline environment has a corresponding live counterpart:
| Offline (Training) | Live (Deployment) |
|---|---|
SequentialTradingEnv (spot, leverage=1) |
AlpacaTorchTradingEnv |
SequentialTradingEnv (futures, leverage>1) |
BinanceFutures / BitgetFutures / BybitFutures / OKXFuturesTorchTradingEnv |
| SequentialTradingEnvSLTP (spot) | AlpacaSLTPTorchTradingEnv |
| SequentialTradingEnvSLTP (futures) | BinanceFuturesSLTP / BitgetFuturesSLTP / BybitFuturesSLTP / OKXFuturesSLTPTorchTradingEnv |
| OneStepTradingEnv (spot) | AlpacaSLTPTorchTradingEnv |
| OneStepTradingEnv (futures) | BinanceFuturesSLTP / BitgetFuturesSLTP / BybitFuturesSLTP / OKXFuturesSLTPTorchTradingEnv |
Supported Exchanges:
- Alpaca - Commission-free US stocks and crypto with paper trading
- Binance - Cryptocurrency futures with high leverage and testnet
- Bitget - Cryptocurrency futures with competitive fees and testnet
- Bybit - Cryptocurrency derivatives with native bracket orders and testnet
- OKX - Global cryptocurrency exchange with bracket orders via attachAlgoOrds and demo trading
- Polymarket - Decentralized prediction markets on Polygon (CLOB), with dry-run paper trading
Overview¶
| Environment | Exchange | Asset Type | Futures | Leverage | Bracket Orders |
|---|---|---|---|---|---|
| AlpacaTorchTradingEnv | Alpaca | Crypto/Stocks | - | - | - |
| AlpacaSLTPTorchTradingEnv | Alpaca | Crypto/Stocks | - | - | Yes |
| BinanceFuturesTorchTradingEnv | Binance | Crypto | Yes | Yes | - |
| BinanceFuturesSLTPTorchTradingEnv | Binance | Crypto | Yes | Yes | Yes |
| BitgetFuturesTorchTradingEnv | Bitget | Crypto | Yes | Yes | - |
| BitgetFuturesSLTPTorchTradingEnv | Bitget | Crypto | Yes | Yes | Yes |
| BybitFuturesTorchTradingEnv | Bybit | Crypto | Yes | Yes | - |
| BybitFuturesSLTPTorchTradingEnv | Bybit | Crypto | Yes | Yes | Yes |
| OKXFuturesTorchTradingEnv | OKX | Crypto | Yes | Yes | - |
| OKXFuturesSLTPTorchTradingEnv | OKX | Crypto | Yes | Yes | Yes |
| PolymarketBetEnv | Polymarket | Prediction markets | - | - | - |
Fractional Position Sizing¶
Non-SLTP environments use action_levels for fractional position sizing — same as offline environments. Action values in [-1.0, 1.0] represent the fraction of balance to allocate. When adjusting positions (e.g., 100% to 50%), the environment only trades the delta, reducing transaction fees.
Live environments use a query-first pattern: they query the actual exchange position (source of truth), calculate the target based on the action, round to exchange constraints (lot size, min notional), and trade only the delta.
Timeframe Format - Critical for Model Compatibility
Always use canonical timeframe forms:
- Alpaca:
["1Min", "5Min", "15Min", "1Hour", "1Day"] - Binance/Bitget/Bybit/OKX:
["1m", "5m", "15m", "1h", "1d"] - Wrong:
["60min"],["60m"],["24hour"]— these create different observation keys and break model compatibility.
Alpaca Environments¶
Alpaca provides commission-free trading for US stocks and cryptocurrencies with paper trading support.
AlpacaTorchTradingEnv¶
from torchtrade.envs.alpaca import AlpacaTorchTradingEnv, AlpacaTradingEnvConfig
config = AlpacaTradingEnvConfig(
symbol="BTC/USD",
time_frames=["1Min", "5Min", "15Min", "1Hour"],
window_sizes=[12, 8, 8, 24],
execute_on="5Min",
paper=True, # Paper trading (recommended!)
)
env = AlpacaTorchTradingEnv(config)
AlpacaSLTPTorchTradingEnv¶
from torchtrade.envs.alpaca import AlpacaSLTPTorchTradingEnv, AlpacaSLTPTradingEnvConfig
config = AlpacaSLTPTradingEnvConfig(
symbol="BTC/USD",
time_frames=["1Min", "5Min", "15Min"],
window_sizes=[12, 8, 8],
execute_on="5Min",
stoploss_levels=[-0.02, -0.05],
takeprofit_levels=[0.05, 0.10],
paper=True,
)
env = AlpacaSLTPTorchTradingEnv(config)
# Action space: HOLD + 4 SL/TP combinations = 5 actions
Binance Environments¶
Binance provides cryptocurrency futures trading with up to 125x leverage and testnet support.
BinanceFuturesTorchTradingEnv¶
from torchtrade.envs.binance import BinanceFuturesTorchTradingEnv, BinanceFuturesTradingEnvConfig
config = BinanceFuturesTradingEnvConfig(
symbol="BTCUSDT",
intervals=["1m", "5m", "15m"],
window_sizes=[12, 8, 8],
execute_on="1m",
leverage=5,
quantity_per_trade=0.01,
demo=True, # Testnet (recommended!)
)
env = BinanceFuturesTorchTradingEnv(config)
BinanceFuturesSLTPTorchTradingEnv¶
from torchtrade.envs.binance import BinanceFuturesSLTPTorchTradingEnv, BinanceFuturesSLTPTradingEnvConfig
config = BinanceFuturesSLTPTradingEnvConfig(
symbol="BTCUSDT",
intervals=["1m", "5m"],
window_sizes=[12, 8],
execute_on="1m",
stoploss_levels=[-0.02, -0.05],
takeprofit_levels=[0.03, 0.06, 0.10],
leverage=5,
quantity_per_trade=0.01, # 0.01 BTC per trade (quantity mode)
trade_mode="quantity", # "quantity", "notional", or "fractional"
demo=True,
)
# For consistent USD-sized positions regardless of BTC price:
# config = BinanceFuturesSLTPTradingEnvConfig(
# ...
# quantity_per_trade=500.0, # $500 per trade
# trade_mode="notional",
# )
# For fractional portfolio-based sizing (10% of account per trade):
# config = BinanceFuturesSLTPTradingEnvConfig(
# ...
# trade_mode="fractional",
# position_fraction=0.1,
# )
env = BinanceFuturesSLTPTorchTradingEnv(config)
# Action space: HOLD + 2×(2 SL × 3 TP) = 13 actions (long + short)
Bitget Environments¶
Bitget provides cryptocurrency futures trading with competitive fees and testnet support. TorchTrade uses CCXT to interface with Bitget's V2 API.
CCXT Symbol Format
Bitget uses CCXT's perpetual swap format: "BTC/USDT:USDT" (not "BTCUSDT").
BitgetFuturesTorchTradingEnv¶
from torchtrade.envs.bitget import BitgetFuturesTorchTradingEnv, BitgetFuturesTradingEnvConfig
from torchtrade.envs.bitget.futures_order_executor import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = BitgetFuturesTradingEnvConfig(
symbol="BTC/USDT:USDT",
time_frames=["5min", "15min"],
window_sizes=[6, 32],
execute_on="1min",
product_type="USDT-FUTURES",
leverage=5,
quantity_per_trade=0.002,
margin_mode=MarginMode.ISOLATED, # ISOLATED (safer) or CROSSED
position_mode=PositionMode.ONE_WAY, # ONE_WAY (simpler) or HEDGE
demo=True,
)
env = BitgetFuturesTorchTradingEnv(
config,
api_key=os.getenv("BITGETACCESSAPIKEY"),
api_secret=os.getenv("BITGETSECRETKEY"),
api_passphrase=os.getenv("BITGETPASSPHRASE"),
)
BitgetFuturesSLTPTorchTradingEnv¶
from torchtrade.envs.bitget import BitgetFuturesSLTPTorchTradingEnv, BitgetFuturesSLTPTradingEnvConfig
from torchtrade.envs.bitget.futures_order_executor import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = BitgetFuturesSLTPTradingEnvConfig(
symbol="BTC/USDT:USDT",
time_frames=["5min", "15min"],
window_sizes=[6, 32],
execute_on="1min",
stoploss_levels=[-0.025, -0.05, -0.1],
takeprofit_levels=[0.05, 0.1, 0.2],
include_hold_action=True,
product_type="USDT-FUTURES",
leverage=5,
quantity_per_trade=0.002, # 0.002 BTC per trade (quantity mode)
trade_mode="quantity", # "quantity", "notional", or "fractional"
margin_mode=MarginMode.ISOLATED,
position_mode=PositionMode.ONE_WAY,
demo=True,
)
env = BitgetFuturesSLTPTorchTradingEnv(
config,
api_key=os.getenv("BITGETACCESSAPIKEY"),
api_secret=os.getenv("BITGETSECRETKEY"),
api_passphrase=os.getenv("BITGETPASSPHRASE"),
)
# Action space: HOLD + 2×(3 SL × 3 TP) = 19 actions (long + short)
Bybit Environments¶
Bybit provides cryptocurrency derivatives trading with native bracket order support (stop-loss/take-profit set directly on orders). TorchTrade uses pybit, Bybit's official Python SDK, for direct v5 API access.
Symbol Format
Bybit uses simple concatenated symbols: "BTCUSDT" (not "BTC/USDT:USDT").
BybitFuturesTorchTradingEnv¶
from torchtrade.envs.bybit import BybitFuturesTorchTradingEnv, BybitFuturesTradingEnvConfig
from torchtrade.envs.bybit import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = BybitFuturesTradingEnvConfig(
symbol="BTCUSDT",
time_frames=["5m", "15m"],
window_sizes=[6, 32],
execute_on="1m",
leverage=5,
margin_mode=MarginMode.ISOLATED, # ISOLATED (safer) or CROSSED
position_mode=PositionMode.ONE_WAY, # ONE_WAY (simpler) or HEDGE
demo=True, # Testnet (recommended!)
)
env = BybitFuturesTorchTradingEnv(
config,
api_key=os.getenv("BYBIT_API_KEY"),
api_secret=os.getenv("BYBIT_API_SECRET"),
)
BybitFuturesSLTPTorchTradingEnv¶
from torchtrade.envs.bybit import BybitFuturesSLTPTorchTradingEnv, BybitFuturesSLTPTradingEnvConfig
from torchtrade.envs.bybit import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = BybitFuturesSLTPTradingEnvConfig(
symbol="BTCUSDT",
time_frames=["5m", "15m"],
window_sizes=[6, 32],
execute_on="1m",
stoploss_levels=(-0.025, -0.05, -0.1),
takeprofit_levels=(0.05, 0.1, 0.2),
include_hold_action=True,
leverage=5,
quantity_per_trade=0.002, # 0.002 BTC per trade (quantity mode)
trade_mode="quantity", # "quantity", "notional", or "fractional"
margin_mode=MarginMode.ISOLATED,
position_mode=PositionMode.ONE_WAY,
demo=True,
)
env = BybitFuturesSLTPTorchTradingEnv(
config,
api_key=os.getenv("BYBIT_API_KEY"),
api_secret=os.getenv("BYBIT_API_SECRET"),
)
# Action space: HOLD + 2×(3 SL × 3 TP) = 19 actions (long + short)
OKX Environments¶
OKX provides global cryptocurrency trading with up to 125x leverage and demo trading support. TorchTrade uses python-okx, OKX's official Python SDK. Bracket orders (SL/TP) are placed atomically via OKX's attachAlgoOrds parameter.
Symbol Format
OKX uses dash-separated symbols with swap suffix: "BTC-USDT-SWAP". The environment auto-normalizes common formats ("BTCUSDT", "BTC/USDT", "BTC-USDT") to the correct OKX format.
Passphrase Required
OKX API requires a passphrase in addition to API key and secret.
OKXFuturesTorchTradingEnv¶
from torchtrade.envs.live.okx import OKXFuturesTorchTradingEnv, OKXFuturesTradingEnvConfig
from torchtrade.envs.live.okx import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = OKXFuturesTradingEnvConfig(
symbol="BTC-USDT-SWAP",
time_frames=["5m", "15m"],
window_sizes=[6, 32],
execute_on="1m",
leverage=5,
margin_mode=MarginMode.ISOLATED, # ISOLATED (safer) or CROSS
position_mode=PositionMode.NET, # NET (simpler) or LONG_SHORT
demo=True, # Demo trading (recommended!)
)
env = OKXFuturesTorchTradingEnv(
config,
api_key=os.getenv("OKX_API_KEY"),
api_secret=os.getenv("OKX_API_SECRET"),
passphrase=os.getenv("OKX_PASSPHRASE"),
)
OKXFuturesSLTPTorchTradingEnv¶
from torchtrade.envs.live.okx import OKXFuturesSLTPTorchTradingEnv, OKXFuturesSLTPTradingEnvConfig
from torchtrade.envs.live.okx import MarginMode, PositionMode
import os
from dotenv import load_dotenv
load_dotenv()
config = OKXFuturesSLTPTradingEnvConfig(
symbol="BTC-USDT-SWAP",
time_frames=["5m", "15m"],
window_sizes=[6, 32],
execute_on="1m",
stoploss_levels=(-0.025, -0.05, -0.1),
takeprofit_levels=(0.05, 0.1, 0.2),
include_hold_action=True,
leverage=5,
quantity_per_trade=0.002, # 0.002 BTC per trade (quantity mode)
trade_mode="quantity", # "quantity", "notional", or "fractional"
margin_mode=MarginMode.ISOLATED,
position_mode=PositionMode.NET,
demo=True,
)
env = OKXFuturesSLTPTorchTradingEnv(
config,
api_key=os.getenv("OKX_API_KEY"),
api_secret=os.getenv("OKX_API_SECRET"),
passphrase=os.getenv("OKX_PASSPHRASE"),
)
# Action space: HOLD + 2×(3 SL × 3 TP) = 19 actions (long + short)
Polymarket Environment¶
Polymarket is a decentralized prediction market on Polygon. TorchTrade currently exposes a single env, PolymarketBetEnv, tailored for short-cadence binary markets, Polymarket runs continuous 5-minute, 15-minute, 1-hour, and 4-hour crypto "up/down" markets (BTC, ETH, SOL) plus daily markets. Each step is an independent bet: place direction, wait for resolution, collect realized payoff, advance to the next market in the series. There is no carried position, so the observation deliberately omits any account_state.
Starter environment, more to come
PolymarketBetEnv is intentionally a starter env matching the most common Polymarket use case for TorchTrade (rolling binary up/down bets). Polymarket also has multi-strike daily price markets, sports, politics, and longer-horizon markets that benefit from a different env shape. Additional Polymarket envs will be added based on user requests and as new market types appear, open an issue describing the pattern you need.
Authentication
Polymarket uses a Polygon private key, not an API key/secret pair. The key is used to derive CLOB API credentials at runtime.
USDC Required
The wallet must hold USDC.e on Polygon to place real orders. Use dry_run=True to validate the pipeline without spending capital, dry_run works without py-clob-client installed.
PolymarketBetEnv¶
from torchtrade.envs.live.polymarket import PolymarketBetEnv, PolymarketBetEnvConfig
import os
from dotenv import load_dotenv
load_dotenv()
config = PolymarketBetEnvConfig(
market_slug_prefix="btc-updown-5m-", # discover via scan_markets.py
bet_fraction=0.01, # stake 1 % of cash per bet
max_steps=10, # 10 bets per episode
initial_cash=1_000.0, # for dry-run accounting
dry_run=True, # paper trading (recommended!)
)
env = PolymarketBetEnv(
config=config,
private_key=os.getenv("POLYGON_PRIVATE_KEY", ""),
)
# observation_spec: {"market_state": (4,)} → [yes_price, spread, vol_24h, liquidity]
# action_spec: Categorical(2) → 0 = Down, 1 = Up
Each step():
1. Submits the bet on the current market (skipped in dry_run).
2. Sleeps until the market's endDate plus a small grace period.
3. Polls Polymarket's CLOB at clob.polymarket.com/midpoint?token_id=… for each outcome token; the market is resolved once the YES midpoint is ≥ 0.99 and the NO midpoint is ≤ 0.01 (Up won), or vice versa (Down won). The CLOB is used here rather than Gamma's outcomePrices because Gamma evicts short-cadence markets within minutes of endDate and its prices field is a stale snapshot, not a live mid.
4. Computes realized payoff, a win pays stake × (1 − fill) / fill; a loss returns −stake.
5. Picks the next active market matching market_slug_prefix and returns its market_state.
Discovering markets and slug prefixes¶
MarketScanner is the discovery tool. Use it once to identify the slug prefix that points at the series you want, then plug it into the env. Two query strategies depending on your filters:
- Browsing mode (no
slug_prefixand nomax_time_to_resolution_minutes), sorts by 24-hour volume descending, surfaces high-volume markets first. Best for "what's hot right now?". - Targeting upcoming mode (either of those is set), sorts by
endDateascending and uses Gamma'send_date_min=nowfilter. Required for short-cadence series likebtc-updown-5m-, which typically have $0 24-hour volume and never make the volume-sorted top page.
from torchtrade.envs.live.polymarket import MarketScanner, MarketScannerConfig
# Generic discovery
scanner = MarketScanner(MarketScannerConfig(
min_volume_24h=10_000.0,
max_markets=10,
))
for m in scanner.scan():
print(m.slug, m.yes_price, m.volume_24h)
# Lock to a series (5-minute Bitcoin)
scanner = MarketScanner(MarketScannerConfig(slug_prefix="btc-updown-5m-"))
for m in scanner.scan():
print(m.slug, m.end_date, m.liquidity)
# Find any short-cadence market resolving in <30 minutes
scanner = MarketScanner(MarketScannerConfig(max_time_to_resolution_minutes=30))
# Fuzzy keyword filter (case-insensitive, OR-match across question + slug)
scanner = MarketScanner(MarketScannerConfig(keyword=["btc", "bitcoin", "crypto"]))
CLI flags¶
The companion CLI is examples/broker/polymarket/scan_markets.py. All flags map 1:1 to fields on MarketScannerConfig:
| Flag | Purpose |
|---|---|
--slug-prefix |
Exact, structural prefix match on the market slug (case-sensitive). This is the env-side primitive, once you've found a prefix you're happy with, it's the identifier you paste into PolymarketBetEnvConfig.market_slug_prefix. Examples: btc-updown-5m-, bitcoin-up-or-down-, eth-updown-15m-. |
--keyword |
Fuzzy, exploratory substring match (case-insensitive, applied to both question and slug). Pass one or more terms, a market passes the filter if any term appears: --keyword btc bitcoin crypto matches anything containing btc OR bitcoin OR crypto. Quote multi-word terms (--keyword "world cup"). Use this to find a series you don't already know the prefix of, then read the matching slug column to identify the prefix for the env. |
--min-volume |
Minimum 24-hour USDC volume. Default 0. Raise to skip thin markets when browsing. |
--min-liquidity |
Minimum order-book liquidity in USDC. Default 0. |
--min-resolution-hours |
Lower bound on time-to-resolution. Default 0. |
--max-resolution-minutes |
Upper bound on time-to-resolution. Default unset. Set to e.g. 30 to surface short-cadence markets that volume-sorted browsing would otherwise miss. |
--max |
Cap on rows printed. Default 20. |
Typical workflow: discover with --keyword, identify the slug stem, then verify with --slug-prefix:
# 1. Fuzzy: "what crypto markets exist resolving soon?"
python examples/broker/polymarket/scan_markets.py --keyword btc bitcoin --max-resolution-minutes 30
# 2. Identify a stem in the output (e.g. btc-updown-5m-) and verify
python examples/broker/polymarket/scan_markets.py --slug-prefix btc-updown-5m- --max 5
# 3. Plug that stem into PolymarketBetEnvConfig(market_slug_prefix="btc-updown-5m-")
Currently active short-cadence series¶
| Series | Cadence | Slug prefix |
|---|---|---|
| BTC up/down (5 min) | 5 min | btc-updown-5m- |
| BTC up/down (15 min) | 15 min | btc-updown-15m- |
| BTC up/down (1 hour) | 1 h | bitcoin-up-or-down- |
| BTC up/down (4 hour) | 4 h | btc-updown-4h- |
| ETH up/down (5 min) | 5 min | eth-updown-5m- |
| ETH up/down (15 min) | 15 min | eth-updown-15m- |
| SOL up/down (5 min) | 5 min | sol-updown-5m- |
| BTC daily up/down | 1 day | bitcoin-up-or-down-on- |
Polymarket adds and renames series occasionally; re-running scan_markets.py to verify the prefix before configuring the env is the safe move.
Adding external context (OHLCV, microfeatures, multi-timeframe)¶
PolymarketBetEnv's default observation is intentionally minimal: a 4-element market_state = [yes_price, spread, volume_24h, liquidity]. For "BTC up/down in 5 min?" markets, the obvious thing to add is live BTC OHLCV from a spot exchange, the policy needs price action to take an informed view, not just the implied probability already baked into yes_price.
The idiomatic TorchRL way to do this is TransformedEnv plus a small Transform that fetches your data each step and adds it to the observation TensorDict. TorchTrade ships BinanceOHLCVTransform for exactly this, wraps BinanceObservationClass (no API key required for public OHLCV), pulls multi-timeframe windows each step, and extends the observation spec to match.
Example: Binance OHLCV alongside the Polymarket observation¶
from torchrl.envs import TransformedEnv
from torchtrade.envs.live.polymarket import PolymarketBetEnv, PolymarketBetEnvConfig
from torchtrade.envs.transforms import BinanceOHLCVTransform
config = PolymarketBetEnvConfig(market_slug_prefix="btc-updown-5m-", dry_run=True)
env = TransformedEnv(
PolymarketBetEnv(config),
BinanceOHLCVTransform(), # defaults: BTCUSDT, 1m/5m/15m × 60/30/20
)
td = env.reset()
# td now has:
# market_state (4,)
# ohlcv_1Minute_60 (60, n_features)
# ohlcv_5Minute_30 (30, n_features)
# ohlcv_15Minute_20 (20, n_features)
# terminated, truncated
Override the symbol, timeframes, or window sizes to suit your market:
from torchtrade.envs.utils.timeframe import TimeFrame, TimeFrameUnit
env = TransformedEnv(
PolymarketBetEnv(PolymarketBetEnvConfig(market_slug_prefix="eth-updown-15m-")),
BinanceOHLCVTransform(
symbol="ETHUSDT",
time_frames=[TimeFrame(15, TimeFrameUnit.Minute), TimeFrame(1, TimeFrameUnit.Hour)],
window_sizes=[40, 24],
key_prefix="eth_ohlcv", # → keys ``eth_ohlcv_15Minute_40`` etc.
),
)
Customizing the features (microfeatures / technicals / cross-asset)¶
BinanceObservationClass accepts a feature_preprocessing_fn(df) -> df that returns a DataFrame with feature columns prefixed feature_. The default is normalized OHLC pct-change. Use it to inject:
- Technicals, RSI, MACD, Bollinger bands, etc.
- Microfeatures, order-book imbalance, recent trade-flow imbalance, realized variance.
- Cross-asset signals, ETH/BTC correlation features, dollar index, anything pulled from another
requestscall.
import numpy as np
from torchtrade.envs.transforms import BinanceOHLCVTransform
def my_features(df):
df = df.copy()
df["feature_log_return"] = np.log(df["close"] / df["close"].shift(1)).fillna(0)
df["feature_rolling_vol"] = df["feature_log_return"].rolling(20).std().fillna(0)
df["feature_volume_pct"] = df["volume"] / df["volume"].rolling(20).mean()
return df.dropna()
augment = BinanceOHLCVTransform(feature_preprocessing_fn=my_features)
Other data sources¶
For purely on-chain signals, custom microfeatures pipelines, or non-Binance exchanges, write your own Transform following the same pattern, _reset, _step, and transform_observation_spec. The BinanceOHLCVTransform source (torchtrade/envs/transforms/binance_ohlcv.py) is short enough to read and adapt.
Roadmap: text and news context
A natural next step for PolymarketBetEnv is augmenting the observation with text-based signals that the LLM actor can reason over directly: real-time news headlines (e.g. via the X / Twitter API or a news aggregator), on-chain whale-flow alerts, macro-event feeds, and similar. The same Transform skeleton applies, just with non-numeric output written into the TensorDict (e.g. as NonTensorData strings). Open an issue or PR with the data source you want and we can ship a NewsContextTransform (or similar) the same way BinanceOHLCVTransform was added.
Driving the env with an LLM actor¶
TorchTrade ships FrontierLLMActor (OpenAI-compatible) and LocalLLMActor (local HF model via vLLM). Both work directly with PolymarketBetEnv, pass an empty account_state_labels to skip that prompt section, point market_data_keys at "market_state", and override the action descriptions to use binary up/down language:
import os
import torch
from dotenv import load_dotenv
from torchtrade.actor import FrontierLLMActor
from torchtrade.envs.live.polymarket import (
PolymarketBetEnv,
PolymarketBetEnvConfig,
)
load_dotenv()
env = PolymarketBetEnv(
PolymarketBetEnvConfig(
market_slug_prefix="btc-updown-5m-",
max_steps=4,
dry_run=True,
),
)
actor = FrontierLLMActor(
model="gpt-5-nano", # any OpenAI-compatible model
market_data_keys=["market_state"], # the env's only obs key
account_state_labels=[], # PolymarketBetEnv has no account_state
action_levels=[0, 1], # binary action space
action_descriptions=[ # override "target exposure" wording
"Action 0 → bet DOWN (price will be lower at resolution)",
"Action 1 → bet UP (price will be higher at resolution)",
],
feature_keys=["yes_price", "spread", "volume_24h", "liquidity"],
symbol="BTC up/down (5 min)",
execute_on="5Minute",
)
td = env.reset()
while not bool(td.get("done", torch.zeros(1, dtype=torch.bool)).item()):
td = actor(td) # writes "action" into td
td = env.step(td)["next"]
env.close()
The actor expects responses in <answer>N</answer> format (handled by BaseLLMActor._extract_action); models that don't follow it default to action 0. Set debug=True on the actor to print the system prompt, user prompt, and raw model response each step, useful while tuning.
Combining with BinanceOHLCVTransform¶
The transform exposes its OHLCV windows under keys like ohlcv_1Minute_60. Add those to market_data_keys and the actor will render them alongside market_state using the standard 2D-window layout (one row per bar, one column per feature):
from torchrl.envs import TransformedEnv
from torchtrade.envs.transforms import BinanceOHLCVTransform
env = TransformedEnv(
PolymarketBetEnv(PolymarketBetEnvConfig(market_slug_prefix="btc-updown-5m-")),
BinanceOHLCVTransform(), # defaults: 1m/5m/15m × 60/30/20
)
actor = FrontierLLMActor(
model="gpt-5-nano",
market_data_keys=[
"market_state",
"ohlcv_1Minute_60",
"ohlcv_5Minute_30",
"ohlcv_15Minute_20",
],
account_state_labels=[],
action_levels=[0, 1],
action_descriptions=[
"Action 0 → bet DOWN",
"Action 1 → bet UP",
],
)
For local inference (no API key required), swap FrontierLLMActor for LocalLLMActor, the constructor kwargs are identical.
Position Sizing (SLTP Environments)¶
All live SLTP environments support three position sizing modes, matching the offline SLTP environments for train-deploy consistency.
| Mode | Config field | Formula | Use case |
|---|---|---|---|
"fractional" |
position_fraction |
account_balance × fraction × leverage / price |
Adaptive sizing (scales with account) |
"notional" |
quantity_per_trade |
quantity_per_trade / price |
Fixed USD per trade |
"quantity" |
quantity_per_trade |
quantity_per_trade directly |
Fixed base-asset units per trade |
# Fractional: risk 10% of account per bracket order
config = BinanceFuturesSLTPTradingEnvConfig(
trade_mode="fractional",
position_fraction=0.1, # 10% of account balance
leverage=5,
...
)
# Notional: always trade $500 USD worth
config = BinanceFuturesSLTPTradingEnvConfig(
trade_mode="notional",
quantity_per_trade=500.0, # $500 per trade
...
)
# Quantity: always trade exactly 0.001 BTC (default)
config = BinanceFuturesSLTPTradingEnvConfig(
trade_mode="quantity", # default
quantity_per_trade=0.001,
...
)
Alpaca
Alpaca SLTP supports "fractional" and "notional" modes only. "quantity" mode is not supported because Alpaca's bracket order API requires dollar amounts.
Position Locking¶
All live SLTP environments support lock_position_until_sltp. When enabled, agent actions are ignored while a position is open — positions can only exit via the exchange's bracket orders (SL/TP triggers).
config = BinanceFuturesSLTPTradingEnvConfig(
lock_position_until_sltp=True, # Positions exit only via SL/TP
...
)
This is useful for deploying policies trained on OneStepTradingEnv where positions are inherently locked to a single decision. See the offline Position Locking docs for details.
Replay Mode (Historical Data Simulation)¶
All live environments (both SLTP and non-SLTP) support replaying historical data through the live pipeline using ReplayObserver and ReplayOrderExecutor. This is useful for:
- Verifying that the live pipeline produces identical results to backtesting
- Data-driven integration tests with real price data
- Catching feature ordering, normalization, or action mapping bugs before deployment
from torchtrade.envs.replay import ReplayObserver, ReplayOrderExecutor
# Create replay components from historical data
executor = ReplayOrderExecutor(initial_balance=10000, leverage=5, transaction_fee=0.0004)
observer = ReplayObserver(
df=historical_df, # DataFrame with timestamp, open, high, low, close, volume
time_frames=config.time_frames,
window_sizes=config.window_sizes,
execute_on=config.execute_on,
feature_preprocessing_fn=feature_fn,
executor=executor,
)
# Inject into any live env — no code changes needed
env = BinanceFuturesSLTPTorchTradingEnv(config, observer=observer, trader=executor)
# Also works with non-SLTP envs:
# env = BinanceFuturesTorchTradingEnv(config, observer=observer, trader=executor)
td = env.reset()
# Run episode with historical data through the exact live pipeline
The ReplayOrderExecutor simulates bracket order execution with intrabar SL/TP detection (checks high/low, SL before TP).
End-of-data handling: The observer raises StopIteration when historical data is exhausted. Wrap your evaluation loop accordingly:
try:
while True:
action = policy(td)
td = env.step(action)["next"]
except StopIteration:
pass # End of historical data
API Key Setup¶
Each exchange requires API keys stored in a .env file:
# Alpaca (https://alpaca.markets/signup)
API_KEY=your_alpaca_api_key
SECRET_KEY=your_alpaca_secret_key
# Binance (https://www.binance.com/en/my/settings/api-management)
BINANCE_API_KEY=your_binance_api_key
BINANCE_SECRET_KEY=your_binance_secret_key
# Bitget (https://www.bitget.com/en/support/articles/360038859731)
BITGETACCESSAPIKEY=your_bitget_api_key
BITGETSECRETKEY=your_bitget_secret_key
BITGETPASSPHRASE=your_bitget_passphrase
# Bybit (https://www.bybit.com/app/user/api-management)
BYBIT_API_KEY=your_bybit_api_key
BYBIT_API_SECRET=your_bybit_api_secret
# OKX (https://www.okx.com/account/my-api)
OKX_API_KEY=your_okx_api_key
OKX_API_SECRET=your_okx_api_secret
OKX_PASSPHRASE=your_okx_passphrase
# Polymarket (Polygon wallet, fund with USDC.e)
POLYGON_PRIVATE_KEY=your_polygon_wallet_private_key
Always Start with Paper/Testnet Trading
Set paper=True (Alpaca), demo=True (Binance/Bitget/Bybit/OKX), or dry_run=True (Polymarket) before using real funds. Start with low leverage (2-5x) for futures.
Exchange Comparison¶
| Feature | Alpaca | Binance | Bitget | Bybit | OKX | Polymarket |
|---|---|---|---|---|---|---|
| Asset Types | Stocks, Crypto | Crypto | Crypto | Crypto | Crypto | Prediction markets |
| Futures | - | Yes | Yes | Yes | Yes | - |
| Max Leverage | 1x | 125x | 125x | 100x | 125x | 1x |
| Paper Trading | Yes | Yes (Testnet) | Yes (Testnet) | Yes (Testnet) | Yes (Demo) | Yes (dry_run) |
| Commission | Free | 0.02%/0.04% | 0.02%/0.06% | 0.02%/0.055% | 0.02%/0.05% | Variable (CLOB) |
| SDK | Alpaca SDK | CCXT | CCXT | pybit (native) | python-okx (native) | py-clob-client |
Requesting New Exchanges¶
Need support for another exchange (Interactive Brokers, Kraken, etc.)? Create an issue or email us at torchtradecontact@gmail.com.