Transforming complex, noisy, high-dimensional market data into compact latent representations that power reinforcement learning agents to reason about regime, risk and scenario.
Reinforcement learning agents fail when the raw state space is too noisy, sparse, or unstable. The same price pattern means different things under different volatility regimes, liquidity conditions, macro contexts, or historical setups.
Instead of feeding agents raw variables, we learn compressed representations of market structure — latent states that preserve regime, trend, volatility, risk asymmetry, market memory, and scenario probability.
The market state is not a flat vector of indicators. It is encoded into a latent representation that summarizes the structure of the current market context — which the RL agent then uses to reason, act, and improve.
Different methods suit different tasks. A VAE is useful for probabilistic scenario generation. A SOM reveals regime topology. A transformer encoder captures long-range dependencies. A world model simulates future transitions.
Learn compressed market state by reconstructing input through a bottleneck. VAE adds a probabilistic latent space — ideal for uncertainty-aware encoding and scenario sampling.
Project high-dimensional market states into a structured 2D topology preserving similarity. Each region maps to a distinct market condition: trending, ranging, volatile, compressing.
Pull similar market states together, push dissimilar ones apart. Similarity defined by forward return distribution, volatility regime, or reward consequence — not just visual resemblance.
Capture long-range temporal dependencies, multi-timeframe context, and regime transitions. The meaning of a market state depends on the path that produced it.
Learn not just current state representation but how states evolve over time. Creates an internal market simulator for scenario rollout, risk estimation, and policy evaluation.
Train representations with auxiliary objectives tied directly to future reward. Similar market states should be close not because they look alike, but because they imply similar decision risk and return distributions.
A self-organizing map trained on historical SPX states. Each cell is a learned region of market structure. Color indicates dominant regime classification. The agent moves through this map as conditions evolve — learning distinct policies for each region.
Instead of mapping each market state to a fixed point, a variational autoencoder learns a distribution over latent states. The same observed conditions can imply a range of possible market structures — which maps naturally to scenario uncertainty.
A world model learns latent dynamics: how states evolve, what transitions are likely, and what reward can be expected — without requiring live market interaction. The goal is not perfect forecasting. The goal is a structured model of possible transitions.
Compressed market context at time T — regime, momentum, vol structure.
Predicts distribution over next latent states p(zt+1 | zt, at).
Simulate multiple future trajectories in latent space — each branch is a scenario.
Estimate expected reward and risk along each trajectory without live market data.
Use simulated rollouts to update the agent's policy — safer than live exploration.
Latent clusters automatically identify market regimes — trending, mean-reverting, high-volatility, compressing, transition — without manual labeling.
Latent states estimate the probability of continuation, reversal, breakout, or volatility expansion — grounded in the structure of the encoded state.
Given a current latent state, retrieve historically similar states and their forward outcomes — providing empirical priors for scenario reasoning.
Cleaner latent states improve offline reinforcement learning from historical episodes — reducing overfitting to surface-level patterns in raw data.
Agents trained on latent representations generalize better across regime changes because the representation abstracts away regime-specific surface features.
Combine numerical features with chart image embeddings in the same latent space — testing whether visual market structure adds decision-useful signal.