A good trader is characterized by making good decisions with all the information from the market, and it usually takes years to learn. Can a computer model achieve the same feat, namely, to digest a massive amount of input data, and then make decisions to maximize the future reward? With the recent advances in deep learning, it is now possible to train a neural network based model with reinforcement learning.
Below is a simple example: an agent is moving randomly in a room with walls, with its eyesight pointing ahead of it. There are also red and green items randomly thrown to the floor. If the agent 'eats' a red item, the agent gets a positive reward, and if the agent 'eats' a green item, it gets a negative reward. The overall goal is for the agent to maximize its total reward. At the beginning, the agent doesn't know what policy it should follow to maximize its future reward, but over time, it learns to avoid states that lead to states with low rewards, and picks actions that lead to higher reward instead.
The model is being trained with in the web browser. With current settings, it will take few minutes for the agent to learn the best policy. If you are impatient, you can load the trained agent by clicking the button at the bottom. Refresh the page to reload the 'naive' agent.
This model uses the open source library ConvNetJS.
Below is a simple example: an agent is moving randomly in a room with walls, with its eyesight pointing ahead of it. There are also red and green items randomly thrown to the floor. If the agent 'eats' a red item, the agent gets a positive reward, and if the agent 'eats' a green item, it gets a negative reward. The overall goal is for the agent to maximize its total reward. At the beginning, the agent doesn't know what policy it should follow to maximize its future reward, but over time, it learns to avoid states that lead to states with low rewards, and picks actions that lead to higher reward instead.
The model is being trained with in the web browser. With current settings, it will take few minutes for the agent to learn the best policy. If you are impatient, you can load the trained agent by clicking the button at the bottom. Refresh the page to reload the 'naive' agent.
This model uses the open source library ConvNetJS.
(Takes ~10 minutes to train with default settings. If you're impatient, scroll down and load an example pre-trained network from pre-filled JSON)