Details
This is an Reinforcement Learning project built to make an A.I. agent (The Snake) that seeks to maximise its rewards (Being the pellots) through It's environment built by OpenAI Gym.
How does it work?
I'm using a technique called Decision Tree - Q learning to build an expanding tree that looks through all possibilities and selects the route (list of states) with the most rewards/points/wins in the least amount of steps.