Only Coders - Where knowledge meets opportunity

python (12.9k questions)

javascript (9.2k questions)

reactjs (4.7k questions)

java (4.2k questions)

java (4.2k questions)

c# (3.5k questions)

c# (3.5k questions)

html (3.3k questions)

Questions - markov-decision-process

Shaping theorem for MDPs

I need help with understanding the shaping theorem for MDPs. Here's the relevant paper: https://people.eecs.berkeley.edu/~pabbeel/cs287-fa09/readings/NgHaradaRussell-shaping-ICML1999.pdf it basically ...

Garrett Baker

reinforcement-learning

markov-decision-process

Votes: 0

Answers: 1

Latest Answer

You are missing the assumption that for every terminal, and starting state s_T, s_0 we have f(s_T) = f(s_0) = 0. (Note, that in the paper there is an assumption that after terminal state there is alwa...

lejlot

How should I code the Gambler's Problem with Q-learning (without any reinforcement learning packages)?

I would like to solve the Gambler's problem as an MDP (Markov Decision Process). Gambler's problem: A gambler has the opportunity to make bets on the outcomes of a sequence of coin flips. If the coin ...

Dalma Tóth-Lakits

python

reinforcement-learning

q-learning

coin-flipping

markov-decision-process

Votes: 0

Answers: 1

Latest Answer

The problem seems to be that your while i<n loop never terminates. It looks like you accidentally wait until the first win before incrementing i. (You forgot to increment i when the episode ends wi...

maxy

Posts

Questions

Blogs

Jobs