Only Coders - Where knowledge meets opportunity

python (12.9k questions)

javascript (9.2k questions)

reactjs (4.7k questions)

java (4.2k questions)

java (4.2k questions)

c# (3.5k questions)

c# (3.5k questions)

html (3.3k questions)

Questions - reinforcement-learning

Compile time distribution strategy issue

i have following code- which tries to implement simple Reinforcement learning environment with keras import gym from gym import Env import numpy as np from gym.spaces import Discrete,Box import random...

user466534

tensorflow

keras

reinforcement-learning

Votes: 0

Answers: 1

Latest Answer

i found solution - instead of declaring model before the putting it to the DQNAgent, i just use functional form like this dqn = build_agent(build_model(states,actions), actions) dqn.compile(optimiz...

user466534

Shaping theorem for MDPs

I need help with understanding the shaping theorem for MDPs. Here's the relevant paper: https://people.eecs.berkeley.edu/~pabbeel/cs287-fa09/readings/NgHaradaRussell-shaping-ICML1999.pdf it basically ...

Garrett Baker

reinforcement-learning

markov-decision-process

Votes: 0

Answers: 1

Latest Answer

You are missing the assumption that for every terminal, and starting state s_T, s_0 we have f(s_T) = f(s_0) = 0. (Note, that in the paper there is an assumption that after terminal state there is alwa...

lejlot

PPO Model not loading

I have trained an agent on colab using openAI gym and stable_baselines(PPO) but when I downloaded the model to local computer I can't load the model and it is throwing error. model = PPO.load(TRAINED_...

Ashish Kumar

python

deep-learning

reinforcement-learning

Votes: 0

Answers: 0

Keras GradientType: Calculating gradients with respect to the output node

For startes: this question does not ask for help regarding reinforcement learning (RL), RL is only used as an example. The Keras documentation contains an example actor-critic reinforcement learning i...

CLRW97

python

tensorflow

keras

reinforcement-learning

gradienttape

Votes: 0

Answers: 1

Latest Answer

Well, after some research I found the answer myself: It is possible to extract the trainable variables of a given layer based on the layer name. Then we can apply tape.gradient and optimizer.apply_gra...

CLRW97

Posts

Questions

Blogs

Jobs