Home BlogAIAI and I: Explain “Gradient Descent Toward Reward Signals”