Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment -- Zhuokai Zhao -- Personal Webpage