The coexistence of a wide variety of different applications with diverse Quality of Service (QoS) requirements calls for more sophisticated radio resource scheduling (RRS) in 5G networks compared to previous generations. To address this challenge, a growing body of research formulates the RRS problem as a Markov decision process (MDP) and aims to solve it using deep reinforcement learning (DRL). A key consideration when formulating an MDP is the choice of reward function, which determines the goal of the decision agent. Despite the reward function being a critical component of an MDP, there is currently no systematic study comparing how different reward functions affect network performance. To this end, we carry out a comparative study of the delay and overflow performance using several reward functions that aim to minimize packet delays. Through extensive simulations under different traffic and channel conditions, we identify a reward function that can achieve near optimal delay with up to 55 - 67% fewer packet drops than the other investigated options, and does not require any tuning.