We study multi-armed bandit problems that use conditional value-at-risk as an underlying risk measure. In particular, we propose a new upper confidence bound algorithm and compare it with the state-of-the-art alternatives with respect to various definitions of regret from the risk-averse online learning literature. For each comparison, we demonstrate that our algorithm achieves either a strictly better or a comparable regret bound. Finally, we complement our theoretical findings by a numerical experiment to showcase the competitiveness of the proposed algorithm.