HN comments for: QwQ-32B: Embracing the Power of Reinforcement Learning