The latest updates on all JetBrains products and topics
In training agents, we toss the whole run if the final outcome is imperfect, missing out on valuable info. To fix this, we developed Step Rejection Fine-Tuning.