A few pretty good ones are in development right no...
# │ai-agents-mentals
m
A few pretty good ones are in development right now. For me the most promising is the CodeScore (https://arxiv.org/abs/2301.09043) Also there are datasets of interview and olimpiad tasks that are deeper than 164 python tasks, but they still not quite related to the general purpose coding tasks