Mathematicians have devised new problems to challenge the most advanced AI systems’ reasoning capabilities — and they failed almost every test