Correction, All benchmarks are only approximations based on several assumptions.
To date, there is yet to be one valid benchmark on predicting user experience improvement.
A benchmark that is good at predicting performance at some tasks are often terrible at predicting performance for other tasks. [Numeric analysis often requires features and additions that are considerably expensive in terms of transistors. That would negatively effect all other scientific calculations.]