I meant result replication not so much as actual retesting by users but rather that if testing way/testbed is made more similar to most common real life setups for majority of users, then your test results will become more useful and for wider share of LC users to actually guide WB purchase choice. If uncommon setup is tested, performed work will be less useful, more like theoretic exercise that still leaves out lot to guesswork/speculative estimation to do purchase choices.