The write latency goes up without NCQ but it is the 4k reads at qd64 that really get hit. Without NCQ the read performance at qd64 stays around the same as qd1. (As can be seen here, which I found randomly during a search)

I still can't work out why reads get so much better at higher qd's. (Not to say they are good at qd1) It must be something to do with NCQ but how that works I have no idea.