I saw this info posted on another forum and wanted to see what people here think about it. This is from the AMD supplied gcc "machine descriptor" file:

;; AMD bdver3 Scheduling
;;
;; The bdver3 contains three pipelined FP units and two integer units.
;; Fetching and decoding logic is different from previous fam15 processors.
;; Fetching is done every two cycles rather than every cycle and
;; two decode units are available. The decode units therefore decode
;; four instructions in two cycles.

;;
;; Three DirectPath instructions decoders and only one VectorPath decoder
;; is available. They can decode three DirectPath instructions or one
;; VectorPath instruction per cycle.
;;
;; The load/store queue unit is not attached to the schedulers but
;; communicates with all the execution units separately instead.
;;
;; bdver3 belong to fam15 processors. We use the same insn attribute
;; that was used for bdver3 decoding scheme.

Zambezi and Vishera are supposed to do 4 instructions per clock cycle I thought, it looks like this is saying Steamroller design will do 2 instructions per clock cycle? I thought adding the second decoder so each core has its own again was supposed to increase IPC, not decrease it. I dont seem to understand it, which is why I hope the smart people from this forum can help explain what this actually means.