PL is 8 for SPi 1M run and 5 for MaxxMem benchmark
500MHz 4-3-2-3-3 PL8 with "loosened" sub-timings only requires 2.35V. In order for me to tight the sub-timings that much I fed them with 2.65V

Bandwidth is indeed low... Can it be 'cause of the chipset ? Because P35 and X48 have tighter internal latencies than X38