Barcelona Opteron 2350(B1) arrived

**nemrod** · 11-05-2007, 04:24 PM

Originally Posted by accord99

It was overclocked to 3.4GHz, EIST probably kicked in while the cpu-z screenshot was taken.

Yes, you're right so where back to your previous comment:

Originally Posted by accord99

Cinebench R10 is significantly different. The stock Q6600 in 32-bit Cinebench R10 scores a bit higher than 2400/8600.

Or a little bit better in 64it mode perhaps ?

(3881/3.4*2.4= 2740 13428/3.4*2.4 =9480)

**mad_skills** · 11-05-2007, 04:33 PM

Originally Posted by mstp2009

Hold on a sec here. Am I reading this right: in C10 a 2.27GHz C2Q is faster than TWO Quads K10 at 2.0GHz?

:scratches head:

now this is interesting, how is this possible?

**nemrod** · 11-05-2007, 04:41 PM

Originally Posted by mad_skills

now this is interesting, how is this possible?

I have missed the 3.4GHz like you before

-----------------------------------------------------------------

So from http://www.adobeforums.com/webx/.3bc4aee5

Processor : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
MHz : 2660
Number of CPUs : 8
Operating System : WINDOWS 32 BIT 5.2.3790

Graphics Card : Quadro FX 1500/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

# ************************************************** *

Rendering (Single CPU): 2701 CB-CPU
Rendering (Multiple CPU): 15867 CB-CPU

Multiprocessor Speedup: 5.88

Shading (OpenGL Standard) : 3852 CB-GFX

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>..

Processor : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
MHz : 2660
Number of CPUs : 8
Operating System : WINDOWS 32 BIT 5.2.3790

Graphics Card : GeForce 8800 GTX/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

# ************************************************** *

Rendering (Single CPU): 2679 CB-CPU
Rendering (Multiple CPU): 15466 CB-CPU

Multiprocessor Speedup: 5.77

Shading (OpenGL Standard) : 4308 CB-GFX

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >.

64 bit OS/64 bit Cinebench:

Processor : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
MHz : 2660
Number of CPUs : 8
Operating System : WINDOWS 64 BIT 5.2.3790

Graphics Card : Quadro FX 1500/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

# ************************************************** *

Rendering (Single CPU): 3069 CB-CPU
Rendering (Multiple CPU): 18357 CB-CPU

Multiprocessor Speedup: 5.98

Shading (OpenGL Standard) : 3794 CB-GFX

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>

Processor : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz
MHz : 2660
Number of CPUs : 8
Operating System : WINDOWS 64 BIT 5.2.3790

Graphics Card : GeForce 8800 GTX/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

# ************************************************** *

Rendering (Single CPU): 3038 CB-CPU
Rendering (Multiple CPU): 18314 CB-CPU

Multiprocessor Speedup: 6.03

Shading (OpenGL Standard) : 4506 CB-GFX

So the Multiprocessor Speedup is lower than the eight core barcelona system but the single cpu score should be higher even at 2GHz, leading to higher multiple cpu score.

(3038/2.66*2.4 = 2741 or 3038/2.66*3.4= 3883 similar to 3881 shown by siyah at 3.4 on Q6600 so at 2GHz single cpu should be 2280 and multi cpu should be 7900 for the Q6600 and 13770 for the eight core xeon system)

Originally Posted by tictac

i guess 3.1ghz phenom b2 could break 12k on cinebench under 64bit mode...

If I do the same calculation taking into account barcelona results single cpu speed = 1907*3.1/2 = 2955 even a Multiprocessor speedup equal to 4 this would give 11800 for multi cpu score...

**mad_skills** · 11-05-2007, 04:49 PM

yeah, got it almost instantly after i have made the post before

**mstp2009** · 11-05-2007, 04:59 PM

Originally Posted by accord99

It was overclocked to 3.4GHz, EIST probably kicked in while the cpu-z screenshot was taken.

3.4GHz make A LOT more sense.

Thanks.

**tictac** · 11-05-2007, 05:11 PM

i calculate it this way.. 3.1ghz 9.7k on 32bit c10 with.. 3.88x speed up.. in 64bit kyosen result 20% improvement on single threaded plus 4x speed up,.. so my estimate score 9.7k / 3.88 x 120% x 4

**accord99** · 11-05-2007, 05:45 PM

Originally Posted by nemrod

Or a little bit better in 64it mode perhaps ?

(3881/3.4*2.4= 2740 13428/3.4*2.4 =9480)

Yeah, roughly there:

http://www.techreport.com/articles.x/13470/11

**tictac** · 11-05-2007, 08:41 PM

Originally Posted by Lightman

I presume multiplier is upwards locked in new Barcelona's?
Also is it possible to set vcore higher than 1.2V using C'n'Q or this is maximum allowed?

**KTE** · 11-06-2007, 12:20 AM

Originally Posted by savantu

Give 2 examples which respect the conditions imposed by the law.

There's the catch on where you start implying the law. It's not a law. It's a theoretical prediction. That's also how it's explained from the link you posted to explain the so called Amdahl's Law. Physical laws are absolutes given system constraints and controlled factors.
The prediction holds true in the case where you only variate one system component like I mentioned, such as in CPU's, but not where more than one are manipulated ((such as increasing DDR speed+CPU speed) in which case you will see more than 100% scaling). It also relies on you knowing the absolute performance possible of a CPU as your base marker to judge scaling. Too many of these facts we do not have so to talk in absolute comparisons about that would be conjectures or estimates at best and inaccurate. We need more information and testing to know the theoretical peak performance of CPU components and then we can judge the achieved performance as a percentage of the maximum possible performance of a given CPU. This is where you will find that the maximum always stays consistent and the performance scaling always stays below 100%.

**indiana_74** · 11-06-2007, 01:20 AM

Thank you for your welcome words and a great thank you to kyosen for the help.

The Opteron 2344 is the secound barcelona system i configured and we get this board after reading this thread in your forum.

Now, with the help from kyosen i can oc the 2344.
At this moment it runs @1.9GHz and so it outperformed my 2347 System.

**s7e9h3n** · 11-06-2007, 01:42 AM

Originally Posted by savantu

Umh...there are actually 2 64 bit channels which put together act as a 128 one.This improves performance when multiple threads access memory.
This is something Intel chipsets have done since the I865.

Ok, I was a bit confused what you were trying to say. A more appropriate descriptor would have been "DCT" and not "IMC". IMC is a generic term which most people use to describe AMD's memory controller as a whole. I apologize for assuming you were like "most people"

**KTE** · 11-06-2007, 01:43 AM

Originally Posted by siyah

hi my Quad 6600 B3

is faster your barci

http://s6.directupload.net/images/071105/fd5j463u.jpg

At 3.4GHz/1134MHz and enough background processes, he got ~1550 CB higher than a Yorkfield at 3GHz (1333FSB) running DDR3-1333 RAM in Vista SP1 beta and nearly 3000 CB higher than the QX6850.

**nemrod** · 11-06-2007, 01:52 AM

Originally Posted by KTE

At 3.4GHz/1134MHz and enough background processes, he got ~1550 CB higher than a Yorkfield at 3GHz (1333FSB) running DDR3-1333 RAM in Vista SP1 beta and nearly 3000 CB higher than the QX6850.

On vr-zone this should be 32-bit cinebench score

Vr-zone QX9650: 11857
techreport QX9650: 13256

http://www.techreport.com/articles.x/13470/11

If I take results I've post before in this thread between 32 bit and 64 bit score there is around 14% difference. So
11857*1.14= 13516 which is not so far of techreport 13256

**KTE** · 11-06-2007, 02:14 AM

Originally Posted by nemrod

On vr-zone this should be 32-bit cinebench score

Yeh, it seems the score is to do with 32-bit (VR-Z) vs 64-bit (TR) AND P35 (TR) vs X38 (VR-Z). RAM remains same.

Vr-zone QX9650: 11857
techreport QX9650: 13256

http://www.techreport.com/articles.x/13470/11

Good quick correction.

I was just about to mention it.

If I take results I've post before in this thread between 32 bit and 64 bit score there is around 14% difference. So
11857*1.14= 13516 which is not so far of techreport 13256

TR score was 12% higher than VR-Z just with those two differences which means we need similar testing to compare scores or we can't really judge one over the other with any CPU.

**PhilDoc** · 11-06-2007, 08:13 AM

One thing we need to remember is that there is going to be some degree of variability in these benchmarks, which may explain some the difference in scores reported. Occationaly, the variance can be quite a bit running the same system. I've witnessed this multiple times as a former benchmark freak. Any review really needs to run these test multiple times and take an average. Don't get me wrong, I'm not saying that AMD is suddenly going to take the lead, but it does explain some of the variability.

**s7e9h3n** · 11-06-2007, 04:55 PM

Originally Posted by s7e9h3n

Hate to say it, but you're right......here's my Cinebench on Server 2k3 x64. Kyosen-san's single threaded score is unusually low....

I just realized that I had absolutely no idea what I was saying here.....I re-read through Kyosen's posts and didn't see any results for CB10 @ 2.0G. Sheesh, I need to get more sleep

**Quintero** · 11-06-2007, 06:10 PM

Originally Posted by savantu

Umh...there are actually 2 64 bit channels which put together act as a 128 one.This improves performance when multiple threads access memory.

You're describing one of the two available DCT-modes, the so-called Ganged Mode. But that protocol is far from new - the very same one was used by socket 940/939: one controller operates two channels in lockstep, almost like in RAID 0.

K10 features two independent controllers, which allows it to send independent commands to each channel (in Unganged Mode). That's the most important new feature of the K10 IMC.

Originally Posted by savantu

This is something Intel chipsets have done since the I865.

Sorry, but that's just not a fair comparison. The i865 is unable to continuously access both channels in parallel, probably because the FSB is not fully dedicated to the RAM (unlike AMD's memory bus). And thanks to the dual independent controllers of the K10, loaded latency is reduced, unrelated data can be fetched simultaneously, and the channels can even operate in different directions at the same time. AFAIK, no comparable features are offered by any Intel chipset.

**kyosen** · 11-06-2007, 07:45 PM

CPU-Z latest beta screenshot
Now it can treat each core clock...we can select the core with mouse right-click.
And it also shows Core VID

http://www.oohashi.jp/c-board/file/C...core_clock.png
Thanks to Franck, as usual

3DMark06 with GeForce8800GT on WinXP x64
2350(B1)-2.2G=220x10、GeForce8800GT,
3DMark Score, CPU Score = 9973, 3212
http://www.oohashi.jp/c-board/file/C...core_clock.png
In previous test on WinXP 32bit: 10148, 3267
http://www.oohashi.jp/c-board/file/3....2G-220x10.png
So, I couldn't find no merit for WinXP x64 about 3DMark06, as expected.

NorthBridge multiplier again
I rebooted after changing register in this time,
and I could confrim that NB clock was down on BIOS display.
http://www.oohashi.jp/c-board/file/O...NB1.0_BIOS.jpg
And SuperPI4M time at same Core clock and different NB clock:

NB-1.8G: 3m48.406s, http://www.oohashi.jp/c-board/file/S...-1.8_WinXP.png
NB-1.6G: 3m48.016s, http://www.oohashi.jp/c-board/file/S...-1.6_WinXP.png
NB-1.4G: 3m50.500s, http://www.oohashi.jp/c-board/file/S...-1.4_WinXP.png
NB-1.2G: 3m53.532s, http://www.oohashi.jp/c-board/file/S...-1.2_WinXP.png
NB-1.0G: 3m58.359s, http://www.oohashi.jp/c-board/file/S...-1.0_WinXP.png

NB>=1.6G looks enough for single thread program with dual DDR2-667.
There may be difference between NB-1.6G and NB-1.8G in case of DDR2-800 setting.

BTW, I've learned reboot(warm reset) is needed for changing HT Link multiplier too.
Yeah, tictac and macci was/is right, as usual

Folding@Home
Quick test on 2350(B1)-2.0G, DDR2-667.
Screenshot at 7% steps:
http://www.oohashi.jp/c-board/file/F...0_step-35k.png

**tictac** · 11-06-2007, 07:53 PM

Thanks for the test...

**Start** · 11-06-2007, 08:27 PM

Originally Posted by kyosen

CPU-Z latest beta screenshot
Now it can treat each core clock...we can select the core with mouse right-click.
And it also shows Core VID

http://www.oohashi.jp/c-board/file/C...core_clock.png
Thanks to Franck, as usual

3DMark06 with GeForce8800GT on WinXP x64
2350(B1)-2.2G=220x10、GeForce8800GT,
3DMark Score, CPU Score = 9973, 3212
http://www.oohashi.jp/c-board/file/C...core_clock.png
In previous test on WinXP 32bit: 10148, 3267
http://www.oohashi.jp/c-board/file/3....2G-220x10.png
So, I couldn't find no merit for WinXP x64 about 3DMark06, as expected.

NorthBridge multiplier again
I rebooted after changing register in this time,
and I could confrim that NB clock was down on BIOS display.
http://www.oohashi.jp/c-board/file/O...NB1.0_BIOS.jpg
And SuperPI4M time at same Core clock and different NB clock:

NB-1.8G: 3m48.406s, http://www.oohashi.jp/c-board/file/S...-1.8_WinXP.png
NB-1.6G: 3m48.016s, http://www.oohashi.jp/c-board/file/S...-1.6_WinXP.png
NB-1.4G: 3m50.500s, http://www.oohashi.jp/c-board/file/S...-1.4_WinXP.png
NB-1.2G: 3m53.532s, http://www.oohashi.jp/c-board/file/S...-1.2_WinXP.png
NB-1.0G: 3m58.359s, http://www.oohashi.jp/c-board/file/S...-1.0_WinXP.png

NB>=1.6G looks enough for single thread program with dual DDR2-667.
There may be difference between NB-1.6G and NB-1.8G in case of DDR2-800 setting.

BTW, I've learned reboot(warm reset) is needed for changing HT Link multiplier too.
Yeah, tictac and macci was/is right, as usual

Folding@Home
Quick test on 2350(B1)-2.0G, DDR2-667.
Screenshot at 7% steps:
http://www.oohashi.jp/c-board/file/F...0_step-35k.png

Thanks for the SMP tests

now at least I have an idea of its performance.

**Sparky** · 11-06-2007, 08:39 PM

Something seems weird about that SMP test.... my Opteron is only 6 minutes slower than that. I would think it would be better. I guess my opteron is running 800MHz faster, but still I guess I was expecting it to be faster than that.

**kyosen** · 11-06-2007, 09:18 PM

Originally Posted by SparkyJJO

Something seems weird about that SMP test.... my Opteron is only 6 minutes slower than that. I would think it would be better. I guess my opteron is running 800MHz faster, but still I guess I was expecting it to be faster than that.

Just quick guesstimation:
my result: 15m24s=924s for each 5000 steps with 2.0G x4 cores K10 Optreon
your score: 15m24s + 6m = 1284s with 2.8G x2 cores K8 Optreon
So, 1284/x * 2.8/2.0 /y = 924
...here x is efficiency of increased cores x2->x4, and y is performance gain per core.
for example, if y is ~1.05, x is ~1.85 from the formula above...
...yeah x should be within 2.0 in this case.
In my experience for SuperPI, y=~1.05 is feasible, at least on my board and current BIOS, so far.
I don't know usual efficiency x for Folding@Home, but 1.85 looks feasible too...

**JohannesRS** · 11-06-2007, 09:56 PM

are you saying that, from this comparison, that the performance gain from K8 to K10 is about... 5%?

**STEvil** · 11-06-2007, 10:51 PM

2.0ghz Barcelona gets 924s
2.8ghz K8 Opteron gets 1284s

2.0 x 8 = 16,000mhz
2.8 x 4 = 11,200mhz

16,000/924 = 17.316
11,200/1284 = 8.723

17.316/8.723 = 1.985 speedup factor.

edit - i'm assuming a dual quad and a dual dual here, but half the numbers (1 proc each) and you get the same... unless there's a number out somewhere in which case my bad

Does the SMP client work one work unit across all cores or is it one per core? I assume it is one per core in this calculation. One across all cores will be different numbers.

**kyosen** · 11-06-2007, 11:19 PM

Originally Posted by JohannesRS

are you saying that, from this comparison, that the performance gain from K8 to K10 is about... 5%?

~5% performance gain from K8 to K10 is based on my own results about SuperPI4M run.
I don't know whether the gain is same or not for both SuperPI and F@H, at this moment.
My intention of previous post is just suggestion of rough estimation formula.
Under assumption of that formula,
*if x = 2.0(ideal scaling), y < 1.0, i.e. gain from K8 to K10 is negative...it's not feasible.
*if x = ~1.95, y = 1.0, i.e. no gain from K8 to K10...it's not feasible too.
*if x = ~1.90, y = ~1.02
*if x = ~1.85, y = ~1.05
*if x = ~1.8, y = ~1.08
*if x = ~1.6, y = ~1.2
...
x=1.8~1.85 looks feasible for me as result of multi-thread program,
then, y=1.08~1.05...and it's not inconsistent with my SuperPI1M&4M results.

Thread: Barcelona Opteron 2350(B1) arrived

Thread Tools

Search Thread

Rate This Thread

Display

Update

Bookmarks

Bookmarks

Posting Permissions