MMM
Results 1 to 21 of 21

Thread: Research Question

  1. #1
    Xtreme Enthusiast
    Join Date
    Dec 2008
    Location
    Austin, Texas
    Posts
    599

    Research Question

    Lets say you had an application that takes advantage of AVX, SSE 4.1, and SSE 4.2.

    What is the greatest improvement in performance you might achieve?

    1. Define application/s
    2. Define method of difference test
    3. Screenshots

    This is a research question, no actual benching necessary.

  2. #2
    Xtreme Cruncher
    Join Date
    Apr 2008
    Location
    Ohio
    Posts
    3,119
    Not sure if I can come up with something or not Much smarter people will be in soon I suspect..
    I know that with out AVX support a 256bit FP unit is pointless, isn't Bulldozer going to have a Flex-FP that functions with out AVX support? I guess I'm going OT...
    ~1~
    AMD Ryzen 9 3900X
    GigaByte X570 AORUS LITE
    Trident-Z 3200 CL14 16GB
    AMD Radeon VII
    ~2~
    AMD Ryzen ThreadRipper 2950x
    Asus Prime X399-A
    GSkill Flare-X 3200mhz, CAS14, 64GB
    AMD RX 5700 XT

  3. #3
    Banned Movieman...
    Join Date
    May 2009
    Location
    illinois
    Posts
    1,809
    Quote Originally Posted by charged3800z24 View Post
    Not sure if I can come up with something or not Much smarter people will be in soon I suspect..
    I know that with out AVX support a 256bit FP unit is pointless, isn't Bulldozer going to have a Flex-FP that functions with out AVX support? I guess I'm going OT...
    idk but i know everytime he posts, it seems to be a teaser right before he puts things out...

  4. #4
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    hm, I think, some benefits could be in some video encoding aplications. I heard, x264 support AVX a bit. Others, i dont know . Maybe some searching sequence in videos? It is still new instruction set and think, we will later, how will be with better support. And we must waiting in this thread to some smarter answer
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  5. #5
    Xtreme Addict
    Join Date
    Sep 2010
    Location
    US, MI
    Posts
    1,680
    PCSX2 now uses a tiny bit of avx, at 1st they said it was completely useless but I guess they found some use for it , I've seen a few code updates with avx support added, minor ones but still, they are using it.

    "SSSE3", not "SSE3" had been in use for along time by the same program.
    Same goes for SSE4.
    Now, finally, they have added a few lines of code that can use SSE3, the stuff the amd's have.

    Exact comment.
    Posted by gabest:
    GSdx: finally, some use for hsubps (SSE3).

  6. #6
    Xtreme Member
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    320
    Sounds like a can of floating point whoop ass Wider, three operand support, allows optimization to one common standard if you have the latest intel or amd cpus.

    So scientific apps/graphics rendering should benefit. I see SETI@home has an experimental AVX optimized version, but I'm not familiar with using it as a benchmark. Could we have it do the same work units on the AVX version versus others?

    Better yet, maybe Cinebench has AVX support? It doesn't look like 11.5 does. Also, I think I read that the latest SiSoft Sandra did...

    The trick will be making apples to apples comparisons "app" wise Unless we can turn AVX on/off, we'd be limited to using different versions of benches with/without avx support, and who knows what other differences between them.

    I'll say run Sandra with 4.1, then 4.2, then avx enabled, and compare results. Downloading sandra lite to play with.

    Here we go, just to get the ball rolling. You can enable/disable AVX for the modules that support it, and sometimes 4.1/4.2:





    So, whether it's Sandra or some other bench, gather your data for each supported test, with avx/whatever else you can enable/disable, then plot results and %improvement. We could throw cpu frequency into the mix to see how it scales. Now we need avx capable hardware

    Nice article here : http://blogs.amd.com/developer/2009/...ing-a-balance/
    Last edited by Grinder; 03-30-2011 at 07:56 AM.
    *in progress*
    AMD FX-8350
    Asus Crosshair V Formula Z
    2X8GB G.Skill Trident X DDR3-2400 C10
    2X Sapphire Radeon R9 290 Tri-X
    D5|EK Res/top|2X Swiftech MCR320XP|EK Supremacy CPU|2X EK 290X Acetal Nickel
    Seasonic M12D 850w
    Fractal Design Arc Midi R2
    T-Balancer MiniNG
    Western Digital Caviar Black 2TB
    Windows 7 Home Premium 64 bit

    My last intel cpu was a celeron 300a. My first computer was a TI-99/4!

  7. #7
    Xtreme Member
    Join Date
    Oct 2010
    Posts
    141
    Only app I can think of that has been mentioned is x264

  8. #8
    Xtreme Cruncher
    Join Date
    Apr 2008
    Location
    Ohio
    Posts
    3,119
    I run WCG almost 24/7, only off when I am benching or decide to play a game. I know there should be advantages for this type of application, but how to test I am not sure.

    Downloading Sandra lite though...
    Last edited by charged3800z24; 03-30-2011 at 02:12 PM. Reason: typos
    ~1~
    AMD Ryzen 9 3900X
    GigaByte X570 AORUS LITE
    Trident-Z 3200 CL14 16GB
    AMD Radeon VII
    ~2~
    AMD Ryzen ThreadRipper 2950x
    Asus Prime X399-A
    GSkill Flare-X 3200mhz, CAS14, 64GB
    AMD RX 5700 XT

  9. #9
    Xtreme Member
    Join Date
    Aug 2009
    Posts
    249
    AIDA64 has benchmarks with AVX acceleration but I believe he is asking more for real world usage scenarios. I don't think AMD is adding AVX just for our benchmarking pleasure.

  10. #10
    Xtreme Member
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    320
    I did notice AIDA64 I'm sure it would be fine as well.

    As long as it's measureable, and creates excitement about the capability, and the new hardware, it's win/win. Especially if it kicks intel ass. I did see some numbers on the Sandra site comparing SB vs. previous non avx generation.
    *in progress*
    AMD FX-8350
    Asus Crosshair V Formula Z
    2X8GB G.Skill Trident X DDR3-2400 C10
    2X Sapphire Radeon R9 290 Tri-X
    D5|EK Res/top|2X Swiftech MCR320XP|EK Supremacy CPU|2X EK 290X Acetal Nickel
    Seasonic M12D 850w
    Fractal Design Arc Midi R2
    T-Balancer MiniNG
    Western Digital Caviar Black 2TB
    Windows 7 Home Premium 64 bit

    My last intel cpu was a celeron 300a. My first computer was a TI-99/4!

  11. #11
    I am Xtreme FlanK3r's Avatar
    Join Date
    May 2008
    Location
    Czech republic
    Posts
    6,823
    AVX= x264 benchmark, as I said...and maybe mediaExpress. Others, I dont now, its about support marketing and near future software.
    ROG Power PCs - Intel and AMD
    CPUs:i9-7900X, i9-9900K, i7-6950X, i7-5960X, i7-8086K, i7-8700K, 4x i7-7700K, i3-7350K, 2x i7-6700K, i5-6600K, R7-2700X, 4x R5 2600X, R5 2400G, R3 1200, R7-1800X, R7-1700X, 3x AMD FX-9590, 1x AMD FX-9370, 4x AMD FX-8350,1x AMD FX-8320,1x AMD FX-8300, 2x AMD FX-6300,2x AMD FX-4300, 3x AMD FX-8150, 2x AMD FX-8120 125 and 95W, AMD X2 555 BE, AMD x4 965 BE C2 and C3, AMD X4 970 BE, AMD x4 975 BE, AMD x4 980 BE, AMD X6 1090T BE, AMD X6 1100T BE, A10-7870K, Athlon 845, Athlon 860K,AMD A10-7850K, AMD A10-6800K, A8-6600K, 2x AMD A10-5800K, AMD A10-5600K, AMD A8-3850, AMD A8-3870K, 2x AMD A64 3000+, AMD 64+ X2 4600+ EE, Intel i7-980X, Intel i7-2600K, Intel i7-3770K,2x i7-4770K, Intel i7-3930KAMD Cinebench R10 challenge AMD Cinebench R15 thread Intel Cinebench R15 thread

  12. #12
    Xtreme Addict
    Join Date
    Sep 2010
    Location
    US, MI
    Posts
    1,680
    I wouldn't worry so much about benching it.
    Unless you are coding, in which case you could find out for your self which pathway is best .

    The intel sandy's support avx.
    The new am3+ cpu's supposedly support it.

    If a benchmark or other program has support for it, I would blindly trust in good faith it's for the best and think no more of it .

    In theory it could double performance.
    In all likely hood the performance is probably 5% better if used correctly and nothing more.

    Avx is 256bit while sse is 128bit.
    I'm not sure how many cycles it will take to process one cpu to mem opcode compared to say 4x 64bit generic reg to mem opcodes.
    Let alone the methods of turning it on and off, etc, on some of that fpu like stuff you need to have everything aligned and make sure things are correct once you stop using it, wasted time.
    You have to use it in such a way to make it worth while.

  13. #13
    Xtreme Enthusiast
    Join Date
    Dec 2008
    Location
    Austin, Texas
    Posts
    599
    Questions. Questions that need answering.

  14. #14
    Xtreme Member
    Join Date
    Oct 2010
    Posts
    141
    list of supported functionality from x264
    Code:
    #define X264_CPU_CACHELINE_32   0x000001  /* avoid memory loads that span the border between two cachelines */
    #define X264_CPU_CACHELINE_64   0x000002  /* 32/64 is the size of a cacheline in bytes */
    #define X264_CPU_ALTIVEC        0x000004
    #define X264_CPU_MMX            0x000008
    #define X264_CPU_MMXEXT         0x000010  /* MMX2 aka MMXEXT aka ISSE */
    #define X264_CPU_SSE            0x000020
    #define X264_CPU_SSE2           0x000040
    #define X264_CPU_SSE2_IS_SLOW   0x000080  /* avoid most SSE2 functions on Athlon64 */
    #define X264_CPU_SSE2_IS_FAST   0x000100  /* a few functions are only faster on Core2 and Phenom */
    #define X264_CPU_SSE3           0x000200
    #define X264_CPU_SSSE3          0x000400
    #define X264_CPU_SHUFFLE_IS_FAST 0x000800 /* Penryn, Nehalem, and Phenom have fast shuffle units */
    #define X264_CPU_STACK_MOD4     0x001000  /* if stack is only mod4 and not mod16 */
    #define X264_CPU_SSE4           0x002000  /* SSE4.1 */
    #define X264_CPU_SSE42          0x004000  /* SSE4.2 */
    #define X264_CPU_SSE_MISALIGN   0x008000  /* Phenom support for misaligned SSE instruction arguments */
    #define X264_CPU_LZCNT          0x010000  /* Phenom support for "leading zero count" instruction. */
    #define X264_CPU_ARMV6          0x020000
    #define X264_CPU_NEON           0x040000  /* ARM NEON */
    #define X264_CPU_FAST_NEON_MRC  0x080000  /* Transfer from NEON to ARM register is fast (Cortex-A9) */
    #define X264_CPU_SLOW_CTZ       0x100000  /* BSR/BSF x86 instructions are really slow on some CPUs */
    #define X264_CPU_SLOW_ATOM      0x200000  /* The Atom just sucks */
    #define X264_CPU_AVX            0x400000  /* AVX support: requires OS support even if YMM registers
                                               * aren't used. */

  15. #15
    Xtreme Enthusiast
    Join Date
    Jun 2008
    Location
    Northern Ohio
    Posts
    664
    I can't tell if this is a riddle, a challenge, or a "can you guess the numbers I already know" lol. All three of those instruction sets can see very big performance gains working with HD Codecs as well as encryption (especially SSE 4.2).


    Work/Game System - ~24/7 WCG
    ASUS P8P67 PRO / i7 2600k @ 4.1Ghz / Gigabyte Radeon HD5870 / 4x4GB Corsair Vengeance @ 1600Mhz 9-9-9

    HTPC -~24/7 WCG
    Gigabyte GA-Z68AP-D3 / i7 2600k @ 4.0Ghz / Sapphire Radeon HD5830 / 2x2GB Mushkin Enhanced Essentials @ 1333Mhz 9-9-9

    XS WCG Team Forum - http://www.worldcommunitygrid.org/

  16. #16
    Xtreme 3D Team
    Join Date
    Mar 2007
    Location
    Rio de Janeiro, Brazil
    Posts
    445
    1. Linpack
    2. Test: i5 2500k running latest Linpack library under Windows 7. Run test without SP1 for AVX disabled result and with SP1 applied for AVX enabled result.
    3. Micro Screenshots

    - No AVX support ->

    - AVX support enabled ->
    [SIGPIC][/SIGPIC]

  17. #17
    Xtreme Cruncher
    Join Date
    Apr 2008
    Location
    Ohio
    Posts
    3,119
    I am looking for a way to run the BOINC benchmark with options to see if we can see the difference. I am having trouble finding anything online.
    ~1~
    AMD Ryzen 9 3900X
    GigaByte X570 AORUS LITE
    Trident-Z 3200 CL14 16GB
    AMD Radeon VII
    ~2~
    AMD Ryzen ThreadRipper 2950x
    Asus Prime X399-A
    GSkill Flare-X 3200mhz, CAS14, 64GB
    AMD RX 5700 XT

  18. #18
    Xtreme Addict
    Join Date
    May 2004
    Location
    Aland Islands, Finland
    Posts
    1,137
    Quote Originally Posted by NBF View Post
    1. Linpack
    2. Test: i5 2500k running latest Linpack library under Windows 7. Run test without SP1 for AVX disabled result and with SP1 applied for AVX enabled result.
    3. Micro Screenshots

    - No AVX support ->

    - AVX support enabled ->
    Almost double GFlops, now thats a pretty neat speed-up

    Cant think of anything else than crunching that I personally could benefit from with AVX.
    But at the same time I'm sure we all use some application that could take advantage of it, we just dont know about, and would require the sw developers to put some focus on it, which might not happen until enough people starts putting out requests for it..
    Asus Crosshair IV Extreme
    AMD FX-8350
    AMD ref. HD 6950 2Gb x 2
    4x4Gb HyperX T1
    Corsair AX1200
    3 x Alphacool triple, 2 x Alphacool ATXP 6970/50, EK D5 dual top, EK Supreme HF

  19. #19
    Xtreme Enthusiast
    Join Date
    Dec 2008
    Location
    Austin, Texas
    Posts
    599
    NBF is on track. Thanks Grinder.

  20. #20
    Xtreme Member
    Join Date
    Jun 2009
    Location
    Budapest, Hungary
    Posts
    262
    1. C-ray. According to phoronix c-ray benefits from avx:



    2. Compile c-ray under linux with avx/sse4.1/sse4.2, then compare completion time with results of other c-ray binaries built with different fp math options.

    3. Well I dont have sandy bridge, so the last benchmark failed without avx:

    1090T | CH4F | HIS HD5850 | TT EvoBlue 750W | TT Spedo Advance | CM Aquagate Max | Samsung S27A350

  21. #21
    Xtreme Member
    Join Date
    Oct 2009
    Location
    Canada
    Posts
    320
    Quote Originally Posted by 64NOMIS View Post
    NBF is on track. Thanks Grinder.
    My pleasure
    *in progress*
    AMD FX-8350
    Asus Crosshair V Formula Z
    2X8GB G.Skill Trident X DDR3-2400 C10
    2X Sapphire Radeon R9 290 Tri-X
    D5|EK Res/top|2X Swiftech MCR320XP|EK Supremacy CPU|2X EK 290X Acetal Nickel
    Seasonic M12D 850w
    Fractal Design Arc Midi R2
    T-Balancer MiniNG
    Western Digital Caviar Black 2TB
    Windows 7 Home Premium 64 bit

    My last intel cpu was a celeron 300a. My first computer was a TI-99/4!

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •