View Full Version : Benchmarking Compression Ratio
Musho
12-31-2009, 07:46 AM
I'm not sure if this is the correct forum for this thread, but I've found a really interesting forum about compression and encryption. They have an image of a VM installation of Linux and are benchmarking programs and settings to see what can get it down as small as possible. This file is excellent for benchmarking, as an entire OS installation will contain many different kind of files, including pictures, text files, executables, etc.
This is the link to the general forum: http://encode.dreamhosters.com/
And here's the link to the benchmarking thread: http://encode.dreamhosters.com/showthread.php?t=507
Direct download link for the image: http://files.mail.ru/6GL31Q
Now let's see who can get that file down the smallest! Will be interesting to see some results! :up:
Edit: My results are already in the thread over there, but I decided it might be a good idea for me to add them here as well, so here you go:
My results:
-Original filesize: 3,95 GB (4.244.176.896 bytes)
-Freearc: 895 MB (939.307.719 bytes)
-Precomp 0.4 > Freearc: 779 MB (817.577.052 bytes)
-Precomp 0.4 > 7z > Freearc > 775 MB (812.958.875 bytes)
-Precomp 0.4 > SREP > Freearc: 771 MB (809.395.616 bytes)
-Precomp 0.4 > 7z > SREP > Freearc: 768 MB (805.338.865 bytes)
The best result is currently held by Skymmer over at that forum. He managed to get the file down to an astonishing 779 709 433 Bytes.
nice thread over there Musho!
Musho
01-03-2010, 02:26 PM
It seems that not many people were interested in compression benchmarks, so I decided to let this thread die. Since it's been brought back up, I'd like to post my recent achievement! These are the programs and switches I used:
precomp0.4 -slow -t-j
srep64 -l128
nanozip -nm -cc -m2g
Which gives us a final file size offffff........*drumroll*
667 MB (699.992.612 bytes)
Yay, the 700.000.000 bytes barrier is broken!
o.k :up:,
congrats man!
these are GREAT results,
will try probably soon enough testing it on some other >4GB files and maybe even give a shot to the VM one (hopefully on a RAM_D) :).
Musho
01-03-2010, 04:33 PM
Onex, you're brilliant! Why didn't I think of that! Precomp is still in alpha and doesn't make very good use of RAM causing lots in HDD access resulting in a bottleneck on the storage, even with my fast SSD. Doing the precomping on a RAMdrive surely would greatly increase the speed. Too bad I don't have enough RAM to fit those huge files on a RAMdrive, though. Precomping the VM file took me 17 hours with my current setup, although precomp is much faster on other kind of files. Might invest some extra in more RAM with my future i7 setup ;) Let me know how it's going with your 4gb+ files :)
yeah, RAM_D is great for this purpose, though you'll need a lot of ram to take care of a 4GB file (and few more GB for it's output)..
shame most of the freewares from this category are not decent enough, other which cost money..
well...
cost money :D.
though,
that would be a hell of a benchmark :up:!
anyhow,
if there'll be some spare time here, will download a nice game file for this purpose though
it's a slow CPU, on an even relatively slower HDD so in about few weeks or a month,
you'll get the answer :shrug:
j/k though, it's probably going to take a while :shrug: :rolleyes:.
p.s - will you give it a shot? http://www.xtremesystems.org/forums/showthread.php?t=240679
curiosity is burning inside ;).
p.s 2 - http://freearc.org/HFCB.aspx u broke a multiple apps record ;).
pjakesmith
01-03-2010, 05:48 PM
Thanks a lot for the post!
Musho
01-04-2010, 07:00 AM
yeah, RAM_D is great for this purpose, though you'll need a lot of ram to take care of a 4GB file (and few more GB for it's output)..
shame most of the freewares from this category are not decent enough, other which cost money..
well...
cost money :D.
though,
that would be a hell of a benchmark :up:!
anyhow,
if there'll be some spare time here, will download a nice game file for this purpose though
it's a slow CPU, on an even relatively slower HDD so in about few weeks or a month,
you'll get the answer :shrug:
j/k though, it's probably going to take a while :shrug: :rolleyes:.
p.s - will you give it a shot? http://www.xtremesystems.org/forums/showthread.php?t=240679
curiosity is burning inside ;).
p.s 2 - http://freearc.org/HFCB.aspx u broke a multiple apps record ;).
I'd love to give the HDD benchmark a shot, although I'm afraid my SSD (Vertex 120gb) is dieing and corrupting files randomly. I've already requested RMA. For the meantime, I've ordered an X25-m G2 80gb to use. I'll prolly sell the Vertex on Ebay or something once I get it back from RMA and add 1 or 2 extra x25-M's in raid0 to increase those sequentials! If you're wondering how good my SSD used to perform, here's a benchmark someone else ran on their Vertex 120gb:
http://www.legitreviews.com/article/954/5/
yeah, i'm more interested in seeing how well it performs multi-threading.
the indexing service is relatively unnecessary for a 120GB HD, it lists u'r files for faster query,
u can enable it, run the test for few seconds and then disable it, u don't need to run the all indexing/unindexing for it,
even more, u usually don't even need to run the indexing service from the services.msc console.
anyway,
PM when u'd like to sell u'r Vertex, i'll might take it if it is for a reasonable price.
i'll try and download a small 1GB game file for the compression test,
even on a few hundreds of MB just to watch the compression percentage,
if u'r going to run a lot of tests on different compression apps, u should really consider adding some extra RAM for it,
a 8 or 12 GB would be great for that purpose, and u got RAM Disk utility which is free and is quite good,
u just need to be aware that u cannot uninstall a ram drive from Windows XP and then reinstall it as there is a conflict between the XP kernel and the RAM Disk driver which demands a reboot whenever u would like to cancel and re enable it.
basically, after u'r clear with the application issues, it could be definitely quite alright :).
Musho
01-04-2010, 07:42 AM
Because these kind of high compression tasks are very calculation intensive, the bottleneck will always be on the CPU, even with an old platter based HDD. The only reason precomp is so slow, is because it's doing most of the things on the HDD instead of in the RAM, as it's still in alpha status. The creator of that program is also active on that forum and the next release will most likely use your RAM, which will cause the bottleneck to go back to the CPU. I don't really feel like spending loads of money to speed up the precomp process, which is usually _much_ faster on other kind of files (the VM file is just really complex, as it contains an entire OS), just to get faster speeds till the next release :) Hopefully I'll be back up and running with my main rig in a few hours on an old platter based HDD. I could test the compression on a game ISO if you can name me a game which ISO is around 1gb in size. I will download it and test the compression. I know downloading pirated games is frowned upon, but I'm sure it's okay to test some compression to the files and delete the ISO afterwards. Alternatively, I could rip one of my own games and compress those. I have age of mythology laying around somewhere. I will rip both disc 1 and 2 and compress them together. Those are relatively small ISOs, so the test won't take 12 hours+ to perform. Will report the results once I am done. Will be most likely tomorrow, as I will have to reinstall my rig first :)
yeah,
these game files are not being used for playing, so it should be fine with everyone ;).
though actually Skymmer has said through the thread u posted that some game files (COD modern warfare 2 for her exm.) are already compressed through several different compression algorithms so it could be a waste of time..
though maybe if u do have some time, maybe other combination do still worth a try.
about the ramdisk, u could probably set the output to the ssd to save some $ on extra ram, maybe try it with a small 1GB file, i'll try looking up one game file for testing for u if u won't find though, u can take any file maybe even openoffice torrent or linux ISO distribution which should be ~700MB.
as for the VM file u'r right, it's probably better to wait for the actual release if u'r not going to be using it a lot anyways ;).
and tnx for the PM http://www.xtremesystems.org/forums/images/icons/icon14.gif.
Musho
01-04-2010, 10:47 AM
yeah,
these game files are not being used for playing, so it should be fine with everyone ;).
though actually Skymmer has said through the thread u posted that some game files (COD modern warfare 2 for her exm.) are already compressed through several different compression algorithms so it could be a waste of time..
though maybe if u do have some time, maybe other combination do still worth a try.
about the ramdisk, u could probably set the output to the ssd to save some $ on extra ram, maybe try it with a small 1GB file, i'll try looking up one game file for testing for u if u won't find though, u can take any file maybe even openoffice torrent or linux ISO distribution which should be ~700MB.
as for the VM file u'r right, it's probably better to wait for the actual release if u'r not going to be using it a lot anyways ;).
and tnx for the PM http://www.xtremesystems.org/forums/images/icons/icon14.gif.
Yes, but that's what precomp is exactly for. Precomp decompresses already compressed data and is capable of re-compressing it to a state it which it is bit-identical to the input file. This allows the user to use the program on already compressed data to decompress it, so the user can use stronger compressing programs and algorithms to compress the data to a greater ratio.
I agree with Skymmer that COD won't compress that well as the same amount of uncompressed data, but I could surely get the size down quite a bit by using precomp and compressing it with something stronger (srep64 + nanozip).
To get really great compression ratios, you should try to compress installed programs. A Linux ISO or Openoffice installer is already compressed data, so even though you can compress it further by decompressing it with precomp first, it wouldn't compress quite as well as uncompressed data like an installed program.
COD is slightly different as some parts of the installed data is still compressed and decompressed on the fly as your loading levels/textures. Some games which compress really well once they are installed are: Grid and Devil May Cry 4.
E:
ohh,
your point is more obvious now,
so what u'r saying is that precomp is capable of actually eliminate earlier compression for certain algo's being used and re-compress the data at it's own file format.
that enable the user to decompress the file and experiment with it through different compression methods.
though the only problem with it is that precomp is aiming at decompressing zlib, Deflate and GIF files so others shouldn't get much affected by it.
it isn't perfect, Skymmer have said that only about half of the files are zlibed and the other are zipped (should mean Deflate which is used by precomp) and the rest are ~2GB BIK.
still, it really worth a shot,
this is a nice benchmark,
if it won't take more than couple of days, i'll definitely give it a shot,
or you would ?
aside from that,
did u hear about that guy from France (or some where else in Eu) who invented a compression algorithm few years back claiming it could shrink a few hundreds of MB files to only few KB?
he was found dead later on and people where saying some sort of a conspiracy of DVD makers is related to that.
thank u for all the info,
this was an interesting input!
Musho
01-05-2010, 03:19 AM
E:
ohh,
your point is more obvious now,
so what u'r saying is that precomp is capable of actually eliminate earlier compression for certain algo's being used and re-compress the data at it's own file format.
that enable the user to decompress the file and experiment with it through different compression methods.
though the only problem with it is that precomp is aiming at decompressing zlib, Deflate and GIF files so others shouldn't get much affected by it.
it isn't perfect, Skymmer have said that only about half of the files are zlibed and the other are zipped (should mean Deflate which is used by precomp) and the rest are ~2GB BIK.
still, it really worth a shot,
Precomp doesn't actually recompress the data it managed to decompress. You have to use other programs for that, like 7zip, freearc, srep, rar, zip, nanozip, etc. What this means, is that precomping a compressed file will actually _increase_ the size. The 4.22gb VM file for example, was a little over 5gb after precomping, but compressed a lot better (about 100mb better, which is huge when the file size is only around 700-800mb). The .BIK files in COD are actually renamed .ZIP files, and precomp decompresses them just fine. What Skymmer meant to say was that COD is actually already compressed, although with an inferior algorithm, so you won't get a ratio as good as you would when you compressed an uncompressed game. I'll give you an example (note, the sizes and numbers are completely made up, they are just there to prove a point.):
COD (8gb) > Precomp (12gb) > Nanozip (4gb)
Result: It managed to decompress the data into 12gb of uncompressed data, which then compressed to 4gb data. You managed to shrink the game in half (50%), although the actual uncompressed data was shrunk to 33% of it's original size (4gb out of 12gb)
Game X (8gb of uncompressed data) > Precomp (still 8gb, because there was nothing to decompress) > Nanozip (2.67gb)
Result: You managed to shrink the uncompressed data to 33% of it's original size, which is exactly the same as in the COD example.
From both examples, the example of Game X looks far more impressive, as you managed to compress 8gb of data to 2.67gb, although both examples' uncompressed data compressed just as well! That's why it's harder to get a good compression ratio with COD. Because it's harder to shrink 8gb of compressed data (12gb actual uncompressed data) than it is to shrink 8gb of uncompressed data (8gb actual data). Hope this explains it all :)
this is a nice benchmark,
if it won't take more than couple of days, i'll definitely give it a shot,
or you would ?
You could easily give it a shot. Precomping on around 6-8gb of data takes me around 2-3 hours for game data. The VM file just took me 17 hours, because it's way more complex, as it's an entire OS. The srep64 process takes around 15-30 minutes and last but not least, nanozip with the switches I provided takes me around 2-3 hours. You could definitely try it out in 8 hours max. (And don't forget you can still use your PC while it's busy. None of these programs are multithreaded yet, so it will only load 1 core.
aside from that,
did u hear about that guy from France (or some where else in Eu) who invented a compression algorithm few years back claiming it could shrink a few hundreds of MB files to only few KB?
he was found dead later on and people where saying some sort of a conspiracy of DVD makers is related to that.
thank u for all the info,
this was an interesting input!
Yes, that guy's name is "Jan Sloot" and he's from The Netherlands, the same country I am from :) He actually got me interested into compression techniques :) While I don't believe the claims that were made, as it is mathematically possible to prove what they claimed is impossible, I certainly believe he found out something important. I think he demoed his system in 1998, showing 16 movies+ running simultaneously while being able to speed up the playbacks, reverse, etc in realtime! Now even if he didn't have some of compression algorithm, what he showed there was VERY impressive with the hardware they had back then. Sure, he _could_ have faked it, but how? Letting a powerful machine do all the work and transmit it wireless? That alone would be a huge invention, I think WIFI was released later, not to mention so many movies would require quite some bandwidth. He stored it on expensive solidstate memory? Still, the machine was extremely powerful in that case, since the processing power required to do what he demoed back then would be enormous for the kind of hardware they had back then. It would have been a great multimediabox invention.
No, what I think that happened, was that the claims were greatly exaggerated, but that he actually did have a powerful and efficient compression algorithm. It's a pity all the information was lost when he died.
yeah, it is well understood,
the only thing left is these BIK files are some RAD tools development video files for games.
they are compressed already, and doubtfully but possibly is that precomp can decompress them and recompress them again.. (a guess).
it still doesn't look good enough, 700MB from a 4GB file seems fine yet another 100MB isn't quite worth it,
there is here X3 terran conflict which is some 9.3 GB rared, 12.1 unrared and 6.2 raw waiting to install :confused:..
probably, the same files has went one over the other while extracting the ISO's.
will have a look soon, adding precomp, srep and NZ with the switches u went with while taking the VM file to under 700MB.
should be interesting,
will post u back the results :).
p.s - was looking for "jean salutte" through google.. ;) and couldn't find it,
i knew this was the guy's name,
poor fellow, strange case,
well, thanks for the reminder,
now it is there ;).
E: PC crashed due to a bugged driver during run time,
@ ~32hrs 92% of the load,
currently seems as if the OP has lost interest in the tests,
might update after a full run later on.