Big milestone:
Name: XtremeSystems
URLs crawled: 30,011,997,950
Data (MB)*: 766,566,997
Congratulations to everyone who helped us get here!!
Big milestone:
Name: XtremeSystems
URLs crawled: 30,011,997,950
Data (MB)*: 766,566,997
Congratulations to everyone who helped us get here!!
oh come on I can't be the only one impressed
Crunch with us, the XS WCG team
The XS WCG team needs your support.
A good project with good goals.
Come join us,get that warm fuzzy feeling that you've done something good for mankind.
It's alright I guess...
Yay! missed this completely ...WELL DONE TEAM
My Biggest Fear Is When I die, My Wife Sells All My Stuff For What I Told Her I Paid For It.79 SB threads and 32 IB Threads across 4 rigs 111 threads Crunching!!
There we go
we need to catch up refic at least... he has reached the PB
"Study hard my young friend"[/B].
---------------------------------------
Woody: It's not a laser! It's a... [sighs in frustration]
Either way it's still good to have you back back Hixie
Crunch with us, the XS WCG team
The XS WCG team needs your support.
A good project with good goals.
Come join us,get that warm fuzzy feeling that you've done something good for mankind.
I've been crawling for quite a while ... just not large figures, and been too busy to post here.
BTW dave, why aren't you going to vegas? I'll be there with a booth this year.
Finally have conquered a quarter of the pie Tasty URLs!
1.7 final node released:
You can increase max reserve buckets from 1 to 3 for more sustained crawlingv1.7.0 5/02/11
+ Support for new generation of central server in parallel with current
! Better handling of redirects
! Better control of domain counts during crawling
! Improved analysis of crawl errors
! Fixed rare issue with empty indexed data written incorrectly
! Mono - support for alternative spawning of archiver, .NET 2.0 build is now the only one available
! Mono builds - Less junk in log on communication errors
! Mono - better logic for handling multiple crawlers on same box
! Added MaxPriorityBuckets parameters to options
! Max reserved (pre-cache) buckets is forced to 1 in order to enable more efficient crawling on whole distributed network
! New SQLite build used (once run database won't be backwards compatible with 1.6.x series)
! Bundled 64-bit SQLite build with Mono distributions
! Mono builds now support https protocol crawling
! Reduced number of messages printed by default (can still be shown if Warnings mode is on)
! Put a limit on barrel archiving to avoid create too many temporary files in odd data sets
I expect he is in the same boat as the rest of us...apart from being away just now.... waiting for the new server to be up in order to redo settings to maximise output. I see we can start in on this tomorrow ...he said hopefully.
Well done on the numbers by the way
My Biggest Fear Is When I die, My Wife Sells All My Stuff For What I Told Her I Paid For It.79 SB threads and 32 IB Threads across 4 rigs 111 threads Crunching!!
Two issues:
First is I haven't been able to get more than 6mbit total from 3 machines since Alex made those changes.
Second is I have one machine down.
Swapped cpu's that had been in that machine and fine and now won't post..
Clueless as to why and that was my main machine..
Crunch with us, the XS WCG team
The XS WCG team needs your support.
A good project with good goals.
Come join us,get that warm fuzzy feeling that you've done something good for mankind.
Dave : use a linux box ( ubuntu). Installation takes 5 mins at most with no previous knowledge.
Also: refic has made a script to install mono ( the toughest part of linux install).
We can guide you and you can call DF who knows it quite a bit :P.
The setup of the ubntu box ( of mj12) could take you 30-40 mins) and you can use all the 30 mbit from a single quad core ( old q6000 would doing it with no sweat)
"Study hard my young friend"[/B].
---------------------------------------
Woody: It's not a laser! It's a... [sighs in frustration]
Crunch with us, the XS WCG team
The XS WCG team needs your support.
A good project with good goals.
Come join us,get that warm fuzzy feeling that you've done something good for mankind.
Dave, under the 'More Crawler' tab in options, try setting 'Maximum deep crawl buckets' to 0 and 'Maximum priority buckets' to 50(or even 100 if 50 works ok). Also, Alex just released the 1.7 final version which lets you raise your reserved buckets from 1 to 3, which should use up more of your connection while crawling so upgrading should give you a boost.
regarding linux, it's very easy to install and run BUT it's a bit of a pain in the a$$ because mono is less stable than .NET under Windows so you constantly have to watch for bugs(not fun).
Congrats on 11 Billion Dave!
I got hit by a nasty concoction of viruses and malware last week, just finished painstakingly reinstalling Windows and everything else(though I have made a clone image of my HD and am going to do monthly backups in case something goes wrong again! ).
sh.. DF.. I always have such sh.. I wonder why alex would not include an AV+malware. .. he could even sell a list of webs containing that sh..
"Study hard my young friend"[/B].
---------------------------------------
Woody: It's not a laser! It's a... [sighs in frustration]
Bookmarks