facebook rss twitter

Google completes new web indexing system

by Scott Bicheno on 9 June 2010, 11:12

Tags: Google (NASDAQ:GOOG)

Quick Link: HEXUS.net/qaym3

Add to My Vault: x

Search stimulus

Internet giant Google has announced the completion of its new web indexing system, which it's calling Caffeine, promising it will index web content sooner after it's published.

In a blog post, software engineer Carrie Grimes describes how the current Google indexing system - from which all Google search results are derived - has several layers, some of which are refreshed more frequently than others. This meant there was a ‘significant' delay between Google finding a page and presenting it in a search result. The video below explains the current system more thoroughly.

Caffeine, however, updates the whole index continually, which means stuff will come up on Google searches quicker than before. "Caffeine provides 50 percent fresher results for web searches than our last index," blogged Grimes.

To further illustrate how effective Caffeine is, Grimes offers some tangible facts and figures:

"Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second.

"Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles."

 

 

 



HEXUS Forums :: 2 Comments

Login with Forum Account

Don't have an account? Register today!
Is this part of Google's trend towards real-time search then? The stage after the whole twitter/news feed feature that they kept hyping up?
100,000 terabytes? That's even more than me :eek: