Google released the next the next Search Off the Record podcast, which was actually recorded at least two months ago, Gary Illyes from Google broke down what the Google caffeine index and system actually does.
If you remember, a problem with caffeine was one of the reasons something broke in Google Search a little while ago.
Here is the recording but this part of the conversation starts at about 9 minutes in:
Here is what Gary said:
Martin then stops Gary to explain what the conversion part means. Gary goes on to explain. It does convert the protocol buffer into a different format but it also has to normalize the HTML.
So as you can see, it does a lot, really, a lot.
It is definitely worth listening to. The whole section goes on for about 10 minutes.
Oh, Gary might do some sort of recording for his Life of a Query talk but not for internal use only, but rather for the public.
Forum discussion at Twitter.