How do search engines search their entire cache for terms in .1 seconds?
It's obviously exobytes of data, how could it be possible?
How do search engines search their entire cache for terms in .1 seconds?
It's obviously exobytes of data, how could it be possible?
because it's not actually that many results.
Go to google and search for cheese. It'll come up with some bullshit number like 600 million results found, but the results cap out at page ~35 with less than 400 results displayed.
google exponential growth
>cheese
pizza?
no just cheese
how do they get away with this
yeah but is google searching it's whole cache for "cheese" for you or not?
searching though fifty million terabytes of data should take a few mins no matter what hardware you have
ha ha ha that IS a Jow Forums reference!
There's place for a better search engine than Google for specialized web forensics
such as
it is the power of billions of dollars
now google hashtables instead of cheese
habeeb it
Google literally has well published and documented papers on this. I fucking don't understand why you would ever ask on Jow Forums when this places is just full of software engineer larpers
Algorithms.
hashtables
God I wish that were me
The girl or the bag of spaghetti?
you can't hash everything, only certain things
>spaghetti without sauce
but for real, I think this is cute
No it doesn't search it's whole hash, why would it? It's already got weighed results to display on the first page, only something like 20? The secret is to have results ready to respond with before someone searches for it. Everything is already sorted and ready to go when you press enter. They don't actually go through 6*10^20 results because they don't have to. They just have to fill up the first page. If you go to the second page then you display the next 20 results in the sorted cache for cheese. If you have multiple keywords in your search its basic algebra to produce a new 20 results.
Unironically image related. I know it's a buzzword, but it's true.
Instead of designing your data to be vertically scaling (one large powerful machine with a few replicas), you structure the data over hundreds of thousands of machines.
ScyllaDB (Cassandra alternative written in C++) is the closest we have to Google's BigTable (that's also FOSS).
(((they))) are merely pretending, in reality everything you see is hand picked to support narrative
It matches a pattern and ranked only for the page you requested.