Language and Big Data

But on the other hand, it really is true that the probability for most grammatical sequences of words actually having turned up on the web really is approximately zero, so grammaticality cannot possibly be reduced to probability of having actually occurred.

Language Log: The sparseness of linguistic data