Loading Docs/manual.texi +16 −4 Original line number Diff line number Diff line Loading @@ -36507,6 +36507,18 @@ carefully tuned up this way). For very small tables, word distribution does not reflect adequately their semantical value, and this model may sometimes produce bizarre results. For example search for the word "search" will produce no results in the above example. Word "search" is present in more than half of rows, and as, such, is effectively treated as stopword (i.e. with semantical value zero). It is, really, the desired behaviour - natural language query should not return every second row in 1GB table. The word that select 50% of rows has low ability to locate relevant documents (and will find plenty of unrelevant documents also - we all know this happen too often when we are trying to find something in Internet with search engine), and, as such, has low semantical value in @strong{this particular dataset}. @page @cindex environment variables, list of @node Environment variables, Users, MySQL internals, Top Loading
Docs/manual.texi +16 −4 Original line number Diff line number Diff line Loading @@ -36507,6 +36507,18 @@ carefully tuned up this way). For very small tables, word distribution does not reflect adequately their semantical value, and this model may sometimes produce bizarre results. For example search for the word "search" will produce no results in the above example. Word "search" is present in more than half of rows, and as, such, is effectively treated as stopword (i.e. with semantical value zero). It is, really, the desired behaviour - natural language query should not return every second row in 1GB table. The word that select 50% of rows has low ability to locate relevant documents (and will find plenty of unrelevant documents also - we all know this happen too often when we are trying to find something in Internet with search engine), and, as such, has low semantical value in @strong{this particular dataset}. @page @cindex environment variables, list of @node Environment variables, Users, MySQL internals, Top