Loading Docs/manual.texi +45 −201 Original line number Diff line number Diff line Loading @@ -411,6 +411,7 @@ MySQL Language Reference * SET OPTION:: @code{SET OPTION} syntax * SET TRANSACTION:: @code{SET TRANSACTION} syntax * GRANT:: @code{GRANT} and @code{REVOKE} syntax * HANDLER:: @code{HANDLER} syntax * CREATE INDEX:: @code{CREATE INDEX} syntax * DROP INDEX:: @code{DROP INDEX} syntax * Comments:: Comment syntax Loading Loading @@ -13653,6 +13654,7 @@ to restart @code{mysqld} with @code{--skip-grant-tables} to run * SET OPTION:: @code{SET OPTION} syntax * SET TRANSACTION:: @code{SET TRANSACTION} syntax * GRANT:: @code{GRANT} and @code{REVOKE} syntax * HANDLER:: @code{HANDLER} syntax * CREATE INDEX:: @code{CREATE INDEX} syntax * DROP INDEX:: @code{DROP INDEX} syntax * Comments:: Comment syntax Loading Loading @@ -22615,7 +22617,7 @@ You can set the default isolation level for @code{mysqld} with @findex GRANT @findex REVOKE @node GRANT, CREATE INDEX, SET TRANSACTION, Reference @node GRANT, HANDLER, SET TRANSACTION, Reference @section @code{GRANT} and @code{REVOKE} Syntax @example Loading Loading @@ -22843,11 +22845,52 @@ dropped only with explicit @code{REVOKE} commands or by manipulating the @strong{MySQL} grant tables. @end itemize @findex HANDLER @node HANDLER, CREATE INDEX, GRANT, Reference @section @code{HANDLER} Syntax @example HANDLER table OPEN [ AS alias ] HANDLER table READ index @{ = | >= | <= | < @} (value1, value2, ... ) [ WHERE ... ] [LIMIT ... ] HANDLER table READ index @{ FIRST | NEXT | PREV | LAST @} [ WHERE ... ] [LIMIT ... ] HANDLER table READ @{ FIRST | NEXT @} [ WHERE ... ] [LIMIT ... ] HANDLER table CLOSE @end example The @code{HANDLER} statement provides direct access to @strong{MySQL} table interface, bypassing SQL optimizer. Thus, it is faster then SELECT. The first form of @code{HANDLER} statement opens a table, making in accessible via the following @code{HANDLER ... READ} routines. The second form fetches one (or, specified by @code{LIMIT} clause) row where the index specified complies to the condition and @code{WHERE} condition is met. If the index consists of several parts (spans over several columns) the values are specified in comma-separated list, providing values only for few first columns is possible. The third form fetches one (or, specified by @code{LIMIT} clause) row from the table in index order, matching @code{WHERE} condition. The fourth form (without index specification) fetches one (or, specified by @code{LIMIT} clause) row from the table in natural row order (as stored in data file) matching @code{WHERE} condition. It is faster than @code{HANDLER table READ index} when full table scan is desired. The last form closes the table, opened with @code{HANDLER ... OPEN}. @code{HANDLER} is somewhat low-level statement, for example it does not provide consistency. That is @code{HANDLER ... OPEN} does @strong{not} takes a snapshot of the table, and does @strong{not} locks the table. The above means, that after @code{HANDLER ... OPEN} table data can be modified (by this or other thread) and these modifications may appear only partially in @code{HANDLER ... NEXT} or @code{HANDLER ... PREV} scans. @cindex indexes @cindex indexes, multi-part @cindex multi-part index @findex CREATE INDEX @node CREATE INDEX, DROP INDEX, GRANT, Reference @node CREATE INDEX, DROP INDEX, HANDLER, Reference @section @code{CREATE INDEX} Syntax @example Loading Loading @@ -40814,205 +40857,6 @@ started to read and apply updates from the master. @code{mysqladmin processlist} only shows the connection, @code{INSERT DELAYED}, and replication threads. @cindex searching, full-text @cindex full-text search @cindex FULLTEXT @node MySQL full-text search, MySQL test suite, MySQL threads, MySQL internals @section MySQL Full-text Search Since Version 3.23.23, @strong{MySQL} has support for full-text indexing and searching. Full-text indexes in @strong{MySQL} are an index of type @code{FULLTEXT}. @code{FULLTEXT} indexes can be created from @code{VARCHAR} and @code{TEXT} columns at @code{CREATE TABLE} time or added later with @code{ALTER TABLE} or @code{CREATE INDEX}. For large datasets, adding @code{FULLTEXT} index with @code{ALTER TABLE} (or @code{CREATE INDEX}) would be much faster than inserting rows into the empty table with a @code{FULLTEXT} index. Full-text search is performed with the @code{MATCH} function. @example mysql> CREATE TABLE t (a VARCHAR(200), b TEXT, FULLTEXT (a,b)); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO t VALUES -> ('MySQL has now support', 'for full-text search'), -> ('Full-text indexes', 'are called collections'), -> ('Only MyISAM tables','support collections'), -> ('Function MATCH ... AGAINST()','is used to do a search'), -> ('Full-text search in MySQL', 'implements vector space model'); Query OK, 5 rows affected (0.00 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM t WHERE MATCH (a,b) AGAINST ('MySQL'); +---------------------------+-------------------------------+ | a | b | +---------------------------+-------------------------------+ | MySQL has now support | for full-text search | | Full-text search in MySQL | implements vector-space-model | +---------------------------+-------------------------------+ 2 rows in set (0.00 sec) mysql> SELECT *,MATCH a,b AGAINST ('collections support') as x FROM t; +------------------------------+-------------------------------+--------+ | a | b | x | +------------------------------+-------------------------------+--------+ | MySQL has now support | for full-text search | 0.3834 | | Full-text indexes | are called collections | 0.3834 | | Only MyISAM tables | support collections | 0.7668 | | Function MATCH ... AGAINST() | is used to do a search | 0 | | Full-text search in MySQL | implements vector space model | 0 | +------------------------------+-------------------------------+--------+ 5 rows in set (0.00 sec) @end example The function @code{MATCH} matches a natural language query @code{AGAINST} a text collection (which is simply the columns that are covered by a @strong{FULLTEXT} index). For every row in a table it returns relevance - a similarity measure between the text in that row (in the columns that are part of the collection) and the query. When it is used in a @code{WHERE} clause (see example above) the rows returned are automatically sorted with relevance decreasing. Relevance is a non-negative floating-point number. Zero relevance means no similarity. Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word. MySQL uses a very simple parser to split text into words. A ``word'' is any sequence of letters, numbers, @samp{'}, and @samp{_}. Any ``word'' that is present in the stopword list or just too short (3 characters or less) is ignored. Every correct word in the collection and in the query is weighted, according to its significance in the query or collection. This way, a word that is present in many documents will have lower weight (and may even have a zero weight), because it has lower semantic value in this particular collection. Otherwise, if the word is rare, it will receive a higher weight. The weights of the words are then combined to compute the relevance of the row. Such a technique works best with large collections (in fact, it was carefully tuned this way). For very small tables, word distribution does not reflect adequately their semantical value, and this model may sometimes produce bizarre results. For example, search for the word "search" will produce no results in the above example. Word "search" is present in more than half of rows, and as such, is effectively treated as a stopword (that is, with semantical value zero). It is, really, the desired behavior - a natural language query should not return every other row in 1GB table. A word that matches half of rows in a table is less likely to locate relevant documents. In fact, it will most likely find plenty of irrelevant documents. We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that such rows have been assigned a low semantical value in @strong{a particular dataset}. @menu * Fulltext Fine-tuning:: * Fulltext features to appear in MySQL 4.0:: * Fulltext TODO:: @end menu @node Fulltext Fine-tuning, Fulltext features to appear in MySQL 4.0, MySQL full-text search, MySQL full-text search @subsection Fine-tuning MySQL Full-text Search Unfortunately, full-text search has no user-tunable parameters yet, although adding some is very high on the TODO. However, if you have a @strong{MySQL} source distribution (@xref{Installing source}.), you can somewhat alter the full-text search behavior. Note that full-text search was carefully tuned for the best searching effectiveness. Modifying the default behavior will, in most cases, only make the search results worse. Do not alter the @strong{MySQL} sources unless you know what you are doing! @itemize @item Minimal length of word to be indexed is defined in @code{myisam/ftdefs.h} file by the line @example #define MIN_WORD_LEN 4 @end example Change it to the value you prefer, recompile @strong{MySQL}, and rebuild your @code{FULLTEXT} indexes. @item The stopword list is defined in @code{myisam/ft_static.c} Modify it to your taste, recompile @strong{MySQL} and rebuild your @code{FULLTEXT} indexes. @item The 50% threshold is caused by the particular weighting scheme chosen. To disable it, change the following line in @code{myisam/ftdefs.h}: @example #define GWS_IN_USE GWS_PROB @end example to @example #define GWS_IN_USE GWS_FREQ @end example and recompile @strong{MySQL}. There is no need to rebuild the indexes in this case. @end itemize @node Fulltext features to appear in MySQL 4.0, Fulltext TODO, Fulltext Fine-tuning, MySQL full-text search @subsection New Features of Full-text Search to Appear in MySQL 4.0 This section includes a list of the fulltext features that are already implemented in the 4.0 tree. It explains @strong{More functions for full-text search} entry of @ref{TODO MySQL 4.0}. @itemize @bullet @item @code{REPAIR TABLE} with @code{FULLTEXT} indexes, @code{ALTER TABLE} with @code{FULLTEXT} indexes, and @code{OPTIMIZE TABLE} with @code{FULLTEXT} indexes are now up to 100 times faster. @item @code{MATCH ... AGAINST} now supports the following @strong{boolean operators}: @itemize @bullet @item @code{+}word means the that word @strong{must} be present in every row returned. @item @code{-}word means the that word @strong{must not} be present in every row returned. @item @code{<} and @code{>} can be used to decrease and increase word weight in the query. @item @code{~} can be used to assign a @strong{negative} weight to a noise word. @item @code{*} is a truncation operator. @end itemize Boolean search utilizes a more simplistic way of calculating the relevance, that does not have a 50% threshold. @item Searches are now up to 2 times faster due to optimized search algorithm. @item Utility program @code{ft_dump} added for low-level @code{FULLTEXT} index operations (querying/dumping/statistics). @end itemize @node Fulltext TODO, , Fulltext features to appear in MySQL 4.0, MySQL full-text search @subsection Full-text Search TODO @itemize @bullet @item Make all operations with @code{FULLTEXT} index @strong{faster}. @item Support for braces @code{()} in boolean fulltext search. @item Support for "always-index words". They could be any strings the user wants to treat as words, examples are "C++", "AS/400", "TCP/IP", etc. @item Support for fulltext search in @code{MERGE} tables. @item Support for multi-byte charsets. @item Make stopword list to depend of the language of the data. @item Stemming (dependent of the language of the data, of course). @item Generic user-suppied UDF (?) preparser. @item Make the model more flexible (by adding some adjustable parameters to @code{FULLTEXT} in @code{CREATE/ALTER TABLE}). @end itemize @cindex mysqltest, MySQL Test Suite @cindex testing mysqld, mysqltest @node MySQL test suite, , MySQL threads, MySQL internals mysql-test/r/handler.result 0 → 100644 +26 −0 Original line number Diff line number Diff line a b 14 aaa a b 15 bbb a b 16 ccc a b 15 bbb a b 22 iii a b 21 hhh a b 20 ggg a b 14 aaa a b a b 22 iii a b 21 hhh a b 22 iii a b a b 15 bbb mysql-test/t/handler.test 0 → 100644 +65 −0 Original line number Diff line number Diff line # # test of HANDLER ... # drop table if exists t1; create table t1 (a int, b char(10), key a(a), key b(a,b)); insert into t1 values (17,"ddd"),(18,"eee"),(19,"fff"),(19,"yyy"), (14,"aaa"),(15,"bbb"),(16,"ccc"),(16,"xxx"), (20,"ggg"),(21,"hhh"),(22,"iii"); handler t1 open as t2; handler t2 read a first; handler t2 read a next; handler t2 read a next; handler t2 read a prev; handler t2 read a last; handler t2 read a prev; handler t2 read a prev; handler t2 read a first; handler t2 read a prev; handler t2 read a last; handler t2 read a prev; handler t2 read a next; handler t2 read a next; handler t2 read a=(15); handler t2 read a=(16); !$1070 handler t2 read a=(19,"fff"); handler t2 read b=(19,"fff"); handler t2 read b=(19,"yyy"); handler t2 read b=(19); !$1109 handler t1 read a last; handler t2 read a=(11); handler t2 read a>=(11); handler t2 read a=(18); handler t2 read a>=(18); handler t2 read a>(18); handler t2 read a<=(18); handler t2 read a<(18); handler t2 read a first limit 5; handler t2 read a next limit 3; handler t2 read a prev limit 10; handler t2 read a>=(16) limit 4; handler t2 read a>=(16) limit 2,2; handler t2 read a last limit 3; handler t2 read a=(19); handler t2 read a=(19) where b="yyy"; handler t2 read first; handler t2 read next; handler t2 read next; handler t2 read last; handler t2 close; drop table if exists t1; sql/Makefile.am +1 −1 Original line number Diff line number Diff line Loading @@ -56,7 +56,7 @@ noinst_HEADERS = item.h item_func.h item_sum.h item_cmpfunc.h \ sql_select.h structs.h table.h sql_udf.h hash_filo.h\ lex.h lex_symbol.h sql_acl.h sql_crypt.h md5.h \ log_event.h mini_client.h sql_repl.h slave.h mysqld_SOURCES = sql_lex.cc \ mysqld_SOURCES = sql_lex.cc sql_handler.cc \ item.cc item_sum.cc item_buff.cc item_func.cc \ item_cmpfunc.cc item_strfunc.cc item_timefunc.cc \ thr_malloc.cc item_create.cc \ Loading sql/lex.h +6 −1 Original line number Diff line number Diff line Loading @@ -82,6 +82,7 @@ static SYMBOL symbols[] = { { "CHANGED", SYM(CHANGED),0,0}, { "CHECK", SYM(CHECK_SYM),0,0}, { "CHECKSUM", SYM(CHECKSUM_SYM),0,0}, { "CLOSE", SYM(CLOSE_SYM),0,0}, { "COLUMN", SYM(COLUMN_SYM),0,0}, { "COLUMNS", SYM(COLUMNS),0,0}, { "COMMENT", SYM(COMMENT_SYM),0,0}, Loading Loading @@ -152,6 +153,7 @@ static SYMBOL symbols[] = { { "GRANTS", SYM(GRANTS),0,0}, { "GROUP", SYM(GROUP),0,0}, { "HAVING", SYM(HAVING),0,0}, { "HANDLER", SYM(HANDLER_SYM),0,0}, { "HEAP", SYM(HEAP_SYM),0,0}, { "HIGH_PRIORITY", SYM(HIGH_PRIORITY),0,0}, { "HOUR", SYM(HOUR_SYM),0,0}, Loading Loading @@ -185,6 +187,7 @@ static SYMBOL symbols[] = { { "KEY", SYM(KEY_SYM),0,0}, { "KEYS", SYM(KEYS),0,0}, { "KILL", SYM(KILL_SYM),0,0}, { "LAST", SYM(LAST_SYM),0,0}, { "LAST_INSERT_ID", SYM(LAST_INSERT_ID),0,0}, { "LEADING", SYM(LEADING),0,0}, { "LEFT", SYM(LEFT),0,0}, Loading Loading @@ -226,11 +229,12 @@ static SYMBOL symbols[] = { { "MYISAM", SYM(MYISAM_SYM),0,0}, { "NATURAL", SYM(NATURAL),0,0}, { "NATIONAL", SYM(NATIONAL_SYM),0,0}, { "NEXT", SYM(NEXT_SYM),0,0}, { "NCHAR", SYM(NCHAR_SYM),0,0}, { "NUMERIC", SYM(NUMERIC_SYM),0,0}, { "NO", SYM(NO_SYM),0,0}, { "NOT", SYM(NOT),0,0}, { "NULL", SYM(NULL_SYM),0,0}, { "NUMERIC", SYM(NUMERIC_SYM),0,0}, { "ON", SYM(ON),0,0}, { "OPEN", SYM(OPEN_SYM),0,0}, { "OPTIMIZE", SYM(OPTIMIZE),0,0}, Loading @@ -245,6 +249,7 @@ static SYMBOL symbols[] = { { "PASSWORD", SYM(PASSWORD),0,0}, { "PURGE", SYM(PURGE),0,0}, { "PRECISION", SYM(PRECISION),0,0}, { "PREV", SYM(PREV_SYM),0,0}, { "PRIMARY", SYM(PRIMARY_SYM),0,0}, { "PROCEDURE", SYM(PROCEDURE),0,0}, { "PROCESS" , SYM(PROCESS),0,0}, Loading Loading
Docs/manual.texi +45 −201 Original line number Diff line number Diff line Loading @@ -411,6 +411,7 @@ MySQL Language Reference * SET OPTION:: @code{SET OPTION} syntax * SET TRANSACTION:: @code{SET TRANSACTION} syntax * GRANT:: @code{GRANT} and @code{REVOKE} syntax * HANDLER:: @code{HANDLER} syntax * CREATE INDEX:: @code{CREATE INDEX} syntax * DROP INDEX:: @code{DROP INDEX} syntax * Comments:: Comment syntax Loading Loading @@ -13653,6 +13654,7 @@ to restart @code{mysqld} with @code{--skip-grant-tables} to run * SET OPTION:: @code{SET OPTION} syntax * SET TRANSACTION:: @code{SET TRANSACTION} syntax * GRANT:: @code{GRANT} and @code{REVOKE} syntax * HANDLER:: @code{HANDLER} syntax * CREATE INDEX:: @code{CREATE INDEX} syntax * DROP INDEX:: @code{DROP INDEX} syntax * Comments:: Comment syntax Loading Loading @@ -22615,7 +22617,7 @@ You can set the default isolation level for @code{mysqld} with @findex GRANT @findex REVOKE @node GRANT, CREATE INDEX, SET TRANSACTION, Reference @node GRANT, HANDLER, SET TRANSACTION, Reference @section @code{GRANT} and @code{REVOKE} Syntax @example Loading Loading @@ -22843,11 +22845,52 @@ dropped only with explicit @code{REVOKE} commands or by manipulating the @strong{MySQL} grant tables. @end itemize @findex HANDLER @node HANDLER, CREATE INDEX, GRANT, Reference @section @code{HANDLER} Syntax @example HANDLER table OPEN [ AS alias ] HANDLER table READ index @{ = | >= | <= | < @} (value1, value2, ... ) [ WHERE ... ] [LIMIT ... ] HANDLER table READ index @{ FIRST | NEXT | PREV | LAST @} [ WHERE ... ] [LIMIT ... ] HANDLER table READ @{ FIRST | NEXT @} [ WHERE ... ] [LIMIT ... ] HANDLER table CLOSE @end example The @code{HANDLER} statement provides direct access to @strong{MySQL} table interface, bypassing SQL optimizer. Thus, it is faster then SELECT. The first form of @code{HANDLER} statement opens a table, making in accessible via the following @code{HANDLER ... READ} routines. The second form fetches one (or, specified by @code{LIMIT} clause) row where the index specified complies to the condition and @code{WHERE} condition is met. If the index consists of several parts (spans over several columns) the values are specified in comma-separated list, providing values only for few first columns is possible. The third form fetches one (or, specified by @code{LIMIT} clause) row from the table in index order, matching @code{WHERE} condition. The fourth form (without index specification) fetches one (or, specified by @code{LIMIT} clause) row from the table in natural row order (as stored in data file) matching @code{WHERE} condition. It is faster than @code{HANDLER table READ index} when full table scan is desired. The last form closes the table, opened with @code{HANDLER ... OPEN}. @code{HANDLER} is somewhat low-level statement, for example it does not provide consistency. That is @code{HANDLER ... OPEN} does @strong{not} takes a snapshot of the table, and does @strong{not} locks the table. The above means, that after @code{HANDLER ... OPEN} table data can be modified (by this or other thread) and these modifications may appear only partially in @code{HANDLER ... NEXT} or @code{HANDLER ... PREV} scans. @cindex indexes @cindex indexes, multi-part @cindex multi-part index @findex CREATE INDEX @node CREATE INDEX, DROP INDEX, GRANT, Reference @node CREATE INDEX, DROP INDEX, HANDLER, Reference @section @code{CREATE INDEX} Syntax @example Loading Loading @@ -40814,205 +40857,6 @@ started to read and apply updates from the master. @code{mysqladmin processlist} only shows the connection, @code{INSERT DELAYED}, and replication threads. @cindex searching, full-text @cindex full-text search @cindex FULLTEXT @node MySQL full-text search, MySQL test suite, MySQL threads, MySQL internals @section MySQL Full-text Search Since Version 3.23.23, @strong{MySQL} has support for full-text indexing and searching. Full-text indexes in @strong{MySQL} are an index of type @code{FULLTEXT}. @code{FULLTEXT} indexes can be created from @code{VARCHAR} and @code{TEXT} columns at @code{CREATE TABLE} time or added later with @code{ALTER TABLE} or @code{CREATE INDEX}. For large datasets, adding @code{FULLTEXT} index with @code{ALTER TABLE} (or @code{CREATE INDEX}) would be much faster than inserting rows into the empty table with a @code{FULLTEXT} index. Full-text search is performed with the @code{MATCH} function. @example mysql> CREATE TABLE t (a VARCHAR(200), b TEXT, FULLTEXT (a,b)); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO t VALUES -> ('MySQL has now support', 'for full-text search'), -> ('Full-text indexes', 'are called collections'), -> ('Only MyISAM tables','support collections'), -> ('Function MATCH ... AGAINST()','is used to do a search'), -> ('Full-text search in MySQL', 'implements vector space model'); Query OK, 5 rows affected (0.00 sec) Records: 5 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM t WHERE MATCH (a,b) AGAINST ('MySQL'); +---------------------------+-------------------------------+ | a | b | +---------------------------+-------------------------------+ | MySQL has now support | for full-text search | | Full-text search in MySQL | implements vector-space-model | +---------------------------+-------------------------------+ 2 rows in set (0.00 sec) mysql> SELECT *,MATCH a,b AGAINST ('collections support') as x FROM t; +------------------------------+-------------------------------+--------+ | a | b | x | +------------------------------+-------------------------------+--------+ | MySQL has now support | for full-text search | 0.3834 | | Full-text indexes | are called collections | 0.3834 | | Only MyISAM tables | support collections | 0.7668 | | Function MATCH ... AGAINST() | is used to do a search | 0 | | Full-text search in MySQL | implements vector space model | 0 | +------------------------------+-------------------------------+--------+ 5 rows in set (0.00 sec) @end example The function @code{MATCH} matches a natural language query @code{AGAINST} a text collection (which is simply the columns that are covered by a @strong{FULLTEXT} index). For every row in a table it returns relevance - a similarity measure between the text in that row (in the columns that are part of the collection) and the query. When it is used in a @code{WHERE} clause (see example above) the rows returned are automatically sorted with relevance decreasing. Relevance is a non-negative floating-point number. Zero relevance means no similarity. Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word. MySQL uses a very simple parser to split text into words. A ``word'' is any sequence of letters, numbers, @samp{'}, and @samp{_}. Any ``word'' that is present in the stopword list or just too short (3 characters or less) is ignored. Every correct word in the collection and in the query is weighted, according to its significance in the query or collection. This way, a word that is present in many documents will have lower weight (and may even have a zero weight), because it has lower semantic value in this particular collection. Otherwise, if the word is rare, it will receive a higher weight. The weights of the words are then combined to compute the relevance of the row. Such a technique works best with large collections (in fact, it was carefully tuned this way). For very small tables, word distribution does not reflect adequately their semantical value, and this model may sometimes produce bizarre results. For example, search for the word "search" will produce no results in the above example. Word "search" is present in more than half of rows, and as such, is effectively treated as a stopword (that is, with semantical value zero). It is, really, the desired behavior - a natural language query should not return every other row in 1GB table. A word that matches half of rows in a table is less likely to locate relevant documents. In fact, it will most likely find plenty of irrelevant documents. We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that such rows have been assigned a low semantical value in @strong{a particular dataset}. @menu * Fulltext Fine-tuning:: * Fulltext features to appear in MySQL 4.0:: * Fulltext TODO:: @end menu @node Fulltext Fine-tuning, Fulltext features to appear in MySQL 4.0, MySQL full-text search, MySQL full-text search @subsection Fine-tuning MySQL Full-text Search Unfortunately, full-text search has no user-tunable parameters yet, although adding some is very high on the TODO. However, if you have a @strong{MySQL} source distribution (@xref{Installing source}.), you can somewhat alter the full-text search behavior. Note that full-text search was carefully tuned for the best searching effectiveness. Modifying the default behavior will, in most cases, only make the search results worse. Do not alter the @strong{MySQL} sources unless you know what you are doing! @itemize @item Minimal length of word to be indexed is defined in @code{myisam/ftdefs.h} file by the line @example #define MIN_WORD_LEN 4 @end example Change it to the value you prefer, recompile @strong{MySQL}, and rebuild your @code{FULLTEXT} indexes. @item The stopword list is defined in @code{myisam/ft_static.c} Modify it to your taste, recompile @strong{MySQL} and rebuild your @code{FULLTEXT} indexes. @item The 50% threshold is caused by the particular weighting scheme chosen. To disable it, change the following line in @code{myisam/ftdefs.h}: @example #define GWS_IN_USE GWS_PROB @end example to @example #define GWS_IN_USE GWS_FREQ @end example and recompile @strong{MySQL}. There is no need to rebuild the indexes in this case. @end itemize @node Fulltext features to appear in MySQL 4.0, Fulltext TODO, Fulltext Fine-tuning, MySQL full-text search @subsection New Features of Full-text Search to Appear in MySQL 4.0 This section includes a list of the fulltext features that are already implemented in the 4.0 tree. It explains @strong{More functions for full-text search} entry of @ref{TODO MySQL 4.0}. @itemize @bullet @item @code{REPAIR TABLE} with @code{FULLTEXT} indexes, @code{ALTER TABLE} with @code{FULLTEXT} indexes, and @code{OPTIMIZE TABLE} with @code{FULLTEXT} indexes are now up to 100 times faster. @item @code{MATCH ... AGAINST} now supports the following @strong{boolean operators}: @itemize @bullet @item @code{+}word means the that word @strong{must} be present in every row returned. @item @code{-}word means the that word @strong{must not} be present in every row returned. @item @code{<} and @code{>} can be used to decrease and increase word weight in the query. @item @code{~} can be used to assign a @strong{negative} weight to a noise word. @item @code{*} is a truncation operator. @end itemize Boolean search utilizes a more simplistic way of calculating the relevance, that does not have a 50% threshold. @item Searches are now up to 2 times faster due to optimized search algorithm. @item Utility program @code{ft_dump} added for low-level @code{FULLTEXT} index operations (querying/dumping/statistics). @end itemize @node Fulltext TODO, , Fulltext features to appear in MySQL 4.0, MySQL full-text search @subsection Full-text Search TODO @itemize @bullet @item Make all operations with @code{FULLTEXT} index @strong{faster}. @item Support for braces @code{()} in boolean fulltext search. @item Support for "always-index words". They could be any strings the user wants to treat as words, examples are "C++", "AS/400", "TCP/IP", etc. @item Support for fulltext search in @code{MERGE} tables. @item Support for multi-byte charsets. @item Make stopword list to depend of the language of the data. @item Stemming (dependent of the language of the data, of course). @item Generic user-suppied UDF (?) preparser. @item Make the model more flexible (by adding some adjustable parameters to @code{FULLTEXT} in @code{CREATE/ALTER TABLE}). @end itemize @cindex mysqltest, MySQL Test Suite @cindex testing mysqld, mysqltest @node MySQL test suite, , MySQL threads, MySQL internals
mysql-test/r/handler.result 0 → 100644 +26 −0 Original line number Diff line number Diff line a b 14 aaa a b 15 bbb a b 16 ccc a b 15 bbb a b 22 iii a b 21 hhh a b 20 ggg a b 14 aaa a b a b 22 iii a b 21 hhh a b 22 iii a b a b 15 bbb
mysql-test/t/handler.test 0 → 100644 +65 −0 Original line number Diff line number Diff line # # test of HANDLER ... # drop table if exists t1; create table t1 (a int, b char(10), key a(a), key b(a,b)); insert into t1 values (17,"ddd"),(18,"eee"),(19,"fff"),(19,"yyy"), (14,"aaa"),(15,"bbb"),(16,"ccc"),(16,"xxx"), (20,"ggg"),(21,"hhh"),(22,"iii"); handler t1 open as t2; handler t2 read a first; handler t2 read a next; handler t2 read a next; handler t2 read a prev; handler t2 read a last; handler t2 read a prev; handler t2 read a prev; handler t2 read a first; handler t2 read a prev; handler t2 read a last; handler t2 read a prev; handler t2 read a next; handler t2 read a next; handler t2 read a=(15); handler t2 read a=(16); !$1070 handler t2 read a=(19,"fff"); handler t2 read b=(19,"fff"); handler t2 read b=(19,"yyy"); handler t2 read b=(19); !$1109 handler t1 read a last; handler t2 read a=(11); handler t2 read a>=(11); handler t2 read a=(18); handler t2 read a>=(18); handler t2 read a>(18); handler t2 read a<=(18); handler t2 read a<(18); handler t2 read a first limit 5; handler t2 read a next limit 3; handler t2 read a prev limit 10; handler t2 read a>=(16) limit 4; handler t2 read a>=(16) limit 2,2; handler t2 read a last limit 3; handler t2 read a=(19); handler t2 read a=(19) where b="yyy"; handler t2 read first; handler t2 read next; handler t2 read next; handler t2 read last; handler t2 close; drop table if exists t1;
sql/Makefile.am +1 −1 Original line number Diff line number Diff line Loading @@ -56,7 +56,7 @@ noinst_HEADERS = item.h item_func.h item_sum.h item_cmpfunc.h \ sql_select.h structs.h table.h sql_udf.h hash_filo.h\ lex.h lex_symbol.h sql_acl.h sql_crypt.h md5.h \ log_event.h mini_client.h sql_repl.h slave.h mysqld_SOURCES = sql_lex.cc \ mysqld_SOURCES = sql_lex.cc sql_handler.cc \ item.cc item_sum.cc item_buff.cc item_func.cc \ item_cmpfunc.cc item_strfunc.cc item_timefunc.cc \ thr_malloc.cc item_create.cc \ Loading
sql/lex.h +6 −1 Original line number Diff line number Diff line Loading @@ -82,6 +82,7 @@ static SYMBOL symbols[] = { { "CHANGED", SYM(CHANGED),0,0}, { "CHECK", SYM(CHECK_SYM),0,0}, { "CHECKSUM", SYM(CHECKSUM_SYM),0,0}, { "CLOSE", SYM(CLOSE_SYM),0,0}, { "COLUMN", SYM(COLUMN_SYM),0,0}, { "COLUMNS", SYM(COLUMNS),0,0}, { "COMMENT", SYM(COMMENT_SYM),0,0}, Loading Loading @@ -152,6 +153,7 @@ static SYMBOL symbols[] = { { "GRANTS", SYM(GRANTS),0,0}, { "GROUP", SYM(GROUP),0,0}, { "HAVING", SYM(HAVING),0,0}, { "HANDLER", SYM(HANDLER_SYM),0,0}, { "HEAP", SYM(HEAP_SYM),0,0}, { "HIGH_PRIORITY", SYM(HIGH_PRIORITY),0,0}, { "HOUR", SYM(HOUR_SYM),0,0}, Loading Loading @@ -185,6 +187,7 @@ static SYMBOL symbols[] = { { "KEY", SYM(KEY_SYM),0,0}, { "KEYS", SYM(KEYS),0,0}, { "KILL", SYM(KILL_SYM),0,0}, { "LAST", SYM(LAST_SYM),0,0}, { "LAST_INSERT_ID", SYM(LAST_INSERT_ID),0,0}, { "LEADING", SYM(LEADING),0,0}, { "LEFT", SYM(LEFT),0,0}, Loading Loading @@ -226,11 +229,12 @@ static SYMBOL symbols[] = { { "MYISAM", SYM(MYISAM_SYM),0,0}, { "NATURAL", SYM(NATURAL),0,0}, { "NATIONAL", SYM(NATIONAL_SYM),0,0}, { "NEXT", SYM(NEXT_SYM),0,0}, { "NCHAR", SYM(NCHAR_SYM),0,0}, { "NUMERIC", SYM(NUMERIC_SYM),0,0}, { "NO", SYM(NO_SYM),0,0}, { "NOT", SYM(NOT),0,0}, { "NULL", SYM(NULL_SYM),0,0}, { "NUMERIC", SYM(NUMERIC_SYM),0,0}, { "ON", SYM(ON),0,0}, { "OPEN", SYM(OPEN_SYM),0,0}, { "OPTIMIZE", SYM(OPTIMIZE),0,0}, Loading @@ -245,6 +249,7 @@ static SYMBOL symbols[] = { { "PASSWORD", SYM(PASSWORD),0,0}, { "PURGE", SYM(PURGE),0,0}, { "PRECISION", SYM(PRECISION),0,0}, { "PREV", SYM(PREV_SYM),0,0}, { "PRIMARY", SYM(PRIMARY_SYM),0,0}, { "PROCEDURE", SYM(PROCEDURE),0,0}, { "PROCESS" , SYM(PROCESS),0,0}, Loading