I just implemented the SQL2000 Full-Text search (FTS) on my MSSQL 2000
 database contains simplified chinese characters.
 The FTS works much faster than using the "LIKE" clause, often times return
 hundreds of thousands of records in seconds or split second.
 However, the search results are not 100% accurate all the time, as all of my
 searches require exact matches to the specified words, but FTS seems to
 return any record containing each character/phrase in my search string (i.e.
 seems like A near B near C, instead of ABC)
 I wonder if there is a way to enable exact matches instead of fuzzy matches?
 i.e. search for "ABC", return "ABC", but not "A near B near C".
 My query string is as follows:
 Select ID,title, mytable where CONTAINS(content, ' "chinese_words" ');
 Please advise.Hi,
Welcome to use MSDN Managed Newsgroup!
From your descriptions, I understood you would like to know how to excat
match chinese characters with FTS. If I have misunderstood your concern,
please feel free to point it out.
Based on my knowledge, you could use containstable and wrap your phrase in
double quotes. For example
SELECT * FROM <Table Name> WHERE CONTAINSTABLE (*, '"<chinese characters>"')
For more information, you could refer the Knowledge Base article below
INF: Correctly Parsing Quotation Marks in Full Text Search Queries
http://support.microsoft.com/kb/246800
and BOL topic: "CONTAINSTABLE"
Thank you for your patience and cooperation. If you have any questions or
concerns, don't hesitate to let me know. We are always here to be of
assistance!
Sincerely yours,
Michael Cheng
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
=====================================================
This posting is provided "AS IS" with no warranties, and confers no rights.|||Hi Michael, thanks for the reply, I had already reviewed the knowledge base
article you referenced but it simply deals with programmatically enclosing
search strings in double-quotes. What my problem is seeems to be related to
the linguistic algorithm used in "word breakers" techniques when I indexed
the table using simplified chinese. FTS seeems to treat some chinese
characters as space delimited words. A way to resolve it is to index the
table using neutral language, but this causes the FTS to exclude search
results that contains the search string but enclosed in punctuations (such as
double-quotes or other chinese punctuation such as book quotes). There is
quite some discussion in newsgroup about this:
C# is ignored on 2000 Server in mssearch
http://groups-beta.google.com/group/microsoft.public.sqlserver.fulltext/browse_thread/thread/86d4fb070fbe7bbe/f066285f170dbff7?q=langwrbk+infosoft+punctuation+double+quote&rnum=1#f066285f170dbff7
Full text search giving incomplete result
http://groups-beta.google.com/group/microsoft.public.sqlserver.fulltext/browse_thread/thread/31b0d021515c956f/b2d525d9bc74fea9?q=langwrbk+infosoft+punctuation&rnum=12#b2d525d9bc74fea9
Execution of a full-text operation failed. A clause of the query contained
only ignored words.
http://www.mcse.ms/printthread.php?threadid=1088004
I wonder if Microsoft might have a way to disable ignoring punctuation for
chinese language and just allow us exact matches similar to the "LIKE
'%searchstring%'" SQL clause.? I have both Windows 2000 and Windows 2003, and
sees problems on both platforms.
For example:
Using "LIKE '%searchstring%'" , my search return 664 records, all these are
exact matches.
However, the following returns on exact matches if I indexed my table using
neutral language, but nothing the significant decrease in search results, as
FTS excluded all results enclosed in english or chinese punctuation.
Using "CONTAINS(*,' "*searchstring*" '), my search returns 118 records
Using "CONTAINSTABLE(myTableName,*,' "*searchstring*" '), my search returns
118 records
Using "CONTAINS(*,'searchstring'), my search returns 29 records
Using "CONTAINSTABLE(myTableName,*,'searchstring'), my search returns 29
records|||Hi,
Thanks for your detailed research for this issue.
I understood you would like to use FTS instead of LIKE to meet a better
performance in both Windows 2000 and Windows Server 2003. If I have
misunderstood your concern, please feel free to point it out.
Unfortunately, I am afraid we do not have a proper way " to disable
ignoring punctuation for chinese language and just allow us exact matches
similar to the "LIKE '%searchstring%'" SQL clause".
I have noticed you are using neutral language for full-text and please also
place N before the search string to search for Unicode characters. Also
enclose the search string in double-quotes. For example
select product_name from product
where contains(product_name, N' "*searchstring*" ')
where searchstring contains some Chinese characters.
Thank you for your patience and cooperation. If you have any questions or
concerns, don't hesitate to let me know. We are always here to be of
assistance!
Sincerely yours,
Michael Cheng
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
=====================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
Subscribe to:
Post Comments (Atom)
 
No comments:
Post a Comment