Wednesday, March 7, 2012

Can't put duplicate words in different expansion sets?

Can't you have duplicate words with different meanings in different
expansion sets? Is this a bug?
Please try this repro below. It uses a thesaurus with 2 expansion sets.
Each set contains the word kind. The sets are like this:
1. kind, sort, class
2. kind, caring, considerate
When I set up a thesaurus with the above two sets (and resart the FTS
service) I only get one row from this query:
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
I would have expected to get five rows (each row has a single word): kind,
sort, class, caring, considerate
-- THESAURUS REPRO --
CREATE TABLE [dbo].[fts_bug](
[id] [int] IDENTITY(1,1) NOT NULL,
[txt] [varchar](50) NULL,
CONSTRAINT [PK_fts_bug] PRIMARY KEY CLUSTERED ([id] ASC)
)
-- catalog and index
create fulltext catalog testCat as default;
create fulltext index on dbo.fts_bug(txt) key index PK_fts_bug;
-- populate
insert into fts_bug(txt) values ('kind')
insert into fts_bug(txt) values ('sort')
insert into fts_bug(txt) values ('class')
insert into fts_bug(txt) values ('caring')
insert into fts_bug(txt) values ('considerate')
-- see data
select * from fts_bug
-- Use this thesaurus (restart the FTS service!):
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>kind</sub>
<sub>sort</sub>
<sub>class</sub>
</expansion>
<expansion>
<sub>kind</sub>
<sub>caring</sub>
<sub>considerate</sub>
</expansion>
</thesaurus>
</XML>
-- I would EXPECT this query to return "kind", "sort", "class", "caring" and
"considerate"
-- but it only returns one row "kind"
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
-- Now use this one: (I only changed the first "kind" to "kinds") (restart
FTS service)
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>kinds</sub>
<sub>sort</sub>
<sub>class</sub>
</expansion>
<expansion>
<sub>kind</sub>
<sub>caring</sub>
<sub>considerate</sub>
</expansion>
</thesaurus>
</XML>
-- Running the same query now returns 3 rows: "kind", "caring",
"considerate", which you would expect
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
-- clean up
drop fulltext index on dbo.fts_bug;
drop fulltext catalog testCat;
delete from fts_bug
drop table fts_bug
-- END THESAURUS
REPRO --
This is SQL 2005 correct?
I can repro this on a SQL 2005 box. It looks like you must have distinct
words in your thesaurus file.
I suspect this is by design as opposed to an actual bug. use connect to
raise it as a bug and Microsoft will acknowledge it as design or a bug.
"spencer" <jimsjbox_xspm_@.yahoo.com> wrote in message
news:F0C05863-DBC0-4643-A473-A2BD7801C023@.microsoft.com...
> Can't you have duplicate words with different meanings in different
> expansion sets? Is this a bug?
> Please try this repro below. It uses a thesaurus with 2 expansion sets.
> Each set contains the word kind. The sets are like this:
> 1. kind, sort, class
> 2. kind, caring, considerate
> When I set up a thesaurus with the above two sets (and resart the FTS
> service) I only get one row from this query:
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> I would have expected to get five rows (each row has a single word): kind,
> sort, class, caring, considerate
>
> -- THESAURUS
> REPRO --
> CREATE TABLE [dbo].[fts_bug](
> [id] [int] IDENTITY(1,1) NOT NULL,
> [txt] [varchar](50) NULL,
> CONSTRAINT [PK_fts_bug] PRIMARY KEY CLUSTERED ([id] ASC)
> )
> -- catalog and index
> create fulltext catalog testCat as default;
> create fulltext index on dbo.fts_bug(txt) key index PK_fts_bug;
> -- populate
> insert into fts_bug(txt) values ('kind')
> insert into fts_bug(txt) values ('sort')
> insert into fts_bug(txt) values ('class')
> insert into fts_bug(txt) values ('caring')
> insert into fts_bug(txt) values ('considerate')
> -- see data
> select * from fts_bug
> -- Use this thesaurus (restart the FTS service!):
> <XML ID="Microsoft Search Thesaurus">
> <thesaurus xmlns="x-schema:tsSchema.xml">
> <diacritics_sensitive>0</diacritics_sensitive>
> <expansion>
> <sub>kind</sub>
> <sub>sort</sub>
> <sub>class</sub>
> </expansion>
> <expansion>
> <sub>kind</sub>
> <sub>caring</sub>
> <sub>considerate</sub>
> </expansion>
> </thesaurus>
> </XML>
> -- I would EXPECT this query to return "kind", "sort", "class", "caring"
> and "considerate"
> -- but it only returns one row "kind"
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> -- Now use this one: (I only changed the first "kind" to "kinds") (restart
> FTS service)
> <XML ID="Microsoft Search Thesaurus">
> <thesaurus xmlns="x-schema:tsSchema.xml">
> <diacritics_sensitive>0</diacritics_sensitive>
> <expansion>
> <sub>kinds</sub>
> <sub>sort</sub>
> <sub>class</sub>
> </expansion>
> <expansion>
> <sub>kind</sub>
> <sub>caring</sub>
> <sub>considerate</sub>
> </expansion>
> </thesaurus>
> </XML>
> -- Running the same query now returns 3 rows: "kind", "caring",
> "considerate", which you would expect
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> -- clean up
> drop fulltext index on dbo.fts_bug;
> drop fulltext catalog testCat;
> delete from fts_bug
> drop table fts_bug
> -- END THESAURUS
> REPRO --
>
|||"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:uSo9%23lngHHA.4772@.TK2MSFTNGP05.phx.gbl...
> This is SQL 2005 correct?
I'm using SQL 2005 Express

> I can repro this on a SQL 2005 box. It looks like you must have distinct
> words in your thesaurus file.
> I suspect this is by design as opposed to an actual bug. use connect to
> raise it as a bug and Microsoft will acknowledge it as design or a bug.
>
Assuming it's not a bug, how do you get around it? There are plenty of
words that have dual meanings.
I wasn't familiar with "connect" to raise the bug. I found it here:
http://connect.microsoft.com/SQLServer
[this is beside the point: It took a 1x1 inch image on that page a while to
load. So I looked at the size--it was almost a half a megabyte! 3239x2432
pixels! My 22 inch monitor couldn't even display the whole image width at
100%! How could that happen? Some of us don't have T1s at home, you know
;-) ]

> "spencer" <jimsjbox_xspm_@.yahoo.com> wrote in message
> news:F0C05863-DBC0-4643-A473-A2BD7801C023@.microsoft.com...
>

No comments:

Post a Comment