Showing posts with label bugplease. Show all posts
Showing posts with label bugplease. Show all posts

Wednesday, March 7, 2012

Can't put duplicate words in different expansion sets?

Can't you have duplicate words with different meanings in different
expansion sets? Is this a bug?
Please try this repro below. It uses a thesaurus with 2 expansion sets.
Each set contains the word kind. The sets are like this:
1. kind, sort, class
2. kind, caring, considerate
When I set up a thesaurus with the above two sets (and resart the FTS
service) I only get one row from this query:
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
I would have expected to get five rows (each row has a single word): kind,
sort, class, caring, considerate
-- THESAURUS REPRO --
CREATE TABLE [dbo].[fts_bug](
[id] [int] IDENTITY(1,1) NOT NULL,
[txt] [varchar](50) NULL,
CONSTRAINT [PK_fts_bug] PRIMARY KEY CLUSTERED ([id] ASC)
)
-- catalog and index
create fulltext catalog testCat as default;
create fulltext index on dbo.fts_bug(txt) key index PK_fts_bug;
-- populate
insert into fts_bug(txt) values ('kind')
insert into fts_bug(txt) values ('sort')
insert into fts_bug(txt) values ('class')
insert into fts_bug(txt) values ('caring')
insert into fts_bug(txt) values ('considerate')
-- see data
select * from fts_bug
-- Use this thesaurus (restart the FTS service!):
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>kind</sub>
<sub>sort</sub>
<sub>class</sub>
</expansion>
<expansion>
<sub>kind</sub>
<sub>caring</sub>
<sub>considerate</sub>
</expansion>
</thesaurus>
</XML>
-- I would EXPECT this query to return "kind", "sort", "class", "caring" and
"considerate"
-- but it only returns one row "kind"
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
-- Now use this one: (I only changed the first "kind" to "kinds") (restart
FTS service)
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>kinds</sub>
<sub>sort</sub>
<sub>class</sub>
</expansion>
<expansion>
<sub>kind</sub>
<sub>caring</sub>
<sub>considerate</sub>
</expansion>
</thesaurus>
</XML>
-- Running the same query now returns 3 rows: "kind", "caring",
"considerate", which you would expect
SELECT *
FROM fts_bug
WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
-- clean up
drop fulltext index on dbo.fts_bug;
drop fulltext catalog testCat;
delete from fts_bug
drop table fts_bug
-- END THESAURUS
REPRO --
This is SQL 2005 correct?
I can repro this on a SQL 2005 box. It looks like you must have distinct
words in your thesaurus file.
I suspect this is by design as opposed to an actual bug. use connect to
raise it as a bug and Microsoft will acknowledge it as design or a bug.
"spencer" <jimsjbox_xspm_@.yahoo.com> wrote in message
news:F0C05863-DBC0-4643-A473-A2BD7801C023@.microsoft.com...
> Can't you have duplicate words with different meanings in different
> expansion sets? Is this a bug?
> Please try this repro below. It uses a thesaurus with 2 expansion sets.
> Each set contains the word kind. The sets are like this:
> 1. kind, sort, class
> 2. kind, caring, considerate
> When I set up a thesaurus with the above two sets (and resart the FTS
> service) I only get one row from this query:
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> I would have expected to get five rows (each row has a single word): kind,
> sort, class, caring, considerate
>
> -- THESAURUS
> REPRO --
> CREATE TABLE [dbo].[fts_bug](
> [id] [int] IDENTITY(1,1) NOT NULL,
> [txt] [varchar](50) NULL,
> CONSTRAINT [PK_fts_bug] PRIMARY KEY CLUSTERED ([id] ASC)
> )
> -- catalog and index
> create fulltext catalog testCat as default;
> create fulltext index on dbo.fts_bug(txt) key index PK_fts_bug;
> -- populate
> insert into fts_bug(txt) values ('kind')
> insert into fts_bug(txt) values ('sort')
> insert into fts_bug(txt) values ('class')
> insert into fts_bug(txt) values ('caring')
> insert into fts_bug(txt) values ('considerate')
> -- see data
> select * from fts_bug
> -- Use this thesaurus (restart the FTS service!):
> <XML ID="Microsoft Search Thesaurus">
> <thesaurus xmlns="x-schema:tsSchema.xml">
> <diacritics_sensitive>0</diacritics_sensitive>
> <expansion>
> <sub>kind</sub>
> <sub>sort</sub>
> <sub>class</sub>
> </expansion>
> <expansion>
> <sub>kind</sub>
> <sub>caring</sub>
> <sub>considerate</sub>
> </expansion>
> </thesaurus>
> </XML>
> -- I would EXPECT this query to return "kind", "sort", "class", "caring"
> and "considerate"
> -- but it only returns one row "kind"
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> -- Now use this one: (I only changed the first "kind" to "kinds") (restart
> FTS service)
> <XML ID="Microsoft Search Thesaurus">
> <thesaurus xmlns="x-schema:tsSchema.xml">
> <diacritics_sensitive>0</diacritics_sensitive>
> <expansion>
> <sub>kinds</sub>
> <sub>sort</sub>
> <sub>class</sub>
> </expansion>
> <expansion>
> <sub>kind</sub>
> <sub>caring</sub>
> <sub>considerate</sub>
> </expansion>
> </thesaurus>
> </XML>
> -- Running the same query now returns 3 rows: "kind", "caring",
> "considerate", which you would expect
> SELECT *
> FROM fts_bug
> WHERE CONTAINS (txt, 'FORMSOF(THESAURUS , "kind")');
> -- clean up
> drop fulltext index on dbo.fts_bug;
> drop fulltext catalog testCat;
> delete from fts_bug
> drop table fts_bug
> -- END THESAURUS
> REPRO --
>
|||"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:uSo9%23lngHHA.4772@.TK2MSFTNGP05.phx.gbl...
> This is SQL 2005 correct?
I'm using SQL 2005 Express

> I can repro this on a SQL 2005 box. It looks like you must have distinct
> words in your thesaurus file.
> I suspect this is by design as opposed to an actual bug. use connect to
> raise it as a bug and Microsoft will acknowledge it as design or a bug.
>
Assuming it's not a bug, how do you get around it? There are plenty of
words that have dual meanings.
I wasn't familiar with "connect" to raise the bug. I found it here:
http://connect.microsoft.com/SQLServer
[this is beside the point: It took a 1x1 inch image on that page a while to
load. So I looked at the size--it was almost a half a megabyte! 3239x2432
pixels! My 22 inch monitor couldn't even display the whole image width at
100%! How could that happen? Some of us don't have T1s at home, you know
;-) ]

> "spencer" <jimsjbox_xspm_@.yahoo.com> wrote in message
> news:F0C05863-DBC0-4643-A473-A2BD7801C023@.microsoft.com...
>