CQL / ZING-SRW
宮澤 彰
国立情報学研究所
2003/09/11
1
環境

SRW中のqueryで使用

<SRW:query>CQL query</SRW:query>
XMLの中で記述
 系列としては ISO 8777 (JIS X0803)の系
統であるが、新しい言語

2003/07/31
MIYAZAWA Akira
2
Grammar (1)
cql-query ::=
cql-query boolean search-clause |
search-clause
boolean ::= "and" | "or" | "not" | prox
search-clause ::= "(" cql-query ")" |
[index-name relation] term
2003/07/31
MIYAZAWA Akira
3
Grammar (2)
index-name ::=
[ index-prefix "."]index-base-name
relation ::= base-relation{"/"qualifier}
base-relation ::= order-relation | "=" | "exact" |
"all" | "any" | "scr"
qualifier ::= "relevant" | "fuzzy" | "stem" |
"phonetic"
order-relation::= "<" | ">" | "<=" | ">=" | "<>"
2003/07/31
MIYAZAWA Akira
4
Grammar (3) (prox)
prox ::= "prox" [ "/" prox-qualifiers ]
prox-qualifiers ::= [ prox-relation ] "/" [distance] "/" [ unit ]
"/" ordering |
[ prox-relation ] "/" [ distance ] "/" unit |
[ prox-relation ] "/" distance |
prox-relation
unit ::= "word" | "sentence" | "paragraph" | "element"
prox-relation ::= order-relation | "="
distance ::= non-negative-integer
ordering ::= "ordered" | "unordered"
2003/07/31
MIYAZAWA Akira
5
Grammar (4) (basic components)
index-prefix ::= identifier
index-base-name ::= identifier
identifier ::= string
term::= string | ""string""
string ::= a character string
(space / = < > ( ) " must be double quoted)
2003/07/31
MIYAZAWA Akira
6
Term and search-clause
term
猫
"犬 猫"
search-clause
subject = 猫
dc.title = 吾輩
srw.resultSetName = 001
temperature <= 100
2003/07/31
MIYAZAWA Akira
7
Search-clause (2)
title = "犬 猫" (word 犬 と 猫がこの順)
title all "犬 猫" (word 犬 と 猫の両方)
title any "犬 猫" (word 犬 と 猫のどちらか)
title exact "犬 猫" ("犬 猫"という文字列)
title scr "犬 猫" (server choice relation)
(term "犬 猫"は、srw.serverChoice scr "犬 猫")
2003/07/31
MIYAZAWA Akira
8
Qualifiers (of Relations)
title =/stem "these completed dinosaurs"
(matches "The Complete Dinosaur")
subject any/relevant "fish frog"
(matches "tuna, coelocanth, toad amphibian, etc)
author all/fuzzy "kernaghan richie"
(matches Kernighan & Ritchie's book)
subject =/phonetic rose
(matches rows, rhos, roes)
-- algorithm is implementation dependent.
2003/07/31
MIYAZAWA Akira
9
Pattern matching
dinosaur*
??動物
^動物
(wordの先頭が動物で始まる)
動物^
(wordの最後が動物で終わる)
2003/07/31
MIYAZAWA Akira
10
Boolean
犬 or 猫
author = 夏目漱石 and 猫
title = 猫 not subject = *動物
犬 or 猫 and 動物
is same as
(犬 or 猫) and 動物
(left to right, no precedence)
2003/07/31
MIYAZAWA Akira
11
Prox (近接演算)
犬 prox/<=/3/word/ordered 猫
srw.serverChoice scr 犬 が成り立ち、
その後ろ3word以内でsrw.serverChoice scr 猫が成り立つ
-
proxがboolean扱い(andやorのならび)
以内(<=)の他、=, <, >, >=, <>
word以外に、sentence, paragraph, element
ordered以外に、unordered (default)
2003/07/31
MIYAZAWA Akira
12
問題点(多言語)
- wordは、syntaxで決まる(spaceやその他
の特殊文字で区切られている)ことを前提
にしているが、検索対象は必ずしもそうで
ない(日本語、中国語、タイ語など)点で、
問題がありそう。
- proxのunitで、characterが必要ではない
か
2003/07/31
MIYAZAWA Akira
13
ダウンロード

CQL/ZING-SRW