第26回研究談話会(Sihem Amer-Yahia氏)開催

テーマ
Title
「 XML Full-Text Search and Scoring 」
講演者
Speaker
Sihem Amer-Yahia (Database Research Department, AT&T Labs-Research)
日時
Date
平成17年 5月10日 (火)   15時10分~16時10分
場所
Location
筑波大学春日キャンパス 情報メディアユニオン3階共同研究会議室I
概要
Abstract
One of the key benefits of XML is its ability to represent a mix of structured and text data. Querying XML is a well-explored topic with powerful database-style query languages such as XPath and XQuery set to become W3C standards. However, these languages are not powerful enough to express full-text queries on XML documents. The first part of the talk will describe TeXQuery, a full-text extension to XPath and XQuery which provides a rich set of fully composable full-text search primitives, such as keyword and Boolean search, proximity distance, stemming and regular expressions. TeXQuery is the precursor of XQuery Full-Text, the current full-text extension to XPath 2.0 and XQuery 1.0 that is being developed by the W3C. I will describe its syntax and semantics. The second part of the talk contains recent research I have been doing on scoring answers to XML queries. Due to structural heterogeneity in XML documents, queries are often interpreted approximately and their answers are returned ranked on scores. XML scoring ranges from pure content scoring to scoring on both content and structure. However, none of the existing methods fully accounts for structure and combines it with content to score query answers. I will describe novel XML scoring methods that are inspired by tf*idf and that account for both structure and content for scoring answers to XML queries. I will finish the talk by describing some of the challenges that we are facing in trying to incorporate scoring into the syntax and semantics of XQuery Full-Text.
参加資格
Participation
事前の申込みは必要ありません。学生,教員,学内外を問わずどなたでも参加ください(無料)。
資料
Files
ダウンロード

備考 Notes