• Modern implementations of query processors are heavily relying for their efficient performance on sophisticated optimizer components to achieve a proper selection of many optimization decisions such as: access paths, join orders and materialization strategies. Estimating the sizes of query results and intermediate results is a crucial part of any effective query optimization process.
  • The problem of selectivity estimation in XML domain is more complicated than the relational domain. There are several reasons behind this such as:
    • The absence of strict schema notion in the XML data.
    • The dualism between structural and value-based querying.
    • The high expressiveness of the XML query languages.
    • The non-uniform distribution of tags and data.
    • The correlation and dependencies between the occurrences of the elements.
  • The goal of this benchmark is to contribute and develop an XML Micro-benchmark, XSelMark, which is mainly focusing on exercising the selectivity estimation aspects of XML queries.
  • XSelMark (A Micro-Benchmark for Selectivity Estimation Approaches of XML Queries) is considered as a first step to bring an overview of the state-of-the-art of the available approaches in the domain of selectivity estimation of XML queries along with their strengths and weaknesses. It aims of guiding researchers and implementors in benchmarking and improving their research efforts in this domain. XSelMark consists of 25 queries organized into seven groups where each group is intended to address the challenges posed by the different aspects of XML query result size estimation.