What is difference between xdmp:estimate and fn:count?

 Ans: Both functions are used for counting the fragments in Marklogic. The xdmp:estimate is fast because it measures through the index and fn:count function also counts the item but also validates in the index and document. But it is not guaranteed that xdmp:estimate & fn:count give the same result. Sometimes xdmp:estimate can give a false result.

So fn:count function works slowly as compared to xdmp:estimate.

Let's understand in deep, why fn:count & xdmp:estimate results are not the same. 

Step-1: Upload below book catalog XML in Marklogic DB.

<?xml version="1.0" encoding="UTF-8"?>

<catalog>
<book id="bk101">
<author>
Ralls, Kim
</author>
<title>
XML Developer's Guide
</title>
<genre>
Computer
</genre>
<price>
44.95
</price>
<publish_date>
2000-10-01
</publish_date>
<description>
An in-depth look at creating applications
with XML.
</description>
</book>
<book id="bk102">
<author>
Ralls, Kim
</author>
<title>
Midnight Rain
</title>
<genre>
Fantasy
</genre>
<price>
5.95
</price>
<publish_date>
2000-12-16
</publish_date>
<description>
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.
</description>
</book>
<book id="bk103">
<author>
Corets, Eva
</author>
<title>
Maeve Ascendant
</title>
<genre>
Fantasy
</genre>
<price>
5.95
</price>
<publish_date>
2000-11-17
</publish_date>
<description>
After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.
</description>
</book>
<book id="bk104">
<author>
Corets, Eva
</author>
<title>
Oberon's Legacy
</title>
<genre>
Fantasy
</genre>
<price>
5.95
</price>
<publish_date>
2001-03-10
</publish_date>
<description>
In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.
</description>
</book>
</catalog>

Step-2: Run the below query from the Marklogic QConsole:

xquery version "1.0-ml";

fn:concat("xdmp:estimate Result=", xdmp:estimate(fn:doc('catalog.xml')/catalog/book[author = 'Ralls, Kim']))
,
fn:concat("fn:count Result=", fn:count(fn:doc('catalog.xml')/catalog/book[author = 'Ralls, Kim']))

Results are:

xdmp:estimate Result=1

fn:count Result=2

Because:

If a fragment contains more than one matching item for the XPath specified, xdmp:estimate will undercount these items as a single item whereas fn:count would count them individually.

So here below query return the 2 fragments in both cases 

xquery version "1.0-ml";

fn:doc('catalog.xml')/catalog/book[author = 'Ralls, Kim']

As:

<book id="bk101">

<author>
Ralls, Kim
</author>
<title>
XML Developer's Guide
</title>
<genre>
Computer
</genre>
<price>
44.95
</price>
<publish_date>
2000-10-01
</publish_date>
<description>
An in-depth look at creating applications
with XML.
</description>

</book>

<book id="bk102">

<author>
Ralls, Kim
</author>
<title>
Midnight Rain
</title>
<genre>
Fantasy
</genre>
<price>
5.95
</price>
<publish_date>
2000-12-16
</publish_date>
<description>
A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.
</description>

</book>

But xdmp:estimate is returning 1 because the author name "Rall, Kim" is same for both fragments so it will count as 1 only.

Therefore, fn:count is not relying on the value of the author's name, it counts the actual fragments and returns 2.

I hope you understood. If not, kindly put a comment with a query I'll reply with the solution.

Comments

Popular posts from this blog

What is universal index and range index?

What is the basic difference between xdmp:spawn and xdmp:invoke a function in Marklogic.

MarkLogic server architecture.