Discussion:
[bdbxml] slow loading/indexing
wimDeVries
2006-05-29 09:59:23 UTC
Permalink
Hi,
I am trying to load 5000 xml docs of appr. 50 Kb each. All the same dtd.
system: 3GHz; 1 Gig memory.
A default "edge-element-substring-string" index is added (at creation time).
cache is 600*1024*1024
The first 1000 docs do get loaded in 21 minutes, after that it keeps slowing
down.
After 5 days it was still at 4400 docs.
Without the index, loading time is no problem.
Can I do some tuning with cache?
A small cache was even slower.

regards, wim de vries

--
View this message in context: http://www.nabble.com/slow+loading-indexing-t1698261.html#a4608652
Sent from the Berkeley DB Xml - General forum at Nabble.com.



------------------------------------------
To remove yourself from this list, send an
email to xml-unsubscribe-***@public.gmane.org
George Feinberg
2006-05-30 15:52:41 UTC
Permalink
Wim,

Default indexes are expensive. Substring indexes are expensive.
Therefore, default substring indexes are *really* expensive.
The only reason I can think of to use them is for random fn:contains()
queries, anywhere in a document.

That said, you should still be able to load your container in a finite
period of time.

Are you trying to load in a single transaction? That'd be a problem.
Bulk loading is best done outside of any transactions, and if using
transactions, you should do a relatively small number of documents
at a time.

Can you send along an example document, so I can see if there's
a problem in the system? Also, what does your load code look like?
Perhaps there's something suspicious there, too.

Regards,

George
Post by wimDeVries
Hi,
I am trying to load 5000 xml docs of appr. 50 Kb each. All the same dtd.
system: 3GHz; 1 Gig memory.
A default "edge-element-substring-string" index is added (at
creation time).
cache is 600*1024*1024
The first 1000 docs do get loaded in 21 minutes, after that it
keeps slowing
down.
After 5 days it was still at 4400 docs.
Without the index, loading time is no problem.
Can I do some tuning with cache?
A small cache was even slower.
regards, wim de vries
------------------------------------------
To remove yourself from this list, send an
email to xml-unsubscribe-***@public.gmane.org

Loading...