This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Paging using XSLT
- To: xsl-list at mulberrytech dot com
- Subject: Re: Paging using XSLT
- From: Larry_Mason at i2 dot com
- Date: Wed, 1 Nov 2000 18:21:49 -0400 (EST)
- Reply-To: xsl-list at mulberrytech dot com
Now throw in user sorting capability and your row number concept is shot.
Pages are in the eye of the consumer not the producer.
Cheers!
Larry
"Terris"
<terris@terris.com> To: <xsl-list@mulberrytech.com>
Sent by: cc:
owner-xsl-list@mulber Subject: Re: Paging using XSLT
rytech.com
11/01/00 10:19 AM
Please respond to
xsl-list
Ah-ha. One of my favorite topics. Strap
yourself in...
Easier said than done. Databases can not directly
perform
select * from A
return rows 50-60
This is a simple example. Most queries contain joins.
Queries can be expensive. The query cost can
prohibit executing the same query
over and over again, scrolling through
the results for the desired subset. And anyway
the result set might change between
executions. You would be surprised
how many sites actually do this though.
You either consume disk or memory, take
your pick. If you go the disk route (disk
is much cheaper than memory),
you either use the database or
the file system. For example, you could
store XML data sets in files.
One database-based solution is to
create a temporary table
that holds the query results. This temporary table
has an additional "row number" column which
is indexed.
There are several problems with this solution...
1. It consumes database resources
2. If you use a "real" temporary table you
have to associate a database connection with
the session .. don't even try it. So you
have no choice but to create real tables
on the fly.
3. Tables have to be cleaned up after
a session times out
4. Need to make sure the same session can only
have one active temporary table at a time,
otherwise one session can consume an
excessive amount of disk
5. Database engines were not designed
for this -- engines like the schema to
stay constant, and creating and
dropping tables every few minutes is
disastrous
Instead of using temporary tables,
create one "query results" table that is
also indexed on session ID. You still
have to garbage collect the rows
at some point, but managing rows is easier
than managing entire tables.
A database solution is appealing because
you can leverage indexes. A database
can be accessed from any machine on
a server farm, so sessions don't have
to be tied to particular servers.
The devil is always in the details.
Where does the row number come from?
Unfortunately, you usually can't use
the select INTO clause to populate
the query results table. This is because
you can't tell SQL to generate an
incrementing row number based on
the ORDER BY clause. Some
dialects of SQL might actually support
this. Without such support you have to
write code that iterates through
the result set, adds an incrementing
row number column, and
inserts the row into the query results table.
This can be very slow for large result
sets.
It seems natural to use XML to represent
temporary result sets, but generally
this approach doesn't scale.
XML files are fundamentally
slow because the file system can't
index XML documents. Stores like
eXcelon can index XML but all of
the undesirable features of temporary
tables described earlier will creep up.
If your data sets are small,
you could get away with managing
temporary XML files in the file system.
You could try to forego files and
cache documents in memory.
Row-based access is fast when the document
has an ID attribute whose
value indicates the row number (r1, r2,
r3, etc.). XML parsers automatically index ID
attributes for fast access. However,
this solution doesn't work on server farms.
In conclusion...
The short answer is this is not easy.
The longer answer is that the
best solution seems to be to store
query results in a permanent table that
is specifically indexed for page-based access.
The best implementation depends
on the size of the result sets. Database-based
solutions scale up to very large result
sets and scale out to web farms.
You can even pre-populate the
query results table with your most popular
queries. This out-perfoms all possible
implementations and that's an
understatement. Of course the assumption is
that the underlying data doesn't change frequently,
which may or may not be correct.
I wish I could say that raw XML technologies
somehow solve this problem. There are limitations
to all tools. XML could be a facilitator,
however, You could build a component that
manages result sets in a database and returns "pages" as
XML. NOW you're talking.
Oops, I just leaked my business plan :^)
http://xmleverywhere.com
----- Original Message -----
From: "Steve Muench" <Steve.Muench@oracle.com>
To: <xsl-list@mulberrytech.com>
Sent: Wednesday, November 01, 2000 5:54 AM
Subject: Re: Paging using XSLT
> Much more efficient is to, as you suggest, query-up/pass-in
> only the 15 rows in the current "window" you want to format
> and not to use the in-memory filtering capability of XSLT
> as a replacement for the fast indexes that a database might
> be able to use to quickly return just the rows you want
> to "see" on the current page.
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list