turn off lazyIO on disk files

Emil Jerabek jerabek at math.cas.cz
Fri Jul 4 12:02:59 CEST 2008

On Thu, Jul 03, 2008 at 06:22:33PM +0200, Frank Heckenbach wrote:
> Wolfgang Helbig wrote:
> > the program, Don Knuth's TeX adapted to GPC version 2.1, does heavy
> > I/O on text files. I noticed a very high percentage of system time.
> > I believe this is due to lazy I/O, which seems to be unbuffered.
> > 
> > GPC should apply lazy/IO  when the text file comes from the keyboard,
> > but to use buffered I/O when it comes from the disk.
> > 
> > Is there a way to turn off lazy/IO on nonterminal text files?
> GPC uses buffers on reading, but only reads as much data as are
> available at any time. E.g., with a default buffer size of $4000, it
> tries to read that many bytes when it needs more input, but if the
> underlying read() system call returns less (e.g., because input from
> a terminal, pipe, socket or other device has less bytes available),
> it doesn't call read() again (to fill the buffer completely) until
> more input is actually required. So there shouldn't be a problem
> here (i.e., disk input should be fast because the buffer is usually
> filled completely, while terminal input doesn't block
> unwarrantedly).
> On writing, GPC doesn't buffer yet at all; it's not implemented yet.
> It's not that it couldn't be done, but it's more tricky, because GPC
> supports various ways of seeking, pre-reading, getting the file
> position etc., which all would have to take account of the buffers.
> If it's a serious problem you could kludge it by installing a
> user-defined file write routine which can do the buffering. If you
> know that your application only does sequential writes, this would
> work. I've done this once (see RewriteBuffer in cgi.pas in
> http://fjf.gnu.de/misc/cgiprogs.tar.bz2). Let me know if you need
> more details.
> Frank

The bottleneck in TeX's I/O is the output to the dvi file, which is
done byte by byte (there's no other choice in standard Pascal). Knuth
himself notes that this is inefficient, the porter is expected to
optimize it:

---- tex.web -----
@ The actual output of |dvi_buf[a..b]| to |dvi_file| is performed by calling
|write_dvi(a,b)|. For best results, this procedure should be optimized to
run as fast as possible on each particular system, since it is part of
\TeX's inner loop. It is safe to assume that |a| and |b+1| will both be
multiples of 4 when |write_dvi(a,b)| is called; therefore it is possible on
many machines to use efficient methods to pack four bytes per word and to
output an array of words with one system call.
@^system dependencies@>
@^inner loop@>

@p procedure write_dvi(@!a,@!b:dvi_index);
var k:dvi_index;
begin for k:=a to b do write(dvi_file,dvi_buf[k]);

In the case of GPC, it is trivial to replace the write_dvi procedure
with a single call to BlockWrite. This makes a huge difference in the
running time.

Emil Jerabek

More information about the Gpc mailing list