This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Guile regexp timings.



I did some regexp checking.  I tested as follows.

1. Did an anon cvs checkout of the latest guile.
2. Built it.
3. Grabbed regex.c, & regex.h from gawk 3.0.
4. Built regex.o in the guile directory (needed config.h & custom.h to
   do this).
5. Built guile.old normally & built guile.hacked by linking in gawk's
   regex.o.

Did the following tests:

   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ ./guile.hacked
   guile> (load "/home/hjstein/scwm/wrappers.scm")
   guile> (define r (make-regexp "^r"))
   guile> (define (t n) (do ((i 0 (1+ i))) ((>= i n)) (regexp-exec r "asdr")))
   guile> (with-profiling (t) (t 10000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     0.730
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     7.160

   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ ./guile.old 
   guile> (load "/home/hjstein/scwm/wrappers.scm")
   guile> (define r (make-regexp "^r"))
   guile> (define (t n) (do ((i 0 (1+ i))) ((>= i n)) (regexp-exec r "asdr")))
   guile> (with-profiling (t) (t 10000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     1.260
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0    12.160

So, right off the bat, gawk's regex.c saves 40% of our runtime.

However, compare this to:

   hjstein@bacall:~$ echo "" | time awk '{for (i=0; i<100000; i++) { "asdr" ~ /^r/ }}'
   0.99user 0.00system 0:00.99elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k

Guile's still 7x slower than gawk.

Also compare this to STk:

   hjstein@bacall:~$ snow
   Welcome to the STk interpreter version 3.1.1 [Linux-2.X-ix86]
   Copyright  1993-1996 Erick Gallesio - I3S - CNRS / ESSI <eg@unice.fr>
   STk> (define r (string->regexp "^r"))
   #[undefined]
   STk> (define (t n) (do ((i 0 (1+ i))) ((>= i n)) (r "asdr")))
   #[undefined]
   STk> (time (t 100000))
   ;;    Time: 3960.00ms
   ;; GC time: 630.00ms
   ;;   Cells: 1000289
   #[undefined]

STk's still almost 2x faster than guile.

Turning debugging off in guile:

   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ ./guile.old 
   guile> (debug-disable 'debug)
   (stack 20000 depth 20 maxdepth 1000 frames 3 indent 10 procnames cheap)
   guile> (load "/home/hjstein/scwm/wrappers.scm")
   guile> (define r (make-regexp "^r"))
   guile> (define (t n) (do ((i 0 (1+ i))) ((>= i n)) (regexp-exec r "asdr")))
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0    10.850
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0    10.890
   guile> hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ 
   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ 
   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ ./guile.hacked
   guile> (debug-disable 'debug)
   (stack 20000 depth 20 maxdepth 1000 frames 3 indent 10 procnames cheap)
   guile> (load "/home/hjstein/scwm/wrappers.scm")
   guile> (define r (make-regexp "^r"))
   guile> (define (t n) (do ((i 0 (1+ i))) ((>= i n)) (regexp-exec r "asdr")))
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     5.990
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     5.900
   guile> 

Guile's still slow, but guile + gawk's regexps + no debugging is
"only" 32% slower than STk.

Timing nothing:

   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ ./guile.hacked
   guile> (debug-disable 'debug)
   (stack 20000 depth 20 maxdepth 1000 frames 3 indent 10 procnames cheap)
   guile> (load "/home/hjstein/scwm/wrappers.scm")
   guile> (define (t n) (do ((i 0 (1+ i))) ((>= i n))))
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     1.430
   guile> (with-profiling (t) (t 100000))
   Function            Called     Time
   ------------------- ---------- ---------
   t                          1.0     1.400


Vs STk:

   hjstein@bacall:~$ snow
   Welcome to the STk interpreter version 3.1.1 [Linux-2.X-ix86]
   Copyright  1993-1996 Erick Gallesio - I3S - CNRS / ESSI <eg@unice.fr>
   STk> (define (t n) (do ((i 0 (1+ i))) ((>= i n))))
   #[undefined]
   STk> (time (t 100000))
   ;;    Time: 2610.00ms
   ;; GC time: 460.00ms
   ;;   Cells: 900288
   #[undefined]
   STk> (time (t 100000))
   ;;    Time: 2630.00ms
   ;; GC time: 470.00ms
   ;;   Cells: 900140
   #[undefined]

Guile manages to execute the empty loop 46% faster than STk, but they
*both* take longer than gawk takes to execute the loop & do 100000
regexp comparisons.

For reference, here's perl:

   hjstein@bacall:~/remote-cvs-pkgs/guile-core/libguile$ echo 'for ($i=0; $i<100000; $i++) {asdr =~ /^r/};' | time perl
   0.39user 0.01system 0:00.39elapsed 101%CPU (0avgtext+0avgdata 0maxresident)k
   0inputs+0outputs (162major+16minor)pagefaults 0swaps


Conclusions:

1. Guile's regexp package is slow.
2. Guile's wrapping of the regexp package is slow.
3. Guile *can* manage to interpret things quickly (at least quicker
   than STk), but it's still substantially slower than gawk.
4. Guile's got a *long* way to go before its speed can be competitive
   with other popular text hacking languages.

-- 
Harvey J. Stein
BFM Financial Research
hjstein@bfr.co.il