Tine: A Module for Measuring Maple Performance

Tine: A Module for Measuring Maple Performance A previous entry in this book, Tools of the Maple Masters, described an idiosyncratic technique for measuring the time and space (memory) usage incurred while evaluating expressions in Maple. It used Maple preprocessor macros so that the restart command could be executed between measurements. As explained there, restarts are used to ensure that each evaluation starts with a minimal usage of memory. While that method was suitable for measuring a procedure at a few data points, the lack of looping with the available macros made it impractical for more intensive analyses. Furthermore, macros are only usable with tty Maple (cmaple), are painful to debug, and usually involve creating two files: one that assigns the procedures to be tested, the other that runs the tests.

I wanted something better. In particular, I wanted to be able to easily plot the performance of a procedure, or several procedures, as the size of the input increased, with just a few lines of Maple code. The key to doing this is controlling, and possibly restarting, an independent Maple process. Preliminary attempts, using the Sockets package and OpenMaple, to communicate with a Maple process, were not entirely successful. Each has limitations.

Communicating with an external Maple process is not difficult, the hard part is restarting it. The Sockets package, which uses network sockets

to communicate, does not provide a way to programmatically restart the kernel. Furthermore, once one end of a socket disconnects (for example, after killing the process) it is not possible, apparently, to reconnect. OpenMaple, which permits accessing Maple through compiled code, has a RestartMaple command. Alas, OpenMaple cannot be directly used by another Maple kernel, at least not via the ExternalCalling procedures. From the help page:
These functions [StartMaple, StopMaple, RestartMaple] can be used in external code with OpenMaple. They do not work as part of define_external.

During an evening walk, while considering methods for bypassing these obstacles, I realized a simple technique for accomplishing the task. I had taken the left fork of a path that led up a hill back to my home, noticed that it was a bit steeper than I cared to climb, so chose an alternate path that returned to the main fork. The technique I wanted was analogous to my path, it uses a process fork.

Caveats

Before describing the method and implementation, there are a few caveats to be considered. First, forks are available only on some operating systems. They do not work on Windows systems. Second, as of Maple 11, the process package is deprecated. From the process[fork] help page:
The process package has been deprecated for Maple 11. The fork command may be removed from future versions of Maple. For similar functionality see the Threads package.
Alas, I do not believe that the Threads package provides equivalent functionality. A third caveat is that, to increase the utility, you will need to compile a small external C program. A makefile is provided, but I have only used it on my Linux system. The package works without the external program, but having it increases the precision of the timing measurements.

Description

A Unix process fork causes the current process to fork into two branches: the parent branch, which we identify with the original process, and the child branch, which we consider a new process. At the time of the fork, the two processes are identical, but independent. Well, not completely identical. A variable is set to one value (0) in the child process and to another value (the process ID, or pid, of the child) in the parent process. The code in each process inspects this variable and branches accordingly.

To use a fork to measure the performance of a Maple procedure, we first assign the procedure and variables in the main branch, open a pipe, then fork the process. The child process records the time and memory usage, applies the test procedure to the arguments, measures the time and memory usage, and computes the differences. It converts the measured data to a string, transmits it on the pipe, then kills itself.

Meanwhile, the parent process reads the other end of the pipe. When the transmitted data arrives, it closes the pipe, waits for the child process to terminate, parses the string, and returns the measured data as the output. Because any memory allocated by Maple during the evaluation of the test procedure only affects the child process, the memory usage in the parent process is minimal—this ensures that the child process started by the next measurement has essentially the same start conditions. Killing the child process also permits the OS to reclaim its memory, something that is not done when Maple is merely restarted.

Implementation

The Maple module Tine exports several procedures:
ApplyFork
Apply a procedure to arguments in a child process and return the result.
ApplyMeas
Measure the time, memory usage, and number of garbage collection cycles required to apply a procedure to arguments.
EvalFork
Evaluate an expression in a child process and return the result.
EvalMeas
Measure the time, memory usage, and number of garbage collection cycles required to evaluate an expression.
PlotApplyMeas
Plot the measurements of a procedure, or procedures, as a parameter varies.
SeqLog
Return a logarithmic sequence. Useful for generating domain points that are evenly spaced on a logarithmic scale.
Time
Return the time, with microsecond resolution, since the epoch. This is available only if the accompanying C code has been properly compiled.

Example

Compare catenating strings with the cat function, the || operator, and StringTools[Join]. The use_catbin function uses cat as a purely binary procedures, to give a useful comparison with ||.
# Assign procedures to measure.  Each procedure catenates all its arguments.

use_bar    := proc() foldl(`||`,_passed) end proc:
use_catbin := proc() foldl(cat,_passed) end proc:
use_join   := proc() StringTools:-Join([_passed],"") end proc:
use_cat    := proc() cat(_passed) end proc:

# Load the Tine package
with(Tine):

# Assign N a list of integers, [ni], logarithmically spaced.
# These are used to generates sequences of that many strings.
# The Loops variable is assigned a list of integers approximately
# inversely proportional to the corresponding integers in N; they are
# used to set the number of times the test procedure is evaluated.
# That permits reasonably accurate measurements for small ni.

(N,Loops) := SeqLog(10..10^3,30,andinverse):

# Generate a list of lists of strings to catenate.

Pargs := [seq([cat("a",1..n)], n=N)]:

# Measure and plot the bytes-used and total-time required by each of the
# procedures.  The 'seed = true' argument causes an integer to be appended
# to each sequence of arguments passed to the test procedures so that the
# results from consecutive calls, when multiply evaluated, differ.
 
PlotApplyMeas([use_bar, use_catbin, use_join, use_cat]
              , N, Loops, Pargs, 'seed=true'
              , field = [bytesused, totaltime]
             );

Installation

The package is distributed as a zip file. It contains a README file that briefly explains how to install it.
}