Download JMail 1.1 / Mail me, jyelon@uiuc.edu / Back to my Software Page


JMail Mailing-List Archiver and Web-Based Viewer

JMail's primary function is to archive a mailing list and export that mailing list to the world-wide web. However, it can be used for many mail-archiving needs, or even for some mail handling needs.

JMail is a single program with many functions. These separate functions can be assembled together into a complete mail-handling system.

To demonstrate the abilities of JMail, here is a JMail archive in my account. It contains messages from the ars-magica@csua.berkeley.edu mailing list. It is active: messages are added continuously, as they arrive in my mailbox. This functionality is performed without my intervention.

JMail's first argument is always a word selecting a function to execute. The available functions are listed below.

Usage: jmail arc-build archive < mailfile

This function converts messages in UNIX mail file format into a JMail archive. JMail archives use only moderate disk space, yet they support highly efficient retrieval of messages by message-number, message-date, or by word search. This makes them an ideal format for long-term storage of messages.

If archive already exists, the messages are appended to the archive. If it does not already exist, it is created.

Warning: if you invoke jmail (or any mail-archiving software) from a UNIX forward file, you need to be aware of locking issues. See jmail claim-lockfile below.

Usage: jmail arc-dump archive > mailfile

This function converts a JMail archive back into a UNIX mailbox file.

Usage: jmail arc-lookup-text archive < commandlist > messages

This function retrieves selected messages from a JMail archive. The messages are retrieved in UNIX mailbox format. The retrieved messages are sent to standard output.

Messages are selected based on the retrieval commands read from standard input. Each retrieval command is a single letter followed by a set of parameters. The currently supported retrieval commands are:

l number
- retrieve all messages in the last number seconds.
d datelo datehi
- retrieve all messages in the specified date-range.
w word1 word2 ...
- retrieve messages containing all the specified words.
n msglo msghi
- retrieve messages whose message-numbers are in the specified range.

Several notes. 1. dates are in seconds, as returned by the UNIX time function. 2. a single integer on a line by itself is also a retrieval command, retrieving the specified message number. 3. To perform a word-search, the jmail archive must be indexified, see jmail arc-indexify below.

Usage: jmail arc-lookup-html archive < commandlist > messages

This function retrieves selected messages from a JMail archive. The messages are retrieved in HTML format. Typically, this option would be used inside a CGI script, making it possible to export the contents of a JMail archive to the world-wide-web.

Messages are selected based on the retrieval commands read from standard input. Retrieval commands are described above, under jmail arc-lookup-text.

Usage: jmail arc-lookup-text-index archive < commandlist > messages

This function retrieves descriptions of selected messages from a JMail archive. The descriptions are printed on standard output. Each description is a comma-separated list with these fields:

There may be comments interspersed with the descriptions. Each comment is a line beginning with '#'.

Messages are selected based on the retrieval commands read from standard input. Retrieval commands are described above, under jmail arc-lookup-text.

Usage: jmail arc-lookup-html-index archive < commandlist > messages

This function retrieves descriptions of selected messages from a JMail archive. The descriptions are in a tabular HTML format, suitable for inclusion on a web-page. Typically, this function would be used inside a CGI script as a means to export a JMail archive to the Web. The HTML contains the word SCRIPT on each line, and this word should be replaced with the URL of the CGI script.

Messages are selected based on the retrieval commands read from standard input. Retrieval commands are described above, under jmail arc-lookup-text.

Usage: jmail arc-indexify archive threshold minoccur

To perform word-searches on a JMail archive, the archive must be indexified. Indexification is the process of building a big table indicating which words occurred in which messages. To reduce the size of the table, the indexification software discards two kinds of words: 1. Words that are extremely common, like the. 2. Words that are extremely rare, and therefore, probably misspelled. The parameter threshold indicates the percentage of messages a word must occur in before it is considered too common. A reasonable value for threshold would be 40, indicating that words occurring in more than 40% of the messages should be ignored. The other parameter, minoccur, indicates the number of times a word must occur in order to be included. A reasonable value for minoccur is 2, indicating that a word must occur in at least 2 messages to be included in the index.

After indexifying an archive, all messages in the archive can be retrieved via word search. If you then subsequently add messages to the archive using arc-build, those new messages are not available for word-search (a word search would only retrieve messages that were already in the archive at the time of indexification). However, since indexification is slow (maybe 5 minutes for 25000 messages), it isn't feasible to indexify after each message addition. We therefore suggest indexifying an archive about once per day.

While indexifying, jmail shows you how many messages it has finished by printing msg #100, msg #200, msg #300, and so forth. When using jmail indexify in a cron-job, it is probably desirable to redirect this status report to /dev/null.

Usage: jmail claim-lockfile filename

It is possible to set up a UNIX forward file to cause messages to automatically be received, processed, and added to a jmail archive. However, there is a small problem: the UNIX mail handling software does not wait for one message to be delivered before it tries to deliver the next one. Therefore, it is possible to accidentally try to concurrently add two messages to a single archive at the same time, corrupting the archive. To avoid this, you must use locking.

JMail claim-lockfile is a simple means to achieve locking in a shell-script. JMail claim-lockfile creates the specified lockfile. If the lockfile is already there, jmail waits for it to be removed before creating it. Therefore, to achieve mutual exclusion in a shell-script, simply use this sequence:

	jmail claim-lockfile LOCK
	perform file manipulations that must be exclusive
	rm -f LOCK

Usage: jmail unmangle < pseudo-mailfile > true-mailfile

JMail unmangle is used when your mail has previously been archived in some vaguely mailbox-like format. JMail uses some simple heuristics to identify message boundaries. It then converts the messages it has identified into genuine UNIX mailbox format, and sends them to standard output. The heuristics used to identify message boundaries are:

In particular, these heuristics were designed to split apart the mail-digests created by ``majordomo''. They work equally well on simple UNIX mail-files which have been ``stripped down'' for the sake of saving disk space, and also on the files created by ``mh''.

Usage: jmail keep-headers header1 header2... < mailfile > stripped-mailfile

This function is used to remove unwanted RFC headers from a mail-file. You specify a list of the headers you want to keep, and jmail removes the rest. Both the input and output are in UNIX mailbox format.

Usage: jmail divide defaultfile string1 file1 string2 file2 ... < mailmsgs

This function divides a UNIX mailbox file into several components. Each message is read from the standard input. If the message contains string1, it is added to file1. If the message contains string2, it is added to file2, and so forth. If the message contains none of the specified strings, it is added to the defaultfile.

Admittedly, this function needs to be more powerful. I'm working on it.


Download JMail 1.1 / Mail me, jyelon@uiuc.edu / Back to my Software Page