This preface is a way of apologizing before the fact for the way this thing is
presented for the initial release.

Also, it lets me explain that the anticipated user may be a home computer 
network 'hobbyist' to more serious home networks to (be still, my heart) small
to medium networks in a more commercial or organizational setting. 

Also, it lets me explain why I think such a project is useful. Mainly because
networks, large and small, are becoming more commonplace and network 
monitoring is becoming more necessary. And because I noticed early on that 
a system like this is much like business cards. You pay a fortune for that
first card and the next 999 cost little more than the card stock they are 
printed on. Likewise, once you have a system like this set up for a single 
client to a server, the next 100 clients are almost effortless to set up. 

My numbers so far suggest that a 350mhz machine with, say, 64 megabytes of RAM 
and a LARGE disk could handle the log output of at least 250 machines. 

This is actually misleading. The system's performance is best measured in 
terms of log entries per unit of time. And the bottleneck is inserting the log 
data into the database. For example, I am running (at the time of this writing) 
the system as released here on a 350mhz AMD with 364MB of RAM. It will 
consistently load 22-23 log entries per second into the database with other
processes (X Windows, SETI) using 30 percent of the processor.

Now, after processing raw log data from my machines (each of which produce 
approximately one log entry every three seconds), there is approximately
a 0.01 reduction in the number of log entries to load into the database. The
bulk of this is accounted for by my cutting login information every few
minutes. Massive redundancy. There is further reduction (0.25) in the number of
records to load into the database because the database maintains a single
copy of any text data and 3/4 of the candidate records are rejected as 
duplicate. That is not completely accurate. 3/4 of the candidate record
_information_ is rejected as duplicate. If this begins to look like some
sort of data compression, rest assured it is completely lossless. The log
entries can be reconstructed verbatim from the database.

If we ignore the reduction in log entries accounted for by login info,
the system will still handle about 90 client log entries per second. If all
machines generate 1/3 entries per second then we are handling 270 log entries
per second or 250 machines. Well, it _feels_ right. And if you don't ignore
the effect of the login info, you are looking at the possibility of thousands
of clients and that is absurd. Well, maybe not...  :->

Please follow this 'trail' to get a handle on what this thing is about:

1. Look over the flowcharts. They are in html format and they diagram the basic
operation of the system per the initial release.

2. Read the 'intralog_#_*' docs.

3. Read the comments in the code (as well as the code, I suppose) in this 
(recommended) order:
    a. archive_logs
    b. log_logins
    c. prepLogs4DBMS_Main, _AuxFuncs, _Vars.
    d. The 'prepLogs4DBMS_do <' '>' files. These are the modules for various
	log 'families'.
    e. prepLogs4DBMS_FixMessage.
    f. The various files in /utils.
    
Please note: In some of the files you will find text at beginning at column 
88. These are monitoring/debugging statements that are turned on at the top 
of the file. There are some statements that always run. They make up a 
fairly extensive debugging facility that should let you find any setup/
config/parsing errors pretty quickly. FYI, with 'everything' turned on,
processing 40 lines of raw log data generated over 2 megabytes of debug
info.
    
Getting the documentation in order is one of the higher priorities. 

DEVELOPMENT PATH:

The initial release should be 0.1. Releases 0.2, 0.3, 0.4 should be the 
development to the point of initial release for 'analysis', 'httpd', and 
'security'. Releases 0.5 through 0.9 could go one of two, or both, ways:
1) support for various operating systems, web servers, RDBMS's and 
scripting interfaces and 2) porting to other languages. The C++ interface
to MySQL or ProC/OCI for Oracle come to mind. 

Hopefully, release 1.0 will be something to be really proud of.

If you have any questions, comments, suggestions or even ridicule, please do
not hesitate to contact me. I have been frankly starving for feedback.

Thanks,

Ken