This preface is a way of apologizing before the fact for the way this thing is presented for the initial release. Also, it lets me explain that the anticipated user may be a home computer network 'hobbyist' to more serious home networks to (be still, my heart) small to medium networks in a more commercial or organizational setting. Also, it lets me explain why I think such a project is useful. Mainly because networks, large and small, are becoming more commonplace and network monitoring is becoming more necessary. And because I noticed early on that a system like this is much like business cards. You pay a fortune for that first card and the next 999 cost little more than the card stock they are printed on. Likewise, once you have a system like this set up for a single client to a server, the next 100 clients are almost effortless to set up. My numbers so far suggest that a 350mhz machine with, say, 64 megabytes of RAM and a LARGE disk could handle the log output of at least 250 machines. This is actually misleading. The system's performance is best measured in terms of log entries per unit of time. And the bottleneck is inserting the log data into the database. For example, I am running (at the time of this writing) the system as released here on a 350mhz AMD with 364MB of RAM. It will consistently load 22-23 log entries per second into the database with other processes (X Windows, SETI) using 30 percent of the processor. Now, after processing raw log data from my machines (each of which produce approximately one log entry every three seconds), there is approximately a 0.01 reduction in the number of log entries to load into the database. The bulk of this is accounted for by my cutting login information every few minutes. Massive redundancy. There is further reduction (0.25) in the number of records to load into the database because the database maintains a single copy of any text data and 3/4 of the candidate records are rejected as duplicate. That is not completely accurate. 3/4 of the candidate record _information_ is rejected as duplicate. If this begins to look like some sort of data compression, rest assured it is completely lossless. The log entries can be reconstructed verbatim from the database. If we ignore the reduction in log entries accounted for by login info, the system will still handle about 90 client log entries per second. If all machines generate 1/3 entries per second then we are handling 270 log entries per second or 250 machines. Well, it _feels_ right. And if you don't ignore the effect of the login info, you are looking at the possibility of thousands of clients and that is absurd. Well, maybe not... :-> Please follow this 'trail' to get a handle on what this thing is about: 1. Look over the flowcharts. They are in html format and they diagram the basic operation of the system per the initial release. 2. Read the 'intralog_#_*' docs. 3. Read the comments in the code (as well as the code, I suppose) in this (recommended) order: a. archive_logs b. log_logins c. prepLogs4DBMS_Main, _AuxFuncs, _Vars. d. The 'prepLogs4DBMS_do <' '>' files. These are the modules for various log 'families'. e. prepLogs4DBMS_FixMessage. f. The various files in /utils. Please note: In some of the files you will find text at beginning at column 88. These are monitoring/debugging statements that are turned on at the top of the file. There are some statements that always run. They make up a fairly extensive debugging facility that should let you find any setup/ config/parsing errors pretty quickly. FYI, with 'everything' turned on, processing 40 lines of raw log data generated over 2 megabytes of debug info. Getting the documentation in order is one of the higher priorities. DEVELOPMENT PATH: The initial release should be 0.1. Releases 0.2, 0.3, 0.4 should be the development to the point of initial release for 'analysis', 'httpd', and 'security'. Releases 0.5 through 0.9 could go one of two, or both, ways: 1) support for various operating systems, web servers, RDBMS's and scripting interfaces and 2) porting to other languages. The C++ interface to MySQL or ProC/OCI for Oracle come to mind. Hopefully, release 1.0 will be something to be really proud of. If you have any questions, comments, suggestions or even ridicule, please do not hesitate to contact me. I have been frankly starving for feedback. Thanks, Ken