This document contains information on downloading installation and configuration of the RPM Web Server. It also contains a note about log file structure and a typical list of related files.
From the bottom up, the following things are required to build and run RPM:
- A unix machine on the internet. (Tested platforms include Linux 1.x and 2.x, HP/UX 9.0.x, SunOS 4.1.x, and Solaris 2.5)
- An installed and running Web server that understands the common gateway interface (CGI)
or
Have the standard inetd invoke RPM for you. (This is what we do at CFHT.)- Precompiled RPM binary
Now you have to make a choice of running RPM as an independent HTTP server on its own port number, or, as a CGI script running under some other Web server. At CFHT, we run RPM in inetd mode on port 911. The binary is automatically installed in /usr/local/cfht/bin/ by the makefile, and /etc/inetd.conf needs to be manually set up to point there, as described below...For running under `inetd'
You will have to add the following line to /etc/inetd.conf:The file "rpm" should be a sym-link in the file system pointing to whatever version you want to run. Next, add this line to /etc/services:rpm stream tcp nowait root /usr/local/bin/rpm /usr/local/bin/rpmOr set "911" to whatever port you want it to answer on. "80" is the standard HTTP port. Finally, send your inetd process the HUP signal so it re-reads its configuration files (or if desperate or lazy... reboot, and the changes will take effect for sure.) Typical URLs with this scheme will now look something like this:rpm 911/tcp # rpm http serverhttp://rpm.cfht.hawaii.edu:911/some.file.on.your.server.xyzFor running under another httpd as a CGI script
Copy the binary into your /cgi-bin directory, but rename the file to something that starts with "nph-". This little bit of filename magic tells the server not to attempt to parse RPM's headers, since it can handle all that itself. For example, install the file withWhich would allow you to access files with RPM by accessing the virtual URLcp httpserver-OSNAME /usr/local/httpd/cgi-bin/nph-rpmhttps://www.cfht.hawaii.edu/cgi-bin/nph-rpm/some.file.on.your.server.xyz
The server should now run. But there are other features that can be controlled by editing the file /etc/rpm.conf:A Note about Security
Anyone who connects to RPM has the ability to not only read files in the specified directories, but also to execute programs, write files, and delete files on your system. If you are not careful when setting up RPM, you can open yourself up to some very big security holes, so read this section carefully.
NOTE: Many useful defaults are already compiled into the server. You can probably skip to the next section and take the defaults to get started...
Use this to tell the server what kind of file something is according to its extension. Suppose you wanted to add "*.htm" as a valid extension for an HTML file. You would add:{MIME-TYPE ... htm = text/html ... }
NOTE: The defaults built into the server for this section are probably good enough, so unless you want to experiment you can probably skip to the next section...
Not all browsers support the same features. The formats that browsers can accept, and the types of tricks they can do was originally intended to be negotiated by headers specific for each feature (for example if a client sends "Connection: Keep-Alive" it's supposed to mean they support some kind of persistent connection mode). Unfortunately this system is really quite broken, because many features don't have a useful or consistent header that corresponds, and even one's that do, like Keep-Alive, are not always to be trusted. Netscape 2.0, for example, claims that it can do Keep-Alive, but it is really broken. Because of this great mess, the {FEATURE...} section is used to tell RPM which versions of which browsers support certain features. The easiest way to override defaults in the {FEATURE section is to set a particular feature to "*" (=on for everyone) or to "X" (=off for everyone). See the source file "rpm_config.cc" for more details.
Supported {FEATURE ...}'s
last-modified- Client properly uses "If-Modified-Since/Last-Modified" headers, and the server can safely tell the browser to "use cached copy" if the dates agree. This is set to "*" by default since no browsers are known to have any problems with this.
frames- This is a list of browsers that support netscape's ``frames''.
javascript- This usually contains a list of browsers that have half-way working support for JavaScript. If a browser is not included in the list, RPM will not bother to include javascript generated from ".rpm" files. JavaScript in normal HTML documents will still be sent out, however.
java- A list of browsers that support java.
dynamic- A list of browsers that understand how to handle server-push dynamic documents. If a browser is not on this list, it will be sent a single, snap-shot frame of the page, rather than a self updating display.
keep-alive- List of browsers that RPM's keep-alive implementation is known to work with. Keep-Alive can speed up loading of pages, especially if many small gifs are involved. It is not, however, necessary.
All other setup parameters for the server fall into the {CONFIG...} section. Most can be omitted, and the server will use built in defaults. The table lists all the ones that can be overridden. Note that many are obscure and would not need to be changed under most circumstances.
The most useful {CONFIG ...}'s
local_access- Specifies the access level given to hosts connecting from the local domain. For full protection against name spoofing, rpm should be run with a tcp_wrapper, a firewall, or an enhanced inetd. Possible settings are demo (the default, won't let anyone execute any ACTIONs, but server-side script tags are executed and all files are fair game and can be viewed), none (completely shut down the server), authorized (prompts for a valid username and password before allowing any access, including ACTIONs), open (no passwords, anyone can load and execute what they like, including ACTIONs).
remote_access- Takes the same settings as local_access, but authorized is the default setting. This applies to all hosts outside of the local internet domain.
host- It is possible to have multiple {CONFIG...} sections in the rpm.conf file. This can be useful if you want several servers to read the same global configuration file, or if a single server machine is hosting serveral virtual sites (even with IP aliasing!) If a config section contains a host=... then it only applies to a server accessed at that hostname. In this way you can specify different document directories or access permissions for different servers in the same configuration file. To guarantee that this works with all browsers, you must use IP aliasing, but since most browser send a "Host:" header, you can specify additional virtual hosts for those browsers without using IP aliasing.
port- This setting can be used to create virtual servers in the same way that the host setting can. This setting can be used instead of or in combination with the host setting.
webmaster- Place the webmaster's email address here
core_dir- If you don't set this one, or if you set it to some string that's not a real directory on your file system, then all signals will be trapped, and some kind of message will hopefully be sent to the client that something went wrong. It may be more useful, in some cases, to allow a core to be generated, in which case you should set this to the directory where the core should be left (for example "/tmp"). You can use the rpmcrashtest server directive to test core generation.
log_dir- Path to the directory where RPM will build an HTML browseable log structure. The directory should be writable ONLY BY ROOT.
script_dir- Path to the directory where javascript and server-side script tags will be installed.
default_dir- Path to the "document root" directory on your server
default_user- This should be an unpriveleged username (like "nobody") to use when accessing pages in the default_dir.
default_home- Set this to something like "~/rpm" to automatically give ALL users RPM directories without having to add them to the SECURE section one-by-one.
default_index- A list of ',' separated filenames to look for in a directory before generating an automatic directory listing. Default value is "index.rpm,index.cgi,index.html".
local_list NEW in RPM 2.0.8- A list of ',' separated filename patterns which should never be served to a remote host, regardless of the setting of remote_access. Default is "/LocalAccess/*,/local_access/*".
log_list- A list of ',' separated filename patterns for directories that contain RPM log files. The default is "/logs/*".
cgi_list- A list of ',' separated filename patterns which should be executed as CGI scripts instead of being sent out normally. The default is "*.cgi,/cgi-bin/*". NOTE: This allows a CGI script to exist ANYWHERE on the server, so long as the filename ends in ".cgi".
cgi_nph_list- If and only if a filename was matched by cgi_list, it is then checked against this list. If it also matches this list, it is executed as a "no-parse-header" CGI script. The default is "*/nph-*".
Rarely Used {CONFIG}'s
verbose- If set to on, more information will be logged to syslog/ cfhtlog. The default is off.
redirect_links- If set to on, a request for a URL that ends up being a symlink to another file at the same level only will result in an HTTP redirect. The default is off.
timings- If set to on, an approximation of how long it takes each document to reach a user with a graphical browser will be logged, and displayed at the bottom of each document in fine print. If set to hidden, approximate transfer times will be logged, but not shown to the user. If set to off, no attempt will be made to time performance.
proxy- If set to on, the server will act as both a regular HTTP server, and as a (non-cacheing) proxy server.
profile- If set to on, the server will enable profiling timers to measure server performance. A table of how long it took to complete various parts of document preparation and server setup will be included with the HTML sent to the browser. If set to off, you can still get most of the timings by appending the rpmprofile directive to a URL.
image_generation- The string with which all special paths that are meant to access the built-in server-side image generation begin with. The string "/imggen" is the default. You can't have a real file or directory by this particular name on your server.
icon_generation- The string with which all special paths that are meant to access the built-in server-side icon selection begin with. The string "/icongen" is the default. You can't have a real file or directory by this particular name on your server.
server_id- Don't change this. It identifies the version of the server that is running. The value this was precompiled into the binary is what we want to see here.
server_date_fmt- Internal stuff for the HTTP protocol that RPM uses. This is a string that strftime should use for formatting the dates on the HTTP headers. If you mess this up, some client's last-modified caching might break!
keep_alive_max- Maximum number of requests to serve sequentially on a single keep-alive connection. Default is 30.
keep_alive_timeout- If no new requests come in for this many seconds, break the connection during keep-alive mode. Default is 45. Note: This is different from RPM dynamic (server push documents). These will keep a connection open indefinitely.
network_timeout- Pauses longer than this when reading data from the client will cause the server to close the connection. Default is 45.
exec_timeout- If set to 0 (the default) processes that the server forks (such as CGI's and handlers) have unlimited time to complete their functions. Otherwise, and error will result after however many seconds you set this to, in case of a runaway process.
bgcolor- The default background color to use for generated forms. For pegasus, this defaults to "#BBBFAA".
bgcolor_light- A lighter version of the background color for grey-ed out items. Default is "#C0C0C0".
demo_head- Full path to a ".rpm" file to process and send before the body of a file in "demo" mode. This demo header should contain only valid tags normally found in the {BODY} section of RPM files.
demo_tail- Full path to a ".rpm" file to process and send after the body of a file in "demo" mode.
These {CONFIG}'s only apply to CFHT's version
CFHTLOGU- This defaults to "~/.,session.np". It tells the server what to store in the environment variable $CFHTLOGU before initializing the standard CFHT logging functions.
CFHTLOGS- This defaults to "/tmp/pipes/syslog.np". It tells the server what to store in the environment variable $CFHTLOGS before initializing the standard CFHT logging functions.
hform_netscape_exec- Complete path to the netscape executable to run when an hform is invoked.
hform_netscape_version- The version number of the netscape being used, exactly as it appears in the _MOZILLA_VERSION X-property. The default is "3.0", but other possibilities are things like "2.01" or "4.0b6"
hform_default_dir- A directory that contains default copies of customized versions of the files preferences (to control some aspects of netscape's appearance), and Netscape (to control still more aspects, through X-resources).
hform_new_window- The text to appear on the title bar of a new hform while it's busy loading the real thing.
hform_empty- The url to load into an hform window while it's busy loading the real thing.
hform_http_server- The http server where RPM is running. Note that settings in a user's .,net.par file will override this.
hform_version_prop- The name of the X-Property that netscape makes available to advertise its version. By default, netscape uses "_MOZILLA_VERSION", but at CFHT it has been changed to "_MOZILLA_RPMVERS" so that hform will not interact with the user's own netscape window, if they happen to be running it at the same time.
An example /etc/rpm.conf
All of the above may seem complex, but an /etc/rpm.conf file would rarely need to use most of the features described above. They are provided only to allow you to keep RPM up-to-date with the latest browsers without having to recompile the binary. A typical /etc/rpm.conf file can be pretty short:Note how the all the sections have been completely omitted, except for {CONFIG}. The example above allows anyone with a subdirectory called "rpm" off of their home directory to put out RPM documents on the server (default_home=~/rpm), and allows anyone with a valid user-id and password on the machine running RPM to get access to any of the pages. Passwords are sent unencrypted across the network. You have been warned.
{CONFIG local_access = authorized remote_access = none webmaster = me@my.machine.net log_dir = /usr/local/rpm/logs default_dir = /usr/local/rpm/documents default_home = ~/rpm }
If the directory sepcified in {CONFIG log_dir="..."} is writable by root, RPM will create an HTML browseable log of all transactions in this directory. You can point any browser at this directory locally to see the logs, or you can load them through RPM if a link has been set up (in {CONFIG default_dir="..."}) called "logs" that points to the directory. In the latter case, RPM will update the log screen dynamically if you leave it up. Accesses to the /logs/ directory are not logged in the log to prevent obvious feedback problems.
Here are some typical places RPM-related files might show up with the installation we have at CFHT:/etc/services /etc/inetd.conf /etc/rpm.conf /usr/local/cfht/bin/rpm-neptune -> (sym-link to appropriate version) /usr/local/cfht/conf/rpm/home/* (document root) ~user/rpm/* (each user's document root)