CFHT FITS Guidelines

Date: Monday, 2001-04-23

From: Sidik Isani

To: Software group and astros

Subject: Proposed Long Term FITS goals for CFHT

Please refer to the on-line copy of this document, http://software.cfht.hawaii.edu/fits_guide.html. It contains links to related information.

About 10 years ago, Steve Smith wrote a library for CFHT which we still use to generate and update FITS keywords. With "Multi-Extension FITS", this library has now reached its limitations. Also, file locking, which would permit updating FITS headers from two or more programs at once, and elimination of header "templates", would simplify and speed up our acquisition system. I have completed a new library to address these issues. It is called "libfh", and I've tried to make it portable and easy to incorporate into stand-alone programs.

Use of this library could give all our tools a common behavior, especially when it comes to accessing extensions within a file. Documentation and examples showing how to use the library itself are at:

http://software.cfht.hawaii.edu/libfh/
I came up with six points that I think we should set as goals for the next couple of years. If you have an interest in how FITS data gets generated and modified by our ever growing data pipeline, please read this over and give some feedback.

I. Follow FITS Standards and Precedents

  1. We intend follow all requirements of the FITS standard.

  2. When ingesting data, our tools should not rely on anything in the FITS standard that is just a recommendation.

  3. When we create files, we should still follow those recommendations whenever possible.

  4. For anything pertaining to CCD mosaic data or image extensions which is not specifically addressed in the FITS standard, we should follow NOAO's convention, as set forth in the NOAO master keyword dictionary.

  5. Finally, if there's no precedent or other way to do it, we make it up as we go along. As has been pointed out, this is the nature of FITS.

II. Exchange 1 file per exposure between subsystems

By the time MegaCam comes along, data passed between major components of our system should only be MEF format. FLIPS and Elixir are to be considered separate major components.

This point is going to take some discussion, and there may have to be some exceptions to the rule, but it would be wise to keep the exceptions to a minimum (and document them.)

III. All tools should at least read MEF directly

We should consider the difficulty in upgrading all tools ASAP which require a FITS product for input, to be able to read directly from MEF format files rather than splitting them. MEF support can be added, one tool at a time, without violating any backward compatibility:
  1. At the command level, a filename that is a single basic FITS file should continue to be processed as it is now. For example, if your program was called `reduce' and you wanted it to process the file `basic.fits'
    reduce "basic.fits"
    
    does now, and always should do what one would expect.

  2. When such a filename is passed to libfh, the library will open the file for your program, read the FITS cards into a table (you have the option to modify and fh_rewrite() them later) and leave a file pointer at the start of data.

    BITPIX, NAXIS, or any other FITS cards must be examined by looking them up in the table (use fh_get_int, etc., instead of reading the file yourself.) [What problems does this currently pose for the way the existing tools work? Is the conversion really such a problem?]

  3. Upgraded or not, each program probably already understands this syntax:
    reduce "filename/filename03.fits"
    
    However, once we know that all FITS files are being opened by the library, we should discontinue this syntax (even though it would still be valid) in preference to the New Way below:

  4. Reading a named extension from a Multi-Extension FITS file would look like this:
    reduce "filename[chip03]"      OR
    reduce "filename.fits[chip03]"
    
    (the two would be equivalent.) Be sure to use "" around the filename so your shell does not try to do file-matching with the [...] expression. Since the library will take care of finding the correct EXTNAME, and will return only those cards in the corresponding table, your programs can now read one extension at a time from an MEF file without any modification. If EXTNAME is not found, it's the same error (as far as your program is concerned) as a file not being found. If it is found, the file pointer is left at the start of the correct image.

    In order to always be able to use this syntax, even when data is actually saved as split files, the library will do more than just search for "filename" or "filename.fits" ... if those aren't found, it will also try "filename/filename+chipno.fits". This means that the syntax "filename[extname]" will work regardless of the format of the data.

  5. The final step in the conversion would be to add proper support for the syntax:
    reduce "meffile.fits"
    
    to any programs for which it makes sense. Until then, that type of invocation will just continue to do whatever it does now. (Although, putting a quick check after each call to the library to open a file to see if it is MEF and printing a friendly message is the preferred behavior.)
See the section of the libfh documentation for porting programs for more information on how all this will work.

IV. Templates Bad

Templates, and/or having up-stream components reserve slots for specific FITS cards are both bad. Instead, insert calls to fh_reserve() in any code that creates a new FITS file.

V. Avoid re-writing whole file as much as possible

Changes in the FITS header size should be minimized to help keep tools which only need to update or add a few cards as simple as possible, and also to optimize the efficiency of the pipeline. Satisfying IV. and V. at the same time is possible. Here is a proposal. Once installed, the system will require some (very easy) tuning from time to time, but it should not be a burden. Most other schemes seem to violate IV. or V.

At first glance, this may seem to imply that all individual tools then have to be able to deal with the possibility that they might have to grow the header and re-write the file. Even if this task is in a library, we probably agree that this is a bit heavy. So let's add one more requirement:

VI. Tools (like handlers) can assume there is space to add keywords

Individual tools may be allowed (and perhaps should) assume that if they only update or add cards to a header that they will have space, and can FAIL if there is no space. (Similar to current Pegasus handlers failing if their template values do not already exist in a FITS file.)
A first, working version of the library is already done, and is being used to generate all 12K and skyprobe FITS data. Working versions of "loggerh" (for datalogger status) and "tcsh" (for TCSIV information) have been created as well, so we will now have logger data in the header for 12K data, and TCS data in the header for skyprobe's images.

If you have suggestions of how the system could be improved, or requests for additional routines that would make the library useful to you, please let me know.

The Pegasus source tree has a "properly integrated" version of the library. At the request of those who would like to provide their code to other sites without having to include the Pegasus Makefile structure or extra libraries, I have also made this library available as two files: "fh.c", the full C source code, except for the validation routine, and "fh.h", a header file to include in any program which uses the routines. You can download these files from the libfh web page.