Class CGIServlet

All Implemented Interfaces:
Serializable, Servlet, ServletConfig

public final class CGIServlet extends HttpServlet
CGI-invoking servlet for web applications, used to execute scripts which comply to the Common Gateway Interface (CGI) specification and are named in the path-info used to invoke this servlet.

Note: This code compiles and even works for simple CGI cases. Exhaustive testing has not been done. Please consider it beta quality. Feedback is appreciated to the author (see below).

Example:
If an instance of this servlet was mapped (using <web-app>/WEB-INF/web.xml) to:

<web-app>/cgi-bin/*

then the following request:

http://localhost:8080/<web-app>/cgi-bin/dir1/script/pathinfo1

would result in the execution of the script

<web-app-root>/WEB-INF/cgi/dir1/script

with the script's PATH_INFO set to /pathinfo1.

Recommendation: House all your CGI scripts under <webapp>/WEB-INF/cgi. This will ensure that you do not accidentally expose your cgi scripts' code to the outside world and that your cgis will be cleanly ensconced underneath the WEB-INF (i.e., non-content) area.

The default CGI location is mentioned above. You have the flexibility to put CGIs wherever you want, however:

The CGI search path will start at webAppRootDir + File.separator + cgiPathPrefix (or webAppRootDir alone if cgiPathPrefix is null).

cgiPathPrefix is defined by setting this servlet's cgiPathPrefix init parameter

CGI Specification:
derived from http://cgi-spec.golux.com. A work-in-progress & expired Internet Draft. Note no actual RFC describing the CGI specification exists. Where the behavior of this servlet differs from the specification cited above, it is either documented here, a bug, or an instance where the specification cited differs from Best Community Practice (BCP). Such instances should be well-documented here. Please email the Tomcat group with amendments.

Canonical metavariables:
The CGI specification defines the following canonical metavariables:
[excerpt from CGI specification]

  AUTH_TYPE
  CONTENT_LENGTH
  CONTENT_TYPE
  GATEWAY_INTERFACE
  PATH_INFO
  PATH_TRANSLATED
  QUERY_STRING
  REMOTE_ADDR
  REMOTE_HOST
  REMOTE_IDENT
  REMOTE_USER
  REQUEST_METHOD
  SCRIPT_NAME
  SERVER_NAME
  SERVER_PORT
  SERVER_PROTOCOL
  SERVER_SOFTWARE
 

Metavariables with names beginning with the protocol name (e.g., "HTTP_ACCEPT") are also canonical in their description of request header fields. The number and meaning of these fields may change independently of this specification. (See also section 6.1.5 [of the CGI specification].)

[end excerpt]

Implementation notes

standard input handling: If your script accepts standard input, then the client must start sending input within a certain timeout period, otherwise the servlet will assume no input is coming and carry on running the script. The script's the standard input will be closed and handling of any further input from the client is undefined. Most likely it will be ignored. If this behavior becomes undesirable, then this servlet needs to be enhanced to handle threading of the spawned process' stdin, stdout, and stderr (which should not be too hard).
If you find your cgi scripts are timing out receiving input, you can set the init parameter stderrTimeout of your webapps' cgi-handling servlet.

Metavariable Values: According to the CGI specification, implementations may choose to represent both null or missing values in an implementation-specific manner, but must define that manner. This implementation chooses to always define all required metavariables, but set the value to "" for all metavariables whose value is either null or undefined. PATH_TRANSLATED is the sole exception to this rule, as per the CGI Specification.

NPH -- Non-parsed-header implementation: This implementation does not support the CGI NPH concept, whereby server ensures that the data supplied to the script are precisely as supplied by the client and unaltered by the server.

The function of a servlet container (including Tomcat) is specifically designed to parse and possible alter CGI-specific variables, and as such makes NPH functionality difficult to support.

The CGI specification states that compliant servers MAY support NPH output. It does not state servers MUST support NPH output to be unconditionally compliant. Thus, this implementation maintains unconditional compliance with the specification though NPH support is not present.

The CGI specification is located at http://cgi-spec.golux.com.

TODO:

  • Support for setting headers (for example, Location headers don't work)
  • Support for collapsing multiple header lines (per RFC 2616)
  • Ensure handling of POST method does not interfere with 2.3 Filters
  • Refactor some debug code out of core
  • Ensure header handling preserves encoding
  • Possibly rewrite CGIRunner.run()?
  • Possibly refactor CGIRunner and CGIEnvironment as non-inner classes?
  • Document handling of cgi stdin when there is no stdin
  • Revisit IOException handling in CGIRunner.run()
  • Better documentation
  • Confirm use of ServletInputStream.available() in CGIRunner.run() is not needed
  • [add more to this TODO list]
Author:
Martin T Dengler [root@martindengler.com], Amy Roh
See Also: