Mar 4, 2009

What is CGI

A lot of people have Web pages but most feel that CGI scripts are "over their head". Nonsense! If you know basic HTML and know how to use an FTP program like WS_FTP to transfer files, chances are you can be using a CGI script on your Web pages in about 15 minutes. With so many free CGI scripts available (including Bestdam Logger Lite) you are really short-changing yourself if you are not using the CGI capabilities offered by your ISP or host provider to use CGI scripts.

While there are a lot of "CGI Tutorial" pages out there, most deal with how to write CGI scripts. For those who just want to know how to use CGI scripts, information is pretty scarce. That's why this page was created.

What Is CGI ?


There are a number of different methods Web developers can use to enhance the content of Web pages over and above what simple HTML provides. Most of these methods involve writing little programs or routines using one scripting language or another. The are two basic differences with these methods:

Where is the script code located ? Where is the script code executed ?
The following table summarizes to the above two points for the various methods:
Method Where is the script
code located ? Where is the script
code executed ? Required HTML
file extensions CGI In files in the CGI-BIN
directory on the server. On the server. .shtm or .shtml PHP, ColdFusion, ASP Embedded in the
HTML document. On the server. .php .cfm .asp Javascript Embedded in the
HTML document. On the user's PC
by their browser. n/a Java In files on
the server. On the user's PC
by their browser. n/a
Note: Java is not a scripting language. Java files are pre-compiled "applets". The applet files are stored on the server and downloaded by the browser for execution. Even though the script code for PHP, ColdFusion, and ASP is embedded in the HTML code, it's not visible at the browser (when using View/Source). Before sending the page to the browser, the server strips out the script code, executes it, and puts in its' place the results of executing that code. For example, a script command to return the current date will be stripped out and the text of the current date will be put in its' place and in the HTML that's sent to the browser. As such, the location of the script code embedded in the HTML is the position of the execution results on the Web page.

The process is somewhat the same with CGI scripts. CGI is utilized by placing an appropriate HTML tag (called an SSI directive tag) in your HTML code. (The author of the script you wish to use should provide you with the appropriate HTML tag needed to run that script.) When the page is requested by a browser the server reads the tag (and strips it out), executes the server-located script file that's specified by the tag, and puts in the tag's place the results of the execution of the script file. A common example is a hit counter script. The script execution increments the counter and the text of the resulting count is put in the HTML that's sent to the browser so that it appears on the page in the same place where the SSI diective tag was located.

If you've ever looked at your browser's settings, you probably seen check boxes or radio buttons to enable or disable Javascript and Java but haven't seen anything for CGI or PHP. That's because Javascript and Java are executed by the browser (or not, if you disable them). Your browser doesn't know anything about CGI or PHP. It just gets pure HTML from the server after the scripts are executed.

The embedded script method (PHP, ColdFusion, ASP) is mainly used by developers writing "front end" web pages that will access "back end" databases (i.e. client/server Web applications). The big advantage of the CGI method is that the scripts are stored in files and there are literally thousands of freely available scripts already written and ready for you to download and use on your Web site. This means that you don't have to learn a scripting language in order to get the benefits of scripts. Someone has already done the work for you.

You've undoubtedly visited Web pages and seen "cgi-bin" appear in the location line of your browser. CGI stands for "Common Gateway Interface". When you see that "cgi-bin" appear on the location line, you probably executed a CGI script on the server when you requested the page.

Two of the methods shown in the above table have the code executed by the server. But how does the server know to look for a tag which calls a CGI script or to look for embedded script code in a PHP page? It's done using different extensions when naming HTML files. If a browser requests a page (an HTML file) with a .shtml extension, the Web server knows it should "parse" (i.e. look through) the page for a tag which calls a CGI script and execute that script before sending the page to the browser. If the requested page has a .php extension, it knows to look for and execute any embedded PHP code it finds in the page before sending it to the browser.


CGI and Perl

You will often see the term "Perl" used with the term "CGI". The two are NOT the same. CGI programs, or scripts, can be written in a variety of computer languages, including C. CGI is the process by which scripts are run. Perl is the most common language used for writing CGI scripts, and for very good reason. (See the Messin' Around with Perl section below).

Because Perl has its' roots in UNIX, many people think that Perl CGI scripts cannot be used on Windows NT Web servers. Not true! Perl CGI scripts can not only run on UNIX and NT servers, but with a little tweaking for "AppleScript", many can run on Macintosh servers as well.


Your CGI

Most ISPs that offer Web space and Website "hosting" companies support the use of CGI scripts. It is so common in fact, that if your ISP or host provider doesn't offer it, you should consider taking your business elsewhere. The two questions that need to be answered are:
1. Do I have the capability of running my own CGI scripts ?

2. Does my CGI capability include support for SSI (Server Side Includes) ?

Note: Don't confuse SSI with SSL (the Secure Socket Layer protocol used with browsers), they're two entirely different things.
If the answer to both of these questions is "Yes", you're good to go. You can run most of the scripts available on the Web. There are some scripts that don't require SSI but a lot do so having SSI support will allow you to run more scripts. However, if you don't have it you can still run some scripts. The documentation (readme file) should come with the script and state whether it requires SSI support or not.

"Server-Side Includes" are just that, commands (aka "directives") to the Web server to include some information the server has in the displayed Web page. A common use of SSI is to display the current date and time on a Web page. These commands are enclosed within HTML comment tags () in a Web page so the browser ignores them. These comment tags with an enclosed server command are the "SSI directive tags" that are mentioned below. When these SSI directive tags are used with scripts, the "...information the server has..." is whatever output was generated by the execution of the script. The script will dictate whether this information is displayed on the Web page (as with a hit counter) or written to some file (as with a logger). As an example, here is the HTML (SSI directive) tag for Bestdam Logger:



Back to your CGI situation. Most Web hosting services and ISP's have a Technical Support section on their Website that may also have a "FAQ" (Frequently Asked Questions) page. Snoop around their support pages and see if they if you can find anything related to CGI and SSI that may answer the above two questions.

If all this talk of Perl, SSI, servers, etc. has you a little confused, here's a diagram to show the inter-relationship of the various components of a typical Internet server. Note that "Apache" and "Sendmail" are like brand names. There are other Web-sever and e-mail-server software packages available.

Big-bucks setup, right? WRONG! You can buy a set of Debian Linux CDs for $15 and it includes the Apache (which has CGI and Perl modules) and Sendmail software. Linux will run on an old Pentium with 32 meg. So if you've got an old system collecting dust and a broadband connection to the Internet, for $15. you can have your own Internet server and eliminate the need for a Web hosting service. Instructions on how to setup a server, and the issues involved with having your own Internet server, are covered at our Beginners Guide To Linux site.

There are both UNIX/Linux and NT versions of Apache. However, most NT servers use IIS (Internet Information Server) which is included as part of the NT Server software. There is also a freeware NT Server e-mail program called Blat that will allow you to send e-mail using scripts. There is a link to the Blat Website on the Bestdam Logger Setup page.

One way to tell if you have CGI capability is if you have a sub-directory (folder) in your root Web directory called cgi-bin. If you do, you very likely do have CGI capability. The only question then is, does your CGI capability include SSI support ? If you didn't find an answer to this on their Website, you will have to check with your host or ISP.



If you don't have a cgi-bin sub-directory that doesn't necessarily mean you don't have CGI capability. It could mean it hasn't been set up.

Normally you cannot simply create the cgi-bin sub-directory yourself. It is a special sub-directory that must be set up by a system administrator. However, I have seen hosts that allow you to create the sub-directory using an FTP program and use their Website "admin" function to enable it. If you are able to create it, how to CHMOD this sub-directory is given at the end of the Transferring Files and Permissions section below.

Note that your host provider or ISP may set up a sub-directory called simply cgi rather than cgi-bin. This is the same thing and you would just need to make the necessary changes to any tags you add to your Web pages to run scripts.

In some cases, hosts or ISPs that do not offer CGI capability with their base package will offer it as an "add-on" or optional service for an additional fee.


Three Steps To Using a Script

Once you've established that you do have CGI and SSI capability, and you've downloaded the script you want to use, there are three basic steps you need to take in order to use the script on your Website:

Set any options that the script may need

Transfer the script's files (the script file itself and any necessary data files) to your Web server and set the permissions

Add the script's HTML tag to the page(s) you want to use the script
Perl scripts typically have a .PL extension, but they may also have a .CGI extension. (Files with other extensions, or no extensions, will likely be data files used by the script.) "Setting options" in scripts is typically just a matter of opening the .PL (or .CGI) file in a text editor and entering values for some of the scripts variables. For example, you may need to enter your e-mail address if the script sends e-mail notifications of some event. Information relating to Step 1 (setting options in scripts) is covered in general in the next section Using Free Scripts Found On The Web.

Note: Some hosts or ISPs may require that scripts have a .cgi extension. It is normally not a problem to just rename the file to comply. If you do so, remember to change the extension in the HTML tag also provided in the script's documentation. Once the script is all set up and ready to go, the next step is to transfer the script files to your Web server and set the proper "permissions" to the files. These permissions are necessary so your Website visitors can access them properly. Step 2 is covered in detail in the Transferring Files and Permissions section below.

That takes care of the script side of things. The final step is to add the appropriate HTML tag to your Web page (HTML file) to call the script and then transferring that updated page to the server. The documentation that came with the script, or the comments in the script file itself, should include the appropriate tag to use. But remember that you may have to modify this tag. Most tags assume your CGI sub-directory is called cgi-bin. If it's called simply cgi, or something else, make the necessary change in the tag. Step 3 is also covered in detail in the Transferring Files and Permissions section below.

The discussion in the next section (Step 1) will be generic so that it applies to most scripts available on the Web. Steps 2 and 3 (transferring files and adding the tag to an HTML page) are covered in greater detail in a later section using the free Lite Edition of Bestdam Logger as an example. However, once you've seen the process in action you can easily apply it to other scripts.

If you're not familiar with loggers, they collect information about those who visit your Website. Bestdam Logger logs date and time, page viewed, visitor IP address, domain and client info, and the page they came from, called the "referrer". This information can be valuable in answering questions such as

Who (via what domain) is visiting my site ? Where are they coming from (i.e. who's sending me traffic) ? What are the peak traffic times ? Which pages are the most popular ? Which browser is most often used to view my site ? Which search engines are "spidering" my site ? Which search keywords did visitors use to find my site ? Knowing search keywords can be helpful in determining which META keywords are effective. If the logger has "multi-page support", and you enable the logging function on all of your pages, you can track the paths visitors took through your site.
Taking the above Internet server diagram and adding another box and a few more lines, you can see how your HTML file completes the process.



Top of page




As mentioned previously, most CGI scripts are written using the Perl language. When a Perl programmer writes a script they may choose to make it freely available to everyone on the Web. However, you should use caution when selecting these free scripts. I have seen many instances where these free scripts do not adequately "lock" files (which is important in a Web environment where multiple people could be viewing the same Web page simultaneously). Some poorly written scripts could also actually pose a security risk by allowing unauthorized access to the server. Unfortunately, there is no easy way for the untrained eye to determine if adequate file locking is used or if the script represents a security hole. And most Websites that offer scripts from third parties for download do not perform any sort of quality control checks. When in doubt, ask your host or ISP to look over the script you want to use. Most would much rather take a couple minutes to evaluate a script than clean up the mess a poorly-written script could cause.

The Files

Perl scripts typically have a .PL extension, but I have also seen them with a .CGI extension. Files with other extensions, or no extensions, will likely be data files used by the script. Serious Perl script programmers will also include a readme file that contains information about the script and how to set it up. This readme file is intended for those who will be using the script and does not have to be transferred to the server with the script and data files. Throughout this section I refer to .PL files. However, the same would apply to .CGI files if your script has that extension instead.

More complex scripts may have more than one .PL file which may require different HTML tags for each one. (It's also possible that one script may "call" another script so that only one tag is needed.) Also, be on the lookout for additional .PL files with names like config.pl or cfg.pl. These are script files where all of the user-settable options are entered and stored. If a file like this is included in the download file, you typically don't have to open the main script file to set options. The main script will refer to this configuration script each time it is executed.

All of these files are typically combined into a single .ZIP file for you to download from a Website.

The Setup

Perl scripts are simply plain old run-of-the-mill ASCII text files. However, instead of containing sentences that make sense to humans, they mostly contain commands that make sense to servers. There is an advantage to this though. Because Perl scripts are ASCII text files, the Perl programmer can also put human-understandable instructions in the script, and many often do, locating this text right at the beginning (top) of the file. It is easy to spot the information that is meant for you to read because the line will start with # character.

The # character is the "comment" character in Perl. Any line that begins with a # does not get executed by the server. (There's one exception to this which you will see shortly.) In addition, the programmer can also put a # character after a script statement to add comments. For example, you could see the following statement in a script which acts as a user-settable option:

$counthits = 1; # 1 = Yes 0 = No
Once you have downloaded and un-ZIPped a script that you would like to try, you should use a text editor like NOTEPAD to open the main, or if found the configuration, .PL file and check the top of the file for any information or setup instructions. If there is a readme file, open that in a text editor and look for setup instructions also.

One key piece of information you should find either in the comments in the script file, or in the readme file, is the HTML tag you need to add to your Web pages to execute the script. Using the file transfer example in the next section, the tag for Bestdam Logger Lite would be:


Note that the provided HTML tag with some scripts may assume you are putting the script in the cgi-bin sub-directory, not a separate sub-directory under it. If this is the case, and you want to put the script in it's own sub-directory just modify the tag. For example, say you downloaded a script called "GigCount" and the tag specified in the script's comments or documentation was:


If you wanted to put the script in it's own sub-directory named "counter" you would simply modify the tag to


CGI scripts that do not require SSI might have a more common type of tag. For instance, a script to take on-line polls may use a link to execute the script. In this case the tag would be something like

Vote here

In addition to a tag, the information near the top of the script file or readme file should also contain instructions for you on what values to use to CHMOD the files (i.e. set the permissions), as well as setting any user-settable options the script may offer.

The very first line of any Perl script is a user-settable option and is always going to be the path to your host's Perl installation, preceded by the characters "#!". This line is commonly referred to as the "shebang". Typical shebangs can be:

#!/usr/bin/perl (often the Perl 4 location) #!/usr/local/bin/perl (often the Perl 5 location) Note that Perl 4 scripts will work with a path to a Perl 5 installation but the reverse may not be true. If you're having problems getting a script to work with the first one, try the second. Note also that this shebang line may not be necessary with Windows NT servers. If you haven't yet done so, now would be a good time to snoop around your host's or ISP's Website Technical Support pages and FAQs looking for anything related to "CGI". There you may find not only the path to their Perl installation (and possibly the version of Perl they have installed), but to their e-mail programs and other paths as well. If you can't verify this information, just leave it at the default value, but verify it with your host or ISP if you run into problems trying to use the script.

If you know how to use the telnet program to access the shell of your UNIX/Linux server, you can use the whereis command to find out the paths to your Perl and sendmail installations. At the shell prompt, simply type in
whereis perl
and
whereis sendmail and the paths will be displayed. (If whereis gives you an error message try using which in its' place.) Note that you will often get multiple paths displayed, some ending with things like /perl5.003 and /sendmail.cf but you are only interested in the paths that end with /perl and /sendmail - i.e. with no extensions. There may even been multiple path listings to these, but that just means there are different versions installed.

You may also be able to find the system path (see the paragraph below) to your root Web directory by using the pwd command (print working directory). Setting options usually involves entering values for script variables. These values can be a '1', 'Y', 'y', 'YES', etc. to enable an option and a '0', 'N', 'n', 'NO', etc. to disable it. Certain user or system information may be needed for some variables. You may be asked to enter path information or an e-mail address. For example, near the top of the Bestdam Logger Lite file you are asked for your e-mail address (so the site visitor data can be e-mailed to you) and the path to your server's e-mail program. The comments in the script or configuration file should clearly indicate what the option is and what the valid optional values are.

Some scripts will ask you to enter the system path. This is not the URL. The system path is the path from the root of the server which is hosting your site and will look something like this:
/usr/local/etc/usersites/(your website identifier)/
You'll have to ask your host or ISP if you don't know what your path is. The problem is that they may not be too quick to give this information up for security reasons. If that's the case, there's not much you can do. You can try asking them to take a look at the script you want to use so they can see how the path is used.

If some of what you read at the beginning of the .PL file doesn't make any sense to you, don't feel like you're doing something wrong. Many Perl programmers will write these comments for other Perl programmers, or worse, for others who are well-versed in UNIX. If you find that the instructions are not clearly written for those without Perl programming or UNIX experience, and there is no accompanying readme file with easy-to-understand instructions, you may want to forget about using that script and find something with clearly written, understandable instructions.

Detailed setup instructions for Bestdam Logger Lite are given on the Setup & Installation page as well as in the readme.txt file contained in the .ZIP download file available on the Features & Download page. However, the top of the bdlogger.pl file is heavily commented with instructions for setting the options so you may be able to set them just by opening the file in a text editor and reading the comments.

With everything set up in the script file it's time to transfer the files and, with UNIX/Linux servers, set the permissions (detailed in the next section). The most common mistake people make when using ftp to transfer script files to the server is not using ASCII mode to transfer the files. Be sure to use ASCII mode when transferring the script files ! A script will not work if it is transferred in binary mode. You will see how to do this in the next section.

Troubleshooting

The vast majority of script problems are due to configuration issues. When you consider that there are so many different flavors of UNIX out there, multiplied by the number of possible configuration options, multiplied by the number of Web server software packages out there, multiplied by the configurations options they have, you can see why this is the case. That's not to say you should go running to your host or ISP if your script doesn't work right off. The truth is, trying to solve problems is one of the best learning experiences there is. There are a lot of things you can check, and if you do go to your host or ISP and say "I have checked....." they'll know you have some knowledge of what you're doing and take you seriously.

If you've set up and installed and script and you feel it may not be working, there are certain steps you can take using your browser to try and track down the problem.

Given the large number of possible server configurations, check your hosts/ISPs Website tech support pages to see if any of these CGI restrictions apply:

If you're using a script that requires SSI, the Web page(s) with the SSI directive tag may be required to have a .shtml or .shtm extension. Check with them regarding any such requirement. This could present a problem if your page is already indexed or referenced on a lot of search engines or directories with a .html or .htm extension. However, there may be a way around this requirement if your host allows the use of an .htaccess file. Details on how to use a .htaccess file are given on the About htaccess & XBitHack companion page.

You may need to use a relative path in your tag. For example,

instead of use
Scripts may be required to have a .cgi rather than a .pl extension. This should not be a concern. It is normally not a problem to just rename the file (and make the corresponding modification to the tag) to comply.

If, when you try and view the page containing the script's tag, a "server error" Web page comes up telling you to contact the system administrator, try using the alternative Perl path. I've seen cases where switching from the Perl 4 path to the Perl 5 path clears this up.

If the script is called with an SSI directive tag, bring up the page with the tag in your browser and view the page source (with Netscape click on "View" on the menu bar and select "Page source") and look in HTML code for the SSI directive tag.

If the SSI directive tag does show up in the page source listing it is not being processed by the server.

If it does not show up the server is trying to execute the script so it is likely bombing during execution due to the way it's configured.
Symptoms and possible causes of problems are outlined on the Bestdam Logger Help page but most of the information would apply to other scripts as well.

You may also have another troubleshooting tool at your disposal. If your host or ISP generates an individual error log for your domain, it will likely contain error messages which will indicate what problems were encountered when the server tried to execute the script. The error logs are typically just ASCII text files so you can ftp them to your local hard-drive and open them using a text editor.