Apr 7, 2023

HTaccess

htaccess is a very ancient configuration file that controls the Web Server running your website, and is one of the most powerful configuration files you will ever come across. .htaccess has the ability to control access/settings for the HyperText Transfer Protocol (HTTP) using Password Protection, 301 Redirects, rewrites, and much much more. This is because this configuration file was coded in the earliest days of the web (HTTP), for one of the first Web Servers ever! Eventually these Web Servers (configured with htaccess) became known as the World Wide Web, and eventually grew into the Internet we use today.

Table of Contents

  1. Introduction
    1. Htaccess - Evolved
    2. AskApache Htaccess Journey
    3. What Is .htaccess
      1. Creating Htaccess Files
      2. Htaccess Scope
    4. Htaccess File Syntax
    5. Htaccess Directives
    6. Main Server Config Examples
    7. Example .htaccess Code Snippets
      1. Redirect Everyone Except IP address to alternate page
      2. When developing sites
      3. Fix double-login prompt
      4. Set Timezone of the Server (GMT)
      5. Administrator Email for ErrorDocument
      6. ServerSignature for ErrorDocument
      7. Charset and Language headers
      8. Disallow Script Execution
      9. Deny Request Methods
      10. Force "File Save As" Prompt
      11. Show CGI Source Code
      12. Serve all .pdf files on your site using .htaccess and mod_rewrite with the php script.
      13. Rewrite to www
      14. Rewrite to www dynamically
      15. 301 Redirect Old File
      16. 301 Redirect Entire Directory
      17. Protecting your php.cgi
      18. Set Cookie based on Request
      19. Set Cookie with env variable
      20. Custom ErrorDocuments
      21. Implementing a Caching Scheme with .htaccess
      22. Password Protect single file
      23. Password Protect multiple files
      24. Send Custom Headers
      25. Blocking based on User-Agent Header
      26. Blocking with RewriteCond
      27. .htaccess for mod_php
      28. .htaccess for php as cgi
      29. Shell wrapper for custom php.ini
      30. Add values from HTTP Headers
      31. Stop hotlinking
    8. Example .htaccess Files
    9. Advanced Mod_Rewrites
      1. Directory Protection
      2. Password Protect wp-login.php
      3. Password Protect wp-admin
      4. Protect wp-content
      5. Protect wp-includes
      6. Common Exploits
      7. Stop Hotlinking
      8. Safe Request Methods
      9. Forbid Proxies
      10. Real wp-comments-post.php
      11. HTTP PROTOCOL
      12. SPECIFY CHARACTERS
      13. BAD Content Length
      14. BAD Content Type
      15. Missing HTTP_HOST
      16. Bogus Graphics Exploit
      17. No UserAgent, Not POST
      18. No Referer, No Comment
      19. Trackback Spam
      20. Map all URIs except those corresponding to existing files to a handler
      21. Map any request to a handler
      22. And for CGI scripts:
      23. Map URIs corresponding to existing files to a handler instead
      24. Deny access if var=val contains the string foo.
      25. Removing the Query String
      26. Adding to the Query String
      27. Rewriting For Certain Query Strings
      28. Modifying the Query String
    10. Best .htaccess Articles
      1. .htaccess for Webmasters
      2. Mod_Rewrite URL Rewriting
      3. 301 Redirects without mod_rewrite
      4. Secure PHP with .htaccess
      5. .htaccess Cookie Manipulation
      6. .htaccess Caching
      7. Password Protection and Authentication
      8. Control HTTP Headers
      9. Blocking Spam and bad Bots
      10. PHP htaccess tips
      11. HTTP to HTTPS Redirects with mod_rewrite
      12. SSL in .htaccess
      13. SetEnvIf and SetEnvIfNoCase in .htaccess
      14. Site Security with .htaccess
      15. Merging Notes
    11. My Favorite .htaccess Links
    12. Htaccess Directives
    13. Htaccess Variables
    14. Htaccess Modules
    15. Htaccess Software
    16. Technical Look at .htaccess
      1. Per-directory configuration structures
      2. Command handling
        1. mod_autoindex
        2. mod_rewrite
      3. Side notes --- per-server configuration, virtual servers, etc.
      4. Litespeed Htaccess support

For Help see HTaccess.guru

Aug 17, 2016

Htaccess Guide

htaccess is a very ancient configuration file that controls the Web Server running your website, and is one of the most powerful configuration files you will ever come across. To learn how to take htaccess to the next level, beyond what you thought possible, memorize this: Htaccess Guide





Common Htaccess error, more at

What Is .htaccess ^

Specifically, .htaccess is the default file name of a special configuration file that provides a number of directives (commands) for controlling and configuring the Apache Web Server, and also to control and configure modules that can be built into the Apache installation, or included at run-time like mod_rewrite (for htaccess rewrite), mod_alias (for htaccess redirects), and mod_ssl (for controlling SSL connections).
Htaccess allows for decentralized management of Web Server configurations which makes life very easy for web hosting companies and especially their savvy consumers. They set up and run "server farms" where many hundreds and thousands of web hosting customers are all put on the same Apache Server. This type of hosting is called "virtual hosting" and without .htaccess files would mean that every customer must use the same exact settings as everyone else on their segment. So that is why any half-decent web host allows/enables (DreamHost, Powweb, MediaTemple, GoDaddy) .htaccess files, though few people are aware of it. Let's just say that if I was a customer on your server-farm, and .htaccess files were enabled, my websites would be a LOT faster than yours, as these configuration files allow you to fully take advantage of and utilize the resources allotted to you by your host. If even 1/10 of the sites on a server-farm took advantage of what they are paying for, the providers would go out of business.

My .htaccess files are being ignored.

This is almost always due to your AllowOverride directive being set incorrectly for the directory in question. If it is set to None then .htaccess files will not even be looked for. That is a good thing. If you have access to edit the httpd.conf, you should not use .htaccess files, ever. If your customers do need support for .htaccess, make sure that AllowOverride is set to something sensible (i.e.: Not All). Be certain it covers the directory you are trying to use the .htaccess file in. This is normally accomplished by ensuring it is inside the proper Directory container.

You can tell if this is your problem by adding nonsense text to your .htaccess file and reloading the page. If you do not get a server error, then Apache httpd is not reading your .htaccess file.

Mar 4, 2009

What is CGI

A lot of people have Web pages but most feel that CGI scripts are "over their head". Nonsense! If you know basic HTML and know how to use an FTP program like WS_FTP to transfer files, chances are you can be using a CGI script on your Web pages in about 15 minutes. With so many free CGI scripts available (including Bestdam Logger Lite) you are really short-changing yourself if you are not using the CGI capabilities offered by your ISP or host provider to use CGI scripts.

While there are a lot of "CGI Tutorial" pages out there, most deal with how to write CGI scripts. For those who just want to know how to use CGI scripts, information is pretty scarce. That's why this page was created.

What Is CGI ?


There are a number of different methods Web developers can use to enhance the content of Web pages over and above what simple HTML provides. Most of these methods involve writing little programs or routines using one scripting language or another. The are two basic differences with these methods:

Where is the script code located ? Where is the script code executed ?
The following table summarizes to the above two points for the various methods:
Method Where is the script
code located ? Where is the script
code executed ? Required HTML
file extensions CGI In files in the CGI-BIN
directory on the server. On the server. .shtm or .shtml PHP, ColdFusion, ASP Embedded in the
HTML document. On the server. .php .cfm .asp Javascript Embedded in the
HTML document. On the user's PC
by their browser. n/a Java In files on
the server. On the user's PC
by their browser. n/a
Note: Java is not a scripting language. Java files are pre-compiled "applets". The applet files are stored on the server and downloaded by the browser for execution. Even though the script code for PHP, ColdFusion, and ASP is embedded in the HTML code, it's not visible at the browser (when using View/Source). Before sending the page to the browser, the server strips out the script code, executes it, and puts in its' place the results of executing that code. For example, a script command to return the current date will be stripped out and the text of the current date will be put in its' place and in the HTML that's sent to the browser. As such, the location of the script code embedded in the HTML is the position of the execution results on the Web page.

The process is somewhat the same with CGI scripts. CGI is utilized by placing an appropriate HTML tag (called an SSI directive tag) in your HTML code. (The author of the script you wish to use should provide you with the appropriate HTML tag needed to run that script.) When the page is requested by a browser the server reads the tag (and strips it out), executes the server-located script file that's specified by the tag, and puts in the tag's place the results of the execution of the script file. A common example is a hit counter script. The script execution increments the counter and the text of the resulting count is put in the HTML that's sent to the browser so that it appears on the page in the same place where the SSI diective tag was located.

If you've ever looked at your browser's settings, you probably seen check boxes or radio buttons to enable or disable Javascript and Java but haven't seen anything for CGI or PHP. That's because Javascript and Java are executed by the browser (or not, if you disable them). Your browser doesn't know anything about CGI or PHP. It just gets pure HTML from the server after the scripts are executed.

The embedded script method (PHP, ColdFusion, ASP) is mainly used by developers writing "front end" web pages that will access "back end" databases (i.e. client/server Web applications). The big advantage of the CGI method is that the scripts are stored in files and there are literally thousands of freely available scripts already written and ready for you to download and use on your Web site. This means that you don't have to learn a scripting language in order to get the benefits of scripts. Someone has already done the work for you.

You've undoubtedly visited Web pages and seen "cgi-bin" appear in the location line of your browser. CGI stands for "Common Gateway Interface". When you see that "cgi-bin" appear on the location line, you probably executed a CGI script on the server when you requested the page.

Two of the methods shown in the above table have the code executed by the server. But how does the server know to look for a tag which calls a CGI script or to look for embedded script code in a PHP page? It's done using different extensions when naming HTML files. If a browser requests a page (an HTML file) with a .shtml extension, the Web server knows it should "parse" (i.e. look through) the page for a tag which calls a CGI script and execute that script before sending the page to the browser. If the requested page has a .php extension, it knows to look for and execute any embedded PHP code it finds in the page before sending it to the browser.


CGI and Perl

You will often see the term "Perl" used with the term "CGI". The two are NOT the same. CGI programs, or scripts, can be written in a variety of computer languages, including C. CGI is the process by which scripts are run. Perl is the most common language used for writing CGI scripts, and for very good reason. (See the Messin' Around with Perl section below).

Because Perl has its' roots in UNIX, many people think that Perl CGI scripts cannot be used on Windows NT Web servers. Not true! Perl CGI scripts can not only run on UNIX and NT servers, but with a little tweaking for "AppleScript", many can run on Macintosh servers as well.


Your CGI

Most ISPs that offer Web space and Website "hosting" companies support the use of CGI scripts. It is so common in fact, that if your ISP or host provider doesn't offer it, you should consider taking your business elsewhere. The two questions that need to be answered are:
1. Do I have the capability of running my own CGI scripts ?

2. Does my CGI capability include support for SSI (Server Side Includes) ?

Note: Don't confuse SSI with SSL (the Secure Socket Layer protocol used with browsers), they're two entirely different things.
If the answer to both of these questions is "Yes", you're good to go. You can run most of the scripts available on the Web. There are some scripts that don't require SSI but a lot do so having SSI support will allow you to run more scripts. However, if you don't have it you can still run some scripts. The documentation (readme file) should come with the script and state whether it requires SSI support or not.

"Server-Side Includes" are just that, commands (aka "directives") to the Web server to include some information the server has in the displayed Web page. A common use of SSI is to display the current date and time on a Web page. These commands are enclosed within HTML comment tags () in a Web page so the browser ignores them. These comment tags with an enclosed server command are the "SSI directive tags" that are mentioned below. When these SSI directive tags are used with scripts, the "...information the server has..." is whatever output was generated by the execution of the script. The script will dictate whether this information is displayed on the Web page (as with a hit counter) or written to some file (as with a logger). As an example, here is the HTML (SSI directive) tag for Bestdam Logger:



Back to your CGI situation. Most Web hosting services and ISP's have a Technical Support section on their Website that may also have a "FAQ" (Frequently Asked Questions) page. Snoop around their support pages and see if they if you can find anything related to CGI and SSI that may answer the above two questions.

If all this talk of Perl, SSI, servers, etc. has you a little confused, here's a diagram to show the inter-relationship of the various components of a typical Internet server. Note that "Apache" and "Sendmail" are like brand names. There are other Web-sever and e-mail-server software packages available.

Big-bucks setup, right? WRONG! You can buy a set of Debian Linux CDs for $15 and it includes the Apache (which has CGI and Perl modules) and Sendmail software. Linux will run on an old Pentium with 32 meg. So if you've got an old system collecting dust and a broadband connection to the Internet, for $15. you can have your own Internet server and eliminate the need for a Web hosting service. Instructions on how to setup a server, and the issues involved with having your own Internet server, are covered at our Beginners Guide To Linux site.

There are both UNIX/Linux and NT versions of Apache. However, most NT servers use IIS (Internet Information Server) which is included as part of the NT Server software. There is also a freeware NT Server e-mail program called Blat that will allow you to send e-mail using scripts. There is a link to the Blat Website on the Bestdam Logger Setup page.

One way to tell if you have CGI capability is if you have a sub-directory (folder) in your root Web directory called cgi-bin. If you do, you very likely do have CGI capability. The only question then is, does your CGI capability include SSI support ? If you didn't find an answer to this on their Website, you will have to check with your host or ISP.



If you don't have a cgi-bin sub-directory that doesn't necessarily mean you don't have CGI capability. It could mean it hasn't been set up.

Normally you cannot simply create the cgi-bin sub-directory yourself. It is a special sub-directory that must be set up by a system administrator. However, I have seen hosts that allow you to create the sub-directory using an FTP program and use their Website "admin" function to enable it. If you are able to create it, how to CHMOD this sub-directory is given at the end of the Transferring Files and Permissions section below.

Note that your host provider or ISP may set up a sub-directory called simply cgi rather than cgi-bin. This is the same thing and you would just need to make the necessary changes to any tags you add to your Web pages to run scripts.

In some cases, hosts or ISPs that do not offer CGI capability with their base package will offer it as an "add-on" or optional service for an additional fee.


Three Steps To Using a Script

Once you've established that you do have CGI and SSI capability, and you've downloaded the script you want to use, there are three basic steps you need to take in order to use the script on your Website:

Set any options that the script may need

Transfer the script's files (the script file itself and any necessary data files) to your Web server and set the permissions

Add the script's HTML tag to the page(s) you want to use the script
Perl scripts typically have a .PL extension, but they may also have a .CGI extension. (Files with other extensions, or no extensions, will likely be data files used by the script.) "Setting options" in scripts is typically just a matter of opening the .PL (or .CGI) file in a text editor and entering values for some of the scripts variables. For example, you may need to enter your e-mail address if the script sends e-mail notifications of some event. Information relating to Step 1 (setting options in scripts) is covered in general in the next section Using Free Scripts Found On The Web.

Note: Some hosts or ISPs may require that scripts have a .cgi extension. It is normally not a problem to just rename the file to comply. If you do so, remember to change the extension in the HTML tag also provided in the script's documentation. Once the script is all set up and ready to go, the next step is to transfer the script files to your Web server and set the proper "permissions" to the files. These permissions are necessary so your Website visitors can access them properly. Step 2 is covered in detail in the Transferring Files and Permissions section below.

That takes care of the script side of things. The final step is to add the appropriate HTML tag to your Web page (HTML file) to call the script and then transferring that updated page to the server. The documentation that came with the script, or the comments in the script file itself, should include the appropriate tag to use. But remember that you may have to modify this tag. Most tags assume your CGI sub-directory is called cgi-bin. If it's called simply cgi, or something else, make the necessary change in the tag. Step 3 is also covered in detail in the Transferring Files and Permissions section below.

The discussion in the next section (Step 1) will be generic so that it applies to most scripts available on the Web. Steps 2 and 3 (transferring files and adding the tag to an HTML page) are covered in greater detail in a later section using the free Lite Edition of Bestdam Logger as an example. However, once you've seen the process in action you can easily apply it to other scripts.

If you're not familiar with loggers, they collect information about those who visit your Website. Bestdam Logger logs date and time, page viewed, visitor IP address, domain and client info, and the page they came from, called the "referrer". This information can be valuable in answering questions such as

Who (via what domain) is visiting my site ? Where are they coming from (i.e. who's sending me traffic) ? What are the peak traffic times ? Which pages are the most popular ? Which browser is most often used to view my site ? Which search engines are "spidering" my site ? Which search keywords did visitors use to find my site ? Knowing search keywords can be helpful in determining which META keywords are effective. If the logger has "multi-page support", and you enable the logging function on all of your pages, you can track the paths visitors took through your site.
Taking the above Internet server diagram and adding another box and a few more lines, you can see how your HTML file completes the process.



Top of page




As mentioned previously, most CGI scripts are written using the Perl language. When a Perl programmer writes a script they may choose to make it freely available to everyone on the Web. However, you should use caution when selecting these free scripts. I have seen many instances where these free scripts do not adequately "lock" files (which is important in a Web environment where multiple people could be viewing the same Web page simultaneously). Some poorly written scripts could also actually pose a security risk by allowing unauthorized access to the server. Unfortunately, there is no easy way for the untrained eye to determine if adequate file locking is used or if the script represents a security hole. And most Websites that offer scripts from third parties for download do not perform any sort of quality control checks. When in doubt, ask your host or ISP to look over the script you want to use. Most would much rather take a couple minutes to evaluate a script than clean up the mess a poorly-written script could cause.

The Files

Perl scripts typically have a .PL extension, but I have also seen them with a .CGI extension. Files with other extensions, or no extensions, will likely be data files used by the script. Serious Perl script programmers will also include a readme file that contains information about the script and how to set it up. This readme file is intended for those who will be using the script and does not have to be transferred to the server with the script and data files. Throughout this section I refer to .PL files. However, the same would apply to .CGI files if your script has that extension instead.

More complex scripts may have more than one .PL file which may require different HTML tags for each one. (It's also possible that one script may "call" another script so that only one tag is needed.) Also, be on the lookout for additional .PL files with names like config.pl or cfg.pl. These are script files where all of the user-settable options are entered and stored. If a file like this is included in the download file, you typically don't have to open the main script file to set options. The main script will refer to this configuration script each time it is executed.

All of these files are typically combined into a single .ZIP file for you to download from a Website.

The Setup

Perl scripts are simply plain old run-of-the-mill ASCII text files. However, instead of containing sentences that make sense to humans, they mostly contain commands that make sense to servers. There is an advantage to this though. Because Perl scripts are ASCII text files, the Perl programmer can also put human-understandable instructions in the script, and many often do, locating this text right at the beginning (top) of the file. It is easy to spot the information that is meant for you to read because the line will start with # character.

The # character is the "comment" character in Perl. Any line that begins with a # does not get executed by the server. (There's one exception to this which you will see shortly.) In addition, the programmer can also put a # character after a script statement to add comments. For example, you could see the following statement in a script which acts as a user-settable option:

$counthits = 1; # 1 = Yes 0 = No
Once you have downloaded and un-ZIPped a script that you would like to try, you should use a text editor like NOTEPAD to open the main, or if found the configuration, .PL file and check the top of the file for any information or setup instructions. If there is a readme file, open that in a text editor and look for setup instructions also.

One key piece of information you should find either in the comments in the script file, or in the readme file, is the HTML tag you need to add to your Web pages to execute the script. Using the file transfer example in the next section, the tag for Bestdam Logger Lite would be:


Note that the provided HTML tag with some scripts may assume you are putting the script in the cgi-bin sub-directory, not a separate sub-directory under it. If this is the case, and you want to put the script in it's own sub-directory just modify the tag. For example, say you downloaded a script called "GigCount" and the tag specified in the script's comments or documentation was:


If you wanted to put the script in it's own sub-directory named "counter" you would simply modify the tag to


CGI scripts that do not require SSI might have a more common type of tag. For instance, a script to take on-line polls may use a link to execute the script. In this case the tag would be something like

Vote here

In addition to a tag, the information near the top of the script file or readme file should also contain instructions for you on what values to use to CHMOD the files (i.e. set the permissions), as well as setting any user-settable options the script may offer.

The very first line of any Perl script is a user-settable option and is always going to be the path to your host's Perl installation, preceded by the characters "#!". This line is commonly referred to as the "shebang". Typical shebangs can be:

#!/usr/bin/perl (often the Perl 4 location) #!/usr/local/bin/perl (often the Perl 5 location) Note that Perl 4 scripts will work with a path to a Perl 5 installation but the reverse may not be true. If you're having problems getting a script to work with the first one, try the second. Note also that this shebang line may not be necessary with Windows NT servers. If you haven't yet done so, now would be a good time to snoop around your host's or ISP's Website Technical Support pages and FAQs looking for anything related to "CGI". There you may find not only the path to their Perl installation (and possibly the version of Perl they have installed), but to their e-mail programs and other paths as well. If you can't verify this information, just leave it at the default value, but verify it with your host or ISP if you run into problems trying to use the script.

If you know how to use the telnet program to access the shell of your UNIX/Linux server, you can use the whereis command to find out the paths to your Perl and sendmail installations. At the shell prompt, simply type in
whereis perl
and
whereis sendmail and the paths will be displayed. (If whereis gives you an error message try using which in its' place.) Note that you will often get multiple paths displayed, some ending with things like /perl5.003 and /sendmail.cf but you are only interested in the paths that end with /perl and /sendmail - i.e. with no extensions. There may even been multiple path listings to these, but that just means there are different versions installed.

You may also be able to find the system path (see the paragraph below) to your root Web directory by using the pwd command (print working directory). Setting options usually involves entering values for script variables. These values can be a '1', 'Y', 'y', 'YES', etc. to enable an option and a '0', 'N', 'n', 'NO', etc. to disable it. Certain user or system information may be needed for some variables. You may be asked to enter path information or an e-mail address. For example, near the top of the Bestdam Logger Lite file you are asked for your e-mail address (so the site visitor data can be e-mailed to you) and the path to your server's e-mail program. The comments in the script or configuration file should clearly indicate what the option is and what the valid optional values are.

Some scripts will ask you to enter the system path. This is not the URL. The system path is the path from the root of the server which is hosting your site and will look something like this:
/usr/local/etc/usersites/(your website identifier)/
You'll have to ask your host or ISP if you don't know what your path is. The problem is that they may not be too quick to give this information up for security reasons. If that's the case, there's not much you can do. You can try asking them to take a look at the script you want to use so they can see how the path is used.

If some of what you read at the beginning of the .PL file doesn't make any sense to you, don't feel like you're doing something wrong. Many Perl programmers will write these comments for other Perl programmers, or worse, for others who are well-versed in UNIX. If you find that the instructions are not clearly written for those without Perl programming or UNIX experience, and there is no accompanying readme file with easy-to-understand instructions, you may want to forget about using that script and find something with clearly written, understandable instructions.

Detailed setup instructions for Bestdam Logger Lite are given on the Setup & Installation page as well as in the readme.txt file contained in the .ZIP download file available on the Features & Download page. However, the top of the bdlogger.pl file is heavily commented with instructions for setting the options so you may be able to set them just by opening the file in a text editor and reading the comments.

With everything set up in the script file it's time to transfer the files and, with UNIX/Linux servers, set the permissions (detailed in the next section). The most common mistake people make when using ftp to transfer script files to the server is not using ASCII mode to transfer the files. Be sure to use ASCII mode when transferring the script files ! A script will not work if it is transferred in binary mode. You will see how to do this in the next section.

Troubleshooting

The vast majority of script problems are due to configuration issues. When you consider that there are so many different flavors of UNIX out there, multiplied by the number of possible configuration options, multiplied by the number of Web server software packages out there, multiplied by the configurations options they have, you can see why this is the case. That's not to say you should go running to your host or ISP if your script doesn't work right off. The truth is, trying to solve problems is one of the best learning experiences there is. There are a lot of things you can check, and if you do go to your host or ISP and say "I have checked....." they'll know you have some knowledge of what you're doing and take you seriously.

If you've set up and installed and script and you feel it may not be working, there are certain steps you can take using your browser to try and track down the problem.

Given the large number of possible server configurations, check your hosts/ISPs Website tech support pages to see if any of these CGI restrictions apply:

If you're using a script that requires SSI, the Web page(s) with the SSI directive tag may be required to have a .shtml or .shtm extension. Check with them regarding any such requirement. This could present a problem if your page is already indexed or referenced on a lot of search engines or directories with a .html or .htm extension. However, there may be a way around this requirement if your host allows the use of an .htaccess file. Details on how to use a .htaccess file are given on the About htaccess & XBitHack companion page.

You may need to use a relative path in your tag. For example,

instead of use
Scripts may be required to have a .cgi rather than a .pl extension. This should not be a concern. It is normally not a problem to just rename the file (and make the corresponding modification to the tag) to comply.

If, when you try and view the page containing the script's tag, a "server error" Web page comes up telling you to contact the system administrator, try using the alternative Perl path. I've seen cases where switching from the Perl 4 path to the Perl 5 path clears this up.

If the script is called with an SSI directive tag, bring up the page with the tag in your browser and view the page source (with Netscape click on "View" on the menu bar and select "Page source") and look in HTML code for the SSI directive tag.

If the SSI directive tag does show up in the page source listing it is not being processed by the server.

If it does not show up the server is trying to execute the script so it is likely bombing during execution due to the way it's configured.
Symptoms and possible causes of problems are outlined on the Bestdam Logger Help page but most of the information would apply to other scripts as well.

You may also have another troubleshooting tool at your disposal. If your host or ISP generates an individual error log for your domain, it will likely contain error messages which will indicate what problems were encountered when the server tried to execute the script. The error logs are typically just ASCII text files so you can ftp them to your local hard-drive and open them using a text editor.

Oct 25, 2008

RFuzz: samples

RFuzz: samples: "How’s it used?

RFuzz is really easy to use for just basic HTTP requests and running simple tests. It’s even easier if you follow the REST paradigm.

The following are the examles that come with the gem.
hpricot_pudding.rb

Simple client that lets you search google from the command line. Run it with:

ruby hpricot_pudding.rb 'ruby zed'

and it will return all the results printed out. (requires hpricot)

require 'rubygems' require 'hpricot' require 'rfuzz/session' include RFuzz agent = 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.4) Firefox/1.5.0.4' google = HttpClient.new('www.google.com', 80) r = google.get('/search', :head => {'User-Agent' => agent}, :query => {'q' => ARGV[0], 'hl' => 'en', 'btnG' => 'Google Search'}) if r.http_status != '200' puts 'Wrong Status: #{r.http_status}?' else doc = Hpricot(r.http_body) (doc/:a).each do |link| if link.attributes['class'] == 'l' puts link.attributes['href'] puts ' -- ' + link.children.join end end end

kill_routes.rb

Demonstrates hitting a Ruby on Rails application with randomly generated long and insanely long URIs to see if you can choke the Rails routes system. When running under Mongrel you’ll find that Mongrel’s explicit limit of 512 characters on URIs protects Rails quite well."

Aug 11, 2008

htaccess-file example



http://z.askapache.com/htaccess-example2.html

########################################################################
# SECURITY / ACCESS CONTROL #
# If the web server's AllowOverride allows AUTHCONFIG to be overridden #
########################################################################
#
# Save both .htpasswd and .htgroup files in a directory above "documentroot" directory
# (e.g. not in or below /apache/htdocs) but could be below "serverroot" directory
# (e.g. below /apache).

# This will pop-up a user/password dialog box saying Realm =
AuthName "Restricted Area"

# AuthType is normally basic. Not very secure until "Digest" type becomes prevalent
AuthType basic

# If value of AuthUserFile doesn't begin with a slash, it is treated as
# relative to the ServerRoot (not DocumentRoot!)
AuthUserFile "/userhome/blahBlah/.htpasswd"
AuthGroupFile "/userhome/blahBlah/.htgroup"

# Each line of the user file contains a username followed by a colon, followed by the crypt()
# encrypted password. The behavior of multiple occurrences of the same user is undefined.
# You can generate a password file on your system by typing commands on the OS prompt as follows:
# htpasswd -c Filename username # Creates a password file 'Filename' with 'username'
# # as the first user. It will prompt for the new password.
# htpasswd Filename username2 # Adds or modifies in password file 'Filename' the 'username2'.
#
# Each line of the group file contains a groupname followed by a colon, followed by
# the member usernames separated by spaces. For example, put this on one line in the .htgroup file:
# mygroup: bob joe anne

# This set to off will forward a not-found userid to the next-in-line module for authentication.
# 'On' is the default It is better that way.
#AuthAuthoritative off

# Now, we allow specific users or groups to get in.
# require user joe john mary
require valid-user
require group family friends

# More Authentication related, rarely used
# AuthDBGroupFile
# AuthDBUserFile
# AuthDBAuthoritative
# AuthDBMGroupFile
# AuthDBMUserFile
# AuthDBMAuthoritative
# AuthDigestFile
# AuthDigestGroupFile
# AuthDigestQop
# AuthDigestNonceLifetime
# AuthDigestNonceFormat
# AuthDigestNcCheck
# AuthDigestAlgorithm
# AuthDigestDomain
# Using Digest Authentication


###############################################################################
# From here on, if something is not working as you might expect, try to make sure that
# the corresponding AllowOverride is enabled in , or sections
# of server configuarion files (generally httpd.conf, can be access.conf or srm.conf).
# Allowoverride could be:
# 1. AuthConfig (allows AuthName, AuthUserFile, require etc. in .htaccess file)
# 2. FileInfo (allows AddType, DefaultType, ErrorDocument etc. in .htaccess file)
# 3. Indexes (allows DirectoryIndex, FancyIndexing, IndexOptions etc. in .htaccess file)
# 4. Limit (allows use of allow, deny and order directives which control access by host)
# 5. Options (allows use of options directive in .htaccess file - see below)
# 6. All (allows all of the above in .htaccess file. Rare)
# 7. None (allows none of the above in .htaccess file. Rare)
# Usually, AuthConfig is allowed. Rest is up to the particular web host company.
#
# If you get server errors after putting this file in, try disabling
# each section below one-by-one to see what your web hosting company
# allows (or you can ask them :)
###############################################################################


################### THIS IS IMPORTANT! #####################
# AddHandler allows you to map certain file extensions to "handlers",
# actions unrelated to filetype. These can be either built into the server
# or added with the Action command (see below).
# If you want to use server side includes, or CGI outside
# ScriptAliased directories, uncomment the following lines.

# To use CGI scripts:
AddHandler cgi-script cgi pl

# To use server-parsed HTML files
AddType text/html .shtml
AddHandler server-parsed .shtml

# Example of a file whose contents are sent as is so as to tell the client that a file has redirected.
# Status: 301 Now where did I leave that URL
# Location: http://xyz.abc.com/foo/bar.html
# Content-type: text/html
#
#
#

Fred's exceptionally wonderful page has moved to
# http://xyz.abc.com/foo/bar.html">Joe's site.
#


#
# Server always adds a Date: and Server: header to the data returned to the client,
# so don't include these in the file.
#AddHandler send-as-is asis

# If you wish to use server-parsed imagemap files, use
AddHandler imap-file map

# For content negotiation use
#AddHandler type-map var

# Action lets you define media types that will execute a script whenever
# a matching file is called. This eliminates the need for repeated URL
# pathnames for oft-used CGI file processors.
# Format: Action action-type cgi-script
# Format: Action media/type /cgi-script/location
# Format: Action handler-name /cgi-script/location
#Action cgi-script /cgi-bin/default.cgi

# Redirect [status] ABSOLUTE-path-of-old-url new-url. Default status is temp.
# Status is one of permanent (returns 301), temp (returns 302),
# seeother (returns 303, see other document in same place),
# gone (returns 410, no longer available at all) - Don't specify new-URL
# Here, if the client requests http://myserver/service/foo.txt, it will be told
# to access http://foo2.bar.com/service/foo.txt instead.
#Redirect /service http://foo2.bar.com/service

######################################################################
# If the web server's AllowOverride allows FILEINFO to be overridden #
######################################################################
# CookieTracking, AddType, DefaultType, AddHandler, Action, ErrorDocument
# Redirect, Redirectmatch, RedirectPermanent, RedirectTemp
# AddEncoding, AddCharset, AddLanguage, LanguagePriority, DefaultLanguage


#### Comment it out if UserTrack module is not loaded in the server
#CookieName "woiqatty"
#CookieTracking on

# Tweak mime.types without actually editing it, or make certain files to be certain types.
#AddType application/x-httpd-php3 .phtml
AddType application/x-httpd-php3 .php
AddType application/x-httpd-php3 .php3
AddType application/x-httpd-php3-source .phps
AddType application/x-tar .tgz

# In this directory, default filetype is this one if Server cannot
# otherwise determine from filename extensions.
# Mostly text or HTML - "text/plain", gif images - "image/gif",
# compiled porgrams - "application/octet-stream"
DefaultType text/plain
# DefaultType image/gif
# DefaultType application/octet-stream



# Customizable error response. Three styles:
# 1. Plain Text - the (") marks it as text, it does not get output
#ErrorDocument 500 "The server made a boo boo.
# 2. Local Redirects - e.g. To redirect to local URL /missing.html
#ErrorDocument 404 /missing.html
#ErrorDocument 404 /cgi-bin/missing_handler.pl
# 3. External Redirects (All env. variables don't go to the redirected location)
#ErrorDocument 402 http://some.other_server.com/subscription_info.html


# Mosaic/X 2.1+ browsers can uncompress information on the fly
AddEncoding x-compress Z
AddEncoding x-gzip gz tgz

#Content negotiation directives
#AddLanguage fr .fr
# Just list the languages in decreasing order of preference.
LanguagePriority en fr it

.htaccess examples

Apache configuration file syntax

Apache configuration file syntax: "' Core and mpm syn keyword apacheDeclaration AccessFileName AddDefaultCharset AllowOverride AuthName AuthType ContentDigest DefaultType DocumentRoot ErrorDocument ErrorLog HostNameLookups IdentityCheck Include KeepAlive KeepAliveTimeout LimitRequestBody LimitRequestFields LimitRequestFieldsize LimitRequestLine LogLevel MaxKeepAliveRequests NameVirtualHost Options Require RLimitCPU RLimitMEM RLimitNPROC Satisfy ScriptInterpreterSource ServerAdmin ServerAlias ServerName ServerPath ServerRoot ServerSignature ServerTokens TimeOut UseCanonicalName if s:av < '002000000' syn keyword apacheDeclaration AccessConfig AddModule BindAddress BS2000Account ClearModuleList CoreDumpDirectory Group Listen ListenBacklog LockFile MaxClients MaxRequestsPerChild MaxSpareServers MinSpareServers PidFile Port ResourceConfig ScoreBoardFile SendBufferSize ServerType StartServers ThreadsPerChild ThreadStackSize User endif if s:av >= '002000000' syn keyword apacheDeclaration AcceptPathInfo CGIMapExtension EnableMMAP FileETag ForceType LimitXMLRequestBody SetHandler SetInputFilter SetOutputFilter syn keyword apacheOption INode MTime Size endif"

Jul 14, 2008

SEO and the importance of 503

SEO and the importance of 503: "SEO and the importance of 503

July 14th, 2008

Today I arrived at work to find one of our, externally hosted, clients plummeting down the SERPS for terms that they would normally rank very highly for. As we always employ an ethical approach to our SEO campaigns I was a little bewildered by this sudden drop in rankings and decided to investigate. After a brief search through the Google results, there was the answer in the returned search listings.

Could not connect to database

This was the description on the returned results for our client’s website.

It would seem that when the site was last crawled the database was down and the client’s developer had returned the afore mentioned error message with a 200 OK header. So instead of our nicely optimised pages the search engines found zero relevant content and a server response header that said ‘Yeah sure! This is fine, this is what we want you to see’.

So what should we really be doing if a database connection fails?

We should be looking to our SEO friendly 503 (Service Temporarily Unavalable) header to let the search engines know that there is a problem and to come back later. For details of implementing the 503 header there’s an excellent tutorial at http://www.askapache.com/htaccess/503-service-temporarily-unavailable.html"

Jul 4, 2008

More ways to stop spammers and unwanted traffic

Comment spammers, trackback spam, stupid bots and AVG linkscanner eating into your bandwidth and server resources? Here’s how to put a dent in their activities with a few mod_rewrite rules.

I hate those blogs that send me fake trackbacks and pingbacks. Unfortunately it’s impossible to stop but this morning I figured out a way of stopping some of them.

Look through the log files of your web server for the string ‘ “-” “-”‘. Lots of requests there aren’t there? I found 914 requests yesterday. Those are requests without a USER_AGENT or HTTP_REFERER and almost all of them are suspicious because they weren’t followed by requests for images, stylesheets. or Javascript files. Unfortunately the WordPress cron server also falls into this category so you need to filter out requests from your own server’s IP address.




This morning I checked up on a spam trackback that came in.

http://ocaoimh.ie/2005/03/01/i-am-bored-sites-for-when-youre-bored/all-comments/


I looked through my log files for that IP address and discovered the following:
85.177.33.196 - - [03/Jul/2008:06:40:01 +0000] “GET /2005/02/18/10-more-ways-to-make-money-with-your-digital-cameras/ HTTP/1.0″ 200 36151 “-” “-”
85.177.33.196 - - [03/Jul/2008:07:04:18 +0000] “GET /2007/06/07/im-not-the-only-one-to-love-the-alfa-147/ HTTP/1.0″ 200 44967 “-” “-”
85.177.33.196 - - [03/Jul/2008:08:09:40 +0000] “GET /2005/03/01/i-am-bored-sites-for-when-youre-bored/all-comments/ HTTP/1.0″ 200 410423 “-” “-”
85.177.33.196 - - [03/Jul/2008:08:09:44 +0000] “POST /xmlrpc.php HTTP/1.0″ 200 249 “-” “XML-RPC for PHP 2.2.1″
85.177.33.196 - - [03/Jul/2008:09:00:09 +0000] “GET /2007/10/28/what-time-is-it-wordpress/ HTTP/1.0″ 200 63332 “-” “-” So, the spammer grabs “/2005/03/01/i-am-bored-sites-for-when-youre-bored/all-comments/” at 8:09am and 4 seconds later sends a trackback spam to the same blog post. Annoying isn’t it? The following mod_rewrite rules will kill those fake GET requests dead.
# stop requests with no UA or referrer
RewriteCond %{HTTP_REFERER} ^$
Rewritecond %{HTTP_USER_AGENT} ^$
RewriteCond %{REMOTE_ADDR} !^64\.22\.71\.36$
RewriteRule ^(.*) - [F] Replace “64\.22\.71\.36″ with the IP address of your own server. If you don’t know what it is, look through your logs for requests for wp-cron.php, run ifconfig from the command line, or check with your hosting company.
Here are a few of the requests already stopped this morning:
72.21.40.122 - - [03/Jul/2008:09:59:59 +0000] “GET /2005/04/02/photo-matt-a-response-to-the-noise/ HTTP/1.1″ 403 248 “-” “-”
216.32.81.66 - - [03/Jul/2008:10:00:11 +0000] “GET /2006/12/14/bupa-to-leave-irish-market/ HTTP/1.1″ 403 240 “-” “-”
66.228.208.166 - - [03/Jul/2008:10:03:18 +0000] “GET /2008/05/23/youre-looking-so-silly-wii-fit HTTP/1.1″ 403 212 “-” “-”
216.32.81.74 - - [03/Jul/2008:10:04:52 +0000] “GET /1998/03/22/for-the-next-month-o/ HTTP/1.1″ 403 234 “-” “-”
69.46.20.87 - - [03/Jul/2008:10:06:06 +0000] “GET /2006/10/01/killing-off-php/ HTTP/1.1″ 403 229 “-” “-”
72.21.58.74 - - [03/Jul/2008:10:07:54 +0000] “GET /2005/08/12/thunderbird-feeds-and-messages-duplicates/ HTTP/1.1″ 403 255 “-” “-” Some spam bots are stupid. They don’t know where your wp-comments-post.php is. That’s the file your comment form feeds when a comment is made. If your blog is installed in the root, “/”, of your domain you can add this one line to stop the 404 requests generated:
RewriteRule ^(.*)/wp-comments-post.php - [F,L] Trackbacks and pingbacks almost always come from sane looking user agents. They usually have the blog or forum software name to identify them. Look for “/trackback/” POSTs in your logs. Notice how 99% of them have browser names in them? Here’s how to stop them, and this has been documented for a long time:
RewriteCond %{HTTP_USER_AGENT} ^.*(Opera|Mozilla|MSIE).*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteCond %{REQUEST_METHOD} ^POST$
RewriteRule ^(.*)/trackback/ - [F,L] I’ve been using that chunk of code for ages. It works exceptionally well. This was prompted by a deluge of 40,000 spam trackbacks this site received in one day a few months ago. If you use my Cookies for Comments plugin. Check your browser for the cookie it leaves and use the following code to block almost all of your comment spam:
RewriteCond %{HTTP_COOKIE} !^.*put_cookie_value_here.*$
RewriteRule ^wp-comments-post.php - [F,L] That will block the spammers even before they hit any PHP script. Your server will breeze through the worst spam attempts. It blocked 2308 comment spam attempts yesterday. Unfortunately it also stops the occasional human visitor leaving a comment but I think it’s worth it. Do something different. That’s what you have to do. Place a hurdle before the spammers and they’ll fall. On that note, I shouldn’t really be blogging all this, but almost all these ideas can be found elsewhere already and the spammers still haven’t adapted. Unwanted traffic? What’s that? Surely all visitors are good? Nope, unfortunately not. Robert alerted me to the fact that AVG anti-virus now includes an AJAX powered browser plugin called “Linkscanner” that scans all the links on search engine result pages for viruses and malicious code. Unfortunately that generates a huge number of requests for pages that are never even seen by the visitor. I counted over 7,000 hits yesterday. Thankfully Padraig Brady has a solution. I hope he doesn’t mind if I reprint his mod_rewrite rules here (unfortunately WordPress changes the ” character so you’ll have to change them back, or grab the code from Padraig’s page.) #Here we assume certain MSIE 6.0 agents are from linkscanner
#redirect these requests back to avg in the hope they’ll see their silliness
Rewritecond %{HTTP_USER_AGENT} “.*MSIE 6.0; Windows NT 5.1; SV1.$” [OR]
Rewritecond %{HTTP_USER_AGENT} “.*MSIE 6.0; Windows NT 5.1;1813.$”
RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP:Accept-Encoding} ^$
RewriteRule ^.* http://www.avg.com/?LinkScannerSucks [R=307,L]