WordPress Default .htaccess Rules Explained


Monster with magnifying glass May 2013
Maybe you've noticed the mysterious code inserted in a file in WordPress' root installation folder and wondered what its purpose is. In short, it allows permalinks to work. If that's not an adequate explanation, and you'd like to really understand what every character means and what its purpose is, this article is for you. We're going to put this code under a big magnifying glass. In the end, you'll understand everything there is to know about this block of code.

During installation, WordPress enters the following rules into the .htaccess file in its installation root. This is assuming you are using Apache server software. This is the most commonly used server software. Other software uses different methods to achieve permalink functionality. These other methods are outside the scope of this article. These .htaccess rules are not inserted, however, if you chose the "default" (none) permalink structure. Some themes or plugins may cause additional lines to be entered. These are the rules inserted upon a fresh installation. If there is not a .htaccess file in the installation folder, one is created:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
Many of the above code lines are links to a detailed explanation of the respective line. (Rollover to identify links) The section headings that each link leads to is a link to the official reference at apache.org. If your WordPress installation is not in your site's public html root, the RewriteBase and final RewriteRule will include the path to your WordPress installation folder.

The main purpose of this rule set is to allow arbitrary paths (permalinks) be sent to the proper WordPress script without causing file/path not found 404 errors. If your permalinks work but you see no .htaccess file on your server, it is because it is hidden. The Linux operating system treats any filename beginning with a dot (.) as a hidden file. Your FTP client has a setting somewhere to enable viewing of hidden files. Select that setting and .htaccess will be visible.

The first and last lines starting with # are comments so you know how these rules ended up in your .htaccess file. WordPress core code uses these comments to locate the related code when it needs to alter the rules for various reasons. They do nothing on their own and serve no other purpose. Any line where the first character is a # is a comment.

The <IfModule mod_rewrite.c> and </IfModule> lines work as a pair and prevent errors if your server does not have the mod_rewrite.c module installed. If the module is missing, the lines in between are ignored. Even if you're sure your server will always have this module available, you may wish to leave these lines in place so if the module becomes unavailable for any reason, your site will not break.
 RewriteEngine On
This line tells the server to process all following lines beginning with Rewrite* up to the next RewriteEngine line as one logical group. You could set the engine to Off to not process any rules. During testing, this is much easier than commenting out every rule.
 RewriteBase /
This line defines the base from which all relative references are taken. A lone / means the base is your site's public html root. The actual value will be the path to the installation folder for WordPress. If your blog index page is accessed by http://example.com/myblog/index.php, the line would read RewriteBase /myblog/.
 RewriteRule ^index\.php$ - [L]
This rule prevents any reference to index.php from being processed farther down as a permalink, because it's not one. This will make more sense in a while. As this is the first rewrite rule, with no conditions above it, this rule is always processed. The rule basically says if the text index.php matches the request portion after the base definition, pass the request along unchanged and do not process any more rules. Let's take a closer look.

All RewriteRule directives have two arguments and an optional flag setting. The first argument is a regexpLink is to outside of this site (regular expression) to attempt to match against the request. The second argument is what to rewrite if there is a match of the first argument. In our case, [L] is a "Last" flag telling the rewrite engine to not process any more rules if there's a match to the first argument. The second argument in our case is -, which is a special character telling the engine to not actually rewrite anything, to only do what the flags indicate if there's a match.

The first argument ^index\.php$, is a regexp, where several characters have special meaning. In this case, they are ^\.$ All other characters are to be literal matches to the request above the base definition. So the regexp index alone means the request must have the corresponding text index somewhere in its content. The ^ character means the match must start at the beginning of the request. Since our base is / (root), ^index means either match a file or folder in root that begins with index. So indexfolder/file.php would match this example regexp fragment. So would indexfile.php. However, sindex.php would not match, since the i is not the first character. In order to match, index has to be the first 5 characters.

In a regexp, the dot . means match any character exactly once. Since we want to only match an actual dot, not any character, we indicate so with a \ backslash preceding the dot. So the regexp ^index. (with a dot but without the \) will match any character after the x, such as indexQmore.php, but ^index\. will only match something with an actual dot after the x, like index.more.php. So now, ^index\.php means the request must start with index.php, such as index.php.moretext. However, the $ means match the very end of the request. So now we have the full ^index\.php$ where the only possible match is index.php. Nothing else will match, index.php.moretext will not match. pre.index.php will not match.
up to full script
 RewriteCond %{REQUEST_FILENAME} !-f
engraving of scribe RewriteCond means a condition must be true in order for the next RewriteRule to be processed. %{REQUEST_FILENAME} is a variable set by the server to contain the request URL, not just a filename as it may appear. The -f flag without the ! means the condition is true if the first argument resolves to a valid file. The ! negates the flag, so now with the full !-f the condition is true if the first argument does NOT resolve to a valid file. Since permalinks should not point to any valid system filename, a typical permalink will cause this condition to be true. A reference to a valid server file will cause this condition to be false.
 RewriteCond %{REQUEST_FILENAME} !-d
This is similar to the first condition, except now we're checking for valid directories (the -d flag) instead of files. Both conditions must be true for the next rewrite rule to be processed. The following rule will only be processed if the request is not a valid file and it is not a valid folder either. If the request is either a valid folder and/or it's a valid file, the following rule is not processed and the request is passed on unchanged, meaning the file is served normally as though there was no rewrite rule in place.
 RewriteRule . /index.php [L]
This rule is essentially only processed if the request is some sort of permalink. Any other valid file system path is passed on without change. Once again, there are two arguments and a flag. The [L] flag once again means do not process any other rules after this one. There are often no other rules after this, but it is placed here just to be safe. The dot . means match any one character. The /index.php means replace the entire original request with /index.php. If your WordPress installation is above your public html root, you will see the actual folder path here as well. This rule basically says send any permalink requests to index.php for further processing by WordPress. WordPress gets the original request from a different variable, so it doesn't matter if it gets rewritten here, WordPress will still know what the original request was.

Though the request is sent to the new location (index.php), the user's browser continues to display the original request in its address bar. This is by design for internal references. External references (URLs that include the http://www.domain.com portion) result in a 302 server response (temporary redirect) and the address bar changes.

The / preceding index.php means this is not a relative link, so the initial RewriteBase / directive does not affect the resulting path. If we omitted the / and just had index.php, we would have a relative link, so the base would be applied before it. Since the base is /, we end up with the same file path either way in this particular case.

Why do we match just one character with the dot (.) argument? Shouldn't we match the entire request by using .*? [In regexp, * means repeat the previous match (with the . meaning match anything) any number of times, including 0 times. Thus .* will match the entire request, no matter what it is.] We could. But there's no point. Since the match is not in parenthesis, and there is no $1 style back reference in the /index.php parameter back to any such parenthesis, this means no forward reference is used and the entire request is discarded, to be replaced in its entirety with the next argument, /index.php. It's a bit more efficient to match any one character than all of them. Since we are required to provide some kind of regexp, a simple dot is the most expedient.

If the original request is always discarded, and previous RewriteCond lines determine if the rewrite is performed, what is the point of requiring a regexp then? In this case, it serves no purpose, thus the dot . is the most expedient regexp we can provide. But in many other situations, the regexp is used to match particular portions of the request. Those portions can then be back referenced in the rewrite parameter to feed useful information from the original request to the new rewritten page. The RewriteRule functionality is much more powerful than what we need here, but it's getting off topic to discuss in any more detail. That's for another day.

Now that we see how all permalink requests get sent to index.php, the reason for earlier line RewriteRule ^index\.php$ - [L] can be more clearly seen. Rewritten requests are also evaluated by .htaccess rules just as a fresh request would be. The first RewriteRule line keeps the rewritten request (or a basic initial index.php request) from being re-evaluated for possible permalink handling, as it can't possibly be a permalink. Besides preventing an endless loop, it's just more efficient.

If not for these rules, all permalinks would throw a 404 error before WordPress ever got a chance to process the links. What was once mysterious gibberish should at least make some sense by now. As with any computer script, once you understand what the various code bits mean and what is done with the information, it's not all that complicated.
up to full script
«Back


Comments, feedback, and questions are always welcome. Email me at JavaScript needs to be enabled .