Let's suppose you are building a small database driven website or web-based application which you need search engine covery for. And this isn't the only reason why you would choose to create a web application with search engine friendly URLs. Another important aspect is the usability of this feature. To achieve this, you could use Apache's mod-rewrite module. This would be one way. Another way would be to implement a sort of URL rewrite engine into your web application's engine.
Reasons
Two main reasons should drive you towards rewriting your URLs into friendly URLs. The most important one is so search engines can index your pages, because most search engines do not index dynamic URLs that contain question mark (?) or equal sign (=). The other main reason is usability. By creating friendly URLs for a site you make it usable. If you care to read more about the usability of this case, you can read Adaptive Path's User-Centered URL Design or Adam Baker's theory on How to make URLs user-friendly.
Update: Someone pointed out in the comments that the scripts I wrote initially were flawed. After a few glances over the code, it seems that they were flawed. I fixed the scripts here in the post, so go ahead and look over them again. Also I have prepared a live example of this method to see it in action. Also the files in the example are available for download (ZIP file, 2 Kb).
Implementation
Although the idea of rewriting URLs is now old, and a MUST have in a dynamic web site or web-based application, I still see sites using the index.php?page=X URL structure.
I think every web-based application or site should use a global loading and unloading script. By that I mean that every script that is called by the browser should call the loading script at the beginning and the unloading script at the end. This way you wrap your scripts with a customizable set of operations both at the start and at the end of them.
Using $_SERVER['PATH_INFO'] is the method I rely on when creating friendly URLs. When a request is made to a PHP-enabled server, PHP fills the $_SERVER global array with several variables regarding the server's enviroment and the request made. Among these variables the PATH_INFO one contains a certain part of the URL requested that is between the actual URL path and the query string of the URL. Let me explain with some examples. Let's take this URL for example:
For this request, the URL path would be /path/to/script.php and the query string ?var1=value1&var2=value2. Note that there is not any information that would go in the PATH_INFO variable. But if you take this next example:
For this request, the URL path and query string are the same as the above, but also the PATH_INFO variable contains /foo/bar/.
Planning our URLs
Now that you figured out what would come in your help. Now decide how are you going make your new URLs. Suppose you have this kind of URLs:
http://www.example.com/categories.php?cat_id=3 http://www.example.com/articles.php?art_id=15&page=2
You would want to try and convert them to something like this:
http://www.example.com/categories.php/cat_id/3/ http://www.example.com/articles.php/art_id/15/page/2/
But wait, that is not enough, because many search engines would still not index you properly. The problem is with the .php extension in the URL followed by /. So what you need to do is transform them into the following:
http://www.example.com/categories/cat_id/3/ http://www.example.com/articles/art_id/15/page/2/
To do this, you need to copy or move the categories.php and articles.php scripts to categories and articles. Now you have to tell the server that those 2 files need to be parsed by PHP. You can do this by creating a .htaccess file in your root directory (or in the same directory you want those kind of files to be parsed by PHP). Then you write in that file:
<Files ~ "categories|articles"> ForceType application/x-httpd-php </Files>
Ok, now those 2 files are now parsed by PHP. Give it a try. Try to access http://www.example.com/categories?cat_id=3. It will have the same effect as trying to access categories.php?cat_id=3.
Parsing the URL
How you parse the URL (that is the data contained in $_SERVER['PATH_INFO']) is the most important part. For this you must put the method I am going to present below in the global loading script.
if (isset($_SERVER['PATH_INFO'])) {
$url = substr($_SERVER['PATH_INFO'], 1);
$urlParts = explode('/', $url);
if ($urlParts[count($urlParts) - 1] == '')
array_pop($urlParts);
$urlPartsCount = count($urlParts);
if ($urlPartsCount % 2 != 0) {
$urlPartsCount++;
}
for ($i = 0; $i < $urlPartsCount; $i += 2) {
$_GET[$urlParts[$i]] = $urlParts[$i + 1];
}
}
The next level
Now you have a functional PHP-based friendly URL interpreting engine. You have two choices now: either you hard code every URL in your links to the new pattern (example: /articles/art_id/15/page/2/), either you leave every link intact and build a PHP-based URL rewrite engine. Since this post is named as such, I will continue describing how to achieve this.
It is done by simply by using the preg_replace_callback function. You feed it an array of URL patterns, which you'll have to construct based on every URL in your application you wish to convert to friendly URL, and a replace callback function. First the URL patterns array:
$urlPatterns = array( '~'.preg_quote(BASE_DIR).'([^\.]+)\.php(\?([0-9a-zA-Z]+[^#"\']*))?~i', );
The BASE_DIR constant I used in this example is the base dir of your application, it can be either / or the entire http://www.example.com/ depending on how you wrote your links in the application.
Next step is that you need to catch all the contents of one request before sending it to the browser. You can do this with the help of the output buffering functions builtin the PHP. For more on output buffering read Output Control Functions on PHP.net. First we insert in the global loading script:
ob_start();
And in the global unloading script:
$pageContents = ob_get_contents(); ob_end_clean(); echo preg_replace_callback($urlPatterns,'urlRewriteCallback',$pageContents);
The final task is creating the callback function to handle the replacement of the URLs.
function urlRewriteCallback($match) {
$extra = '';
if ($match[3]) {
$params = explode('&', $match[3]);
if ($params[0] == '') array_shift($params);
foreach ($params as $param) {
$paramEx = explode('=', $param);
$extra .= $paramEx[0].'/'.$paramEx[1].'/';
}
}
return BASE_DIR.$match[1].'/'.$extra;
}
Now you should have a ready-to-work PHP-based URL rewriting engine.
Thinking ahead
Of course you souldn't stop here. You can do a lot of more things to imporve the URLs.
For example you could drop the variables names at a first glance. But you should map all the variables you've got in a request to certain variables in your application. That way you get:
http://www.example.com/categories/3/ http://www.example.com/articles/15/2/
For this to work you must modify the parsing of the PATH_INFO in the loading script. Using a switch on the basename($_SERVER['SCRIPT_NAME']) you can map the URL data into different variables your application uses.
And another good thing to do is to lose the number ids and to replace them with words. For exmaple instead of 3 put the title of the category Web development. This way you will have even more friendly URLs:
http://www.example.com/categories/web-development/ http://www.example.com/articles/rewriting-dynamic-urls/2/
I hope this post helps everyone who reads it. And let's pray together for a web made of pretty URLs. Any comments on my method would be appreciated.
Comments
at 19:06 on 03/Nov/2005
![]()
- 1, 2 - corect
- 3, 4 - read the paragraphs before the codes, it is specified.
- 5 - simply instead of creating a key on a numeric in a database table, create a unique key on a text field.
at 21:28 on 05/Nov/2005
![]()
this is a really nice howto ;-) Nonetheless i am getting everytime i try it out the following error:
PHP Warning: preg_replace_callback() [function.preg-replace-callback]: requires argument 2, 'urlRewriteCallback', to be a valid callback in contentm.php on line 221
So would it be possible for you if u could send me a working example of your code als zip file?
Thank you very much
Ciao
Chris ;-)
at 15:27 on 09/Nov/2005
![]()
#
# $urlPatterns = array(
# '~href="'.preg_quote(BASE_DIR).'([^\.]+).php(\?([0-9a-zA-Z]+[^"\']*))?~',
# );
# ?>
at 15:48 on 09/Nov/2005
![]()
I don't understand what it is to do with those 5 scripts.
1. the first script (16 lines) it is to include to all the pages?
2. the third (3 lines) it is to put in the first script[global loading script]?
3. the fouth (5 lines) it is to include in the end of the page?
4. The five scripts are there one script or we must saved as each one?
Please make an example.
Please help me...
at 21:36 on 27/Nov/2005
![]()
If you have troubles with the scripts, try to do the following: you must define the replacing callback function urlRewriteCallback() before the following line
echo preg_replace_callback($urlOld, 'urlRewriteCallback', $pageContents);
at 02:38 on 06/Dec/2005
![]()
but could you please put a little downloadable sample together so we can see how things work exactly?
anyway, i appreciate your work!
at 06:37 on 11/Dec/2005
![]()
at 22:52 on 06/Jan/2006
![]()
I've seen this solution, to avoid using an extension less file:
Options +MultiViews DirectoryIndex index index.php index.html
MultiViews get the request to "index" and try to load "index.*" if "index" is not found.
at 00:47 on 28/Jan/2006
![]()
Thank You
at 06:08 on 21/Feb/2006
![]()
www.blablabla.com/products/snickers
and i want to redirect it to
www.blablabla.com/?section_id=8
and i also have
www.blablabla.com/products/
and i want to redirect it to
www.blablabla.com/?section_id=4
?
at 17:28 on 05/Apr/2006
![]()
at 07:40 on 28/Apr/2006
![]()
at 16:02 on 28/Apr/2006
![]()
@adam: Yes, you do have to rebuild your sitemap.xml file. But if you already have it dynamicly generated (which is a very good idea), you just add the url replace function at the bottom of the script. Thus it will treat the sitemap file as a normal file and will replace all the URLs into to friendly ones.
This, the whole technique, is especially good for indexing.
at 18:03 on 16/May/2006
![]()
at 00:28 on 19/Jun/2006
![]()
at 12:02 on 21/Jun/2006
![]()
I use win xp and apache.
My error:
Not Found
The requested URL /categories/id/music/ was not found on this server.
Apache/1.3.34 Server at localhost Port 80
please hel me!
Thx!
at 19:57 on 21/Jun/2006
![]()
There's a setting you need to set in the httpd.conf of Apache. Under the
<Directory "path/to/localhost">
tag, you should write
AllowOverride All
at 12:40 on 26/Feb/2007
![]()
I will be very thankful to you.
at 17:06 on 14/Apr/2007
![]()
at 04:09 on 18/Apr/2007
![]()
<Files ~ "test">
ForceType application/x-httpd-php
</Files>
at 04:22 on 18/Apr/2007
![]()
at 19:20 on 19/Apr/2007
![]()
at 18:06 on 16/Oct/2007
![]()
at 10:43 on 17/Oct/2007
![]()
at 04:55 on 24/Dec/2008
![]()
when I move, for example the categories.php scripts to categories then the categories become a down loadable file as I click on the link! but it work ok on my localhost not on the hosting site!
Thanks in advance for your help.
at 08:05 on 19/Mar/2009
![]()
Thanks very much,
Zeck
at 20:34 on 20/Mar/2009
![]()
suppose my link is
index.php?page=OngoingAnime
using your script (which is great).. is rewrites my url to
/index/page/OngoingAnime/
what i want is to remove the index/page ... so that it's only
http://www.mysite.com/OngoingAnime/
instead of http://www.mysite.com/index/page/OngoingAnime/
can you help me plz :(
at 11:21 on 25/Sep/2009
![]()
Thanks for this article.
at 13:04 on 26/Dec/2009
![]()
I need help to rewrite my URL for better promotion.
My link is
http://www.domain.com/cat_sell.php?cid=1
where cid=1 is Agriculture
Now i want url like this
http://www.domain.com/category/Agriculture.html
Plz help.
at 20:03 on 26/Oct/2005
Comment by dantefoxfox
Thanks