As of late I started to notice my little VPS server had more difficulty keeping up with the amount of data it has to sent to the webbrowsers. I knew that the amount of requests increased and after some tracking I found out that the average page size also increased.
Keep in mind the average page size is not only the size of the HTML but also adding the external CSS, JavaScript and images. Which could dramatically increase the amount of data the users are downloading. So I started looking in the compression options of Apache 2.0
First I looked on the official Apache page, but as usual the data is probably available but not very user friendly. So after some testing and crashing I found out the following procedure which seems to work fine to enable compression per Virtual Host.
Firstly enable the module that supports compression by executing:
:> ln -s /etc/apache2/mods-available/deflate.load /etc/apache2/mods-enabled/deflate.load
This will instruct Apache to load the needed library (module) for compression using GZip. The second thing you will need to do is add the following lines to every single Virtual Host you want to use compression on.
<Location / >
SetOutputFilter DEFLATE
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
</Location>
This will enable the compression for output (everything sent to the end-user) but not for incoming requests, which in my case is enough compression for right now. It also exludes some browsers that do not support compression.
Now restart or reload your apache by running the statement below and your website should support compression. This will make the loading of pages faster, though the client software needs to decompress the pages from this point on.
:> etc/init.d/apache2 reload
If you have any trouble enabling compression just leave a message and I’ll try and help you where I can.
Ok today I came accross something really weird. I was trying to find information on setting up my own mail server using PostFix. No problem really, just used Google search to find the information.
The strange crashes started happening once I found a page that didn’t contain the information I needed and I hit the back button. Every single time I did this Google crashed my tab with some type of cross scripting warning. Even when IE tried to recover the tab it crashed again. After the second crash IE just said, slightly paraphrased, ‘f*ck it the website keeps crashing go somewhere else instead!’.
So here is the steps to reproduce (as it crashed every single time):
Always fun to see how some javascript can crash a Internet Explorer tab. I am at least presuming it is caused by javascript.
Just yesterday Adobe published a press release that indicated that they will start working with the major search engines, like Google and Yahoo. This to help the search engines with indexing flash pages.
As most developers know that have tried to get their flash pages indexed by search engines this is almost impossible. In the past the search engines were unable to read the content of the flash page. Simply because they don’t understand flash.
For now only the future will tell if this cooperation between Adobe and the search engine will help in getting flash pages in the index. And even better in the index with decent results. But given the dedication of Adobe I have no doubt that it will be a success.
Read the official Adobe press release
Read Matt Cutts reply to the announcement
We all know that search engines have strange quirkes when it comes to filtering the indexes they have. Well very recently it came to my attention that Google has added some new extensions to the filter list.
We already new that files with .exe, .dll and .lib were being filtered from the search results. Which I think is a good thing, as it protects the visitors from potential harm. But just a few days back I got word that Google is now also banning or blocking pages ending on .0.
Some examples are:
After some chatter about the issue around the internet and blogs Matt Cutts wrote a quick entry in his blog as to why they have been removed from the search results. Read it at http://www.mattcutts.com/blog/dont-end-your-urls-with-exe/.
So for know try to avoid ending Url’s with .0 or any of the already known blocked extensions.
I’ve been working on a demo website called MovServDex for quite some years now. I’m calling it a demo website, but it’s really a fully featured website on TV shows and movies. In the latest version I have decided to add a search engine. In this post I’ll shed some light on how you can create a PHP script that will ‘crawl’ the web for pages.
Before I continue please note that this is not meant to be a replacement for a real search engine like Google. But it may be useful for you to use on your own website.