HostnameLookups and other DNS considerations
Prior to Apache 1.3, HostnameLookups defaulted to On. This adds latency to every request because it requires a DNS lookup to complete before the request is finished. In Apache 1.3 this setting defaults to Off. If you need to have addresses in your log files resolved to hostnames, use the logresolve program that comes with Apache, or one of the numerous log reporting packages which are available.
It is recommended that you do this sort of postprocessing of your log files on some machine other than the production web server machine, in order that this activity not adversely affect server performance.
If you use any Allow from domain or Deny from domain directives (i.e., using a hostname, or a domain name, rather than an IP address) then you will pay for two DNS lookups (a reverse, followed by a forward lookup to make sure that the reverse is not being spoofed). For best performance, therefore, use IP addresses, rather than names, when using these directives, if possible.
Note that it's possible to scope the directives, such as within a <Location /server-status> section. In this case the DNS lookups are only performed on requests matching the criteria. Here's an example which disables lookups except for .html and .cgi files:
<Files ~ "\.(html|cgi)$">
But even still, if you just need DNS names in some CGIs you could consider doing the gethostbyname call in the specific CGIs that need it.
FollowSymLinks and SymLinksIfOwnerMatch
Wherever in your URL-space you do not have an Options FollowSymLinks, or you do have an Options SymLinksIfOwnerMatch Apache will have to issue extra system calls to check up on symlinks. One extra call per filename component. For example, if you had:
and a request is made for the URI /index.html. Then Apache will perform lstat(2) on /www, /www/htdocs, and /www/htdocs/index.html. The results of these lstats are never cached, so they will occur on every single request. If you really desire the symlinks security checking you can do something like this:
Options -FollowSymLinks +SymLinksIfOwnerMatch
This at least avoids the extra checks for the DocumentRoot path. Note that you'll need to add similar sections if you have any Alias or RewriteRule paths outside of your document root. For highest performance, and no symlink protection, set FollowSymLinks everywhere, and never set SymLinksIfOwnerMatch.
Wherever in your URL-space you allow overrides (typically .htaccess files) Apache will attempt to open .htaccess for each filename component. For example,
and a request is made for the URI /index.html. Then Apache will attempt to open /.htaccess, /www/.htaccess, and /www/htdocs/.htaccess. The solutions are similar to the previous case of Options FollowSymLinks. For highest performance use AllowOverride None everywhere in your filesystem.
If at all possible, avoid content-negotiation if you're really interested in every last ounce of performance. In practice the benefits of negotiation outweigh the performance penalties. There's one case where you can speed up the server. Instead of using a wildcard such as:
Use a complete list of options:
DirectoryIndex index.cgi index.pl index.shtml index.html
where you list the most common choice first.
Also note that explicitly creating a type-map file provides better performance than using MultiViews, as the necessary information can be determined by reading this single file, rather than having to scan the directory for files.
If your site needs content negotiation consider using type-map files, rather than the Options MultiViews directive to accomplish the negotiation. See the Content Negotiation documentation for a full discussion of the methods of negotiation, and instructions for creating type-map files.
In situations where Apache 2.x needs to look at the contents of a file being delivered--for example, when doing server-side-include processing--it normally memory-maps the file if the OS supports some form of mmap(2).
On some platforms, this memory-mapping improves performance. However, there are cases where memory-mapping can hurt the performance or even the stability of the httpd:
On some operating systems, mmap does not scale as well as read(2) when the number of CPUs increases. On multiprocessor Solaris servers, for example, Apache 2.x sometimes delivers server-parsed files faster when mmap is disabled.
If you memory-map a file located on an NFS-mounted filesystem and a process on another NFS client machine deletes or truncates the file, your process may get a bus error the next time it tries to access the mapped file content.
For installations where either of these factors applies, you should use EnableMMAP off to disable the memory-mapping of delivered files. (Note: This directive can be overridden on a per-directory basis.)
In situations where Apache 2.x can ignore the contents of the file to be delivered -- for example, when serving static file content -- it normally uses the kernel sendfile support the file if the OS supports the sendfile(2) operation.
On most platforms, using sendfile improves performance by eliminating separate read and send mechanics. However, there are cases where using sendfile can harm the stability of the httpd:
Some platforms may have broken sendfile support that the build system did not detect, especially if the binaries were built on another box and moved to such a machine with broken sendfile support.
With an NFS-mounted files, the kernel may be unable to reliably serve the network file through it's own cache.
For installations where either of these factors applies, you should use EnableSendfile off to disable sendfile delivery of file contents. (Note: This directive can be overridden on a per-directory basis.)
1. Disable RedirectMatch directives temporarily
All the Apache servers had directives such as:
RedirectMatch /abc/xyz/data http://admin.mysite.com/abc/xyz/data
This was done so administrators who visited a special URL would be redirected to a special-purpose admin server. Since the servers were pretty much serving static pages, and they were under considerable load due to a special event, I disabled the RedirectMatch directives temporarily, for the duration of the event. Result? Apache was a lot faster.
2. Increase MaxClients and ServerLimit
This is a well-known Apache performance optimization tip. Its effect is to increase the number of httpd processes available to service the HTTP requests.
However, when I tried to increase MaxClients over 256 in the prefork.c directives and I restarted Apache, I got a message such as:
WARNING: MaxClients of 1000 exceeds ServerLimit value of 256 servers, lowering MaxClients to 256. To increase, please see the ServerLimit directive.
There is no ServerLimit entry by default in httpd.conf, so I proceeded to add one just below the MaxClients entry. I restarted httpd, and I still got the message above. The 2 entries I had in httpd.conf in the IfModule prefork.c section were:
At this point I resorted to all kinds of Google searches in order to find out how to get past this issue, only to notice after a few minutes that the number of httpd processes HAD been increased to well over the default of 256!
So, lesson learned? Always double-check your work and, most importantly, know when to ignore warnings :-)
Now I have a procedure for tuning the number of httpd processes on a given box:
1. Start small, with the default MaxClients (150).
2. If Apache seems sluggish, start increasing both MaxClients and ServerLimit; restart httpd every time you do this.
3. Monitor the number of httpd processes; you can use something like:
ps -def | grep httpd | grep -v grep | wc -l
If the number of httpd processes becomes equal to the MaxClients limit you specified in httpd.conf, check your CPU and memory (via top or vmstat). If the system is not yet overloaded, go to step 2. If the system is overloaded, it's time to put another server in the server farm behind the load balancer.
Our HTTPD is using some modules as url rewriting, server info, php5, GeoIP and other basic modules. We could optimize much more by using an Apache 2.2.3 Worker and only useful modules or even more delivering static pages and using proxy for dynamic pages. All this depend on your developments and your server usage. Here we will only focus on the Apache Prefork.
Nowadays, it’s important to keep active the KEEPALIVE functionality. This will increase the speed of delivring pages for lot of modern browsers (it’s supported by ie, firefox, safari, opera, etc.). The only thing is to touch a little to the default value. In fact, if your keepalive time out is too big, you will keep an entire apache slot open for a user that is probably gone ! A 4 seconds timeout is enough for delivering a full web page and take care of any network congestion. MaxKeepAliveRequests is used to define the maximum number of request manage by an apache slot during a keepalive session. Except if you have lot of pictures to load on your web pages you don’t really need to have a big value at this state.
As I don’t have lot of memory available on the server I ‘m constraint to decrease drastically the number of running servers from 150 to 60. As I have an apache using approximatly 13Mo of memory (withdraw 3Mo of shared memory), I need approximately 600 Mo of available memory when all the apache child process are running. We have to consider, for our further tuning, that this memory is used. It’s really important in our case to dedicate memory for avoid to swap too much and lost the box in a freeze. you can follow your memory usage by using TOP and looking for your apache/httpd process. (Do a quick “man top” for know more). If you have little more free memory you can take a look to the apache documentation for further tuning.
Our server is often overload, with lot of traffic. When I need to restart the apache, or in case of any crashes the apache server start with only 5 Child server process and will add new one 1 second later, 2 new child 2 second later, 4 new at the third second, etc. It’s really too long when you are in a peak ! So, I configured StartServers for let us start directly with 30 child Server process. That will help us to deliver quickly the clients and minimize the impact of the server restart.
MinSpareServers and MaxSpareServers is used in same way as StartServer. When your apache server isn’t load, there is idle child waiting for connection. It’s not usefull to have all your child still open but, In case of a new peak the best way to minimize its impact on your server is to deliver web pages as quick as possible. So keeping some idle Child Process still waiting for client isn’t so stupid. Furthermore in case of our touchy server we consider to be able to allocate 600Mo of RAM. So, We can use it even if it’s for idle Child Process as we dedicate this RAM for apache. For avoid any module Memory Leak, and having fully available Child I set the MaxRequestPerChild to 1000, that mean that each 1000 request, the child will be kill and Apache Server will spare a new one. You’ll probably have to set this value to a higher number. It’s depend of the structure of your web page. You will have to monitor a little your server after those change for being sure to don’t have too much child kill/spare instead of delivering web pages.
Follow some security issue, we don’t display too much information about our server. As we don’t need the reverse lookup on the client ip, we keep the default value of HostnameLookups to Off and by this way we save some network traffic and server load.
For perform our page generation and save some cpu we use the php extension eaccelerator. Take a look at the documentation for install it.
We dedicate 32Mo of our RAM for eaccelerator (shm_size) and will use it with shared memory and file cache (”shm_and_disk” value for keys, sessions and content variable). (Memory is really useful in our case, because of all the mails, apache log and MySQL disk access that generate too much i/o and slow down considerably all the server). As we don’t change often the php script on the server we don’t need to use the check_mtime functionality. When set to “1″, that will do a stat on the php script for checking of last modification date We don’t need this because we want to save disk access and we don’t have so many updates on the running scripts. We just have to clean the cache directory after an update.
eaccelerator.keys = “shm_and_disk”
eaccelerator.sessions = “shm_and_disk”
eaccelerator.content = “shm_and_disk”