Caching is a method of improving server performance by allowing the temporary storage of commonly requested content in a way that allows faster access. This speeds up processing and delivery by reducing some resource-intensive operations.
By creating effective caching rules, content suitable for caching will be stored to shorten response time, save resources and minimize load. Apache provides a variety of caches suitable for accelerating different types of operations. In this guide, we will discuss how to configure Apache 2.4 on Ubuntu 14.04 using its various caching modules.
To complete this tutorial, you need to have the following:
An Ubuntu** server** with a non-root account that can use the sudo
command has been set up, and the firewall has been turned on. Students who don’t have a server can buy it from here, but I personally recommend you to use the free Tencent Cloud Developer Lab for experimentation, and then buy server.
SSL certificate: How to set up this certificate depends on whether you have a domain name that can resolve the server.
If you have a domain name, the easiest way to protect your website is to use Tencent Cloud SSL Certificate Service, which provides free trusted certificates. [Tencent Cloud SSL Certificate Installation Operation Guide] (https://cloud.tencent.com/document/product/400/6814?from=10680) to set.
If you don't have a domain name, it is recommended that you go here register a domain name first. If you just use this configuration for testing or personal use, you can use a self-signed certificate without purchasing a domain name. The self-signed certificate provides the same type of encryption, but there is no domain name verification announcement. Regarding self-signed certificates, you can refer to the two articles Create a self-signed SSL certificate for Apache and How to create a self-signed SSL certificate for Nginx.
Apache can cache content with different levels of complexity and scalability. The project divides these into three groups based on the way the content is cached. The general breakdown is:
A quick glance at the above description may reveal some overlap in the above methods, but using multiple strategies at the same time may help. For example, using key-value storage for SSL sessions and enabling standard HTTP caching for responses allows you to significantly reduce data source load and speed up many content delivery operations on the client side.
Now that you have a broad understanding of each of Apache's caching mechanisms, let's take a look at these systems in more detail.
mod_file_cache
The mod_file_cache
module is mainly used to speed up file access on servers with slow file systems. It provides a choice of two configuration instructions. The purpose of these two configuration instructions is to speed up the process of serving static files by performing some work when the server is started instead of requesting files.
The CacheFile
directive is used to specify the path of the file on the disk that you want to speed up access. When Apache starts, Apache will open the specified static file and cache the file handle, so there is no need to open the file when requested. The number of files that can be opened in this way is limited by the operating system settings.
The MMapFile
directive will also open the file when Apache is started for the first time. However, MMapFile
caches the contents of the file in memory and not just in the file handler. This can improve the performance of these pages, but it has some serious limitations. It does not record the amount of memory it uses, so it may run out of memory. Also note that the child process will copy any allocated memory, which may lead to resource exhaustion faster than you initially expected. This command can only be used with caution.
These instructions are only evaluated when Apache is started. This means that you cannot rely on Apache to get the changes made after startup. Use these files only on static files, these files will not change during the life cycle of the Apache session. Depending on how the file is modified, the server may be notified of the change, but this is not expected behavior and will not always work properly. If you must make changes to the files passed to these instructions, restart Apache after completing the changes.
File cache is provided by the mod_file_cache
module. To use this feature, you need to enable the module.
This module will be installed when running Ubuntu 14.04, but it will be disabled when installing Apache. You can enable the module by typing:
sudo a2enmod file_cache
After that, you should edit the main configuration file to set file caching directives. Enter the following command to open the file:
sudo nano /etc/apache2/apache2.conf
To set the file handle cache, use the CacheFile
directive. This instruction uses a list of file paths, separated by spaces, as shown below:
CacheFile /var/www/html/index.html /var/www/html/somefile.index
After restarting the server, Apache will open the listed files and store their file handles in the cache for faster access.
Conversely, if you want to map several files directly into memory, you can use the MMapFile
directive. Its syntax is basically the same as the last command, because it only needs a list of file paths:
MMapFile /var/www/html/index.html /var/www/html/somefile.index
In practice, there is no need to configure CacheFile
and MMapFile
for files in the same group all, but you can use them in different file sets at the same time.
When you are done, you can save and close the file. Type the following command to check the configuration file syntax:
sudo apachectl configtest
If the last line says Syntax OK
, you can safely restart the Apache instance:
sudo service apache2 restart
Apache will restart and cache file contents or handlers according to the instructions you use.
mod_socache_dbm
, mod_socache_dc
, mod_socache_memcache
, mod_socache_shmcb
mod_authn_socache
, mod_ssl
Key-value caching is more complex than file caching and has more key advantages. Apache's key-value caching is also called shared object caching, which is mainly used to avoid the expensive operations involved in repeatedly setting up client access to content, rather than the content itself. Specifically, it can be used to cache authentication details, SSL sessions, and provide SSL stapling.
**Note: **Currently, *each shared object cache provider has some issues. The references to these issues are outlined below. Please consider these factors when evaluating whether to enable this feature.
The actual caching is done by using one of the shared object caching provider modules. these are:
dbm
database engine, which is a file-based key-value store that uses hashing and fixed-size buckets. This provider has some memory leaks, so for most cases, it is recommended to use mod_socache_shmcb
.Together with the above provider modules, depending on the object to be cached, additional modules will be required. For example, to enable SSL sessions or configure SSL stapling, you must enable mod_ssl
, which will provide SSLSessionCache
and SSLStaplingCache
instructions respectively. Similarly, to set the authentication cache, the mod_authn_socache
module must be enabled so that the AuthnCacheSOCache
directive can be set.
Considering the above errors and warnings, if you still want to configure this type of cache in Apache, please follow the instructions below.
The method used to set up the key-value cache depends on what it will be used for and the provider you use. We will introduce the basics of authentication caching and SSL session caching below.
Currently, there is an error in the authentication cache that prevents parameters from being passed to the cache provider. Therefore, any provider that does not provide default settings will encounter problems.
If you use expensive authentication methods (such as LDAP or database authentication), the authentication cache is useful. If the backend must be hit every time an authentication request is issued, these types of operations can have a significant impact on performance.
Setting up the cache involves modifying the existing authentication configuration (we will not cover how to set up authentication in this guide). Regardless of the back-end authentication method, the modification itself will be roughly the same. We will use mod_socache_shmcb
to demonstrate.
First, type the following command to enable the authn_socache
module and the mod_socache_shmcb
provider module:
sudo a2enmod authn_socache
sudo a2enmod socache_shmcb
Open the main Apache configuration file to specify this shared cache backend for authentication:
sudo nano /etc/apache2/apache2.conf
Internally, add the AuthnCacheSOCache
directive to the top of the file. Specify that shmcb
will be used as the provider. If the previously discussed error preventing option delivery has been fixed when you read this article, you can specify the location and size of the cache. The number is in bytes, so the commented example will result in a 512 KB cache:
AuthnCacheSOCache shmcb
# If the bug preventing passed arguments to the provider gets fixed,
# you can customize the location and size like this
# AuthnCacheSOCache shmcb:${APACHE_RUN_DIR}/auth_cache(512000)
Save and close the file when you are done.
Next, open the virtual host configuration page where authentication is configured. We assume that you are using the 000-default.conf
virtual host configuration, but you should modify it to reflect your environment:
sudo nano /etc/apache2/sites-enabled/000-default.conf
Where you configure authentication, modify the block to add caching. Specifically, you need to add AuthnCacheProvideFor
to tell it which authentication sources to cache, add cache timeout with AuthnCacheTimeout
, and add socache
to the list of AuthBasicProvider
before the traditional authentication method. The result will look like this:
< VirtualHost *:80>
...
< Directory /var/www/html/private>
AuthType Basic
AuthName "Restricted Files"
AuthBasicProvider socache file
AuthUserFile /etc/apache2/.htpasswd
AuthnCacheProvideFor file
AuthnCacheTimeout 300
Require valid-user
< /Directory></VirtualHost>
The example above is for file authentication, which may not benefit from caching. However, when using other authentication methods, the implementation should be very similar. The only substantial difference is that the "file" specification in the example above needs to be replaced with other authentication methods.
Save and close the file. Restart Apache to implement cache changes:
sudo service apache2 restart
The handshake that must be performed to establish an SSL connection can incur significant overhead. Therefore, caching session data to avoid this initialization step for further requests may avoid this penalty. The shared object cache is a perfect place.
If SSL has been configured for the Apache server, mod_ssl
will be enabled. On Ubuntu, this means that the ssl.conf
file has been moved to the /etc/apache2/mods-enabled
directory. This actually already sets up the cache. Inside, you will see some lines like this:
...
SSLSessionCache shmcb:${APACHE_RUN_DIR}/ssl_scache(512000)
SSLSessionCacheTimeout 300
...
This is actually enough to set up the session cache. To test it, you can use the OpenSSL connection client. enter:
openssl s_client -connect 127.0.0.1:443-reconnect -no_ticket | grep Session-ID
If the session IDs in all results are the same, the session cache is operating normally. Press CTRL-C to exit to the terminal.
mod_cache
mod_cache_disk
, mod_cache_socache
The HTTP protocol encourages and provides a mechanism for caching responses on the content delivery path. Any computer that touches the content may cache each item for a certain period of time, depending on the source of the content and the caching strategy specified in the computer’s own caching rules.
The Apache HTTP caching mechanism caches responses based on the HTTP caching strategy it sees. This is a general-purpose caching system that follows the same rules as delivery followed by any intermediate server. This makes the system very flexible and powerful, and allows you to take advantage of the title you should have set on the content (we will show how to do this below).
Apache's HTTP cache is also called "three-state" cache. This is because the content it stores can be in one of three states. It can be fresh, which means it can be served to the client without further inspection, it may be stale, which means the TTL on the content has expired, or it may not be available if the content cannot be found in the cache exist.
If the content becomes stale, the cache can re-verify it by checking the content of the origin on the next request. If it has not changed, it can reset the freshness date and provide the current content. Otherwise, it will fetch the changed content and store it for the length of time allowed by its caching policy.
HTTP caching logic is available through the mod_cache
module. The actual caching is done by a caching provider. Normally, the cache is stored on disk using the mod_cache_disk
module, but the shared object cache can also be obtained through the mod_cache_socache
module.
The mod_cache_disk
module caches on disk, so if you are proxying content from a remote location, generating content from a dynamic process, or just speeding up by caching on a faster disk than the content usually resides on, this is useful . This is the most thoroughly tested provider and should be your first choice in most cases. The cache is not automatically cleaned, so a tool called htcacheclean
must be run occasionally to cache. This can be run manually, set up as a regular cron
job or run as a daemon.
The mod_cache_socache
module caches to one of the shared object providers (the same as discussed in the previous section). This may have better performance than mod_cache_disk
(which shared cache provider to choose). However, it is updated and relies on shared object providers, which have the bugs discussed earlier. Before implementing the mod_cache_socache
option, a thorough test is recommended.
Apache's HTTP cache can be deployed in two different configurations according to your needs.
If the CacheQuickHandler
is set to "on", the cache will be checked as early as possible during the request processing. If the content is found, it will be provided directly without further processing. This means that it is very fast, but it also means that it does not allow processes such as authentication of content. If the content in the cache generally requires authentication or access control, **any unidentified person can access the content (if CacheQuickHandler
is set to "on").
Basically, this simulates a separate cache in front of the web server. If your web server requires any kind of condition checking, authentication or authorization, this will not happen. Apache will not even evaluate the instructions in it<Location>
or<Directory>
Block. Please note that CacheQuickHandler
is set to "on" by default!
If CacheQuickHandler
is set to "off", the cache will be checked later in the request processing sequence. You can think of this configuration as placing the cache between the Apache processing logic and the actual content. This will allow traditional processing instructions to be run before content is retrieved from the cache. Setting this to "off" will make transactions faster in order to process requests more deeply.
In order to enable caching, you need to enable the mod_cache
module and one of the cache providers. As mentioned above, mod_cache_disk
is fully tested and we will rely on it.
On Ubuntu systems, you can enable these modules by typing:
sudo a2enmod cache
sudo a2enmod cache_disk
This will enable caching the next time the server is restarted.
You also need to install the apache2-utils
package, which contains the utility htcacheclean
for reducing the cache when necessary. You can enter the following command to install:
sudo apt-get update
sudo apt-get install apache2-utils
Most cache configuration will be done in a single virtual host definition or location block. However, enabling mod_cache_disk
also enables a global configuration that can be used to specify certain general attributes. Now open the file to view:
sudo nano /etc/apache2/mods-enabled/cache_disk.conf
After removing the comments, the file should look like this:
< IfModule mod_cache_disk.c>
CacheRoot /var/cache/apache2/mod_cache_disk
CacheDirLevels 2
CacheDirLength 1</IfModule>
The IfModule
package tells Apache to worry about these instructions if the mod_cache_disk
module is enabled. The CacheRoot
directive specifies the location on the disk where the cache will be kept. Both CacheDirLevels
and CacheDirLength
help define how to build a limited cache directory structure.
Use the hash value md5
to create the URL being provided as the key for storing data. The data will be organized into directories derived from the starting character of each hash. CacheDirLevels
specifies the number of subdirectories to be created, while CacheDirLength
specifies the number of characters to be used as the name of each directory. Therefore, the hash with the default value b1946ac92492d2347c6235b4d2611184
shown above will be archived in the directory structure b/1/946ac92492d2347c6235b4d2611184
. Normally, you don't need to modify these values, but it is best to know their purpose.
**Note: **If you choose to modify the CacheRoot
value, you must open the /etc/default/apache2
file and modify its HTCACHECLEAN_PATH
value to match your choice. This is used to clean the cache regularly, so it must have the correct location for the cache.
You can set some other values CacheMaxFileSize
and CacheMinFileSize
in this file, which set the file size range in which Apache will submit to the cache, as well as bytes CacheReadSize
and CacheReadTime
, which allow you to send to The client waits and buffers the content before. This option is useful if the content is located in a location other than this server.
Most cache configuration will be done at a more granular level, whether in the virtual host definition or in a specific location block.
Open a virtual host file to follow. We assume that you are using the default file in this guide:
sudo nano /etc/apache2/sites-enabled
In the virtual host block, outside of any location block, we can start to configure some cache properties. In this guide, we assume that we want to turn off CacheQuickHandler
in order to complete more processing. This allows us to obtain more complete caching rules.
We will also take this opportunity to configure cache locking. This is a file locking system that Apache will use when checking in with content sources to see if the content is still valid. During the time that this query is satisfied, if other requests for the same content come in, it will cause other requests for back-end resources, which may cause load peaks.
Setting cache locks for resources during verification will inform Apache that the resources are currently being refreshed. During this time, a warning header indicating its status can be used to provide obsolete resources. We will set the cache lock directory in the /tmp
folder. We will allow the lock for up to 5 seconds before it can be considered valid. These examples come directly from Apache's documentation, so they should be suitable for our purposes.
We will also tell Apache to ignore the Set-Cookie
headers and not store them in the cache. Doing so prevents Apache from accidentally leaking user-specific cookies to other parties. The Set-Cookie
header will be stripped before the header is cached.
< VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
CacheQuickHandler off
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
CacheIgnoreHeaders Set-Cookie
< /VirtualHost>
We still need to actually enable caching for this virtual host. We can use the CacheEnable
instruction to perform this operation. If this is set in the virtual host block, we need to provide the caching method (disk
or socache
) and the request URI that should be cached. For example, to cache all responses, you can set it to CacheEnable disk /
, but if you only want to cache responses under the /public
URI, you can set it to CacheEnable disk /public
.
We will take a different path by enabling caching in a specific location block. Doing so means that we don't have to provide the URI path of the CacheEnable
command. Any URI provided from that location will be cached. We will also turn on the CacheHeader
directive so that our response headers will indicate whether the cache is used to serve the request.
Another instruction we set is CacheDefaultExpire
. If neither the header file Expires
nor the header file Last-Modified
is set in the content, we can set an expiration time (in seconds). Similarly, we will set CacheMaxExpire
to limit the storage time of the project. We will set the CacheLastModifiedFactor
so that Apache can create an expiration date if it has a Last-Modified
date but no expiration. This factor is multiplied by the time since modification to set a reasonable expiration time.
< VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
CacheQuickHandler off
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
CacheIgnoreHeaders Set-Cookie
< Location />
CacheEnable disk
CacheHeader on
CacheDefaultExpire 600
CacheMaxExpire 86400
CacheLastModifiedFactor 0.5</Location></VirtualHost>
After configuring everything you need, save and close the file.
Enter the following to check the entire configuration for syntax errors:
sudo apachectl configtest
If no errors are reported, restart the service by typing the following command:
sudo service apache2 restart
In the above configuration, we configured HTTP cache, which relies on HTTP headers. However, the content we serve does not actually have the Expires
or Cache-Control
headers required for intelligent caching decisions. To set these headers, we need to utilize more modules.
The mod_expires
module can set the Expires
title and the max-age
option in the Cache-Control
title. The mod_headers
module can be used to add more specific Cache-Control
options to further adjust the caching strategy.
We can enable these two modules by typing:
sudo a2enmod expires
sudo a2enmod headers
After enabling these modules, we can directly modify our virtual host file again:
sudo nano /etc/apache2/sites-enabled/000-default.conf
The mod_expires
module only provides three instructions. In ExpiresActive
, turn expiration processing in a specific context by setting it to "on". The other two instructions are very similar to each other. The ExpiresDefault
directive sets the default expiration time, and the ExpiresByType
sets the expiration time according to the MIME type of the content. Both of these will set the "max-age" of Expires
and Cache-Control
to the correct value.
Two different syntaxes can be used for these two settings. The first is a simple "A" or "M" followed by a few seconds. This sets an expiration relative to the last time the content was last "accessed" or "modified". For example, both will expire 30 seconds after the content is accessed.
ExpiresDefault A30
ExpireByType text/html A30
Another syntax allows more detailed configuration. It allows you to use units other than seconds that are easier for human calculations. It also uses the full term "access" or "modify". The entire expired configuration should be kept in quotation marks, as shown below:
ExpiresDefault "modification plus 2 weeks 3 days 1 hour"
ExpiresByType text/html "modification plus 2 weeks 3 days 1 hour"
For our purposes, we only set the default expiration time. We will start with a setting of 5 minutes so that if we make a mistake while familiarizing it, it will not be stored on our customer’s computer for a long time. When we are more confident in our ability to choose policies that suit our content, we can adjust it to a more positive approach:
< VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
CacheQuickHandler off
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
CacheIgnoreHeaders Set-Cookie
< Location />
CacheEnable disk
CacheHeader on
CacheDefaultExpire 600
CacheMaxExpire 86400
CacheLastModifiedFactor 0.5
ExpiresActive on
ExpiresDefault "access plus 5 minutes"</Location></VirtualHost>
This will set our Expires
title to five minutes in the future and set Cache-Control max-age=300
. In order to further improve the caching strategy, we can use the Header
directive. We can use the merge
option to add other Cache-Control
options. You can call this option multiple times and add other strategies you want. For our example, we only need to set "public" so that other caches can ensure that they are allowed to store copies.
To set ETags
as static content (for verification) on our site, we can use the FileETag
command. This applies to static content. For dynamically generated content, your application will be responsible for correctly generating ETags
.
We use this directive to set the attributes that Apache will use to calculate Etag
. This can be INode
, MTime
, Size
, or All
, depending on whether we want to modify the ETag
(every time the file is modified by the inode
, its modification time changes, and its size changes , Or all of the above). You can provide multiple values, and you can modify the inherited settings in the sub-context by adding +
or -
before the new setting. For our purposes, we will only use "all" so that all changes are registered:
< VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
CacheQuickHandler off
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
CacheIgnoreHeaders Set-Cookie
< Location />
CacheEnable disk
CacheHeader on
CacheDefaultExpire 600
CacheMaxExpire 86400
CacheLastModifiedFactor 0.5
ExpiresActive on
ExpiresDefault "access plus 5 minutes"
Header merge Cache-Control public
FileETag All
< /Location></VirtualHost>
This will add "public" (separated by commas) to any existing values of Cache-Control
and include the static content of ETag
.
When finished, save and close the file. Type the following command to check the changed syntax:
sudo apachectl configtest
If no errors are found, restart the service to implement the caching strategy:
sudo service apache2 restart
Because there are so many options, configuring caching with Apache seems like a daunting task. Fortunately, it is easy to start simple and then grow as you need more complexity. Most administrators do not need every type of cache.
When configuring the cache, keep in mind the specific problem you are trying to solve to avoid getting lost in different implementation choices. Most users will at least benefit from setting the header. If you want to proxy or generate content, setting up HTTP caching may help. If you use a backend provider, the shared object cache is useful for specific tasks such as storing SSL sessions or authentication details. File caching may be limited to files with slower system speeds.
For more related tutorials on configuring content caching, please go to Tencent Cloud + Community to learn more.
Reference: "How To Configure Apache Content Caching on Ubuntu 14.04"
Recommended Posts