Web cache
A Web cache (or HTTP cache) is an information technology for the temporary storage (caching) of Web documents, such as Web pages, images, and other types of Web multimedia, to reduce server lag. A Web cache system stores copies of documents passing through it; subsequent requests may be satisfied from the cache if certain conditions are met. A Web cache system can refer either to an appliance or to a computer program.[1]
Systems
Web caches can be used in various systems (as viewed from the direction of delivery of Web content):
Forward position system (recipient or client-side)
A forward cache is a cache outside the Web server's network, e.g. on the client computer, in an ISP or within a corporate network.[2] A network-aware forward cache is just like a forward cache but only caches heavily accessed items.[1] A client, such as a Web browser, can also store Web content for reuse. For example, if the back button is pressed, the locally cached version of a page may be displayed instead of a new request being sent to the Web server. A Web proxy sitting between the client and the server can evaluate HTTP headers and choose whether to store Web content.
Reverse position system (content provider or web-server side)
A reverse cache sits in front of one or more Web servers and Web applications, accelerating requests from the Internet, reducing peak Web server load. A content delivery network (CDN) can retain copies of Web content at various points throughout a network. A search engine may also cache a website; it provides a way of retrieving information from websites that have recently gone down or a way of retrieving data more quickly than by clicking the direct link. Google, for instance, does so. Links to cached contents may be found in Google search results.
Cache-control
HTTP defines three basic mechanisms for controlling caches: freshness, validation, and invalidation.[3]
- Freshness
- allows a response to be used without re-checking it on the origin server, and can be controlled by both the server and the client. For example, the Expires response header gives a date when the document becomes stale, and the Cache-Control: max-age directive tells the cache how many seconds the response is fresh for.
- Validation
- can be used to check whether a cached response is still good after it becomes stale. For example, if the response has a Last-Modified header, a cache can make a conditional request using the If-Modified-Since header to see if it has changed. The ETag (entity tag) mechanism also allows for both strong and weak validation.
- Invalidation
- is usually a side effect of another request that passes through the cache. For example, if a URL associated with a cached response subsequently gets a POST, PUT or DELETE request, the cached response will be invalidated.
Many CDNs and manufacturers of network equipment have replaced this standard HTTP cache control with dynamic caching.
Legal issues
In 1998, the DMCA added rules to the United States Code (17 U.S.C. §: 512) that relinquishes system operators from copyright liability for the purposes of caching.
Web caching software
The following is a list of dedicated Web caching server software:
Name | Operating system | Forward mode |
Reverse mode |
License |
---|---|---|---|---|
Apache HTTP Server | Windows, OS X, Linux, Unix, FreeBSD, Solaris, Novell NetWare, OS/2, TPF, OpenVMS and eComStation | No | Apache License 2.0 | |
aiScaler Dynamic Cache Control | Linux | Proprietary | ||
ApplianSys CACHEbox | Linux | Proprietary | ||
Blue Coat ProxySG | SGOS | Yes | Yes | Proprietary |
Nginx | Linux, BSD variants, OS X, Solaris, AIX, HP-UX, other *nix flavors | No | Yes | 2-clause BSD-like |
Microsoft Forefront Threat Management Gateway | Windows | Yes | Yes | Proprietary |
Polipo | Windows, OS X, Linux, OpenWrt, FreeBSD | Yes | Yes | MIT License |
Squid | Linux, Unix, Windows | Yes | Yes | GNU General Public License |
Traffic Server | Linux, Unix | Yes | Yes | Apache License 2.0 |
Untangle | Linux | Yes | Yes | Proprietary |
Varnish | Linux, Unix | Yes (possible with a VMOD) | Yes | BSD |
WinGate | Windows | Yes | Yes | Proprietary / Free for 3 users |
Nuster | Linux, Unix | No | Yes | GNU General Public License |
McAfee Web Gateway | McAfee Linux Operating System | Yes | Yes | Proprietary |
See also
References
- Erman, Jeffrey; Gerber, Alexandre; Hajiaghayi, Mohammad T.; Pei, Dan; Spatscheck, Oliver (2008). "Network-Aware Forward Caching" (PDF). AT&T Labs: 291–300. CiteSeerX 10.1.1.159.1786. Archived from the original (PDF) on 1 April 2011. Retrieved 11 March 2019.
- Shinder, Thomas (2 September 2008). "Understanding Web Caching Concepts for the ISA Firewall". ISA Server. TechGenix Ltd. Archived from the original on 23 July 2011. Retrieved 27 February 2011.
- Kelly, Mike; Hausenblas, Michael. "Using HTTP Link: Header for Gateway Cache Invalidation" (PDF). WS-REST. p. 20. Retrieved 14 June 2013.