Access.log basics
The access.log file logs all incoming requests. chapter 11 covers the fields in the access.log in detail. The most important fields are the URL (field 7), and hierarchy access type (field 9) fields. Note that a "-" indicates that there is no data for that field.
The following example access.log entries indicate the changes in log output when connecting to another server, without a cache, with a single parent, and with multiple parents.
Though fields are seperated by spaces, fields can contain sub-fields, where a "/" indicates the split.
When connecting directly to a destination server, field 9 contains two subfields - the key word "DIRECT", followed by the name of the server that it is connecting to. Access to local servers (on your network) should always be DIRECT, even if you have a firewall, as discussed in section 3.1.2. The acl operator always_direct controls this behaviour.
905144366.259 1010 127.0.0.1 TCP_MISS/200 20868 GET
squid-cache.com - DIRECT/www.squid-cache.com text/html
When you have configured only one parent cache, the hierarchy access type indicates this, and includes the name of that cache.
905144426.435 289 127.0.0.1 TCP_MISS/200 20868 GET
squid-cache.com - SINGLE_PARENT/cache1.squid-cache.com text/html
There are many more types that can appear in the hierarchy access information field, but these are covered in chapter 11.
Another useful field is the 'Log Tag' field, field four. In the following example this is the field "TCP_MISS/200".
905225025.225 609 127.0.0.1 TCP_MISS/200 10089 GET
Internet Service Provider|Domains|Hosting|VoIP|ADSL|Hotspots - DIRECT/www.is.co.za text/html
A MISS indicates that the request was already stored in the cache (or that the page contained headers indicating that the page was not to be cached). A HIT would indicate that the page was already stored in the cache. In the latter case the request time for a remote page should be substantially less than the first occurence in the logs.
The time that Squid took to service the request is the second field. This value is in milliseconds. This value should approach that returned by examining a client request, but given operating system buffering there is likely to be a discrepancy.
The fifth field is the size of the page returned to the client. Note that an aborted request can end up downloading more than this from the origin server if the quick_abort feature set is turned on in the Squid config file.
Here is an example request direct from the origin server:
905230201.136 6642 127.0.0.1 TCP_MISS/200 20847 GET
squid-cache.com - DIRECT/www.squid-cache.com text/html
If we use client to fetch the page a short time later, a HIT is returned, and the time is reduced hugely.
905230209.899 151 127.0.0.1 TCP_HIT/200 20869 GET
squid-cache.com - NONE/- text/html
Some of you will have noticed that the size of the hit has increased slightly. If you have checked the size of a request from the origin server and compared it to that of the same page through the cache, you will also note that the size of the returned data has increased very slightly. Extra headers are added to pages passing through the cache, indicating which peer the page was returned from (if applicable), age information and other information. Clients never see this information, but it can be useful for debugging.
Since Squid 1.2 has support for HTTP/1.1, extra features can be used by clients accessing a copy of a page that Squid already has. Certain extra headers are included into the HTTP headers returned in HITS, indicating support for features which are not available to clients when returning MISSes. In the above example Squid has included a header in the page indicating that range-request are supported.