2015年(68)
分类: LINUX
2015-08-31 17:05:52
最近发现一个问题,网站首页清缓存后,用IE、Firefox等浏览器访问首页时能访问到最新的内容,用谷歌的chrome访问首页还是访问到旧的已缓存的内容。折腾了不少时间,在网上也找了不少资料,经过测试后发现与http协议的vary头有关。
具体什么是vary,请看下面的RFC:
14.44 Vary
The Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation. For uncacheable or stale responses, the Vary field value advises the user agent about the criteria that were used to select the representation. A Vary field value of "*" implies that a cache cannot determine from the request headers of a subsequent request whether this response is the appropriate representation. See section 13.6 for use of the Vary header field by caches.
Vary = "Vary" ":" ( "*" | 1#field-name )An HTTP/1.1 server SHOULD include a Vary header field with any cacheable response that is subject to server-driven negotiation. Doing so allows a cache to properly interpret future requests on that resource and informs the user agent about the presence of negotiation
on that resource. A server MAY include a Vary header field with a non-cacheable response that is subject to server-driven negotiation, since this might provide the user agent with useful information about the dimensions over which the response varies at the time of the response.
A Vary field value consisting of a list of field-names signals that the representation selected for the response is based on a selection algorithm which considers ONLY the listed request-header field values in selecting the most appropriate representation. A cache MAY assume that the same selection will be made for future requests with the same values for the listed field names, for the duration of time for which the response is fresh.
The field-names given are not limited to the set of standard request-header fields defined by this specification. Field names are case-insensitive.
A Vary field value of "*" signals that unspecified parameters not limited to the request-headers (e.g., the network address of the client), play a role in the selection of the response representation. The "*" value MUST NOT be generated by a proxy server; it may only be generated by an origin server.
--------------------------------------------------------------------------------------------------------------------------------------------------------
一般来说像squid这样的缓存软件,通常都会根据用户请求的URL、请求方法、HTTP头(vary) 来hash一个key,通过这个key可以找到保存在内存、硬盘的缓存内容,所以说如果浏览器的vary不一样的话,那就会有多个key对应多个内容。
先看看浏览器的vary各有什么不同:
Request Headers:
google chrome(节选):Accept-Encoding:gzip,deflate,sdch
IE (节选): Accept-Encoding: gzip, deflate
FIreFox(节选): Accept-Encoding: gzip, deflate
Response Headers:
三者的vary都为: Vary: Accept-Encoding
从上面看可以看到三个浏览器的Vary头都是根据请求头的Accept-Encoding来生成的,但是由于谷歌浏览器的Accept-Encoding与其他2个浏览器的不一致,生成了多个版本的cache,导至chrome访问的内容跟其他2个浏览器访问到的内容是不一样的。
处理这个问题的方法,从我找到的资料来,都是建议从后端的server将vary禁掉或是更改,但是根据我们的网站架构,我决定用在前端nginx将请求头的Accept-Encoding设成一样的方法,在nginx的配置上加上这一条:
more_set_input_headers 'Accept-Encoding:gzip, deflate';
(注:我们的网站结构为nginx->squid->tomcat)