Proxies - 요약

Web proxy servers are intermediaries. Proxies sit between clients and servers and act as “middlemen,” shuffling HTTP messages back and forth between the parties. This chapter talks all about HTTP proxy servers, the special support for proxy features, and some of the tricky behaviors you’ll see when you use proxy servers.

In this chapter, we:

•

Explain HTTP proxies, contrasting them to web gateways and illustrating how
proxies are deployed.

•

Show some of the ways proxies are helpful.

•

Describe how proxies are deployed in real networks and how traffic is directed
to proxy servers.

•

Show how to configure your browser to use a proxy.

•

Demonstrate HTTP proxy requests, how they differ from server requests, and
how proxies can subtly change the behavior of browsers.

•

Explain how you can record the path of your messages through chains of proxy
servers, using Via headers and the TRACE method.

•

Describe proxy-based HTTP access control.

•

Explain how proxies can interoperate between clients and servers, each of which
may support different features and versions.

1. Web Intermediaries

•

Web proxy servers are middlemen that fulfill transactions on the client’s behalf. 

•

With a web proxy, the client instead talks to the proxy, which itself communicates with the server on the client’s behalf. 

•

The client still completes the transaction, but through the good services of the proxy server.

•

HTTP proxy servers are both web servers and web clients. 

•

Because HTTP clients send request messages to proxies, the proxy server must properly handle the requests and the connections and return responses, just like a web server. 

•

At the same time, the proxy itself sends requests to servers, so it must also behave like a correct HTTP client, sending requests and receiving responses (see Figure 6-1).

Figure 6-1. A proxy must be both a server and a client

1.1 Private and Shared Proxies

•

Proxies shared among numerous clients are called public proxies.

◦

Most proxies are public, shared proxies. It’s more cost effective and easier to administer a centralized proxy. And some proxy applications, such as caching proxy servers, become more useful as more users are funneled into the same proxy server, because they can take advantage of common requests between users.

•

Proxies dedicated to a single client are called private proxies. 

◦

Dedicated private proxies are not as common, but they do have a place, especially when run directly on the client computer. Some browser assistant products, as well as some ISP services, run small proxies directly on the user’s PC in order to extend browser features, improve performance, or host advertising for free ISP services.

1.2 Proxies Versus Gateways

•

Strictly speaking, proxies connect two or more applications that speak the same protocol, while gateways hook up two or more parties that speak different protocols. 

•

A gateway acts as a “protocol converter.”

Figure 6-2. Proxies speak the same protocol; gateways tie together different protocols

•

In practice, the difference between proxies and gateways is blurry. Because browsers and servers implement different versions of HTTP, proxies often do some amount of protocol conversion. 

•

And commercial proxy servers implement gateway functionality to support SSL security protocols, SOCKS firewalls, FTP access, and web-based applications. 

6.2 Why Use Proxies?

•

Proxy servers can improve security, enhance performance, and save money. 

•

And because proxy servers can see and touch all the passing HTTP traffic, proxies can monitor and modify the traffic to implement many useful value-added web services. 

•

Here are examples of just a few of the ways proxies can be used:

Child filter

Elementary schools use filtering proxies to block access to adult content, while providing unhindered access to educational sites.*

Figure 6-3. Proxy application example: child-safe Internet filter

Document access controller

•

Proxy servers can be used to implement a uniform access-control strategy across a large set of web servers and web resources and to create an audit trail. 

•

This is useful in large corporate settings or other distributed bureaucracies.

•

All the access controls can be configured on the centralized proxy server, without requiring the access controls to be updated frequently on numerous web servers, of different makes and models, administered by different organizations.†

Figure 6-4. Proxy application example: contralized document access control

Security firewall

•

Proxy servers restrict which application-level protocols flow in and out of an organization, at a single secure point in the network. 

•

They also can provide hooks to scrutinize that traffic (Figure 6-5), as used by virus-eliminating web and email proxies.

Figure 6-5. Proxy application example: security firewall

Web cache

•

Proxy caches maintain local copies of popular documents and serve them on demand, reducing slow and costly Internet communication.

Figure 6-6. Proxy application example: web cache

Surrogate

•

Proxies can masquerade as web servers. These so-called surrogates or reverse proxies receive real web server requests, but, unlike web servers, they may initiate communication with other servers to locate the requested content on demand.

•

Surrogates may be used to improve the performance of slow web servers for common content. In this configuration, the surrogates often are called server accelerators (Figure 6-7). Surrogates also can be used in conjunction with content-routing functionality to create distributed networks of on-demand replicated content.

Figure 6-7. Proxy application example: surrogate (in a server accelerator deployment)

Content router

•

Proxy servers can act as “content routers,” vectoring requests to particular web servers based on Internet traffic conditions and type of content.

Example.

Figure 6-8. Proxy application example: content routing

Transcoder

•

Proxy servers can modify the body format of content before delivering it to clients. This transparent translation between data representations is called transcoding.*

Usage Example.

Figure 6-9. Proxy application example: content transcoder

Anonymizer

•

Anonymizer proxies provide heightened privacy and anonymity, by actively removing identifying characteristics from HTTP messages (e.g., client IP address, From header, Referer header, cookies, URI session IDs).*

Figure 6-10. Proxy application example: anonymizer

Figure 6-10 Description

6.3 Where Do Proxies Go?

The previous section explained what proxies do. Now let’s talk about where proxies sit when they are deployed into a network architecture. We’ll cover:

•

How proxies can be deployed into networks

•

How proxies can chain together into hierarchies

•

How traffic gets directed to a proxy server in the first place

6.3.1 Proxy Server Deployment

You can place proxies in all kinds of places, depending on their intended uses. Figure 6-11 sketches a few ways proxy servers can be deployed.

Figure 6-11. Proxies can be deployed many ways, depending on their intended use.

Egress proxy (Figure 6-11a)

•

You can stick proxies at the exit points of local networks to control the traffic flow between the local network and the greater Internet.

•

You might use egress proxies in a corporation to offer firewall protection(1) against malicious hackers outside the enterprise or to reduce bandwidth charges(2) and improve performance of Internet traffic(3). An elementary school might use a filtering egress proxy to prevent precocious students from browsing inappropriate content(4).

Access (ingress) proxy (Figure 6-11b)

•

Proxies are often placed at ISP access points, processing the aggregate requests from the customers.

•

ISPs use caching proxies to store copies of popular documents, to improve the download speed for their users (especially those with high-speed connections) and reduce Internet bandwidth costs.

Surrogates (Figure 6-11c)

•

Proxies frequently are deployed as surrogates (also commonly called reverse proxies) at the edge of the network, in front of web servers, where they can field all of the requests directed at the web server and ask the web server for resources only when necessary. 

•

Surrogates can add security features(1) to web servers or improve performance by placing fast web server caches in front of slower web servers(2). Surrogates typically assume the name and IP address of the web server directly, so all requests go to the proxy instead of the server.

Network exchange proxy (Figure 6-11d)

•

With sufficient horsepower, proxies can be placed in the Internet peering exchange points between networks, to alleviate congestion at Internet junctions through caching and to monitor traffic flows.

6.3.2 Proxy Hierarchies

•

Proxies can be cascaded in chains called proxy hierarchies.

•

In a proxy hierarchy, messages are passed from proxy to proxy until they eventually reach the origin server (and then are passed back through the proxies to the client), as shown in Figure 6-12.

Figure 6-12. Three-level proxy hierarchy

•

Proxy servers in a proxy hierarchy are assigned parent and child relationships. The next inbound proxy (closer to the server) is called the parent, and the next outbound proxy (closer to the client) is called the child.

Proxy hierarchy content routing

•

A proxy server can forward messages to a varied and changing set of proxy servers and origin servers, based on many factors.

•

For example, in Figure 6-13, the access proxy routes to parent proxies or origin servers in different circumstances:

Figure 6-13. Proxy hierarchies can be daynamic, changing for each request.

•

If the requested object belongs to a web server that has paid for content distribution, the proxy could route the request to a nearby cache server that would either return the cached object or fetch it if it wasn’t available.

•

If the request was for a particular type of image, the access proxy might route the request to a dedicated compression proxy that would fetch the image and then compress it, so it would download faster across a slow modem to the client.

•

Here are a few other examples of dynamic parent selection:

Load balancing

A child proxy might pick a parent proxy based on the current level of workload on the parents, to spread the load around.

Geographic proximity routing

A child proxy might select a parent responsible for the origin server’s geographic region.

Protocol/type routing

A child proxy might route to different parents and origin servers based on the URI. Certain types of URIs might cause the requests to be transported through special proxy servers, for special protocol handling.

Subscription-based routing

If publishers have paid extra money for high-performance service, their URIs might be routed to large caches or compression engines to improve performance.

6.3.3 How Proxies Get Traffic

•

We need to explain how HTTP traffic finds its way to a proxy in the first place. 

•

There are four common ways to cause client traffic to get to a proxy:

Figure 6-14. There are many techniques to direct web reqeust to proxies

Modify the client

•

Many web clients, including Netscape and Microsoft browsers, support both manual and automated proxy configuration. 

•

If a client is configured to use a proxy server, the client sends HTTP requests directly and intentionally to the proxy, instead of to the origin server (Figure 6-14a).

Modify the network

•

There are several techniques where the network infrastructure intercepts and steers web traffic into a proxy, without the client’s knowledge or participation. 

•

This interception typically relies on switching and routing devices that watch for HTTP traffic, intercept it, and shunt the traffic into a proxy, without the client’s knowledge (Figure 6-14b). 

•

This is called an intercepting proxy.*

Modify the DNS namespace

•

Surrogates, which are proxy servers placed in front of web servers, assume the name and IP address of the web server directly, so all requests go to them instead of to the server (Figure 6-14c). 

•

This can be arranged by manually editing the DNS naming tables or by using special dynamic DNS servers that compute the appropriate proxy or server to use on-demand.

Modify the web server

•

Some web servers also can be configured to redirect client requests to a proxy by sending an HTTP redirection command (response code 305) back to the client. 

•

Upon receiving the redirect, the client transacts with the proxy (Figure 6-14d).

6.4 Client Proxy Settings

All modern web browsers let you configure the use of proxies. In fact, many browsers provide multiple ways of configuring proxies, including:

Manual configuration

You explicitly set a proxy to use.

Browser preconfiguration

The browser vendor or distributor manually preconfigures the proxy setting of the browser (or any other web client) before delivering it to customers.

Proxy auto-configuration (PAC)

You provide a URI to a JavaScript proxy auto-configuration (PAC) file; the client fetches the JavaScript file and runs it to decide if it should use a proxy and, if so, which proxy server to use.

WPAD proxy discovery

Some browsers support the Web Proxy Autodiscovery Protocol (WPAD), which automatically detects a “configuration server” from which the browser can download an auto-configuration file.*

이하 생략