A proxy, or proxy server, in a computer network, is a server —program or device—, which acts as an intermediary in the requests for resources made by a client (A) to another server (C). For example, if a hypothetical machine A requests a resource from C, it will do so by requesting B, which in turn will transfer the request to C; in this way C will not know that the request originally came from A. This strategic intermediate point location allows it to offer various functionalities: access control, traffic log, restriction to certain types of traffic, performance improvement, communication anonymity, web cache, etc. Depending on the context, the intermediation carried out by the proxy may be considered by users, administrators or providers as legitimate or criminal and its use is frequently discussed.
Web residential proxies can provide a series of interesting features in different areas:
Traffic reduction by implementing cache in the proxy. Requests for Web pages are made to the Proxy residential and not to the Internet directly. Therefore, the network traffic is lightened and the destination servers are downloaded, to which fewer requests arrive.
The cache typically uses a configurable algorithm to determine when a document is out of date and should be removed from the cache. As configuration parameters it uses the age, size and access history. Two of these basic algorithms are the LRU (the least recently used, “Least Recently Used”) and the LFU (the least frequently used, “Least Frequently Used”).
Improved response time speed by implementing cache in the proxy. The proxy server creates a cache that prevents identical transfers of information between servers for a period of time (configured by the administrator) so that the user receives a faster response. For example suppose we have an ISP that has a Cached Proxy Server. If a client of that ISP sends a request, for example, to Google, it will reach the Proxy server that this ISP has and will not go directly to the IP address of the Google domain. This specific page is usually highly requested by a high percentage of users, therefore the ISP retains it in its Proxy for a certain time and creates a response in much less time. When the user creates a search in Google, the proxy server is no longer used; the ISP sends his request and the client now receives his answer from Google.
P2P programs can take advantage of the cache provided by some proxies. It is the so-called Webcache. For example it is used in Lphant and some Emule Mods.
The proxy can be used to implement content filtering functions. This requires the configuration of a series of restrictions that indicate what is not allowed. Note that this functionality can be used not only to prevent certain users from accessing certain content, but also to filter certain files that may be considered dangerous, such as viruses and other hostile content served by remote web servers.
A proxy can allow the identity of the person requesting certain content to be hidden from the web server. The only thing the web server detects is that the proxy’s ip requests certain content. However, it cannot determine the ip origin of the request. Furthermore, if a cache is used, it may be the case that the content is accessed many more times than those detected by the web server that hosts that content.
Proxies can be exploited to provide a web service to a higher user demand than would be possible without them.
The proxy server can modify the contents that the original web servers serve. There may be different motivations for doing this. Let’s see some examples:
Some proxies can change the format of web pages for a specific purpose or audience (eg displaying a page on a mobile phone or PDA) by translating the contents.
The use of this type of proxies is frequent in the users’ own machines (local proxies) to implement an intermediate step and that the requests are not released / received to / from the network without having previously been cleaned of information or dangerous content or private. This type of proxy is typical in environments where there is a lot of concern about privacy and is usually used as a preliminary step to requesting content through a network that seeks anonymity such as Tor. The most frequent programs to do this type of functionality are:
Privoxy: Focuses on web content. It does not provide cache service. Analyze traffic based on predefined rules that are associated with addresses specified with regular expressions and applied to headers, content, etc. It is highly configurable. It has extensive documentation.
Polyp: It has features that make it faster than privoxy (cache, pipeline, intelligent use of range of requests). Its disadvantage is that it is not configured by default to provide anonymity at the application layer level.