Proxy

Reading time: 11 minute Word count: 2180

Network Proxy HTTP Gateway

Many network behaviors look like “the client is talking directly to the server”, but there is often already a proxy in the middle. Browsers may go through a corporate proxy to reach the Internet. Mobile requests may hit a CDN before reaching the origin. Services may sit behind Nginx or an API gateway. When troubleshooting, the peer address, TLS certificate, source IP, and connection count you see may already be altered.

The easiest way to underestimate a proxy is to think of it as “something that helps forward a request”. The really important change is not that packets detour through another hop. It is that the two endpoints no longer face each other directly: the connection is terminated at the proxy, the other side is created by the proxy on behalf of one side, and part of the identity, policy, caching, audit, and reachability model moves into that middle layer.

A proxy is not just “sending traffic onward”. It inserts a middle participant that can terminate connections, speak for one side, and reapply policy between the two ends

Why It Appears

If a network only had direct client-server communication, many useful capabilities would be hard to deploy consistently:

The client wants unified egress, auditing, and access control
The server wants to hide internal topology, absorb connection pressure, and unify ingress policy
The path wants to cache hot content, compress traffic, or speed up access locally
The organization wants authentication, rate limiting, logging, and protocol compatibility handled outside the business service

All of these want something more than “deliver the packet to the peer”. They want a decision-making point in the middle of the path. Routers mainly decide the next hop. NAT mainly rewrites addresses and maintains mappings. A proxy goes further and appears to one side as an actual communication participant.

So when a proxy appears, three things are really being redistributed:

Who can see the real peer
Who creates the next connection
Who has the authority to apply policy in the middle

Grasp the Smallest Model First

Think of the path as two segments instead of one straight line:

Client <-> Proxy <-> Server

The diagram looks simple, but the key point is that the relation on each side has changed.

To the client, the peer may no longer be the final server. It may be the proxy.

To the server, the thing connecting may no longer be the original client. It may be the proxy.

That means a proxy usually does at least two things:

Terminates one side of the connection or session
Starts another connection on behalf of that side

Once those two things are true, the proxy is no longer just “seeing traffic”. It can:

Rewrite requests and responses
Decide whether traffic may continue
Reuse, cache, queue, or rate-limit
Add metadata such as the original client address

The Most Common Main Path

Here is a classic reverse-proxy path. The user accesses https://example.com/api/devices, but the origin is not directly exposed to the public Internet:

Client
  -> Reverse Proxy / CDN: TLS + HTTP Request

Reverse Proxy / CDN
  -> Origin Service: HTTP or HTTPS Request

Origin Service
  -> Reverse Proxy / CDN: Response

Reverse Proxy / CDN
  -> Client: Response

The easiest thing to misread is “the request just took one more hop”. In reality, the proxy may already have done several things:

Terminated TLS and seen plaintext HTTP
Routed the request to different backends by Host, path, or header
Returned a cached response without going back to origin
Added X-Forwarded-For, Forwarded, or X-Forwarded-Proto
Kept working with the backend after the client disconnected, or returned an error earlier than the backend did

From an engineering perspective, the client and origin no longer share one raw connection. They share a logical path assembled by the proxy.

Why Proxy Is Not the Same as NAT or Routing

These concepts often appear together, but their boundaries are different.

A router mainly decides the next hop from network-layer information and does not normally establish application sessions on behalf of either side. NAT rewrites addresses and ports at the boundary and its main lever is the mapping table. A tunnel more or less wraps one kind of traffic inside another payload to carry it across, with emphasis on encapsulation and traversal.

The difference with a proxy is that it usually has to “speak for one side”:

An HTTP proxy receives and parses HTTP requests itself
A SOCKS proxy negotiates with the client first and then connects to the target on its behalf
A reverse proxy listens on the public address and then decides which backend should receive the request

So a proxy is usually not asking “which next hop does this packet go to?” It is asking “is this session allowed to happen, under what identity, to which upstream, and should I do work in the middle?”

That is why, when you troubleshoot a path with a proxy in it, you cannot keep thinking as if the client is talking directly to the server.

The most common difference between forward and reverse proxy is simply which side treats the proxy as its representative.

A forward proxy sits on the client side. The client knows it is using a proxy. Common cases include:

Corporate egress control
Cross-network access
Scraper pools and public-IP management
Local packet capture, debugging, and request replay

In this model, the target server often sees the proxy rather than the original client, unless the proxy passes additional information.

A reverse proxy sits on the server side. The client usually does not know how many real services are behind it. It only sees one unified entry point. Common cases include:

Nginx or Envoy as a site entry
CDN as an edge cache and acceleration layer
API Gateway for unified authentication, rate limiting, and routing
Load balancers dispatching traffic across multiple instances

In this model, the client sees the identity exposed by the proxy, while the backend instances are hidden behind it.

They look opposite, but the core is the same:

One side no longer faces the real peer directly
The proxy creates the next connection on behalf of that side
Policy and visibility are pulled into the middle layer

Why Connection Termination Matters More Than “Forwarding”

Many proxy problems have to be understood by looking at where the connection is terminated, not just by reading configuration.

If the proxy only works at the TCP layer, it may not understand HTTP semantics, but it can still:

Accept the client TCP connection
Open its own TCP connection to the upstream
Forward bytes between the two connections

If the proxy terminates TLS, it is no longer just a byte mover. It can see:

HTTP methods, paths, and headers
Cookies, authentication headers, and cache-control
Upper-layer protocol upgrades such as WebSocket

That directly determines what the proxy can do, and where the risk moves:

More visible semantics allow more precise routing, caching, and auditing
The stronger the capability, the more carefully certificates, privacy, header trust, and protocol compatibility must be handled

This is the real difference behind “layer 7 proxy” and “layer 4 proxy”.

Why `CONNECT` Matters

The trickiest part of an HTTP forward proxy is how HTTPS gets through it.

If the client is visiting a plain HTTP URL, it can send the full target address to the proxy and let the proxy fetch it. HTTPS is different. The client wants to establish a TLS security context directly with the target site and does not want plain HTTP exposed to an ordinary forward proxy.

The common solution is to send CONNECT first:

Client -> Proxy: CONNECT example.com:443
Proxy -> Client: 200 Connection Established
Client <-> Proxy <-> Server: later TLS bytes are tunneled

The point of CONNECT is not “another kind of GET”. It tells the proxy: first establish a TCP path to this target address, and do not interpret the later bytes as ordinary HTTP requests.

That way:

The proxy still controls reachability and access policy
The client can still complete the TLS handshake directly with the target site
Whether the proxy can see plaintext later depends on whether it also performs TLS interception

Many enterprise proxies, packet-capture tools, and local debugging proxies are really divided at this line.

Why Proxies Naturally Rewrite Identity and Trust Boundaries

Once a proxy exists, “who is who” is no longer as direct as in a point-to-point connection.

The source IP the server sees is often the proxy’s address, not the original client’s. That is why engineering systems use headers such as:

X-Forwarded-For
Forwarded
X-Real-IP
X-Forwarded-Proto
X-Forwarded-Host

These headers help the application behind the proxy recover the original context. But they also create a common trap: they are not inherently trustworthy.

Only within a clearly defined trust boundary should an application believe a forwarding header inserted by a specific proxy layer. Otherwise the client can forge headers with the same names and mislead logs, authentication, risk control, or rate limiting.

So identity decisions around proxies should always separate three layers:

Who is visible in the protocol as the original peer
Which proxy layer is trusted to pass through the original information
Which fields the application ultimately uses for security decisions

Why Caching, Compression, and Load Balancing Often Live on Proxies

A proxy sits on the request path and can represent one side to the other. That makes it a natural home for cross-cutting capabilities.

Caching tends to live on proxies because they can reuse responses without changing business code. CDNs, reverse-proxy caches, and enterprise gateway caches all use that fact.

Compression tends to live on proxies because the proxy can see request and response headers and know what encoding the client accepts.

Load balancing tends to live on proxies because the proxy controls the next connection itself and can send traffic to different upstreams according to:

Round robin
Least connections
Consistent hashing
Health checks
Session stickiness

These features may look like separate product functions, but they all rely on the same premise: the proxy is not a bystander. It is a decision point in the traffic path.

Why Proxies Also Create New Failure Paths

Proxies solve many problems, but they also create new boundaries and new failure modes.

The most common costs include:

Another network hop and another connection termination increase latency and resource usage
If the proxy and the upstream or downstream use mismatched timeout policies, you get early closes, half-dead flows, or retry storms
Once headers, paths, or protocol versions are rewritten, what the application sees is no longer the raw client request
Changing the TLS termination point changes certificate deployment, plaintext exposure, and compliance boundaries
Caching improves performance but also introduces stale content and consistency problems

Many incidents where “the service is fine but users still cannot access it” are not in the business code at all. They are in the proxy layer:

A health check removed the instance
The proxy connection pool is exhausted
A reverse proxy timeout is shorter than application processing time
A WebSocket or gRPC upgrade header was not forwarded correctly
Some cache returned stale or wrong content

The stronger the proxy, the less useful it is to look only at application logs.

What Usually Shows Up in Real Engineering Is Proxy-on-Proxy

In specifications, you can name forward, reverse, transparent, explicit, caching proxy, application gateway. In reality, it is often several layers stacked:

Client
  -> Local debugging proxy
  -> Corporate egress proxy
  -> CDN / WAF
  -> Site reverse proxy
  -> Service mesh sidecar
  -> Application service

Once a request passes through that many layers, a lot of properties become “facts about one segment of the path” rather than globally true end-to-end facts:

Which certificate the client sees
Which layer owns the source IP you log
Where connection reuse happens
Who triggered the timeout and retry
Whether a response came from origin or from cache

That is why modern troubleshooting increasingly depends on a path view. Without the layered map, many symptoms just look random.

What Engineering Should Actually Use This Understanding For

When you see a proxy, do not start by asking “which config item is wrong?”. First draw the path.

First, confirm who is facing whom. Who does the client connect to, who connects upstream, where TLS terminates, and whether the application sees the original source and protocol.

Second, separate the proxy’s responsibilities. Is it only forwarding bytes, or has it already taken on TLS termination, HTTP routing, caching, authentication, rate limiting, and protocol-upgrade support? That determines which logs and metrics matter.

Third, any forwarding header or original-client information must be viewed inside a trust boundary. Only information written by a trusted proxy should participate in security decisions.

Fourth, when troubleshooting, do not stop at the final service. Proxy connection pools, retries, caches, health checks, certificates, and timeout policies often decide the user experience before the application code does.

In one sentence, a proxy should be understood like this:

It does not just move traffic onward. It inserts a middle participant that can terminate connections, speak for one side, and apply policy between the two ends