Firewall

Reading time: 11 minute Word count: 2263

Network Firewall Security State

Many network failures look like “the service never started”, “the port is not listening”, or “the peer did not reply”. When you keep tracing them, you often discover a firewall policy boundary in the middle. The application thinks it is only connecting to 443. Operations thinks only one security-group rule was opened. The user thinks “if ping works, access should work too”. In reality, the traffic is often being judged by a device or host in the middle that decides whether this flow is allowed to pass.

The easiest way to underestimate a firewall is to think of it as “a blacklist box”. What really changes is not whether it can drop packets, but that network reachability is no longer determined only by addresses and routing. It is now filtered through an explicit policy and state decision. Once you understand that, cases like “it works inside the LAN but not across subnets”, “the port is open but access still fails”, and “outbound works but inbound does not” all start to fit one model.

A firewall does not merely “block bad traffic”. It pulls the decision about network reachability into an explicit, state-aware policy boundary that can default to deny

Why It Exists

If a network only had addresses, routing, and application listeners, the default model would be very direct:

If you know the destination address and port, you try to connect
If routing can reach it and the service is listening, traffic should pass
Who can talk to whom is mostly determined by topology, not by explicit policy

That can still kind of work in a small network, but once the scale grows it quickly becomes unmanageable:

Some services on the same host should be publicly reachable, while others should only be reachable from the internal network
Not every host in the same subnet should be able to talk to every other host by default
Outbound access is often more acceptable than inbound exposure
Security, audit, and compliance teams want “who can access whom” written as explicit rules, not left to topology luck

That is where the firewall appears. It does not just ask whether a packet can be forwarded. It tries to turn reachability from a default network result into a policy result the organization can explicitly declare.

Grasp the Main Model First

To understand a firewall, separate these three things first:

Topology reachability: whether routing exists and the next hop can be reached
Service availability: whether the target host is listening and the process is healthy
Policy allowance: whether the middle device or host allows the flow to pass

Many troubleshooting sessions go in circles because these three layers are collapsed into one sentence like “the network is down”.

What a firewall actually does can be compressed like this:

Traffic reaches a policy boundary
  -> rules match by address, port, protocol, direction, and interface
  -> if needed, existing connection state is consulted
  -> decide allow, reject, drop, log, or forward further

The important thing to remember is not a specific command. It is what this boundary means: from this hop onward, reachability is no longer a natural property. It is the result of explicit policy authorization.

The Most Common Main Path

Take a common access scenario: a client reaches an HTTPS service in the server zone through a boundary firewall.

Client
  -> Firewall: TCP SYN dst=203.0.113.10:443

Firewall
  -> match: from office network, to server zone, protocol TCP, destination port 443
  -> action: allow, and create state for this connection
  -> forward to Server

Server
  -> Firewall: TCP SYN-ACK src=203.0.113.10:443

Firewall
  -> match: part of an established connection
  -> action: allow
  -> forward to Client

In this path, the key is not “the firewall allowed 443”. The key is:

Whether the first packet matched an explicit rule
Whether return traffic was recognized as part of an established connection
Which direction, zone, and protocol were actually allowed

If this model is not stable, then state tracking, default deny, east-west isolation, and cloud security groups will all become fuzzy.

Why “Default Deny” Is Closer to Reality Than Blacklisting

Many people picture a firewall as a bad-traffic filter: allow everything first, then block a small set of dangerous things. That idea applies somewhat to content security devices, intrusion detection, or malware signature filtering, but it is not the usual model for access control in real engineering environments.

The more common and more stable firewall model is:

First define trust boundaries and network zones
Do not assume all zones should talk to each other by default
Explicitly allow only communication that is actually needed

That is why the real design question is often not “which bad IPs should be blocked?” but:

Which source ranges may access this service
Is the rule inbound, outbound, or both
Is the rule for one port or an entire protocol family
Does this rule apply to the public network, office network, container network, or service mesh

So a firewall is more like a whitelist-style authorization boundary. Blacklists exist, but what usually determines the exposure surface is the minimal path left open after default deny.

Why State Tracking Became the Core of Modern Firewalls

If a firewall only judged packets statically one by one, it could still work, but two problems would show up quickly in practice.

First, many protocols are naturally bidirectional. A client sends TCP SYN first, then the server replies with SYN-ACK. If the firewall only looked at “packets coming from the server toward the client”, that reply would look like an unauthorized new connection.

Second, what many access policies actually want to express is not “allow every packet from port 443”, but “allow the return traffic for this approved connection”.

That is why modern firewalls usually add state tracking, often called stateful inspection or connection tracking:

Once the first packet matches an allow rule, state is created
Later packets in the same session can pass quickly
When the connection ends or times out, the state is removed

This is how the firewall can default to being conservative while still letting legitimate bidirectional communication complete naturally.

The tradeoff is clear too:

The firewall no longer just checks rules. It must maintain a state table
Long-lived connections, idle connections, and high-concurrency connections all consume state resources
If the timeout policy is wrong, you get “the application is still alive, but the firewall already forgot the state”

Many incidents like “it connects once, then fails after a while” or “UDP occasionally stops working” are rooted not in the application, but in state lifetime.

Why Packet Filtering, Stateful Firewalls, and Application Gateways Are Not the Same Thing

The term “firewall” often mixes different layers and functions together. That causes bad judgment immediately.

The most basic one is packet filtering. It mainly looks at:

Source address, destination address
Protocol number
Source port, destination port
Ingress interface, egress interface

That solves the most basic access-control problem. It is simple, fast, and boundary-clear. The downside is that it is almost blind to connection semantics and application content.

Above that is the stateful firewall. It still makes decisions mainly from the network and transport layers, but it also uses connection state. Most real-world “open a port”, “allow return traffic”, and “deny new inbound connections” rules live here.

Above that are application gateways or more complex security devices. They may understand HTTP, TLS, DNS, or other application protocols and make finer-grained decisions, such as:

Whether a particular URL path may be accessed
Whether a DNS query class should be blocked
Whether a TLS handshake parameter is suspicious

Those capabilities are stronger, but also more expensive and more complex, and they often overlap with proxies, WAFs, IDS, and IPS. If you do not separate the layers when writing about firewalls, readers will later mix “the L4 rule was not allowed” with “an L7 policy blocked it”.

Why Inbound and Outbound Defaults Often Differ Completely

Many enterprise networks, cloud networks, and home networks treat inbound and outbound traffic asymmetrically. That is not an accident. It is an engineering tradeoff.

The most common assumption is:

Outbound access is usually easier to allow
Inbound exposure usually needs explicit approval and narrower rules

The reason is not mysterious:

Most internal hosts need to access external dependencies
Most internal hosts do not need to be actively reached from outside
Inbound exposure directly expands the attack surface

That is why “the server can dial out, but the outside cannot dial in” is such a common situation. It is not always NAT. It is not always routing. Very often it is simply the firewall policy making different default assumptions by direction.

Once that model is stable, security groups, host firewalls, egress ACLs, and zero-trust access boundaries become much less confusing.

Why “Port 443 Is Open” May Still Fail

This is one of the most common false assumptions in firewall troubleshooting. Saying “443 is open” sounds specific, but it is still nowhere near enough.

You still need to ask:

Which side’s 443 is open, source port or destination port
Which source addresses are allowed
Is only TCP 443 allowed, or do you also need ICMP, UDP, or health-check ports
Did the public entry point allow it, or is the host firewall still denying it
Was the first packet allowed, or was the return path dropped because of state, asymmetric routing, or timeout

The same “443 does not work” symptom can have very different root causes:

The boundary firewall did not allow the first packet
The host’s local iptables / nftables / Windows Firewall did not allow it
The cloud security group allowed it, but the subnet ACL rejected it
The load balancer health-check port was closed, so the backend was never attached
The return traffic took a different path, and the stateful firewall no longer recognized the reply

So “opening a port” is never a complete firewall conclusion. It is only an unfinished question.

Why Host Firewalls and Boundary Firewalls Have Different Responsibilities

These two things often appear together, but they should not be written as if they were the same.

The boundary firewall’s focus is traffic policy between network zones. It usually answers:

Can the office network access the server zone
Can the public network access a certain service set
Which subnets are allowed to interconnect

The host firewall is closer to the machine’s own exposure surface. It answers:

Which ports on this machine are really open to the outside
Which interface and source can access this local service
Whether the machine should keep one last restriction even inside the same security domain

They do not replace each other. They are boundaries at different layers. A boundary firewall manages large-scale traffic rules, while a host firewall keeps the single machine’s exposure surface tighter. Many truly robust systems rely on both layers together rather than trusting only one of them never to leak.

What to Check First in Packet Captures and Troubleshooting

Firewall troubleshooting is easiest to waste time on when you start by asking “is it being blocked?” without separating the path and direction. A better order is usually this.

First determine whether the failure happens on the first packet or the return packet

If the client’s first packet never reaches the server, focus on:

Where the packet first disappears
Whether the boundary rule allows this source, destination, and direction
Whether multiple firewall layers are active on the path

If the first packet reaches the server but the reply never comes back, then focus on:

Whether the reply matches existing state
Whether the return path is symmetric
Whether the state has already expired

Then determine which boundary layer is making the decision

Many environments have multiple policy layers at once:

Cloud security groups
Subnet ACLs
Boundary firewalls
Host local firewalls
Container network policies

Just saying “the firewall blocked it” is not useful. What matters is: which layer, which rule, and which direction made the deny decision.

Only then look for content-level policy

If transport-layer connectivity is already working but the application still fails, then look at:

Whether TLS handshakes were blocked by an intermediate security device
Whether HTTP requests were denied by WAF or proxy policy
Whether DNS queries were blocked by content policy

Keeping this step last helps avoid mixing basic connectivity issues with higher-layer policy issues.

What Engineering Should Actually Use This Understanding For

Do not think of a firewall as an “extra security plugin”. It directly defines the default boundary of network reachability
Do not mix routing reachability, service listening, and policy allowance into one problem. Any one of them missing can cause “not reachable”
Do not ask only “is the port open”. Ask which source, destination, direction, interface, state, and enforcement layer are involved
Do not forget that modern firewalls rely heavily on state tracking. Many disconnects and intermittent failures are about the state table, timeout, or return path
Do not write host firewalls, boundary firewalls, and application-layer security devices as if they were the same thing. They solve different problems at different layers