Azure Network Design

I've seen too many Azure deployments where the network design was an afterthought, leading to expensive retrofits down the line. Azure Virtual Networks are the foundational networking layer for Azure workloads, so it's crucial to get the design right from the start.

When it comes to VNet and address space planning, it's difficult to make changes after resources are deployed, so plan for growth. A /16 VNet provides 65,536 IPs, which is enough room for multiple subnets across all environments and future expansion. Subnet sizing is also important, with subnets allocated by role, such as AKS node pool subnet, AKS pod subnet for Azure CNI, App Services subnet, and private endpoint subnet, with an appropriate size for each. Azure reserves 5 IPs per subnet, so keep that in mind when planning.

In a production tenant I ran a single /16 VNet for three environments—dev, test, prod—and allocated /24 subnets for each service tier. After a few months the dev team spun up an AKS cluster with a 10‑node pool using Azure CNI, which consumes one IP per pod. The pod subnet quickly ate up the /24, and we had to carve out a new /20 subnet and re‑wire the cluster. The whole exercise took a weekend window, and the root cause was the initial assumption that a /24 would be enough for a dynamic workload. I now start with /20 per environment and reserve a /22 for any container or VM scale set that uses Azure CNI, and I script a check with az network vnet check-ip-address to verify headroom before any new deployment.

Network Security Groups are the primary microsegmentation tool in Azure, filtering traffic by source and destination IP, port, and protocol. The security standard is to have an NSG with a default-deny inbound rule and explicit allow rules for required traffic on every subnet. The NSG rule priority model requires careful rule ordering, with lower numbers evaluated first and the matched rule terminating evaluation. Application Security Groups allow rules to reference groups of VMs by logical name rather than IP addresses, making management easier.

The other thing that bites you is rule churn. In a multi‑team project we ended up with more than 150 NSG rules across ten subnets, and a junior admin accidentally added a permissive rule with priority 100 that opened inbound RDP to the whole internet. The rule was evaluated before the intended deny, and we saw a spike in connection attempts in Azure Monitor. To avoid that, we locked down NSG changes behind an Azure Policy that caps the number of rules per NSG at 50 and requires a tag on the rule description. We also turned on NSG flow logs and shipped them to Log Analytics; the extra storage cost was about $30 per month, but it gave us the forensic detail to catch the mis‑configuration within minutes.

Private endpoints for PaaS services are a game-changer for security. Azure Private Endpoint creates a private IP address in your VNet for services like Storage, SQL, Key Vault, Service Bus, and ACR. This means network traffic to the service traverses the Azure backbone rather than the public internet, reducing the attack surface. Private Endpoints eliminate the public endpoint of the PaaS service for your access, and DNS resolution for the service FQDN resolves to the private IP via Private DNS Zones linked to the VNet.

When it comes to connecting VNets, there are two main options: VNet peering and VPN gateways. VNet peering connects two VNets at the Azure backbone level with low latency and high bandwidth, allowing peered VNets to communicate as if they're on the same network. VPN Gateway, on the other hand, connects VNets or on-premises networks via encrypted tunnels over the public internet, with higher latency and lower bandwidth.

Cost and latency are the real decision points between peering and VPN. A 1 Gbps VNet peering in the same region costs roughly $0.01 per GB of outbound traffic, and the latency is typically under 2 ms because the traffic stays on the Azure backbone. By contrast, a Site‑to‑Site VPN using a VpnGw1 SKU costs about $0.10 per hour plus $0.03 per GB, and you add encryption overhead that pushes latency into the 30‑50 ms range. In a recent migration we moved a data‑lake workload from a hub VNet to a spoke VNet; the peering cost was under $10 per month for 1 TB of traffic, while the VPN would have been ten times that. The only time we kept the VPN was when we needed to connect to an on‑premises network that still relied on IPSec tunnels.

For Azure-to-Azure connectivity, VNet peering is usually the correct choice, unless cross-region or cross-subscription routing requires a hub-and-spoke topology with Azure Virtual WAN or Azure Route Server. In those cases, VPN Gateway may be necessary, but it's generally better to use VNet peering when possible.

In my experience, proper planning and design of Azure Virtual Networks can make a huge difference in the security and performance of Azure workloads. By taking the time to plan VNets, subnets, and security groups carefully, you can avoid expensive retrofits later on and ensure a smooth and secure deployment.

One of the key benefits of using Azure Virtual Networks is the ability to microsegment traffic using Network Security Groups and Application Security Groups. This allows for fine-grained control over traffic flow and makes it easier to meet security and compliance requirements.

Overall, Azure Virtual Networks provide a powerful foundation for Azure workloads, and with proper planning and design, they can help ensure a secure and high-performing deployment.