CDN & Networking Notes
Practical notes for content delivery engineering: caching, HTTP, DNS, edge routing, security, observability, and troubleshooting.
Quick Navigation
CDN Fundamentals
Core Ideas
- CDN (Content Delivery Network): A distributed platform that serves web content from locations closer to users. What it does: Lowers latency, reduces origin load, absorbs traffic spikes, and improves reliability. Use case: Serving images, JavaScript, CSS, APIs, video, and downloads globally.
- Edge location / POP (Point of Presence): A regional site where CDN servers run. What it does: Terminates client connections, applies edge rules, checks cache, and forwards misses to origin. Use case: A user in New York gets a response from a nearby edge instead of a distant origin.
- Origin server: The source of truth for content. What it does: Handles requests the CDN cannot serve from cache. Use case: Your application server, object bucket, API gateway, or media server behind the CDN.
- Request flow: Browser -> DNS -> CDN edge -> cache lookup -> origin if needed -> response back to user. What it does: Explains where latency, cache misses, redirects, TLS failures, and 5xx errors can happen. Use case: Debugging why one region is slow while another is fast.
- Reverse proxy: A server that sits in front of origin and forwards requests. What it does: Hides origin, applies policy, and can cache responses. Use case: CDN edge acting as the public entry point for a web app.
- Edge delivery mindset: Think in terms of request path, cacheability, headers, routing, and failure mode. What it does: Keeps troubleshooting structured. Use case: Separating a browser issue from an edge issue from an origin issue.
Caching Behavior
Cache Concepts
- Cache hit: The edge already has a fresh response. What it does: Returns quickly without contacting origin. Use case: Static assets with long TTLs.
- Cache miss: The edge does not have a usable response. What it does: Fetches from origin and may store the response. Use case: First request after deploy or purge.
- TTL (Time To Live): How long a cached object is considered fresh. What it does: Balances freshness and performance. Use case: Long TTL for hashed JS files, short TTL for HTML pages.
- Cache key: The values used to identify a cached object. What it does: Decides whether two requests share the same cached response. Use case: Include path and query string, but avoid unnecessary variation.
- Vary header: Tells caches which request headers affect the response. What it does: Prevents serving the wrong variant. Use case: Different compression or language responses.
- Stale content: Cached response served after freshness expires under controlled rules. What it does: Can preserve availability during origin trouble. Use case: Serve stale on origin timeout while refreshing in background.
- Purge / invalidation: Manual removal of cached content. What it does: Forces the edge to fetch a new version. Use case: Emergency rollback, content update, bad asset deployed.
- Cache poisoning risk: Incorrect cache key or header trust can store attacker-controlled content. What it does: Turns caching into a security issue. Use case: Always review user-controlled headers and query parameters.
HTTP & Web Delivery
Protocol Basics
- HTTP (Hypertext Transfer Protocol): The request/response protocol for web traffic. What it does: Defines how clients and servers exchange data. Use case: Loading a webpage or API response.
- HTTPS (HTTP Secure): HTTP over TLS encryption. What it does: Protects data in transit. Use case: Login pages and any private data transfer.
- Headers: Metadata sent with requests/responses. What it does: Controls caching, auth, and behavior. Use case: Cache-Control, Authorization, User-Agent.
- Methods: GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS. What it does: Communicates intent. Use case: GET and HEAD are commonly cacheable; mutating methods usually are not.
- Redirects: 301, 302, 307, and 308 move clients to another URL. What it does: Changes request destination. Use case: HTTP to HTTPS, old path to new path, apex to www.
- HTTP versions: HTTP/1.1, HTTP/2, and HTTP/3. What it does: Changes connection behavior and performance. Use case: HTTP/2 multiplexing reduces connection overhead; HTTP/3 uses QUIC over UDP.
Status & Caching
- Status codes: 2xx success, 3xx redirect, 4xx client error, 5xx server error. What it does: Tells you the outcome of a request. Use case: Debug cache or origin errors.
- Cache-Control: Cache policy header. What it does: Sets how long content can be cached. Use case: max-age=3600 for static assets.
- ETag (Entity Tag): Resource version identifier. What it does: Lets clients revalidate efficiently. Use case: 304 responses for unchanged content.
- 304 Not Modified: Reuse cached content without redownloading. What it does: Saves bandwidth. Use case: Browser checks if asset changed.
- Surrogate headers: Headers meant for the CDN rather than the browser. What it does: Lets edge cache policy differ from browser cache policy. Use case: Browser caches for 5 minutes, edge caches for 1 hour.
Performance Terms
- Latency: Time for a request to travel to the server and back. What it does: Affects how fast pages start loading. Use case: CDN reduces latency by serving closer.
- Bandwidth: Maximum data capacity of a connection. What it does: Limits how fast large files download. Use case: Video delivery and large assets.
- Throughput: Actual delivered data rate. What it does: Real-world speed after overhead. Use case: Compare CDN performance across regions.
DNS, Anycast & BGP
Routing Foundations
- IP (Internet Protocol, IPv4/IPv6): A unique address for a device. What it does: Identifies where traffic should go. Use case: DNS returns IPs for domains.
- DNS (Domain Name System): Maps domain names to IPs. What it does: Lets users type names instead of numbers. Use case: example.com → 93.184.216.34.
- DNS propagation: Time for DNS updates to spread globally. What it does: Delays changes being seen. Use case: New CDN CNAME may take hours to fully apply.
- CNAME: Alias from one hostname to another. What it does: Points a customer hostname toward a CDN hostname. Use case: www.example.com aliases to a CDN-managed hostname.
- A / AAAA records: DNS records for IPv4 and IPv6. What it does: Returns direct IP addresses. Use case: Apex domains or direct service endpoints.
- Anycast: One IP announced from many locations. What it does: Routes users to the nearest/healthiest POP. Use case: Global CDN edge IPs.
- BGP (Border Gateway Protocol): The internet routing protocol between networks. What it does: Chooses paths between ASNs. Use case: CDN traffic steering.
- ASN (Autonomous System Number): ID for a network on the internet. What it does: Identifies routing domains in BGP. Use case: ISPs and CDNs have ASNs.
- Peering: Direct exchange of traffic between networks. What it does: Lowers latency and transit cost. Use case: CDN peering with large ISPs.
- Transit: Paid connectivity to the broader internet. What it does: Provides reach when peering is unavailable. Use case: Backup path or smaller network connectivity.
TLS & Certificates
Secure Delivery
- TLS (Transport Layer Security): Encryption protocol for HTTPS. What it does: Encrypts traffic and validates server identity. Use case: Protecting login sessions and private data.
- Certificate: Public proof that a hostname belongs to a server. What it does: Lets browsers trust the HTTPS connection. Use case: Serving https://www.example.com without browser warnings.
- SNI (Server Name Indication): Hostname sent during TLS negotiation. What it does: Allows one IP to serve many certificates. Use case: Multi-tenant CDN edge hosts thousands of domains.
- HSTS (HTTP Strict Transport Security): Browser policy forcing HTTPS. What it does: Prevents downgrade to HTTP. Use case: Security hardening for production domains.
- OCSP stapling: Certificate revocation status sent during TLS handshake. What it does: Improves trust checks without extra browser lookup. Use case: Faster TLS validation.
- Certificate chain: Leaf, intermediate, and root certificates. What it does: Establishes trust path. Use case: Debugging "certificate not trusted" errors.
Origin & Shielding
Origin Behavior
- Origin selection: Choosing which backend receives a miss. What it does: Routes traffic to the correct service. Use case: API origin vs static asset origin.
- Load balancing: Distributes traffic across origins. What it does: Avoids overloading one backend. Use case: Multiple origin servers behind the same site.
- Health checks: Probes that verify origin availability. What it does: Keeps traffic away from failed backends. Use case: Automatic failover.
- Timeouts: Limits for connecting, reading, or writing to origin. What it does: Prevents stuck requests. Use case: Return fast failure instead of waiting forever.
- Shielding: A designated parent cache between edge and origin. What it does: Collapses origin misses and reduces origin load. Use case: Big traffic spikes or popular uncached content.
- Origin protection: Restricting direct access to backend servers. What it does: Reduces attack surface and ensures traffic goes through the edge. Use case: Firewall allow only CDN IP ranges.
Edge Configuration
Request Control
- Edge configuration: Rules that control request and response behavior. What it does: Shapes caching, routing, redirects, headers, and security. Use case: Cache static assets but bypass logged-in pages.
- Edge compute: Lightweight logic executed at the edge. What it does: Makes decisions before origin. Use case: Geo routing, A/B testing, bot checks, header normalization.
- Header manipulation: Add, remove, or rewrite headers. What it does: Controls cache policy, origin routing, and security behavior. Use case: Add security headers or normalize Host headers.
- URL rewriting: Change the path or query sent upstream. What it does: Decouples public URLs from backend structure. Use case: Serve /assets from an object storage origin.
- Edge dictionaries: Key-value configuration available at the edge. What it does: Enables fast runtime decisions. Use case: Allow lists, feature flags, redirect maps.
- Config versioning: Tracking and activating config changes safely. What it does: Reduces risk during deployment. Use case: Roll back a bad cache rule quickly.
Security Controls
Edge Security
- WAF (Web Application Firewall): Filters malicious HTTP traffic. What it does: Blocks attack patterns before origin. Use case: SQL injection, XSS, path traversal attempts.
- DDoS (Distributed Denial of Service): High-volume attack from many sources. What it does: Attempts to exhaust bandwidth, compute, or application resources. Use case: CDN absorbs and filters attack traffic at the edge.
- Rate limiting: Restricts request volume by client, token, path, or behavior. What it does: Protects APIs and login endpoints. Use case: Limit password attempts or expensive API calls.
- Bot mitigation: Detects automated traffic. What it does: Separates likely bots from real users. Use case: Scraping, credential stuffing, fake signups.
- Access control: Rules that allow or deny traffic. What it does: Restricts sensitive paths. Use case: Admin routes only available from trusted networks.
- Security headers: Browser-enforced protections. What it does: Reduces client-side risk. Use case: HSTS, CSP, X-Frame-Options, Referrer-Policy.
Logs & Monitoring
Operational Signals
- Logs: Detailed records of requests and events. What it does: Shows what happened. Use case: Debug a spike in 503 responses.
- Metrics: Numeric time-series data. What it does: Shows trends and health. Use case: Cache hit ratio, origin latency, error rate, request volume.
- Tracing: Request path visibility across systems. What it does: Shows where time is spent. Use case: Identify whether slowness is edge, network, or origin.
- Streaming logs: Near real-time log delivery. What it does: Speeds investigation. Use case: Watching an incident while traffic is still affected.
- SLO / SLA: Reliability targets and contractual guarantees. What it does: Defines acceptable availability and performance. Use case: Measure uptime and response latency.
- Incident response: Structured handling of service impact. What it does: Coordinates mitigation, communication, and follow-up. Use case: Edge errors or origin outage affecting customers.
- Postmortem: Written review after an incident. What it does: Captures cause, timeline, fixes, and prevention. Use case: Improve systems after repeated cache purge failures.
Troubleshooting Workflow
How to Debug CDN Issues
- Start with DNS: Confirm the hostname resolves to the expected CDN path. What it does: Rules out bad records or stale propagation. Useful tools:
dig,nslookup. - Check headers: Inspect cache status, age, server, location, and cache-control. What it does: Shows whether the edge cached, missed, redirected, or bypassed. Useful tool:
curl -I https://example.com. - Compare regions: Test from different networks or locations. What it does: Finds regional routing or POP-specific behavior. Useful tools: external probes, VPN, monitoring locations.
- Separate edge from origin: Determine whether the CDN or backend generated the response. What it does: Prevents chasing the wrong layer. Use case: 503 from origin vs 503 from edge timeout.
- Validate TLS: Check certificate name, expiration, chain, and SNI. What it does: Explains browser TLS warnings. Useful tool:
openssl s_client -connect host:443 -servername host. - Read timing: Break down DNS, connect, TLS, TTFB, and download. What it does: Identifies where latency lives. Useful tool:
curl -wtiming output. - Document the fix: Record symptom, scope, root cause, change made, and verification. What it does: Builds repeatable incident muscle. Use case: Customer-facing support and escalation notes.
