When the network is your problem (Developer Edition).

I’ve seen many junior and even senior developers struggling with figuring where an issue is located in their network topology. Developers need to know how to troubleshoot networking related issues, not necessarily fix them, but most definitely help figure out where they are.

This post will cover the basic tooling and how to use them to figure out what and where a problem is.

The topology

This post assumes some knowledge on how the internet connects, a good primer can be found here

I am going to base all of the examples on the following very generic topology, which should resemble most setups, to some degree at least.

A very generic network topology.

The problem

You try to go to the website you are working on, and you are hit with

Site can’t be reached (chrome edition)

S#!%… Okay, relax… breathe…

First thing to note is, we are not getting back a response of any kind from the server.

A response of any kind, means that the network is working, and the problem is either the webserver (IIS, Apache etc.), in code or filesystem related.

Or

You try to connect to an RPC service on a server, but it does not respond.
Either way, the problem could be the same.


Ping

Ping is a utility that sends ICMP echo requests to a given IP address and returns with the time it took to reach that destination. ICMP is its own protocol and does not have a concept of a port. Thus, you are only testing if the server responds to ICMPs.

Usage

ping {IP}
ping 8.8.8.8

Pros

  • Tests the entire chain, from client to server.
  • Can be set to ping until stopped (which is nice for checking when a servers goes back up, from a reboot)

Cons

  • Can be blocked in production firewalls.
  • If it fails, we generally have no idea why or where. It could be failing from the client and outward.
  • Does not test DNS.

Test Chain — Success

On a successful ping, we know that the client can reach the server over ICMP.
Problem could be Ports or DNS

Test Chain — Failure

If it fails, we really have no idea where or how. This is where you either start pinging from the client out / or server in — Or go try tracert.


Tracert (traceroute)

The natural next step from Ping. Traceroute also utilizes ICMP requests, but returns per hop response times, meaning this utility can show you the route your packets go through to reach the destination.

This helps in figuring out any bottlenecks in routing, and also which endpoint the communication stops at.

Example; the server’s local firewall is blocking all connections. When using tracert we can see that everything, but the server responds. troubleshooting can now continue on the server, as we know the network is fine.

Usage

tracert {IP}
tracert 8.8.8.8

Pros

  • Tests the full chain, from client to server.
  • Can see individual hit endpoints on the way to the destination.

Cons

  • Still uses ICMP, which may be blocked by production firewalls/servers.
  • Does not test DNS.
  • We as developers can’t do much about routing issues, other than report them where appropriate.

Test Chain — Success

If output looks fine, and there are the “expected” amount of hops, we know that the client can reach the server over ICMP.
Problem could be Port or DNS

Test Chain — Failure

If it times out after a certain point, we know that we can at least reach that point. — Investigations should continue from the point of failure and everything before can generally be ruled out (unless other problems also occur)


Telnet

Telnet is used to establish TCP/IP connections. In troubleshooting scenarios we can use it to test if a given port is open and accessible on a remote server.

When you connect to a website, your browser establishes a TCP/IP connection on port 80/443 on that server.

There is a similar powershell command you can run, which does the same thing and a ping if connection fails.

Usage Telnet

telnet {IP/FQDN} {PORT}

If you get a blank screen or any output — Connection has been established.

Usage Test-NetConnection

Test-NetConnection {IP/FQDN} -Port {PORT}
Test-NetConnection google.com -Port 443

Pros

  • Can directly tell us, whether or port is open or not — Any port.

Cons

  • Not installed by default in later version of windows (security)(can be installed as a feature) (or just use Test-NetConnection)
  • Does not test DNS

Test Chain — Success

If a connection is established, we know that the network is in good shape and that the server responds on the correct port. — The error then being on the actual server in question. or possibly loadbalancer issues (again, something to tell the appropriate team)

Test Chain — Failure

If we already know that we can ping or DNS is correct, but we cannot connect on the port in question, we now know that the port is either closed on the server or blocked by a firewall.


Nslookup (dig)

Nslookup or dig on linux, is used to check DNS records. Generally, the problem with DNS is the propagation times and TTL on records.

The general problem is either;

  1. Wrong IP/Hostname in DNS entry
  2. Stale records on client

Usage

nslookup {FQDN}
nslookup google.com response

Pros

  • Checks the DNS side of things, if the name matches what you expect for the address, then everything is in order (perhaps)
    DNS takes a while to propagate, so DNS might be working in some parts of the internet, where others might not. Generally, up to 24 hours (usually within an hour or so)

Cons

  • Only checks DNS

Getting committed lines of code count, through git

TL;DR

git diff --stat 4b825dc642cb6eb9a060e54bf8d69288fbee4904

Explanation

Git has a “secret” hardcoded SHA1 for an empty tree, referenced in the source code as EMPTY_TREE_SHA1 which is technically the /dev/null tree.
You can find the hash by doing git hash-object -t tree /dev/null
So with git diff --state 4b825dc642cb6eb9a060e54bf8d69288fbee4904 you get the difference between your current tree and an empty tree, which will return the lines of code committed

Note: That package.json and package-lock.json can become pretty big 🙂

Node-Alpine images with Git

This will be short and sweet.

I was building out an image for a swagger diff api (link)
and quickly realised that the size of the Node-Jessie image is > 600mb. Which is just too bloody much for my 12 lines of expressjs.


This image is based on the popular Alpine Linux project, available in the alpine official image. Alpine Linux is much smaller than most distribution base images (~5MB), and thus leads to much slimmer images in general.
This variant is highly recommended when final image size being as small as possible is desired. The main caveat to note is that it does use musl libc instead of glibc and friends, so certain software might run into issues depending on the depth of their libc requirements. However, most software doesn’t have an issue with this, so this variant is usually a very safe choice. See this Hacker News comment thread for more discussion of the issues that might arise and some pro/con comparisons of using Alpine-based images.


To minimize image size, it’s uncommon for additional related tools (such as git or bash) to be included in Alpine-based images. Using this image as a base, add the things you need in your own Dockerfile (see the alpine image description for examples of how to install packages if you are unfamiliar).

https://hub.docker.com/_/node/

The problem with the alpine image is that it does not contain GIT (as you can see from the above quote), which is a requirement for NPM to work properly.

So, the fast and easy way is to just add git to the dockerfile.

You simply add RUN apk --no-cache add git to your Dockerfile and you are good to go.

Doing this I ended up at an image size of 109mb down from 650mb, not bad.
Anyway, my Dockerfile went from

To


Note the change from
node:8.15-jessie to node:8.15-alpine as well.

If you want to know more about packaging simple node services inside a docker container, I highly recommend this article