Deployments

This is a rehash of an example I found (where I've lost my reference to the source document, apologies!)

Overview

We're going to Deploy a simple container. That said there's a few options and we can poke about under the hood to see if we can figure out what's going on.

In this instance we're going to use a stock echoserver of which there appear to be many and we'll modify one ourselves a bit later. It's not a proper echo service, in a TCP service sense, but more an HTTP service that will report on the client making the connection. More like the reporting parts of a STUN server than anything else.

Deployment

This kind of stuff feels really easy:

$ kubectl create deployment source-ip-app --image=k8s.gcr.io/echoserver:1.4
deployment.apps/source-ip-app created

but, I confess, the production of k8s.gcr.io/echoserver:1.4 from nowhere has me scratching my head. How did I know about k8s.gcr.io (OK, it turns out to be a big player) or their echoserver image let alone that I should be using version 1.4.

Here, version 1.4, at least, seems to be a common refrain in that it will get you exactly the same behaviour as whomever is demonstrating in their documentation. A good thing, except for the poor sap who has to keep all those versions online. And a bad thing as everyone who now follows this example get the same security risk exposure as existed at the time.

In the meanwhile, without any signatures/checksums/digests to verify none of us are any the wiser as to what we're getting is what we should be getting. Still, it's the future.

Anyway, we now have a deployment called source-ip-app (far better than echoserver).

What do we get overall?

$ kubectl get all
NAME                                READY   STATUS    RESTARTS   AGE
pod/source-ip-app-5978b6457-88c8l   1/1     Running   0          61s

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/source-ip-app   1/1     1            1           61s

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/source-ip-app-5978b6457   1         1         1       61s

Hmm, stuff!

Notice the deployment uses our given name, source-ip-app, but everything else is using a derived template.

Maybe we expected a Pod from a Deployment but a ReplicaSet has appeared and the Pod's name appears to be derived from the ReplicaSet's name. That fits in with the idea that Kubernetes is going to try to ensure that the implementation of the Deployment keeps running: we (implicitly) asked for one Pod and so the ReplicaSet's job is to account for the number of them running and Kubernetes can start another if we drop short.

Presumably, we can update the ReplicaSet to have more than one and the right thing will happen. Actually we tweak the Deployment but, yes, the right thing will happen. We'll do that later.

We know that containers are marshalled by Pods so lets look a bit closer:

$ kubectl get pod/source-ip-app-5978b6457-88c8l -o wide
NAME                            READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
source-ip-app-5978b6457-88c8l   1/1     Running   0          3m54s   10.254.46.8   k8s-w2   <none>           <none>

We could have used -o yaml but -o wide gets us a couple of interest parts, that this Pod is running on (our cleverly named second worker node) k8s-w2 and has an IP address in the Pod CIDR range (phew!).

Let's have a nose around on k8s-w2:

k8s-w2# ip ro | grep 10.254.46.8
10.254.46.8 dev cali4e6b13667e4 scope link

k8s-w2# ip li show dev cali4e6b13667e4
83: cali4e6b13667e4@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP mode DEFAULT group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-4b4f9c64-ed9a-393e-1fa3-d5db0c83e36c

Note the small MTU, there a lot of IP-in-IP tunnelling going on! What's happening in the network namespace?

k8s-w2# ip netns exec cni-4b4f9c64-ed9a-393e-1fa3-d5db0c83e36c ip ad
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    ...
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if83: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1430 qdisc noqueue state UP group default
    ...
    inet 10.254.46.8/32 scope global eth0
       valid_lft forever preferred_lft forever
    ...

Interestingly, the routing table in the network namespace is using link-local addresses:

default via 169.254.1.1 dev eth0
169.254.1.1 dev eth0 scope link

and we didn't have a link-local address ourselves.

Testing

We can test the Deployment by running an instance of busybox and wget'ing some stuff:

$ kubectl run busybox -it --image=busybox --restart=Never --rm
If you don't see a command prompt, try pressing enter.
/ # wget -qO - 10.254.46.8
wget: can't connect to remote host (10.254.46.8): Connection refused

Well, that turns out to be a function of the echoserver which is listening on port 8080. Who knew?

/ # wget -qO - 10.254.46.8:8080
CLIENT VALUES:
client_address=10.254.46.10
command=GET
real path=/
query=nil
request_version=1.1
request_uri=http://10.254.46.8:8080/

SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001

HEADERS RECEIVED:
connection=close
host=10.254.46.8:8080
user-agent=Wget
BODY:
-no body in request-/ #

You can confirm your equivalent busybox IP address -- it will be different to mine.

Services

We can now start to expose the three kinds of Service variants for this Deployment.

ClusterIP

ClusterIP is the default but we can specify it explicitly:

$ kubectl expose deployment source-ip-app --name=clusterip --port=80 --target-port=8080
service/clusterip exposed

Where we map {service-IP}:80 requests to {pod-IP}:8080 requests.

We can see our new Service:

$ kubectl get svc
NAME        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
clusterip   ClusterIP   10.103.157.57   <none>        80/TCP           25s

Not quite as useful as we'd like as there's no tie-in to our Deployment. Let's get more information:

$ kubectl get svc -o wide
NAME        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE   SELECTOR
clusterip   ClusterIP   10.103.157.57   <none>        80/TCP           61s   app=source-ip-app

Testing

Again, with busybox we can do some testing:

/ # wget -qO - 10.103.157.57 | grep ^client_address
client_address=10.254.46.10

Top!

NodePort

A NodePort is a bit more interesting. Here, Kubernetes endeavours to find a port that is free on all nodes (including masters) and sets up a Service meaning all nodes can respond:

$ kubectl expose deployment source-ip-app --name=nodeport --port=80 --target-port=8080 --type=NodePort
service/nodeport exposed

This looks like:

$ kubectl get svc
NAME        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
clusterip   ClusterIP   10.103.157.57   <none>        80/TCP           15m
nodeport    NodePort    10.99.156.196   <none>        80:32647/TCP     91s

Here, we can take a look on all nodes and find:

$ ss -ntl sport = :32647
State           Recv-Q          Send-Q                   Local Address:Port                     Peer Address:Port          Process
LISTEN          0               4096                           0.0.0.0:32647                         0.0.0.0:*

Testing

Hmm, two types of testing, here. busybox is in Kubernetes (in The Matrix?) but these are genuine ports on the nodes themselves.

busybox

The normal Service access:

/ # wget -qO - 10.99.156.196 | grep ^client_address
client_address=10.254.46.10

Routing

Out of interest, though, can we see the nodes? My master is on 172.18.0.243:

/ # wget -qO - 172.18.0.243:32647
client_address=10.254.42.128

Ooh er. What's that client address, though? A quick rummage and it turns out to be the node's tunnel address:

$ ip ad show dev tunl0
5: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1430 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 10.254.42.128/32 scope global tunl0
       valid_lft forever preferred_lft forever

We'll get a similar answer for our first worker node (where the Pod is not running) but the answer for the second worker node is more interesting:

/ # wget -qO - 172.18.0.244:32647
client_address=172.18.0.244

No IP address rewriting into tunnels.

LoadBalancer

The final variant is something we can't nominally support as it requires an external-to-Kubernetes real world load balancer.

However, there is a bodge which might be acceptable.

$ kubectl expose deployment source-ip-app --name=loadbalancer --port=80 --target-port=8080 --type=LoadBalancer
service/loadbalancer exposed
$ kubectl get svc loadbalancer
NAME           TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
loadbalancer   LoadBalancer   10.107.159.89   <pending>     80:30881/TCP   19s

Of interest, here, that port, 30881 is listening on all nodes, just like NodePort, above.

In the absence of our actual external load balancer we can bodge in an ExternalIP.

We could construct some YAML which might include:

spec:
  type: LoadBalancer
  externalIPs:
  - 172.18.0.243

where 172.18.0.243 is my node's IP address or we could patch the value right in:

$ kubectl patch svc loadbalancer -p '{"spec": {"type": "LoadBalancer", "externalIPs":["172.18.0.243"]}}'
service/loadbalancer patched

which now gives us:

$ kubectl get svc loadbalancer
NAME           TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
loadbalancer   LoadBalancer   10.107.159.89   172.18.0.243   80:30881/TCP   4m5s

and, from anywhere on our network that can directly (or indirectly through, say, Floating IP addresses):

$ curl -s 172.18.0.243 | grep ^client_address
client_address=10.254.42.128

reports our tunl0 address as the requested is routed across the Control Plane.

Clearly, bodging a specific node (or nodes') IP address in utterly fails the very idea of Load Balancing and places all the onus on this current node to be around and working.

But in a tight spot, you know...

Deleting

I guess you should delete things in an "unwinding the stack" sort of way.

If you created several different entities from the one YAML file then kubectl delete -f YAML does the right thing. Here, I suppose, deleting everything so quickly prevents any attempt to re-create, say, a Pod, as the parent Deployment will have gone too.

Document Actions