Deployment
Update of deployment related items
Sometimes there is a need to update the persistent volume and persistent volume claim, e.g.: a wrongly configured mount point. Since pvc claims for pv, and pv is used by deployment, to completely remove pv, pvc needs to be deleted first by "kubectl delete pvc <pvc_name>", and then remove pv with "kubectl delete pv <pv_name>" followed by "kubectl patch pv <pv_name> -p '{"metadata": {"finalizers": null}}' ".
After deleting PVC, re-creating the PV will find status become "Released", the claim ID is still attaching the old PVC, we need to patch the PV using "kubectl patch pv <volume_name> --type json -p '[{"op": "remove", "path": "/spec/claimRef/uid"}]' ". The PV can then re-bound with the new PVC.Deployment example (redmine)
Refer to the following deployment composed (deployment-redmine.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: redmine
labels:
app: redmine
spec:
replicas: 1
selector:
matchLabels:
app: redmine
strategy:
type: Recreate
template:
metadata:
labels:
app: redmine
spec:
containers:
- name: redmine
image: bitnami/redmine:4.1.1
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
protocol: TCP
env:
- name: REDMINE_USERNAME
value: admin
- name: REDMINE_PASSWORD
value: admin
- name: REDMINE_DB_USERNAME
value: root
- name: REDMINE_DB_PASSWORD
value: P2mobile8!886
- name: REDMINE_DB_NAME
value: bitnami_redmine
- name: REDMINE_DB_MYSQL
value: mariadb.redmine.svc.cluster.local
lifecycle:
postStart:
exec:
command: ["sh","-c","/usr/bin/curl http://storage.dev.hkg.internal.p2mt.com/storage/redmine/purple.tar --output purple.tar; tar -xvf purple.tar -C /opt/bitnami/redmine/public/themes/;\n"]
volumeMounts:
- mountPath: /bitnami
name: nfs-redmine
restartPolicy: Always
volumes:
- name: nfs-redmine
persistentVolumeClaim:
claimName: redmine-data-claim
Here are some points worth noting
Different types of "labels"
A. metadata.labels (with value app: redmine) - This label is to manage the deployment itself. This may be used to filter the deployment based on this label. E.g. when you run ""kubectl get deploy --show-labels, the following outputs
NAME READY UP-TO-DATE AVAILABLE AGE LABELS
redmine 1/1 1 1 20h app=redmine
We can see LABELS field with "app=redmine", which is what it was defined above, so when we want to filter specific deployment, we can also use this label by "kubectl get deployments -L app=redmine -o wide", the following will be outputted
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
redmine 1/1 1 1 20h redmine bitnami/redmine:4.1.1 app=redmine
B. spec.selector.matchLabels (with value app: redmine) - This label tells ReplicaSet, an application managed in Deployment, to use this label to identify and manage how many instances of this label of pods to maintain the numbers, this label must equal to the one specified in spec.template.spec.metadata.labels
C. spec.template.spec.metadata.labels (with value app: redmine) - Label of the pod, must be the same as spec.selector.matchLabels
Spec and templates
Spec and templates defines the desired state of the deployment. Here are some configs worth noting at.
1. deployment.spec: Defines status of pods within the deployment, e.g. how many copies of pods required for the service containers, how the pod is formed (by defining 1 or more containers within a pod)
2. deployment.spec.selector.matchLabels: A required field which must match with deployment.spec.template.metadata.labels, when specified, the deployment knows which pods underhood are being targeted.
3. deployment.spec.template.metadata.labels: The name of the template, which must match to deployment.spec.selector.matchLabels
4. deployment.spec.template.spec.containers: The formation in terms of containers that the deployment is defined to, more than 1 container can be specified, like docker compose, container name, image etc. can be designated. deployment.spec.template.spec.containers.[n].ports.containerPort specify the port used (internally) by container, it can then exposed with an external port / IP if service container is needed to be accessed externally.
deployment.spec.template.spec.containers.[n].lifecycle can be used to put some commands executed in different cycles, e.g. one may want to execute an entrypoint command after container in up successfully using .lifecycle.podStart.exec.command.
Volumes
In this redmine deployment example, we use nfs (Network File Storage) situated in our company's synology NAS as the permanent storage source, here is how to create a volume
1. Create PV (Persistent Volume) and PVC (Persistent Volume Claim)
In K8S, we need to defined PV and PVC first before we can actually "frame" them into the deployment yaml. Here are the yamls of PV and PVC.
Persistent Volume (PV), pv-redmine.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-redmine
spec:
storageClassName: redmine-class
capacity:
storage: 200Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
nfs:
path: /volume1/k8s-nfs/redmine
server: 10.240.0.3
PV defines the specifications of the volume, in this case, we use nfs as volume source, we must ensure the connectivity between the node and the NAS (10.240.0.3) before applying the manifest. And we need to also create and intialize the nfs server first (as shown below)
nfs.path is the physical location of the NAS, in our case, we need to refer to the NAS's nfs server mount point settings and fill in the mount point value. In Synology NAS, this information can be retrieved through "Shared folder > NFS rights", the mount path can be found at the bottom.
spec.storageClassName and spec.accessModes are worth noting, storageClassName is used to bind the PV with the PVC, the corresponding storage class name needs to be the same in PVC. ReadWriteOnce access mode means the volume can only be read / write by a single node.
Persistent Volume (PVC), pvc-redmine.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redmine-data-claim
spec:
storageClassName: redmine-class
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
The PVC is created to claim the volume (PV), spec.storageClassName should match the one in PV or the claim cannot linked the PV.
Linking PVC to deployment
After creating PV and PVC, deployment.spec.template.spec.volumes should be created, it is part of the pod specifications, the deployment.spec.template.spec.volumes.persistentVolumeClaim.claimName should match the pvc.meta.name so as to link up the PV with the deployment.
Furthermore, the deployment.spec.tempplate.spec.containers.[n].volumeMounts should also be configured, the volumeMounts.[n].mountPath should be filled with the pod's mount path, e.g. /etc/config, /bitnami etc.
The volumePath.name should also match the pv.metadata.name to let the mountPath knows where the mount volume should it refer to.
Service
In k8s, service is needed to expose network application (usually containers running in pods), e.g. a redmine server.
Manifests
The following objects of service manifest (redmine-service.yaml) are worth noting.
apiVersion: v1
kind: Service
metadata:
annotations:
metallb.universe.tf/address-pool: metallb-address-pool
metallb.universe.tf/loadBalancerIPs: 10.230.0.12
labels:
app: redmine
name: redmine
spec:
ports:
- name: "redmine"
port: 80
targetPort: 3000
selector:
app: redmine
type: LoadBalancer
status:
loadBalancer: {}
The "specs.selector.app" instructs the service to be applied to the pod (usually defined in deployment yaml) which also contains the same label selector. In this case, pod with label app:redmine will be selected and apply this service (with name "redmine")
Identification
Labels and annotations are used to identify pods when cluster becomes sophisticated, labels are typically valued as release, stable etc. which are identifiable and can be selected by selector.
Labels can be multipled
Annotations can mark some informational data such as author, versioning, release contact etc. These are all arbitrary for easy identification
Troubleshooting
1. Reading logs
System wide
Sometimes logs are useful to track where the problem is, use "kubectl get events --sort-by=.metadata.creationTimestamp" or "kubectl describe pods" to do so.
Deployments
kubectl logs deployment/redmine dumps logs of the redmine container (assuming 1 container only in the deployment), and kubectl logs deployment/redmine -c <containerName> for multiple containers
Pods
For single pod logs, use kubectl logs -l app=redmine, to dump logs with designated label
2. Throttling error when trying to retrieve information
When receving "Throttling request took..." while running " kubectl get all -n <ns>" , run " sudo chown -R <user>:<group> .kube/" to solve the issue3. The connection to the server <master>:6443 was refused - did you specify the right host or port? when running any kubectl command
2. Check status of docker service bysystemctl status docker.service

systemctl status docker.service
systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Sun 2023-05-28 13:19:36 HKT; 1s ago
Docs: https://kubernetes.io/docs/home/
Process: 4048306 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 4048306 (code=exited, status=1/FAILURE)
Kubelet unable to initialize, further check system log
journalctl -f -u kubelet
where -f means follow and -u to specify particular service
-- Logs begin at Tue 2023-04-25 19:25:06 HKT. --
May 28 13:18:45 jacky-MS-7817-master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
May 28 13:18:45 jacky-MS-7817-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: I0528 13:18:45.187814 4048052 server.go:440] "Kubelet version" kubeletVersion="v1.22.1"
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: I0528 13:18:45.188249 4048052 server.go:868] "Client rotation is on, will bootstrap in background"
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: E0528 13:18:45.189222 4048052 bootstrap.go:265] part of the existing bootstrap client certificate in /etc/kubernetes/kubelet.conf is expired: 2022-06-12 17:11:44 +0000 UTC
May 28 13:18:45 jacky-MS-7817-master kubelet[4048052]: E0528 13:18:45.189256 4048052 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
May 28 13:18:45 jacky-MS-7817-master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
May 28 13:18:45 jacky-MS-7817-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
May 28 13:18:55 jacky-MS-7817-master systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 1647394.
Certificate is expired, needs to check and renew the certificate.
sudo kubeadm certs renew all
sudo kubeadm certs check-expiration
which outputs
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
MISSING! certificate the apiserver uses to access etcd
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
and
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf May 27, 2024 05:04 UTC 364d no
apiserver May 27, 2024 05:04 UTC 364d ca no
apiserver-etcd-client May 27, 2024 05:06 UTC 364d etcd-ca no
apiserver-kubelet-client May 27, 2024 05:04 UTC 364d ca no
controller-manager.conf May 27, 2024 05:04 UTC 364d no
etcd-healthcheck-client May 27, 2024 05:04 UTC 364d etcd-ca no
etcd-peer May 27, 2024 05:04 UTC 364d etcd-ca no
etcd-server May 27, 2024 05:04 UTC 364d etcd-ca no
front-proxy-client May 27, 2024 05:04 UTC 364d front-proxy-ca no
scheduler.conf May 27, 2024 05:04 UTC 364d no
indicating successful of certificate renewal
5. Check the system log again, but still fails to initiailize the service
May 28 13:07:32 jacky-MS-7817-master kubelet[4044890]: E0528 13:07:32.789268 4044890 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
May 28 13:07:32 jacky-MS-7817-master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
May 28 13:07:32 jacky-MS-7817-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
6. Re-initialize k8s control plane node using the renew certs
sudo cp -r /etc/kubernetes /etc/kubernetes-bak #backup k8s folders
sudo rm -rf $HOME/.kube # Remove all configurations in .kube folder
sudo rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf /root/.kube/config # Copy the admin config back to .kube directory
sudo rm -rf /etc/kubernetes/*.conf
sudo kubeadm init phase kubeconfig all # redo the initialization phase, this will re-generate all kinds of certs
systemctl restart kubelet
The kubelet should back to its running state and kubectl should run successfully by running
systemctl status kubelet.service
kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2023-05-28 13:38:05 HKT; 11h ago
Docs: https://kubernetes.io/docs/home/
Main PID: 4052761 (kubelet)
Tasks: 19 (limit: 18434)
Memory: 149.8M
CGroup: /system.slice/kubelet.service
└─4052761 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.4.1
sudo cp -r /etc/kubernetes /etc/kubernetes-bak #backup k8s folders
sudo rm -rf $HOME/.kube # Remove all configurations in .kube folder
sudo rm -rf $HOME/.kube
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf /root/.kube/config # Copy the admin config back to .kube directory
sudo rm -rf /etc/kubernetes/*.conf
sudo kubeadm init phase kubeconfig all # redo the initialization phase, this will re-generate all kinds of certs
systemctl restart kubelet
The kubelet should back to its running state and kubectl should run successfully by running
systemctl status kubelet.service
kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Sun 2023-05-28 13:38:05 HKT; 11h ago Docs: https://kubernetes.io/docs/home/ Main PID: 4052761 (kubelet) Tasks: 19 (limit: 18434) Memory: 149.8M CGroup: /system.slice/kubelet.service └─4052761 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.4.1
4. The connection to the server localhost:8080 was refused - did you specify the right host or port? when running any kubectl command
This is due to the absence of admin configuration in .kube directory, simply follow the official instruction to copy the admin.conf to the .kube config directory
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
5. Cannot connect to Docker daemon, Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: no such file or directory" when running pipeline in k8s gitlab runner
6. How do I allow private repository in minikube
Step1: Install minikube plugins for private repository and configure ecr
https://minikube.sigs.k8s.io/docs/handbook/registry/
Step2: Configure the deployment yaml
https://minikube.sigs.k8s.io/docs/tutorials/configuring_creds_for_aws_ecr/
7. My minikube load balancer service external IP's status is in "PENDING" state
Step1: Assign external IP by "minikube service <service_name>", a browser window will popup with the accessible IP with external port
8. I made some mistakes in ConfigMap / Secrets, how can I do the update in an already running deployment?
Basically, a more k8s way to update the manifests are creating a new yaml file with -v* to indicate the version, this allow easy rolback to the previous deployment.
e.g. configmap-runner-script-gitlab-runner-v3, copy all the contents of configmap-runner-script-gitlab-runner-v2 and start the modification.
Run
kubectl apply -f ./configmap-runner-script-gitlab-runner-v3.yaml
to apply the configMap once again, by default the changes apply will immediately propagated to all the pods, but it only applies when the configMap itself is mounted as volume, in our case, we need to rollout the statefulset to reflect the configMap changes by kubectl rollout -n gitlab restart statefulset/gitlab-ci-runner
The same applies to secrets and other related deployment resources. The update itself wouldn't cause any downtime to the service but may affect performance if insffucuent resources occurs during the rollout update process.
References
1. Different semantics of "label" in deployment yaml
3. Another description of YAML dash and indentation
4. Why do we need pods type when we have deployment type?
5. Kubernetes PV refuses to bind after delete/re-create
6. What is the difference between a Source NAT, Destination NAT and Masquerading?
7. Kubernetes NodePort vs LoadBalancer vs Ingress? When should I use what?
8. Connecting Applications with Services
12. Access IP outside the cluster
13. Labels in deployment spec and template
14. ReplicaSet
15. Ingress
16. (Metallb) Does kube-proxy not already do this?
17. 问题“The connection to the server....:6443 was refused - did you specify the right host or port?”的处理!
18. 如何使用kubeadm管理证书?
20. The connection to the server localhost:8080 was refused - did you specify the right host or port?
21. Pull an Image from a Private Registry
For detailed steps of how this cluster is made, please refer to the relative blog post
沒有留言:
張貼留言