问题
We would like to setup Elasticsearch Highly Available Setup in Kubernetes. we would like to deploy the below objects and would like to scale them independently
- Master pods
- Data pods
- Client pods
please share your suggestions if you have implemented this kind of setup. Preferably using open source tools
回答1:
See below some points for a proposed architecture:
- Elasticsearch master nodes do not need persistent storage, so use a Deployment to manage these. Use a Service to load balance between the masters.
Use a ConfigMap to manage their settings. Something like this:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
labels:
component: elasticsearch
role: master
version: v6.5.0 // or whatever version you require
spec:
selector:
component: elasticsearch
role: master
version: v6.5.0
ports:
- name: transport
port: 9300 // no need to expose port 9200, as master nodes don't need it
protocol: TCP
clusterIP: None
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-master-configmap
data:
elasticsearch.yml: |
# these should get you going
# if you want more fine-grained control, feel free to add other ES settings
cluster.name: "${CLUSTER_NAME}"
node.name: "${NODE_NAME}"
network.host: 0.0.0.0
# (no_master_eligible_nodes / 2) + 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ${DISCOVERY_SERVICE}
node.master: true
node.data: false
node.ingest: false
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: elasticsearch-master
labels:
component: elasticsearch
role: master
version: v6.5.0
spec:
replicas: 3 // 3 is the recommended minimum
template:
metadata:
labels:
component: elasticsearch
role: master
version: v6.5.0
spec:
affinity:
// you can also add node affinity in case you have a specific node pool
podAntiAffinity:
// make sure 2 ES processes don't end up on the same machine
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- elasticsearch
- key: role
operator: In
values:
- master
topologyKey: kubernetes.io/hostname
initContainers:
# just basic ES environment configuration
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: elasticsearch-master
image: // your preferred image
imagePullPolicy: Always
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: elasticsearch-cluster
- name: DISCOVERY_SERVICE
value: elasticsearch-discovery
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m // or more, if you want
ports:
- name: tcp-transport
containerPort: 9300
volumeMounts:
- name: configmap
mountPath: /etc/elasticsearch/elasticsearch.yml
subPath: elasticsearch.yml
- name: storage
mountPath: /usr/share/elasticsearch/data
volumes:
- name: configmap
configMap:
name: elasticsearch-master-configmap
- emptyDir:
medium: ""
name: storage
Client nodes can also be deployed in a very similar fashion, so I will avoid adding code for that.
- Data nodes are a bit more special: you need to configure persistent storage, so you'll have to use StatefulSets. Use PersistentVolumeClaims to create disks for these pods. I'd do something like this:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
ports:
- name: http
port: 9200 # in this example, data nodes are being used as client nodes
- port: 9300
name: transport
selector:
component: elasticsearch
role: data
version: v6.5.0
type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-data-configmap
data:
elasticsearch.yml: |
cluster.name: "${CLUSTER_NAME}"
node.name: "${NODE_NAME}"
network.host: 0.0.0.0
# (no_master_eligible_nodes / 2) + 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ${DISCOVERY_SERVICE}
node.master: false
node.data: true
node.ingest: false
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-data
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
serviceName: elasticsearch
replicas: 1 # choose the appropriate number
selector:
matchLabels:
component: elasticsearch
role: data
version: v6.5.0
template:
metadata:
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
affinity:
# again, I recommend using nodeAffinity
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- elasticsearch
- key: role
operator: In
values:
- data
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 180
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: elasticsearch-production-container
image: .search the same image that you use for the master node
imagePullPolicy: Always
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: elasticsearch-cluster
- name: DISCOVERY_SERVICE
value: elasticsearch-discovery
- name: ES_JAVA_OPTS
value: -Xms31g -Xmx31g # do not exceed 32 GB!!!
ports:
- name: http
containerPort: 9200
- name: tcp-transport
containerPort: 9300
volumeMounts:
- name: configmap
mountPath: /etc/elasticsearch/elasticsearch.yml
subPath: elasticsearch.yml
- name: elasticsearch-node-pvc
mountPath: /usr/share/elasticsearch/data
readinessProbe:
httpGet:
path: /_cluster/health?local=true
port: 9200
initialDelaySeconds: 15
livenessProbe:
exec:
command:
- /usr/bin/pgrep
- -x
- "java"
initialDelaySeconds: 15
resources:
requests:
# adjust these as per your needs
memory: "32Gi"
cpu: "11"
volumes:
- name: configmap
configMap:
name: elasticsearch-data-configmap
volumeClaimTemplates:
- metadata:
name: elasticsearch-node-pvc
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: # this is dependent on your K8s environment
resources:
requests:
storage: 350Gi # choose the desired storage size for each ES data node
Hope this helps!
来源:https://stackoverflow.com/questions/55216342/elasticsearch-highly-available-setup-in-kubernetes