Apache Spark
Kubernetes: Apache Spark — configuration and practical examples.
The following command installs Apache Spark on Kubernetes using the Bitnami Helm chart. The master.webPort is set to 8081 to avoid conflicts with other services that may use port 80. After installation, you can access the Spark master UI and submit jobs to the cluster.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
helm3 install spark \
--set master.webPort=8081 bitnami/spark
NAME: spark
LAST DEPLOYED: Mon Sep 7 15:25:26 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the Spark master WebUI URL by running these commands:
kubectl port-forward --namespace default svc/spark-master-svc 80:80
echo "Visit http://127.0.0.1:80 to use your application"
2. Submit an application to the cluster:
To submit an application to the cluster the spark-submit script must be used. That script can be
obtained at https://github.com/apache/spark/tree/master/bin. Also you can use kubectl run.
export EXAMPLE_JAR=$(kubectl exec -ti --namespace default spark-worker-0 -- find examples/jars/ -name 'spark-example*\.jar' | tr -d '\r')
kubectl exec -ti --namespace default spark-worker-0 -- spark-submit --master spark://spark-master-svc:7077 \
--class org.apache.spark.examples.SparkPi \
$EXAMPLE_JAR 5
** IMPORTANT: When submit an application from outside the cluster service type should be set to the NodePort or LoadBalancer. **
** IMPORTANT: When submit an application the --master parameter should be set to the service IP, if not, the application will not resolve the master. **
** Please be patient while the chart is being deployed **
This post is licensed under CC BY 4.0 by the author.