Setup Mesos cluster for spark workload in a nut shell

Setup Mesos cluster with Ansible

Dependencies

1
2
3
4
5
6
7
ansible-galaxy install JasonGiedymin.mesos
ansible-galaxy install AnsibleShipyard.ansible-zookeeper
ansible-galaxy install AnsibleShipyard.ansible-java
ansible-galaxy install geerlingguy.java
ansible-galaxy install JasonGiedymin.marathon
ansible-galaxy install JasonGiedymin.chronos
ansible-galaxy install JasonGiedymin.nodejs

Pre-requisite

playbook mesos_install.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
- name: Mesos
  hosts: mesos
  gather_facts: yes
  vars:
    - zookeeper_hostnames: "{{ groups.zookeeper_hosts | join(':' + zookeeper_client_port + ',') }}:{{ zookeeper_client_port }}"
  tasks:
    - debug: msg=" zookeeper_hostnames {{ zookeeper_hostnames }}"
- name: Zookeeper
  hosts: zookeeper_hosts
  sudo: yes
  roles:
    - role: AnsibleShipyard.ansible-zookeeper
  tasks:
    - debug: msg="{{ zookeeper_hostnames }}"
- name: Java
  hosts: all
  sudo: yes
  roles:
    - role: geerlingguy.java
- name: mesos_masters
  hosts: mesos-masters
  strategy: debug
  sudo: yes
  gather_facts: yes
  vars:
    chronos_bin_load_options_override_enabled: yes
    chronos_conf_options:
      hostname: "{{ chronos_hostname }}"
      http_port: "{{ chronos_port }}"
      mesos_framework_name: "chronos"
  tasks:
    - debug: msg="{{ chronos_conf_options }}"
  roles:
    - role: JasonGiedymin.mesos
      mesos_install_mode: master-slave
    - role: JasonGiedymin.nodejs
      nodejs_version: 0.10.25
      nodejs_global_packages:
        - express
      nodejs_path: "/usr/"
    - role: JasonGiedymin.chronos
      chronos_version: "2.4.0"
- name: mesos_slaves
  hosts: mesos-slaves
  sudo: yes
  gather_facts: yes
  vars:
    - zookeeper_hostnames: "{{ groups.zookeeper_hosts | join(':' + zookeeper_client_port + ',') }}:{{ zookeeper_client_port }}"
  tasks:
    - debug: msg="{{ zookeeper_hostnames }} "
  roles:
    - role: JasonGiedymin.mesos
      mesos_install_mode: slave
    - role: JasonGiedymin.marathon

group_var/mesos.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# playbook group_vars file "inventories/development/group_vars/mesos.yml"
# Mesos
mesos_version: “1.1.0"
mesos_hostname: "{{ ansible_fqdn }}"
mesos_cluster_name: "Development Cluster"
#binary
mesos_containerizers: "mesos"
#mesos_containerizers: "docker,mesos"
mesos_quorum: '2'
mesos_log_location: '/opt/logs/mesos'
mesos_work_dir: '/opt/mesos'
#share the same zookeeper in cluster
zookeeper_hostnames: "{{ groups.zookeeper_hosts | join(':' + zookeeper_client_port + ',') }}:{{ zookeeper_client_port }}"

Inventory

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[all]
mesos-spark-kube2 ansible_ssh_host=172.16.170.70
mesos-spark-kube3 ansible_ssh_host=172.16.170.146
mesos-spark-kube4 ansible_ssh_host=172.16.170.147
mesos-spark-kube1  ansible_ssh_host=172.16.169.210
[k8s-cluster:children]
kube-node
kube-master
[kube-node]
mesos-spark-kube2
mesos-spark-kube3
mesos-spark-kube4
[etcd]
mesos-spark-kube1
[zookeeper_hosts]
mesos-spark-kube1
[mesos:children]
mesos-masters
mesos-slaves
[mesos-slaves]
mesos-spark-kube2
mesos-spark-kube3
mesos-spark-kube4
[mesos-masters]
mesos-spark-kube1

Let is go to kick-up a cluster

Setup cluster with playbook to install mesos

ansible-playbook -i ~/.kargo/inventory/inventory.cfg mesos_install.yml

Verify cluster with slave status

curl http://172.16.169.210:5050/master/state | jq .slaves

Verify cluster with mesos execution

1
2
3
4
5
6
./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master mesos://172.16.169.210:5050  --num-executors 20  --driver-memory 1g --executor-memory 2g --executor-cores 1 --queue thequeue file:///root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 10000
./bin/spark-shell --master mesos://172.16.169.210:5050  --num-executors 20
./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master mesos://172.16.169.210:5050  --num-executors 16  --driver-memory 1g --executor-memory 1g --executor-cores 1 --queue thequeue file:///root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 10000

Go further for client setup

setup ubuntu 16.04 client

1
2
3
4
5
apt-get install wget curl unzip python-setuptools python-dev mesos=1.2.0-2.0.1
wget http://repos.mesosphere.com/debian/pool/main/m/mesos/mesos_1.2.0-2.0.1.debian8_amd64.deb
dpkg -i mesos_1.2.0-2.0.1.debian8_amd64.deb

setup an spark client

1
2
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz
tar -xvf spark-2.1.0-bin-hadoop2.7.tgz

submit spark job from client

1
./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master mesos://172.16.169.210:5050  --num-executors 20  --driver-memory 1g --executor-memory 2g --executor-cores 1 --queue thequeue file:///root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 10000

submit to multiple master

1
2
3
./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master mesos://zk://172.16.169.210:2181/mesos  --num-executors 2  --driver-memory 1g --executor-memory 2g --executor-cores 1 --queue thequeue file:///root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 10000
./bin/spark-submit --class org.apache.spark.examples.SparkPi  --master mesos://zk://172.16.169.210:2181/mesos  --num-executors 2  --driver-memory 1g --executor-memory 2g --executor-cores 1 --queue thequeue file:///root/spark-2.1.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.1.0.jar 10000

Monitoring

mesos monitoring with REST API

master: http://172.16.169.210:5050/metrics/snapshot

slave: http://172.16.169.210:5051/metrics/snapshot

Integrate with spark dispatcher for backend job

mesos dispatcher

1
2
3
4
sbin/start-mesos-dispatcher.sh -h 172.16.169.210 --name dispatcher -m mesos://zk://172.16.169.210:2181/mesos
sbin/start-mesos-dispatcher.sh -h 9.30.101.101 --name dispatcher -m mesos://zk://mesos-medium1:2181/mesos
-z zk://172.16.169.210:2181