4. Deploying Streams on Mesos and Marathon

In this getting started guide, the Data Flow Server is run as a standalone application outside the Mesos cluster. This also requires running a local instance of Redis to store available modules. A future version will provide support for the Data Flow Server itself to run on Mesos.

  1. Deploy a Mesos and Marathon cluster.

    The Mesosphere getting started guide provides a number of options for you to deploy a cluster. Many of the options listed there need some additional work to get going. For example, many Vagrant provisioned VMs are using deprecated versions of the Docker client. We have included some brief instructions for setting up a single-node cluster with Vagrant in Appendix A, Test Cluster. In addition to this we have also used the Playa Mesos Vagrant setup. For those that want to setup a distributed cluster quickly, there is also an option to spin up a cluster on AWS using Mesosphere’s Datacenter Operation System on Amazon Web Services.

    The rest of this getting started guide assumes that you have a working Mesos and Marathon cluster and know the Marathon endpoint URL.

  2. Create a Rabbit MQ service on the Mesos cluster.

    The rabbitmq service will be used for messaging between modules in the stream. There is a sample application JSON file for Rabbit MQ in the spring-cloud-dataflow-server-mesos repository that you can use as a starting point. The service discovery mechanism is currently disabled so you need to look up the host and port to use for the connection. Depending on how large your cluster is, you way want to tweek the CPU and/or memory values.

    Using the above JSON file and an Mesos and Marathon cluster installed you can deploy a Rabbit MQ application instance by issuing the following command

    curl -X POST http://192.168.33.10:8080/v2/apps -d @rabbitmq.json -H "Content-type: application/json"

    Note the @ symbol to reference a file and that we are using the Marathon endpoint URL of 192.168.33.10:8080. Your endpoint might be different based on the configuration used for your installation of Mesos and Marathon. Using the Marathon and Mesos UIs you can verify that rabbitmq service is running on the cluster.

  3. Run a local redis-server.

    $ redis-sever

    This is used by the locally running Data Flow Server to store the state of available module versions for stream definitions.

  4. Download the Spring Cloud Data Flow Server for Mesos and Marathon.

    $ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-server-mesos/1.0.0.M2/spring-cloud-dataflow-server-mesos-1.0.0.M2.jar
  5. Using the Marathon GUI, look up the host and port for the rabbitmq application. In our case it was 192.168.33.10:31916. For the deployed apps to be able to connect to Rabbit MQ we need to provide the following property when we start the server:

    --spring.cloud.deployer.mesos.marathon.environmentVariables='SPRING_RABBITMQ_HOST=192.168.33.10,SPRING_RABBITMQ_PORT=31916'
  6. Now, run the Spring Cloud Data Flow Server for Mesos and Marathon passing in this host/port configuration.

    $ java -jar spring-cloud-dataflow-server-mesos-1.0.0.M2.jar --spring.cloud.deployer.mesos.marathon.apiEndpoint=http://192.168.33.10:8080 --spring.cloud.deployer.mesos.marathon.memory=768 --spring.cloud.deployer.mesos.marathon.environmentVariables='SPRING_RABBITMQ_HOST=192.168.33.10,SPRING_RABBITMQ_PORT=31916'

    You can pass in properties to set default values for memory and cpu resource request. For example --spring.cloud.deployer.mesos.marathon.memory=768 will by default allocate additional memory for the application vs. the default value of 512. You can see all the available options in the MarathonAppDeployerProperties.java file.

  7. Download and run the Spring Cloud Data Flow shell.

    $ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-shell/1.0.0.M3/spring-cloud-dataflow-shell-1.0.0.M3.jar
    
    $ java -jar spring-cloud-dataflow-shell-1.0.0.M3.jar
  8. Register the Rabbit MQ version of the time and log app modules using the shell

    dataflow:>module register --type source --name time --uri docker:springcloudstream/time-source-rabbit
    dataflow:>module register --type sink --name log --uri docker:springcloudstream/log-sink-rabbit
  9. Deploy a simple stream in the shell

    dataflow:>stream create --name ticktock --definition "time | log" --deploy

    In the Mesos UI you can then look at the logs for the log sink.

    2016-04-26 18:13:03.001  INFO 1 --- [           main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8080 (http)
    2016-04-26 18:13:03.004  INFO 1 --- [           main] o.s.c.s.a.l.s.r.LogSinkRabbitApplication : Started LogSinkRabbitApplication in 7.766 seconds (JVM running for 8.24)
    2016-04-26 18:13:54.443  INFO 1 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring FrameworkServlet 'dispatcherServlet'
    2016-04-26 18:13:54.445  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : FrameworkServlet 'dispatcherServlet': initialization started
    2016-04-26 18:13:54.459  INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : FrameworkServlet 'dispatcherServlet': initialization completed in 14 ms
    2016-04-26 18:14:09.088  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:09
    2016-04-26 18:14:10.077  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:10
    2016-04-26 18:14:11.080  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:11
    2016-04-26 18:14:12.083  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:12
    2016-04-26 18:14:13.090  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:13
    2016-04-26 18:14:14.091  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:14
    2016-04-26 18:14:15.093  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:15
    2016-04-26 18:14:16.095  INFO 1 --- [time.ticktock-1] log.sink                                 : 04/26/16 18:14:16
  10. Destroy the stream
dataflow:>stream destroy --name ticktock