In this getting started guide, the Data Flow Server is run as a standalone application outside the Mesos cluster. This also requires running a local instance of Redis to store available modules. A future version will provide support for the Data Flow Server itself to run on Mesos.
Deploy a Mesos and Marathon cluster.
The Mesosphere getting started guide provides a number of options for you to deploy a cluster. Many of the options listed there need some additional work to get going. For example, many Vagrant provisioned VMs are using deprecated versions of the Docker client. We have included some brief instructions for setting up a single-node cluster with Vagrant in Appendix A, Test Cluster. In addition to this we have also used the Playa Mesos Vagrant setup. For those that want to setup a distributed cluster quickly, there is also an option to spin up a cluster on AWS using Mesosphere’s Datacenter Operation System on Amazon Web Services.
The rest of this getting started guide assumes that you have a working Mesos and Marathon cluster and know the Marathon endpoint URL.
Create a Rabbit MQ service on the Mesos cluster.
The rabbitmq
service will be used for messaging between modules in the stream. There is a sample application JSON file for Rabbit MQ in the spring-cloud-dataflow-server-mesos
repository that you can use as a starting point. The service discovery mechanism is currently disabled so you need to look up the host and port to use for the connection. Depending on how large your cluster is, you way want to tweek the CPU and/or memory values.
Using the above JSON file and an Mesos and Marathon cluster installed you can deploy a Rabbit MQ application instance by issuing the following command
curl -X POST http://192.168.33.10:8080/v2/apps -d @rabbitmq.json -H "Content-type: application/json"
Note the @
symbol to reference a file and that we are using the Marathon endpoint URL of 192.168.33.10:8080
. Your endpoint might be different based on the configuration used for your installation of Mesos and Marathon. Using the Marathon and Mesos UIs you can verify that rabbitmq
service is running on the cluster.
Run a local redis-server.
$ redis-sever
This is used by the locally running Data Flow Server to store the state of available module versions for stream definitions.
Download the Spring Cloud Data Flow Server for Mesos and Marathon.
$ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-server-mesos/1.0.0.M2/spring-cloud-dataflow-server-mesos-1.0.0.M2.jar
Using the Marathon GUI, look up the host and port for the rabbitmq
application. In our case it was 192.168.33.10:31916
. For the deployed apps to be able to connect to Rabbit MQ we need to provide the following property when we start the server:
--spring.cloud.deployer.mesos.marathon.environmentVariables='SPRING_RABBITMQ_HOST=192.168.33.10,SPRING_RABBITMQ_PORT=31916'
Now, run the Spring Cloud Data Flow Server for Mesos and Marathon passing in this host/port configuration.
$ java -jar spring-cloud-dataflow-server-mesos-1.0.0.M2.jar --spring.cloud.deployer.mesos.marathon.apiEndpoint=http://192.168.33.10:8080 --spring.cloud.deployer.mesos.marathon.memory=768 --spring.cloud.deployer.mesos.marathon.environmentVariables='SPRING_RABBITMQ_HOST=192.168.33.10,SPRING_RABBITMQ_PORT=31916'
You can pass in properties to set default values for memory and cpu resource request. For example --spring.cloud.deployer.mesos.marathon.memory=768
will by default allocate additional memory for the application vs. the default value of 512. You can see all the available options in the MarathonAppDeployerProperties.java file.
Download and run the Spring Cloud Data Flow shell.
$ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-shell/1.0.0.M3/spring-cloud-dataflow-shell-1.0.0.M3.jar $ java -jar spring-cloud-dataflow-shell-1.0.0.M3.jar
Register the Rabbit MQ version of the time
and log
app modules using the shell
dataflow:>module register --type source --name time --uri docker:springcloudstream/time-source-rabbit dataflow:>module register --type sink --name log --uri docker:springcloudstream/log-sink-rabbit
Deploy a simple stream in the shell
dataflow:>stream create --name ticktock --definition "time | log" --deploy
In the Mesos UI you can then look at the logs for the log sink.
2016-04-26 18:13:03.001 INFO 1 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8080 (http) 2016-04-26 18:13:03.004 INFO 1 --- [ main] o.s.c.s.a.l.s.r.LogSinkRabbitApplication : Started LogSinkRabbitApplication in 7.766 seconds (JVM running for 8.24) 2016-04-26 18:13:54.443 INFO 1 --- [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring FrameworkServlet 'dispatcherServlet' 2016-04-26 18:13:54.445 INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization started 2016-04-26 18:13:54.459 INFO 1 --- [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization completed in 14 ms 2016-04-26 18:14:09.088 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:09 2016-04-26 18:14:10.077 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:10 2016-04-26 18:14:11.080 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:11 2016-04-26 18:14:12.083 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:12 2016-04-26 18:14:13.090 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:13 2016-04-26 18:14:14.091 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:14 2016-04-26 18:14:15.093 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:15 2016-04-26 18:14:16.095 INFO 1 --- [time.ticktock-1] log.sink : 04/26/16 18:14:16
dataflow:>stream destroy --name ticktock