4. Deploying Streams on Kubernetes

In this getting started guide, the Admin server is run as a standalone application outside the kubernetes cluster. This also requires running a local instance of Redis to store available modules. A future version will allow the admin server itself to run on kubernetes.

  1. Deploy a Kubernetes cluster.

    The Kubernetes getting started guide lets you choose among many deployment options so you can pick one that you are most comfortable using. Of note, the docker-compose-kubernetes is not among those options but was used by the developers of this project to run a local Kubernetes cluster using Docker Compose.

    The rest of this getting started guide assumes that you have a working Kubernetes cluster and a kubectl command line.

  2. Create a redis service on the Kubernetes cluster.

    The redis service will be used for messaging between modules in the stream. There are sample replication controller and service YAML files in the spring-cloud-dataflow-admin-kubernetes repository that you can use as a starting point as they have the required metadata set for service discovery by the modules.

    If you are deploying a Kubernetes cluster outside your local machine, setting the service to be a LoadBalancer type will give you an external IP address to connect to the redis instance.

    $ git clone https://github.com/spring-cloud/spring-cloud-dataflow-admin-kubernetes
    $ cd spring-cloud-dataflow-admin-kubernetes
    $ kubectl create -f src/etc/kubernetes/redis-controller.yml
    $ kubectl create -f src/etc/kubernetes/redis-service.yml

    You can use the command kubectl get pods to verify that the controller is running. Note that it can take a minute or so until there is an external IP address for the redis server. Use the command kubectl get services to check on the state of the service and look for when there is a value under the EXTERNAL_IP column. The command kubectl delete service redis to clean up afterwards.

  3. Determine the location of your Kubernetes Master URL, for example:

    $ kubectl cluster-info
    
    Kubernetes master is running at https://104.154.90.44
    
       ...other output omitted...
  4. Export environment variables to connect to Kubernetes.

    The admin server uses the fabric8 Java client library to connect to the Kubernetes cluster. It can be configured using system properites, environment variables, and the Kube config file. In testing using the Google Container Engine, only setting the environment variables KUBERNETES_MASTER and KUBERNETES_NAMESPACE were required. Other configuration values were read from the Kube config file.

    $ export KUBERNETES_MASTER=https://104.154.90.44/
    $ export KUBERNETES_NAMESPACE=default

    This approach supports using one admin server instance per Kubernetes namespace.

  5. Run a local redis-server.

    $ redis-sever

    This is used by the locally running admin server to store the state of available module versions for stream definitions.

    [Note]Note

    If you are switching between milestone and snapshot versions of the admin server, flush the redis keys that contain the module’s version information. Calling flushdb is indiscriminate but a handy way to start from a clean redis state.

  6. Download and run the Spring Cloud Dataflow Admin server for Kubernetes.

    $ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-admin-kubernetes/1.0.0.M1/spring-cloud-dataflow-admin-kubernetes-1.0.0.M1.jar
    
    $ java -jar spring-cloud-dataflow-admin-kubernetes-1.0.0.M1.jar

    Ensure that the admin is running in the same terminal session that has the Kubernetes environment variables set.

  7. Download and run the Spring Cloud Dataflow shell.

    $ wget http://repo.spring.io/milestone/org/springframework/cloud/spring-cloud-dataflow-shell/1.0.0.M2/spring-cloud-dataflow-shell-1.0.0.M2.jar
    
    $ java -jar spring-cloud-dataflow-shell-1.0.0.M2.jar
  8. Deploy a simple stream in the shell

    dataflow:>stream create --name ticktock --definition "time | log" --deploy

    You can use the command kubectl get pods to check on the state of the pods corresponding to this stream.

    $ kubectl get pods
    NAME                  READY     STATUS    RESTARTS   AGE
    redis-d207a           1/1       Running   0          50m
    ticktock-log-qnk72    0/1       Running   2          1m
    ticktock-time-r65cn   0/1       Running   3          1m

    Look at the logs for the pod deployed for the log sink.

    $ kubectl logs -f ticktock-log-qnk72
    ...
    2015-12-28 18:50:02.897  INFO 1 --- [           main] o.s.c.s.module.log.LogSinkApplication    : Started LogSinkApplication in 10.973 seconds (JVM running for 50.055)
    2015-12-28 18:50:08.561  INFO 1 --- [hannel-adapter1] log.sink                                 : 2015-12-28 18:50:08
    2015-12-28 18:50:09.556  INFO 1 --- [hannel-adapter1] log.sink                                 : 2015-12-28 18:50:09
    2015-12-28 18:50:10.557  INFO 1 --- [hannel-adapter1] log.sink                                 : 2015-12-28 18:50:10
    2015-12-28 18:50:11.558  INFO 1 --- [hannel-adapter1] log.sink                                 : 2015-12-28 18:50:11

    A useful command to help in troubleshooting issues, such as a container that has a fatal error starting up, add the options `--prvious to view last terminated container log.

  9. Destroy the stream
dataflow:>stream destroy --name ticktock
[Note]Note

If you stop and restart the admin server after streams are deployed, you will not be able to destroy them. This is abug that is being addressed in a future release.