What are StreamSets environments and deployments?

An environment serves two purposes. First, it provides segmentation between development and production environments in the traditional sense. Second, it encapsulates all the resources deployed approximately once in a user’s cloud environment (e.g. VPCs, networks). 

A deployment is a set of engine replicas deployed and managed as a unit from Control Hub to run data pipelines. A deployment encapsulates all the stage libraries, external resources (e.g. JDBC drivers, custom stages), security elements like certificate bundles, and backend properties into one central location. A deployment is either Data Collector or Transformer.

Users provision deployments into a cloud environment as a set of autoscaling replicas and the cloud environments determine and guarantee the availability of the engines at any point in time.

The same concept applies for “self-managed” deployments, where the backend properties are managed through Control Hub but the engines can be deployed on-prem via an automation tool such as Ansible.

