Data Export

Setup & Authentication

Authentication is configured per-destination from the Rover dashboard. Rover never holds long-lived credentials to your cloud; access is delegated through AWS's native role-assumption trust mechanism.

You'll need an S3 bucket in your own AWS account, with an optional prefix for Rover-written objects. Once authentication is configured, Rover will write into that bucket continuously.


Amazon S3 — IAM role with external ID

When you create an S3 destination, the dashboard mints a destination-scoped external ID and shows you the AWS principal Rover will assume from. You then create an IAM role in your account that:

  1. Trusts the Rover AWS principal as the trust-policy Principal.
  2. Requires a matching sts:ExternalId condition equal to the external ID the dashboard generated for this destination. The external ID protects you against the confused-deputy problem: another Rover customer cannot assume your role even if they know its ARN.
  3. Grants the minimum permissions Rover needs to write objects under the configured bucket and prefix:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:AbortMultipartUpload"
      ],
      "Resource": "arn:aws:s3:::YOUR_BUCKET/YOUR_PREFIX/*"
    }
  ]
}

Rover uploads larger objects via S3 multipart, so s3:AbortMultipartUpload is required to clean up incomplete uploads on transient failures. We also recommend enabling a lifecycle rule to abort incomplete multipart uploads on the bucket so they never accrue storage costs.

Paste the role ARN back into the dashboard. Rover will use sts:AssumeRole with the external ID for every delivery. Rotation is handled by AWS; there are no static keys to manage.

Region must match the bucket

Rover assumes the role and writes to the bucket from a region you configure in the dashboard. Set this to the same region as the bucket. S3 does not transparently route cross-region writes against a regional endpoint, so a mismatch will surface as PermanentRedirect errors.

KMS-encrypted buckets

If your bucket has a default SSE-KMS key, you'll also need to grant the role kms:GenerateDataKey (and kms:Decrypt for any consumers) on the key. SSE-S3 and SSE-KMS with an AWS-managed key require no additional permissions.


Pausing and disabling

Disabling a destination from the dashboard stops new deliveries. In-flight deliveries finish; nothing new is queued.

Re-enabling does not replay individual changes that occurred during the disabled window. Instead, it starts a fresh baseline snapshot from current state for the three state datasets (fan_profiles, fan_identity_graph, fan_profile_merges), then resumes the change stream from re-enable time forward. fan_track_events resumes flowing as new events occur; events that fired while the destination was disabled are not delivered.

Because re-enabling produces a new snapshot, the same record-id and loader caveats as a snapshot rerun apply: rows are re-emitted with new deterministic id values, and an upsert-by-id load will see them as new records unless you deduplicate on (roverID, updatedAt) (or the dataset's event-time field).

A destination cannot be paused while its initial snapshot is still in progress; wait until the snapshot phase has completed before pausing.

Snapshot reruns

If a snapshot delivery fails (for example: bucket permissions were removed mid-snapshot, or the role's trust policy broke), the dashboard will surface the failure and offer a retry that re-runs the snapshot. This is the supported recovery path.

If you need a fresh full re-export for any other reason (recovering from a loader incident on your side, rebuilding the warehouse from scratch), contact your Rover account manager to coordinate the re-snapshot.

Snapshot reruns produce new record ids

A snapshot rerun re-emits every existing row with a new deterministic id, not the same id it had originally. Plan loader logic so that rerun rows do not produce duplicates: deduplicate on (roverID, updatedAt) (or the dataset's event-time field), or truncate and reload the affected partitions before letting the rerun land.

Object overwrite behavior

Rover writes objects with deterministic, content-derived keys, so the same logical file always lands at the same key. Re-deliveries after transient failures will overwrite an existing object at the same key. If you have S3 versioning enabled on the destination bucket, expect occasional non-current versions to appear; the latest version is always the canonical one.

Previous
Overview