Skip to content

CI Integration

Marina is designed to work headlessly in CI pipelines. Every registry type that requires credentials supports an --auth-env flag that names an environment variable holding the secret. No interactive prompts, no stored session files needed on the runner.

General pattern

When adding a registry, pass --auth-env with the name of the environment variable that will hold the credentials at runtime. The variable name is saved in the registry config.

marina registry add <name> <uri> --auth-env MY_SECRET_VAR

Then in your CI environment, set MY_SECRET_VAR to the appropriate value for the registry type (see below) and run Marina commands normally.

Config and cache directories

By default Marina reads its config from ~/.config/marina/marina.rl and writes cached data to ~/.cache/marina/. Both paths can be overridden with environment variables:

Variable Default Purpose
MARINA_CONFIG_DIR ~/.config/marina Directory containing marina.rl
MARINA_CACHE_DIR ~/.cache/marina Downloaded dataset cache

This makes it easy to ship the registry configuration as part of the repo and point CI runners at it directly — no marina registry add step needed on the runner:

repo/
└── ci/
    └── marina/
        └── marina.rl   ← registry config with auth_env set
env:
  MARINA_CONFIG_DIR: ${{ github.workspace }}/ci/marina
  MARINA_SA_JSON: ${{ secrets.MARINA_SA_JSON }}
  MARINA_CACHE_DIR: /tmp/marina-cache

steps:
  - run: marina pull namespace/dataset:tag

A minimal marina.rl for a GDrive registry looks like:

registries {
  prod {
    uri = "gdrive://<folder_id>"
    kind = "gdrive"
    auth_env = "MARINA_SA_JSON"
  }
}

Google Drive

Marina uses a Google Service Account for unattended Drive access.

One-time setup

  1. Create a Service Account in Google Cloud Console.
  2. Grant it access to the target Drive folder by sharing the folder with the service account's email address.
  3. Download the key as JSON.
  4. Store the JSON content as a CI secret (e.g. MARINA_SA_JSON).

Register the registry (done once, result can be committed to the repo):

marina registry add prod gdrive://<folder_id> --auth-env MARINA_SA_JSON

CI pipeline

env:
  MARINA_SA_JSON: ${{ secrets.MARINA_SA_JSON }}

steps:
  - run: marina pull namespace/dataset:tag
  - run: marina push namespace/dataset:tag ./data/

The env var can hold either the raw JSON content ({"type":"service_account",...}) or a path to the JSON file on disk — we accept both.

SSH

Marina authenticates with a private key or password, controlled by the auth_env variable's value at runtime.

Key-based auth (recommended)

Store the private key content as a CI secret, write it to a temporary file, and point the env var at the path:

env:
  MARINA_SSH_KEY: ${{ secrets.MARINA_SSH_KEY }}       # path to key file
  MARINA_SSH_KEY_PASSPHRASE: ${{ secrets.SSH_PASSPHRASE }}  # omit if unencrypted

Or write the key inline to a temp file in the pipeline:

steps:
  - run: |
      echo "$SSH_PRIVATE_KEY" > /tmp/marina_key
      chmod 600 /tmp/marina_key
      echo "MARINA_SSH_KEY=/tmp/marina_key" >> $GITHUB_ENV
    env:
      SSH_PRIVATE_KEY: ${{ secrets.SSH_PRIVATE_KEY }}

  - run: marina pull namespace/dataset:tag

Password auth

If the env var value is not a file path, Marina treats it as a password directly:

env:
  MARINA_SSH_KEY: ${{ secrets.SSH_PASSWORD }}

Register the registry:

marina registry add prod ssh://user@host:22/path/to/registry --auth-env MARINA_SSH_KEY

For encrypted keys, Marina automatically reads {VAR}_PASSPHRASE — so if --auth-env MARINA_SSH_KEY is set, it looks for MARINA_SSH_KEY_PASSPHRASE.

SSH host key verification

Marina currently accepts any host key automatically. This is intentional for ease of use in CI, but means you should ensure the hostname/IP is correct in your registry URI and that your network is trusted.