Troubleshooting

Use this page as a quick playbook when something goes sideways. Start with the symptom, work through the checks, and capture diagnostics before escalating.

How to use this page

Find the symptom that best matches what you see.
Run the quick checks in order; most issues are solved within the first two steps.
Collect the diagnostics listed in the Capture before escalating section and attach them to support requests.

Connectivity issues

Cannot access the web UI

Symptom: Browser cannot reach the UI from another machine or the connection is refused.
Quick checks:
- Start the server with a public bind address: lyftdata run server --bind-address 0.0.0.0:3000.
- Or set LYFTDATA_BIND_ADDRESS=0.0.0.0:3000 in the environment.
- Confirm firewalls or security groups allow inbound TCP 3000.
- Default bind address is 127.0.0.1:3000 (local only).

Symptom: lyftdata login admin fails or CLI commands time out.
Quick checks:
- Ensure the server is running and reachable at LYFTDATA_URL (default https://localhost:3000/).
- If using a non-default port or address, export LYFTDATA_URL with the correct scheme, for example https://server:3000/ (or http://server:3000/ only when the server runs with --disable-tls).
- If the server is using the default self-signed certificate, re-run the CLI command with --tls-insecure (or set LYFTDATA_TLS_INSECURE=true) until you install a trusted certificate.
- If the password is lost and no admin session remains, follow Reset an admin password.

Worker fails to register or stays offline

Symptom: Worker shows offline in the dashboard and does not receive jobs.
Quick checks:
- Ensure LYFTDATA_URL points to the server API (include scheme and port).
- Verify API key and worker ID (LYFTDATA_WORKER_API_KEY, LYFTDATA_WORKER_ID).
- Confirm network reachability from the worker host to the server on TCP 3000.
Capture before escalating:
- Worker logs (journalctl -u lyftdata-worker or console output).
- Recent server logs showing registration attempts.

Licensing and onboarding

EULA prompt blocks automation

Set LYFTDATA_LICENSE_EULA_ACCEPT=yes in the environment for the first run to bypass the interactive prompt.
Apply the same variable when automating worker or CLI runs that would otherwise prompt.

Evaluation license expired or missing

Check the License page in the UI for status and expiry.
Reapply the license through the UI or restart the server with an updated LYFTDATA_LICENSE environment variable if you manage licenses non-interactively.

Installation and permissions

Server will not start because of staging directory permissions

Verify the service account owns the staging directory and it is writable.
Avoid running the server as root; create a dedicated user and fix ownership (chown -R lyftdata:lyftdata /var/lib/lyftdata).
If running on Windows, run the service as an account with Modify permissions on the staging directory.

Server startup fails in Docker with keyring/DBus errors

Symptom: startup aborts with errors like failed to access variables master key and Unable to autolaunch a dbus-daemon without a $DISPLAY for X11.
Cause: headless containers usually do not have a desktop keyring/DBus session, so keyring-backed master key loading fails.
Quick checks:
- Configure env-backed master keys for server and workers.
- Use --variables-master-key-source env (or LYFTDATA_VARIABLES_MASTER_KEY_SOURCE=env) for the server.
- Do not use legacy settings like LYFTDATA_KEYRING_BACKEND, LYFTDATA_MASTER_KEY, or --master-key-backend; current binaries use scoped master-key variables.
- External workers require a licensed deployment; Community Edition supports only the built-in worker.
- For a full container setup guide, see Docker and Docker Compose.

services:
  lyft-server:
    build: .
    restart: unless-stopped
    command:
      - run
      - server
      - --disable-tls
      - --bind-address
      - 0.0.0.0:3000
      - --variables-master-key-source
      - env
    ports:
      - "3000:3000"
    environment:
      LYFTDATA_LICENSE_EULA_ACCEPT: "yes"
      # Recommended for published container ports: create the first admin explicitly
      # rather than depending on the local setup-link bootstrap flow.
      LYFTDATA_ADMIN_INIT_PASSWORD: "ChangeMeVerySoon1"
      # Optional but recommended: bootstrap the license non-interactively (required for external workers on first run)
      LYFTDATA_LICENSE: "<paste-your-license-jwt>"
      LYFTDATA_STAGING_DIR: "/data"
      LYFTDATA_AUTO_ENROLLMENT_KEY: "ChangeThisEnrollmentKey!"
      LYFTDATA_VARIABLES_MASTER_KEY_SOURCE: "env"
      LYFTDATA_VARIABLES_MASTER_KEY: "<64-hex-chars>"
      LYFTDATA_CREDENTIAL_MANAGER_MASTER_KEY_SOURCE: "env"
      LYFTDATA_CREDENTIAL_MANAGER_MASTER_KEY: "<64-hex-chars>"
      # Required in headless containers because the built-in worker runs inside the server
      LYFTDATA_SETTINGS_MASTER_KEY_SOURCE: "env"
      LYFTDATA_SETTINGS_MASTER_KEY: "<64-hex-chars>"
    volumes:
      - ./lyft_data/server:/data

  worker-alpha:
    build: .
    restart: unless-stopped
    command:
      - run
      - worker
      - --url
      - http://lyft-server:3000
      - --worker-name
      - worker-alpha
      - --worker-jobs-dir
      - /data
    depends_on:
      - lyft-server
    environment:
      LYFTDATA_LICENSE_EULA_ACCEPT: "yes"
      LYFTDATA_AUTO_ENROLLMENT_KEY: "ChangeThisEnrollmentKey!"
      LYFTDATA_SETTINGS_MASTER_KEY_SOURCE: "env"
      LYFTDATA_SETTINGS_MASTER_KEY: "<64-hex-chars>"
    volumes:
      - ./lyft_data/worker-alpha:/data

  worker-beta:
    build: .
    restart: unless-stopped
    command:
      - run
      - worker
      - --url
      - http://lyft-server:3000
      - --worker-name
      - worker-beta
      - --worker-jobs-dir
      - /data
    depends_on:
      - lyft-server
    environment:
      LYFTDATA_LICENSE_EULA_ACCEPT: "yes"
      LYFTDATA_AUTO_ENROLLMENT_KEY: "ChangeThisEnrollmentKey!"
      LYFTDATA_SETTINGS_MASTER_KEY_SOURCE: "env"
      LYFTDATA_SETTINGS_MASTER_KEY: "<64-hex-chars>"
    volumes:
      - ./lyft_data/worker-beta:/data

Note: if your server is running with TLS enabled, switch worker URL values to https://lyft-server:3000 and set LYFTDATA_TLS_INSECURE=true for local/self-signed evaluation.

Worker exits immediately on startup

Confirm the binary matches the host architecture.
Check that required environment variables are set (LYFTDATA_URL, LYFTDATA_WORKER_API_KEY).
Review stdout/stderr for errors; if possible, run the worker in the foreground once to capture the full startup output (add -v for more logs).

Worker local state is corrupted or auth lease is stuck

Symptom: startup errors mention malformed sqlite databases, auth lease validation loops, or the worker never recovers after credential-related failures.
Quick checks:
- Stop the worker service first (systemctl stop lyftdata-worker, sc stop LyftDataWorker, or equivalent).
- Confirm the worker state directory (LYFTDATA_JOBS_DIR or default worker cache path).
- Start once with one of the reset modes:
  - lyftdata-worker --reset-state auth
  - lyftdata-worker --reset-state repair
  - lyftdata-worker --reset-state full
- Restart normally without --reset-state after the one-time repair/reset.
Reset mode guidance:
- auth: clears cached worker auth lease material so the worker re-fetches fresh auth state from the server.
- repair: runs sqlite integrity checks and rebuilds unhealthy worker sqlite files from a logical dump.
- full: wipes local worker runtime state and recreates a minimal baseline, retaining worker identity/API key where possible.
Notes:
- repair requires sqlite3 to be available on the worker host.
- full may still require --worker-api-key or auto-enrollment if no reusable local credentials exist.
- You can use lyftdata run worker --reset-state <mode> instead of lyftdata-worker if you run the combined binary.

Jobs and pipeline execution

No data flowing through a job

Confirm the job is staged and deployed to an online worker.
Review Run Output and Logs for validation errors or connector failures.
Verify connector credentials and destination permissions.
Use the Issues pane to resolve schema or validation problems before redeploying.

UI actions do not take effect after staging or deploying

Check that the job version shows as Staged and the intended worker is connected.
For external workers, validate worker enrollment, API key, and LYFTDATA_URL.
Ensure the browser session is using the latest admin password; re-login if prompted.

Validation errors referencing missing fields

Ensure upstream jobs emit the fields referenced by Convert/Filter actions.
Use the Preview tab on each action to inspect incoming sample events.
When splitting pipelines, document channel schemas so downstream jobs expect the correct shape.

Performance and scaling

Backpressure or slow throughput

Monitor worker CPU and memory utilisation; scale horizontally by adding workers.
Use worker channels to fan out workloads and avoid single-job hotspots.
Apply rate limits on inputs where bursts cause congestion.
Inspect the worker backlog charts (UI) for long-running actions.

Disk usage high or logs purged unexpectedly

The server cleans up data when disk thresholds are exceeded.
Adjust retention: --db-retention-days or LYFTDATA_LOG_RETENTION_DAYS (default 30 days).
Adjust disk usage threshold: --disk-usage-max-percentage or LYFTDATA_DB_DISK_USE_MAX_PERCENT (default 80%).
Provision more disk for the staging directory volume.

Diagnostics and logs

Server staging directory: defaults to a per-user data path, configurable via --staging-dir or LYFTDATA_STAGING_DIR.
Worker job/data directory: LYFTDATA_JOBS_DIR or the worker cache directory if unset.
Collect server and worker logs, job run result JSON, and Issues list from the UI when escalating.
Review the Logs & Issues guide if warnings are not appearing where you expect them.
Record exact versions, host platform details, and recent configuration changes.

Additional resources

Configuration reference: reference/configuration
Getting started guide: Getting started
Worker installation (Linux): Install on Linux
Multi-job orchestration concepts: Build overview

Troubleshooting

How to use this page

Connectivity issues

Cannot access the web UI

”Login failed” or CLI cannot reach the server

Worker fails to register or stays offline

Licensing and onboarding

EULA prompt blocks automation

Evaluation license expired or missing

Installation and permissions

Server will not start because of staging directory permissions

Server startup fails in Docker with keyring/DBus errors

Worker exits immediately on startup

Worker local state is corrupted or auth lease is stuck

Jobs and pipeline execution

No data flowing through a job

UI actions do not take effect after staging or deploying

Validation errors referencing missing fields

Performance and scaling

Backpressure or slow throughput

Disk usage high or logs purged unexpectedly

Diagnostics and logs

Additional resources