> ## Documentation Index
> Fetch the complete documentation index at: https://docs.trainy.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Exporting Neptune Runs

> Export existing Neptune experiments to Pluto using **neptune-exporter**

If you have existing experiments tracked in Neptune (v2.x or v3.x), you can export and migrate them to Pluto using the [neptune-exporter](https://github.com/Trainy-ai/neptune-exporter) CLI tool. This tool streams your Neptune runs to parquet files and loads them into Pluto while preserving run structure, metrics, parameters, and artifacts.

<Note>
  This is separate from the [Neptune compatibility layer](/pluto/neptune-migration) which enables dual-logging. Use this tool to migrate **existing historical runs** from Neptune to Pluto.
</Note>

## Overview

The neptune-exporter tool works in three stages:

1. **Export** - Download Neptune runs to local parquet files and artifacts
2. **Inspect** - View a summary of exported data
3. **Load** - Upload the exported data to Pluto

## Installation

Clone the neptune-exporter repository and install it with uv:

```bash theme={null}
# Clone the repository
git clone https://github.com/Trainy-ai/neptune-exporter
cd neptune-exporter

# Install all dependencies (including Pluto loader)
uv sync --extra pluto
```

## Quick Start

### 1. Export Neptune Data

First, authenticate with Neptune by setting your API token:

```bash theme={null}
export NEPTUNE_API_TOKEN="your-neptune-api-token"
```

Then, export your Neptune runs to local storage using this basic command:

```bash theme={null}

uv run neptune-exporter export \
  -p "my-workspace/my-project" \
  --exporter neptune3 \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
```

<Note>
  Use `--exporter neptune2` if you're using Neptune 2.x, or `--exporter neptune3` for Neptune 3.x.
</Note>

#### Export Options

The export command supports various filters to control what data gets exported:

| Option                      | Description                                                                                                           |
| --------------------------- | --------------------------------------------------------------------------------------------------------------------- |
| `-p`/`--project-ids`        | **Required.** Neptune project path (e.g., `"workspace/project"`). Can specify multiple projects.                      |
| `--exporter`                | **Required.** Neptune version: `neptune2` or `neptune3`.                                                              |
| `--data-path`               | Directory for parquet files (default: `./exports/data`).                                                              |
| `--files-path`              | Directory for artifacts (default: `./exports/files`).                                                                 |
| `-r`/`--runs`               | Filter runs by ID (regex supported). Neptune 3.x uses `sys/custom_run_id`, Neptune 2.x uses `sys/id` (e.g., `SAN-1`). |
| `-a`/`--attributes`         | Filter specific attributes (regex or exact names).                                                                    |
| `-c`/`--classes`            | Include specific data types: `parameters`, `metrics`, `series`, or `files`.                                           |
| `--exclude`                 | Exclude specific data types (same options as `--classes`).                                                            |
| `--include-archived-runs`   | Include archived/trashed runs.                                                                                        |
| `--include-metric-previews` | Neptune 3.x only. Include Metric Previews in the export (preview completion info is discarded).                       |
| `--api-token`               | Neptune API token (can also use `NEPTUNE_API_TOKEN` env var).                                                         |

### 2. Inspect Exported Data

Review what was exported before loading to Pluto:

```bash theme={null}
uv run neptune-exporter summary --data-path ./exports/data
```

This displays:

* Number of projects and runs
* Breakdown of attribute types
* Step statistics (min/max/count)
* Data volume information

### 3. Load to Pluto

Upload the exported data to Pluto. You have two authentication options:

**Option A: Use stored credentials (recommended for repeated loads)**

```bash theme={null}
# First, authenticate once (stores credentials locally)
pluto login <your-api-key>

# Then load without specifying credentials
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
```

**Option B: Provide API key directly**

```bash theme={null}
# First, authenticate by setting the API key
export PLUTO_API_KEY="your-api-key"

# Then load with PLUTO_API_KEY
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
```

The loader will:

* Create Ops in Pluto for each Neptune run
* Upload metrics, parameters, and histograms
* Upload artifacts and file series
* Preserve experiment structure and metadata

## Configuration

### Optional Configuration

Configure the Pluto loader behavior using environment variables:

| Variable                                   | Default                   | Description                                                                                              |
| ------------------------------------------ | ------------------------- | -------------------------------------------------------------------------------------------------------- |
| `NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME`      | Neptune's `project_id`    | Override destination project name (e.g., `"workspace/project"`).                                         |
| `NEPTUNE_EXPORTER_PLUTO_BASE_DIR`          | `.` (current dir)         | Base directory for cache files and working data.                                                         |
| `NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE`      | `.pluto_upload_cache.txt` | Explicit path to the loaded runs cache file.                                                             |
| `NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS`        | `10000`                   | Arrow-to-pandas batch size. Higher = more RAM, faster. Min: 1000.                                        |
| `NEPTUNE_EXPORTER_PLUTO_LOG_EVERY`         | `50`                      | Downsample metric steps by logging every N-th point. Set to `1` for lossless (slower), `50+` for faster. |
| `NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY`       | `1000`                    | Buffered metric step flush threshold. Higher = more RAM, fewer API calls. Min: 100.                      |
| `NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE`   | `100`                     | Number of files per upload batch. Higher = faster, more risk of 502 errors.                              |
| `NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP`  | `0.5`                     | Seconds to sleep between file batches. Lower = faster, more risk of rate limits.                         |
| `NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN` | `0`                       | Hard cap on uploaded files per run. `0` = disabled.                                                      |

#### Loading Example

Choose your authentication method, then apply performance tuning:

**With stored credentials:**

```bash theme={null}
# Authenticate once
pluto login <your-api-key>

# Set performance tuning variables
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load without specifying API key
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files
```

**With direct API key:**

```bash theme={null}
# Set all variables at once
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export PLUTO_API_KEY="your-api-key"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load with API key flag
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files
```

## Data Mapping

### Attribute Types

Neptune attributes are mapped to Pluto as follows:

| Neptune Type                     | Pluto Mapping                 | Details                                                                                                            |
| -------------------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| `float`, `int`, `string`, `bool` | Config parameters             | Logged via `op.update_config()`                                                                                    |
| `datetime`                       | Config parameter (ISO string) | Converted to ISO 8601 format                                                                                       |
| `string_set`                     | Config parameter (list)       | Converted to list of strings                                                                                       |
| `float_series`                   | Metrics                       | Logged via `op.log()`, preserves decimal steps                                                                     |
| `string_series`                  | Text artifacts (Logs)         | Printed to console; consolidated into `logs/stdout` (non-error) and `logs/stderr` (error paths) as Text artifacts. |
| `histogram_series`               | Histograms                    | Logged as `pluto.Histogram` by step                                                                                |
| `file`, `file_series`            | Artifacts                     | Uploaded via `pluto.Artifact()`                                                                                    |

### Run Structure

* **Project:** Target project is set via `NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME` or uses Neptune's `project_id`
* **Op Name:** Neptune `sys/name` (experiment name) becomes the Pluto Op name. If missing, falls back to `custom_run_id`/`run_id`.
* **Tags:** Includes `import:neptune` and `import_project:<project_id>` for traceability
* **Tags:** Neptune tags are preserved as Pluto tags
* **Fork Relationships:** Not natively supported (stored as metadata only)

## Data Schema

Exported data uses the following parquet schema:

| Column                                                                                              | Type                        | Description                                                                                                                                    |
| --------------------------------------------------------------------------------------------------- | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| `project_id`                                                                                        | `string`                    | Neptune project path (e.g., `workspace/project`)                                                                                               |
| `run_id`                                                                                            | `string`                    | Neptune run identifier                                                                                                                         |
| `attribute_path`                                                                                    | `string`                    | Full attribute path (e.g., `metrics/accuracy`)                                                                                                 |
| `attribute_type`                                                                                    | `string`                    | One of: `float`, `int`, `string`, `bool`, `datetime`, `string_set`, `float_series`, `string_series`, `histogram_series`, `file`, `file_series` |
| `step`                                                                                              | `decimal(18,6)`             | Decimal step value for series data                                                                                                             |
| `timestamp`                                                                                         | `timestamp(ms, UTC)`        | Timestamp for time-based records                                                                                                               |
| `int_value` / `float_value` / `string_value` / `bool_value` / `datetime_value` / `string_set_value` | typed                       | Value based on `attribute_type`                                                                                                                |
| `file_value`                                                                                        | `struct{path}`              | Relative path to downloaded artifact                                                                                                           |
| `histogram_value`                                                                                   | `struct{type,edges,values}` | Histogram payload                                                                                                                              |

## Storage Layout

The exporter creates the following directory structure:

```
exports/
├── data/                          # Parquet files
│   └── workspace_project_abc123/  # Sanitized project dir
│       ├── run_1_part_0.parquet
│       ├── run_1_part_1.parquet
│       └── run_2_part_0.parquet
└── files/                         # Artifacts
    └── workspace_project_abc123/
        ├── run_1/
        │   └── artifacts/
        └── run_2/
            └── artifacts/
```

* Projects are sanitized for filesystem safety (with digest suffix)
* Each run is split into \~50 MB compressed parquet parts
* Files and artifacts mirror the project structure

## Duplicate Prevention

The Pluto loader tracks loaded runs in a local cache file to prevent duplicates:

* Cache file: `.pluto_upload_cache.txt` (or custom path via `NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE`)
* Located in `NEPTUNE_EXPORTER_PLUTO_BASE_DIR` (default: current directory)
* Stores project ID and run name to identify already-uploaded runs
* The loader does **not** check the Pluto backend; it only uses the local cache

To re-upload the same runs:

* Delete the run from the cache file, or
* Delete the entire cache file, or
* Run from a different directory, or
* Set `NEPTUNE_EXPORTER_PLUTO_BASE_DIR` to a new location

## Troubleshooting

### Large Datasets

For runs with hundreds of thousands of steps:

1. **Increase batch size** - Process more rows at once (uses more RAM):
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
   ```

2. **Downsample metrics** - Reduce points uploaded (lossy but faster):
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=100  # Keep every 100th point
   ```

3. **Increase flush buffer** - Fewer API calls (uses more RAM):
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=5000
   ```

### File Upload Errors

If you encounter 502 errors or rate limits during file uploads:

1. **Reduce chunk size** - Upload fewer files per batch:
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=50
   ```

2. **Increase sleep time** - Wait longer between batches:
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=1.0
   ```

3. **Cap total files** - Limit files per run:
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN=1000
   ```

### Memory Issues

If the loader runs out of memory:

1. **Decrease batch size**:
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=5000
   ```

2. **Decrease flush buffer**:
   ```bash theme={null}
   export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=500
   ```

3. **Process runs individually** - Export and load one run at a time using `-r` filter

### Still Having Issues?

For additional help and the latest information, check out the **[GitHub Repository](https://github.com/Trainy-ai/neptune-exporter)**
