Exporting Neptune Runs

If you have existing experiments tracked in Neptune (v2.x or v3.x), you can export and migrate them to Pluto using the neptune-exporter CLI tool. This tool streams your Neptune runs to parquet files and loads them into Pluto while preserving run structure, metrics, parameters, and artifacts.

This is separate from the Neptune compatibility layer which enables dual-logging. Use this tool to migrate existing historical runs from Neptune to Pluto.

Overview

The neptune-exporter tool works in three stages:

Export - Download Neptune runs to local parquet files and artifacts
Inspect - View a summary of exported data
Load - Upload the exported data to Pluto

Installation

Clone the neptune-exporter repository and install it with uv:

# Clone the repository
git clone https://github.com/Trainy-ai/neptune-exporter
cd neptune-exporter

# Install all dependencies (including Pluto loader)
uv sync --extra pluto

Quick Start

1. Export Neptune Data

First, authenticate with Neptune by setting your API token:

export NEPTUNE_API_TOKEN="your-neptune-api-token"

Then, export your Neptune runs to local storage using this basic command:

uv run neptune-exporter export \
  -p "my-workspace/my-project" \
  --exporter neptune3 \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v

Use --exporter neptune2 if you’re using Neptune 2.x, or --exporter neptune3 for Neptune 3.x.

Export Options

The export command supports various filters to control what data gets exported:

Option	Description
`-p`/`--project-ids`	Required. Neptune project path (e.g., `"workspace/project"`). Can specify multiple projects.
`--exporter`	Required. Neptune version: `neptune2` or `neptune3`.
`--data-path`	Directory for parquet files (default: `./exports/data`).
`--files-path`	Directory for artifacts (default: `./exports/files`).
`-r`/`--runs`	Filter runs by ID (regex supported). Neptune 3.x uses `sys/custom_run_id`, Neptune 2.x uses `sys/id` (e.g., `SAN-1`).
`-a`/`--attributes`	Filter specific attributes (regex or exact names).
`-c`/`--classes`	Include specific data types: `parameters`, `metrics`, `series`, or `files`.
`--exclude`	Exclude specific data types (same options as `--classes`).
`--include-archived-runs`	Include archived/trashed runs.
`--include-metric-previews`	Neptune 3.x only. Include Metric Previews in the export (preview completion info is discarded).
`--api-token`	Neptune API token (can also use `NEPTUNE_API_TOKEN` env var).

2. Inspect Exported Data

Review what was exported before loading to Pluto:

uv run neptune-exporter summary --data-path ./exports/data

This displays:

Number of projects and runs
Breakdown of attribute types
Step statistics (min/max/count)
Data volume information

3. Load to Pluto

Upload the exported data to Pluto. You have two authentication options: Option A: Use stored credentials (recommended for repeated loads)

# First, authenticate once (stores credentials locally)
pluto login <your-api-key>

# Then load without specifying credentials
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v

Option B: Provide API key directly

# First, authenticate by setting the API key
export PLUTO_API_KEY="your-api-key"

# Then load with PLUTO_API_KEY
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v

The loader will:

Create Ops in Pluto for each Neptune run
Upload metrics, parameters, and histograms
Upload artifacts and file series
Preserve experiment structure and metadata

Configuration

Optional Configuration

Configure the Pluto loader behavior using environment variables:

Variable	Default	Description
`NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME`	Neptune’s `project_id`	Override destination project name (e.g., `"workspace/project"`).
`NEPTUNE_EXPORTER_PLUTO_BASE_DIR`	`.` (current dir)	Base directory for cache files and working data.
`NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE`	`.pluto_upload_cache.txt`	Explicit path to the loaded runs cache file.
`NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS`	`10000`	Arrow-to-pandas batch size. Higher = more RAM, faster. Min: 1000.
`NEPTUNE_EXPORTER_PLUTO_LOG_EVERY`	`50`	Downsample metric steps by logging every N-th point. Set to `1` for lossless (slower), `50+` for faster.
`NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY`	`1000`	Buffered metric step flush threshold. Higher = more RAM, fewer API calls. Min: 100.
`NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE`	`100`	Number of files per upload batch. Higher = faster, more risk of 502 errors.
`NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP`	`0.5`	Seconds to sleep between file batches. Lower = faster, more risk of rate limits.
`NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN`	`0`	Hard cap on uploaded files per run. `0` = disabled.

Loading Example

Choose your authentication method, then apply performance tuning: With stored credentials:

# Authenticate once
pluto login <your-api-key>

# Set performance tuning variables
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load without specifying API key
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files

With direct API key:

# Set all variables at once
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export PLUTO_API_KEY="your-api-key"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load with API key flag
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files

Data Mapping

Attribute Types

Neptune attributes are mapped to Pluto as follows:

Neptune Type	Pluto Mapping	Details
`float`, `int`, `string`, `bool`	Config parameters	Logged via `op.update_config()`
`datetime`	Config parameter (ISO string)	Converted to ISO 8601 format
`string_set`	Config parameter (list)	Converted to list of strings
`float_series`	Metrics	Logged via `op.log()`, preserves decimal steps
`string_series`	Text artifacts (Logs)	Printed to console; consolidated into `logs/stdout` (non-error) and `logs/stderr` (error paths) as Text artifacts.
`histogram_series`	Histograms	Logged as `pluto.Histogram` by step
`file`, `file_series`	Artifacts	Uploaded via `pluto.Artifact()`

Run Structure

Project: Target project is set via NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME or uses Neptune’s project_id
Op Name: Neptune sys/name (experiment name) becomes the Pluto Op name. If missing, falls back to custom_run_id/run_id.
Tags: Includes import:neptune and import_project:<project_id> for traceability
Tags: Neptune tags are preserved as Pluto tags
Fork Relationships: Not natively supported (stored as metadata only)

Data Schema

Exported data uses the following parquet schema:

Column	Type	Description
`project_id`	`string`	Neptune project path (e.g., `workspace/project`)
`run_id`	`string`	Neptune run identifier
`attribute_path`	`string`	Full attribute path (e.g., `metrics/accuracy`)
`attribute_type`	`string`	One of: `float`, `int`, `string`, `bool`, `datetime`, `string_set`, `float_series`, `string_series`, `histogram_series`, `file`, `file_series`
`step`	`decimal(18,6)`	Decimal step value for series data
`timestamp`	`timestamp(ms, UTC)`	Timestamp for time-based records
`int_value` / `float_value` / `string_value` / `bool_value` / `datetime_value` / `string_set_value`	typed	Value based on `attribute_type`
`file_value`	`struct{path}`	Relative path to downloaded artifact
`histogram_value`	`struct{type,edges,values}`	Histogram payload

Storage Layout

The exporter creates the following directory structure:

exports/
├── data/                          # Parquet files
│   └── workspace_project_abc123/  # Sanitized project dir
│       ├── run_1_part_0.parquet
│       ├── run_1_part_1.parquet
│       └── run_2_part_0.parquet
└── files/                         # Artifacts
    └── workspace_project_abc123/
        ├── run_1/
        │   └── artifacts/
        └── run_2/
            └── artifacts/

Projects are sanitized for filesystem safety (with digest suffix)
Each run is split into ~50 MB compressed parquet parts
Files and artifacts mirror the project structure

Duplicate Prevention

The Pluto loader tracks loaded runs in a local cache file to prevent duplicates:

Cache file: .pluto_upload_cache.txt (or custom path via NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE)
Located in NEPTUNE_EXPORTER_PLUTO_BASE_DIR (default: current directory)
Stores project ID and run name to identify already-uploaded runs
The loader does not check the Pluto backend; it only uses the local cache

To re-upload the same runs:

Delete the run from the cache file, or
Delete the entire cache file, or
Run from a different directory, or
Set NEPTUNE_EXPORTER_PLUTO_BASE_DIR to a new location

Troubleshooting

Large Datasets

For runs with hundreds of thousands of steps:

Increase batch size - Process more rows at once (uses more RAM):
```
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
```

Downsample metrics - Reduce points uploaded (lossy but faster):

export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=100  # Keep every 100th point

Increase flush buffer - Fewer API calls (uses more RAM):
```
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=5000
```

File Upload Errors

If you encounter 502 errors or rate limits during file uploads:

Reduce chunk size - Upload fewer files per batch:
```
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=50
```

Increase sleep time - Wait longer between batches:

export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=1.0

Cap total files - Limit files per run:

export NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN=1000

Memory Issues

If the loader runs out of memory:

Decrease batch size:

export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=5000

Decrease flush buffer:

export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=500

Process runs individually - Export and load one run at a time using -r filter

Still Having Issues?

For additional help and the latest information, check out the GitHub Repository

Get Started

Core

Migrate

Overview

Installation

Quick Start

1. Export Neptune Data

Export Options

2. Inspect Exported Data

3. Load to Pluto

Configuration

Optional Configuration

Loading Example

Data Mapping

Attribute Types

Run Structure

Data Schema

Storage Layout

Duplicate Prevention

Troubleshooting

Large Datasets

File Upload Errors

Memory Issues

Still Having Issues?

Get Started

Core

Migrate

​Overview

​Installation

​Quick Start

​1. Export Neptune Data

​Export Options

​2. Inspect Exported Data

​3. Load to Pluto

​Configuration

​Optional Configuration

​Loading Example

​Data Mapping

​Attribute Types

​Run Structure

​Data Schema

​Storage Layout

​Duplicate Prevention

​Troubleshooting

​Large Datasets

​File Upload Errors

​Memory Issues

​Still Having Issues?

Overview

Installation

Quick Start

1. Export Neptune Data

Export Options

2. Inspect Exported Data

3. Load to Pluto

Configuration

Optional Configuration

Loading Example

Data Mapping

Attribute Types

Run Structure

Data Schema

Storage Layout

Duplicate Prevention

Troubleshooting

Large Datasets

File Upload Errors

Memory Issues

Still Having Issues?