Skip to main content
If you have existing experiments tracked in Neptune (v2.x or v3.x), you can export and migrate them to Pluto using the neptune-exporter CLI tool. This tool streams your Neptune runs to parquet files and loads them into Pluto while preserving run structure, metrics, parameters, and artifacts.
This is separate from the Neptune compatibility layer which enables dual-logging. Use this tool to migrate existing historical runs from Neptune to Pluto.

Overview

The neptune-exporter tool works in three stages:
  1. Export - Download Neptune runs to local parquet files and artifacts
  2. Inspect - View a summary of exported data
  3. Load - Upload the exported data to Pluto

Installation

Clone the neptune-exporter repository and install it with uv:
# Clone the repository
git clone https://github.com/Trainy-ai/neptune-exporter
cd neptune-exporter

# Install all dependencies (including Pluto loader)
uv sync --extra pluto

Quick Start

1. Export Neptune Data

First, authenticate with Neptune by setting your API token:
export NEPTUNE_API_TOKEN="your-neptune-api-token"
Then, export your Neptune runs to local storage using this basic command:

uv run neptune-exporter export \
  -p "my-workspace/my-project" \
  --exporter neptune3 \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
Use --exporter neptune2 if you’re using Neptune 2.x, or --exporter neptune3 for Neptune 3.x.

Export Options

The export command supports various filters to control what data gets exported:
OptionDescription
-p/--project-idsRequired. Neptune project path (e.g., "workspace/project"). Can specify multiple projects.
--exporterRequired. Neptune version: neptune2 or neptune3.
--data-pathDirectory for parquet files (default: ./exports/data).
--files-pathDirectory for artifacts (default: ./exports/files).
-r/--runsFilter runs by ID (regex supported). Neptune 3.x uses sys/custom_run_id, Neptune 2.x uses sys/id (e.g., SAN-1).
-a/--attributesFilter specific attributes (regex or exact names).
-c/--classesInclude specific data types: parameters, metrics, series, or files.
--excludeExclude specific data types (same options as --classes).
--include-archived-runsInclude archived/trashed runs.
--include-metric-previewsNeptune 3.x only. Include Metric Previews in the export (preview completion info is discarded).
--api-tokenNeptune API token (can also use NEPTUNE_API_TOKEN env var).

2. Inspect Exported Data

Review what was exported before loading to Pluto:
uv run neptune-exporter summary --data-path ./exports/data
This displays:
  • Number of projects and runs
  • Breakdown of attribute types
  • Step statistics (min/max/count)
  • Data volume information

3. Load to Pluto

Upload the exported data to Pluto. You have two authentication options: Option A: Use stored credentials (recommended for repeated loads)
# First, authenticate once (stores credentials locally)
pluto login <your-api-key>

# Then load without specifying credentials
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
Option B: Provide API key directly
# First, authenticate by setting the API key
export PLUTO_API_KEY="your-api-key"

# Then load with PLUTO_API_KEY
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files \
  -v
The loader will:
  • Create Ops in Pluto for each Neptune run
  • Upload metrics, parameters, and histograms
  • Upload artifacts and file series
  • Preserve experiment structure and metadata

Configuration

Optional Configuration

Configure the Pluto loader behavior using environment variables:
VariableDefaultDescription
NEPTUNE_EXPORTER_PLUTO_PROJECT_NAMENeptune’s project_idOverride destination project name (e.g., "workspace/project").
NEPTUNE_EXPORTER_PLUTO_BASE_DIR. (current dir)Base directory for cache files and working data.
NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE.pluto_upload_cache.txtExplicit path to the loaded runs cache file.
NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS10000Arrow-to-pandas batch size. Higher = more RAM, faster. Min: 1000.
NEPTUNE_EXPORTER_PLUTO_LOG_EVERY50Downsample metric steps by logging every N-th point. Set to 1 for lossless (slower), 50+ for faster.
NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY1000Buffered metric step flush threshold. Higher = more RAM, fewer API calls. Min: 100.
NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE100Number of files per upload batch. Higher = faster, more risk of 502 errors.
NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP0.5Seconds to sleep between file batches. Lower = faster, more risk of rate limits.
NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN0Hard cap on uploaded files per run. 0 = disabled.

Loading Example

Choose your authentication method, then apply performance tuning: With stored credentials:
# Authenticate once
pluto login <your-api-key>

# Set performance tuning variables
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load without specifying API key
uv run neptune-exporter load \
  --loader pluto \
  --data-path ./exports/data \
  --files-path ./exports/files
With direct API key:
# Set all variables at once
export NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME="my-workspace/migrated-runs"
export PLUTO_API_KEY="your-api-key"
export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=1   # Lossless
export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=3000
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=100
export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=0.1

# Load with API key flag
uv run neptune-exporter load \
  --loader pluto \
  --pluto-api-key "$PLUTO_API_KEY" \
  --data-path ./exports/data \
  --files-path ./exports/files

Data Mapping

Attribute Types

Neptune attributes are mapped to Pluto as follows:
Neptune TypePluto MappingDetails
float, int, string, boolConfig parametersLogged via op.update_config()
datetimeConfig parameter (ISO string)Converted to ISO 8601 format
string_setConfig parameter (list)Converted to list of strings
float_seriesMetricsLogged via op.log(), preserves decimal steps
string_seriesText artifacts (Logs)Printed to console; consolidated into logs/stdout (non-error) and logs/stderr (error paths) as Text artifacts.
histogram_seriesHistogramsLogged as pluto.Histogram by step
file, file_seriesArtifactsUploaded via pluto.Artifact()

Run Structure

  • Project: Target project is set via NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME or uses Neptune’s project_id
  • Op Name: Neptune sys/name (experiment name) becomes the Pluto Op name. If missing, falls back to custom_run_id/run_id.
  • Tags: Includes import:neptune and import_project:<project_id> for traceability
  • Tags: Neptune tags are preserved as Pluto tags
  • Fork Relationships: Not natively supported (stored as metadata only)

Data Schema

Exported data uses the following parquet schema:
ColumnTypeDescription
project_idstringNeptune project path (e.g., workspace/project)
run_idstringNeptune run identifier
attribute_pathstringFull attribute path (e.g., metrics/accuracy)
attribute_typestringOne of: float, int, string, bool, datetime, string_set, float_series, string_series, histogram_series, file, file_series
stepdecimal(18,6)Decimal step value for series data
timestamptimestamp(ms, UTC)Timestamp for time-based records
int_value / float_value / string_value / bool_value / datetime_value / string_set_valuetypedValue based on attribute_type
file_valuestruct{path}Relative path to downloaded artifact
histogram_valuestruct{type,edges,values}Histogram payload

Storage Layout

The exporter creates the following directory structure:
exports/
├── data/                          # Parquet files
│   └── workspace_project_abc123/  # Sanitized project dir
│       ├── run_1_part_0.parquet
│       ├── run_1_part_1.parquet
│       └── run_2_part_0.parquet
└── files/                         # Artifacts
    └── workspace_project_abc123/
        ├── run_1/
        │   └── artifacts/
        └── run_2/
            └── artifacts/
  • Projects are sanitized for filesystem safety (with digest suffix)
  • Each run is split into ~50 MB compressed parquet parts
  • Files and artifacts mirror the project structure

Duplicate Prevention

The Pluto loader tracks loaded runs in a local cache file to prevent duplicates:
  • Cache file: .pluto_upload_cache.txt (or custom path via NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE)
  • Located in NEPTUNE_EXPORTER_PLUTO_BASE_DIR (default: current directory)
  • Stores project ID and run name to identify already-uploaded runs
  • The loader does not check the Pluto backend; it only uses the local cache
To re-upload the same runs:
  • Delete the run from the cache file, or
  • Delete the entire cache file, or
  • Run from a different directory, or
  • Set NEPTUNE_EXPORTER_PLUTO_BASE_DIR to a new location

Troubleshooting

Large Datasets

For runs with hundreds of thousands of steps:
  1. Increase batch size - Process more rows at once (uses more RAM):
    export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=50000
    
  2. Downsample metrics - Reduce points uploaded (lossy but faster):
    export NEPTUNE_EXPORTER_PLUTO_LOG_EVERY=100  # Keep every 100th point
    
  3. Increase flush buffer - Fewer API calls (uses more RAM):
    export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=5000
    

File Upload Errors

If you encounter 502 errors or rate limits during file uploads:
  1. Reduce chunk size - Upload fewer files per batch:
    export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE=50
    
  2. Increase sleep time - Wait longer between batches:
    export NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP=1.0
    
  3. Cap total files - Limit files per run:
    export NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN=1000
    

Memory Issues

If the loader runs out of memory:
  1. Decrease batch size:
    export NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS=5000
    
  2. Decrease flush buffer:
    export NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY=500
    
  3. Process runs individually - Export and load one run at a time using -r filter

Still Having Issues?

For additional help and the latest information, check out the GitHub Repository