If you have existing experiments tracked in Neptune (v2.x or v3.x), you can export and migrate them to Pluto using the neptune-exporter CLI tool. This tool streams your Neptune runs to parquet files and loads them into Pluto while preserving run structure, metrics, parameters, and artifacts.Documentation Index
Fetch the complete documentation index at: https://docs.trainy.ai/llms.txt
Use this file to discover all available pages before exploring further.
This is separate from the Neptune compatibility layer which enables dual-logging. Use this tool to migrate existing historical runs from Neptune to Pluto.
Overview
The neptune-exporter tool works in three stages:- Export - Download Neptune runs to local parquet files and artifacts
- Inspect - View a summary of exported data
- Load - Upload the exported data to Pluto
Installation
Clone the neptune-exporter repository and install it with uv:Quick Start
1. Export Neptune Data
First, authenticate with Neptune by setting your API token:Use
--exporter neptune2 if you’re using Neptune 2.x, or --exporter neptune3 for Neptune 3.x.Export Options
The export command supports various filters to control what data gets exported:| Option | Description |
|---|---|
-p/--project-ids | Required. Neptune project path (e.g., "workspace/project"). Can specify multiple projects. |
--exporter | Required. Neptune version: neptune2 or neptune3. |
--data-path | Directory for parquet files (default: ./exports/data). |
--files-path | Directory for artifacts (default: ./exports/files). |
-r/--runs | Filter runs by ID (regex supported). Neptune 3.x uses sys/custom_run_id, Neptune 2.x uses sys/id (e.g., SAN-1). |
-a/--attributes | Filter specific attributes (regex or exact names). |
-c/--classes | Include specific data types: parameters, metrics, series, or files. |
--exclude | Exclude specific data types (same options as --classes). |
--include-archived-runs | Include archived/trashed runs. |
--include-metric-previews | Neptune 3.x only. Include Metric Previews in the export (preview completion info is discarded). |
--api-token | Neptune API token (can also use NEPTUNE_API_TOKEN env var). |
2. Inspect Exported Data
Review what was exported before loading to Pluto:- Number of projects and runs
- Breakdown of attribute types
- Step statistics (min/max/count)
- Data volume information
3. Load to Pluto
Upload the exported data to Pluto. You have two authentication options: Option A: Use stored credentials (recommended for repeated loads)- Create Ops in Pluto for each Neptune run
- Upload metrics, parameters, and histograms
- Upload artifacts and file series
- Preserve experiment structure and metadata
Configuration
Optional Configuration
Configure the Pluto loader behavior using environment variables:| Variable | Default | Description |
|---|---|---|
NEPTUNE_EXPORTER_PLUTO_PROJECT_NAME | Neptune’s project_id | Override destination project name (e.g., "workspace/project"). |
NEPTUNE_EXPORTER_PLUTO_BASE_DIR | . (current dir) | Base directory for cache files and working data. |
NEPTUNE_EXPORTER_PLUTO_LOADED_CACHE | .pluto_upload_cache.txt | Explicit path to the loaded runs cache file. |
NEPTUNE_EXPORTER_PLUTO_BATCH_ROWS | 10000 | Arrow-to-pandas batch size. Higher = more RAM, faster. Min: 1000. |
NEPTUNE_EXPORTER_PLUTO_LOG_EVERY | 50 | Downsample metric steps by logging every N-th point. Set to 1 for lossless (slower), 50+ for faster. |
NEPTUNE_EXPORTER_PLUTO_FLUSH_EVERY | 1000 | Buffered metric step flush threshold. Higher = more RAM, fewer API calls. Min: 100. |
NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SIZE | 100 | Number of files per upload batch. Higher = faster, more risk of 502 errors. |
NEPTUNE_EXPORTER_PLUTO_FILE_CHUNK_SLEEP | 0.5 | Seconds to sleep between file batches. Lower = faster, more risk of rate limits. |
NEPTUNE_EXPORTER_PLUTO_MAX_FILES_PER_RUN | 0 | Hard cap on uploaded files per run. 0 = disabled. |
Loading Example
Choose your authentication method, then apply performance tuning: With stored credentials:Data Mapping
Attribute Types
Neptune attributes are mapped to Pluto as follows:| Neptune Type | Pluto Mapping | Details |
|---|---|---|
float, int, string, bool | Config parameters | Logged via op.update_config() |
datetime | Config parameter (ISO string) | Converted to ISO 8601 format |
string_set | Config parameter (list) | Converted to list of strings |
float_series | Metrics | Logged via op.log(), preserves decimal steps |
string_series | Text artifacts (Logs) | Printed to console; consolidated into logs/stdout (non-error) and logs/stderr (error paths) as Text artifacts. |
histogram_series | Histograms | Logged as pluto.Histogram by step |
file, file_series | Artifacts | Uploaded via pluto.Artifact() |
Run Structure
- Project: Target project is set via
NEPTUNE_EXPORTER_PLUTO_PROJECT_NAMEor uses Neptune’sproject_id - Op Name: Neptune
sys/name(experiment name) becomes the Pluto Op name. If missing, falls back tocustom_run_id/run_id. - Tags: Includes
import:neptuneandimport_project:<project_id>for traceability - Tags: Neptune tags are preserved as Pluto tags
- Fork Relationships: Not natively supported (stored as metadata only)
Data Schema
Exported data uses the following parquet schema:| Column | Type | Description |
|---|---|---|
project_id | string | Neptune project path (e.g., workspace/project) |
run_id | string | Neptune run identifier |
attribute_path | string | Full attribute path (e.g., metrics/accuracy) |
attribute_type | string | One of: float, int, string, bool, datetime, string_set, float_series, string_series, histogram_series, file, file_series |
step | decimal(18,6) | Decimal step value for series data |
timestamp | timestamp(ms, UTC) | Timestamp for time-based records |
int_value / float_value / string_value / bool_value / datetime_value / string_set_value | typed | Value based on attribute_type |
file_value | struct{path} | Relative path to downloaded artifact |
histogram_value | struct{type,edges,values} | Histogram payload |
Storage Layout
The exporter creates the following directory structure:- Projects are sanitized for filesystem safety (with digest suffix)
- Each run is split into ~50 MB compressed parquet parts
- Files and artifacts mirror the project structure
Duplicate Prevention
The Pluto loader tracks loaded runs in a local cache file to prevent duplicates:- Cache file:
.pluto_upload_cache.txt(or custom path viaNEPTUNE_EXPORTER_PLUTO_LOADED_CACHE) - Located in
NEPTUNE_EXPORTER_PLUTO_BASE_DIR(default: current directory) - Stores project ID and run name to identify already-uploaded runs
- The loader does not check the Pluto backend; it only uses the local cache
- Delete the run from the cache file, or
- Delete the entire cache file, or
- Run from a different directory, or
- Set
NEPTUNE_EXPORTER_PLUTO_BASE_DIRto a new location
Troubleshooting
Large Datasets
For runs with hundreds of thousands of steps:-
Increase batch size - Process more rows at once (uses more RAM):
-
Downsample metrics - Reduce points uploaded (lossy but faster):
-
Increase flush buffer - Fewer API calls (uses more RAM):
File Upload Errors
If you encounter 502 errors or rate limits during file uploads:-
Reduce chunk size - Upload fewer files per batch:
-
Increase sleep time - Wait longer between batches:
-
Cap total files - Limit files per run:
Memory Issues
If the loader runs out of memory:-
Decrease batch size:
-
Decrease flush buffer:
-
Process runs individually - Export and load one run at a time using
-rfilter