feature bulk-import cosmos-db csv sql-server

Bulk Import CSV & SQL Server to Cosmos DB Graph

GremlinStudio's new Bulk Import wizard lets you load thousands of vertices and edges into Azure Cosmos DB from CSV files or live SQL Server queries — with a guided wizard, real-time progress, and high-throughput Cosmos SDK ingestion.

GremlinStudio Team · February 21, 2026

The Data-In Problem

Graph databases have an adoption bottleneck: getting data in. Most developers have data sitting in CSVs, SQL exports, or Excel sheets. Until now, the only option in GremlinStudio was writing individual g.addV() queries by hand — completely impractical for anything beyond a handful of nodes.

Today we’re shipping Bulk Import — a wizard-guided feature that lets you import thousands of vertices and edges from CSV files or live SQL Server queries directly into your Azure Cosmos DB graph database.

No scripts. No custom code. No ETL pipelines.

What’s New

Bulk Import is a five-step wizard built directly into GremlinStudio:

Connect — Select your Cosmos DB connection. Partition key and throughput are auto-detected.
Choose Source — Upload CSV/delimited files, or connect to MS SQL Server and write a query.
Map Columns — Map source columns to graph properties with auto-detection for convention headers.
Preview — Review validation, choose Create or Upsert mode, and see an RU cost estimate.
Import — Watch real-time progress with live RU tracking. Minimize to a floating PIP widget and keep working.

Bulk Import - Connection Step

CSV and Delimited File Import

Drop a CSV, TSV, pipe-delimited, or semicolon-delimited file into the wizard and GremlinStudio parses it instantly. Convention-based headers make column mapping automatic:

~id,~label,~partitionKey,name,age:number,email,active:boolean
v1,person,dept-eng,Alice Chen,30,alice@acme.com,true
v2,person,dept-eng,Bob Smith,25,bob@acme.com,true
v3,person,dept-mkt,Carol Wu,33,,false

The ~ prefix columns (~id, ~label, ~partitionKey) map to graph structure automatically. Type annotations like :number and :boolean on property headers give you precise control over how values are stored in Cosmos DB.

Edges work the same way:

~id,~label,~from,~to,~fromLabel,~toLabel,weight:number
e1,knows,v1,v2,person,person,0.8
e2,works_at,v1,v3,person,company,1.0

Upload both files, and the wizard imports vertices first, then edges — automatically enforcing the correct order.

Bulk Import - Data Source Step

The mapping step lets you wire each CSV column to a graph property. Auto-map handles the common cases, and you can manually override individual columns with a dropdown for full control.

Bulk Import - Column Mapping

Live SQL Server Import

This is where it gets powerful. Instead of exporting data to an intermediate file, you can connect directly to a Microsoft SQL Server instance, write a SQL query, and stream the result set into your graph.

The SQL Server panel provides:

Connection form with Windows Auth and SQL Login support
Query editor with syntax highlighting and a “Run Preview” button that shows the first 50 rows
Import mode selector — choose whether each row becomes a vertex or an edge
Auto-type inference from SQL column types (no annotation suffixes needed)

Bulk Import - SQL Server Connection

Alias your SQL columns to ~id, ~label, and ~partitionKey for instant auto-mapping:

SELECT
    e.EmployeeId     AS [~id],
    'employee'       AS [~label],
    d.DepartmentName AS [~partitionKey],
    e.FullName       AS [name],
    e.Age,
    e.Email,
    d.DepartmentName AS [department]
FROM Employees e
INNER JOIN Departments d ON e.DeptId = d.Id

The query is executed via SqlDataReader in streaming mode — rows are never buffered in memory. They’re converted to graph elements on the fly and handed to the bulk executor in batches.

Bulk Import - SQL Query Editor and Preview

Why It’s Fast: Cosmos SDK Bulk Executor

Under the hood, Bulk Import uses the Cosmos DB SDK Bulk Executor — not individual Gremlin queries. The difference is dramatic:

Approach	Throughput
Individual `g.addV()` queries	~50–200 items/sec
Cosmos SDK Bulk Executor	~5,000–50,000+ items/sec

The SDK handles parallelism automatically with AllowBulkExecution, auto-retries on 429 (throttled) and 409 (conflict) responses, and optimizes network utilization without any manual batch sizing.

For a typical import of 10,000 vertices + 20,000 edges at 4,000 RU/s provisioning, expect completion in 1–2 minutes instead of the hour it would take with individual Gremlin queries.

Real-Time Progress and RU Tracking

During import, the wizard streams live metrics via SignalR:

Progress bar with percentage complete
Vertex and edge counters updating in real time
RU consumed — so you can monitor cost as it happens
Elapsed time and estimated remaining time
Activity log with timestamped events

Need to keep querying while a big import runs? Click Minimize and the wizard collapses to a floating picture-in-picture widget in the bottom-right corner. It shows phase, progress, and stats — click to expand back to the full wizard at any time.

Bulk Import - Preview Step

Bulk Import - Import Complete

Validation Before Writes

Before a single document is written to Cosmos DB, the wizard runs a full validation pass:

Required columns — checks for ~id, ~label, ~partitionKey (vertices) or ~from, ~to, ~fromLabel, ~toLabel (edges)
Partition key values — ensures every vertex has a non-empty partition key
Type parsing — validates that :number columns contain numbers, :boolean columns contain booleans, etc.
Referential integrity — ensures edge ~from and ~to IDs exist in the vertex set
Duplicate ID detection — warns about duplicate vertex IDs within the import

Blocking errors prevent the import from starting. Warnings (like empty optional fields) are surfaced but don’t block.

Create vs. Upsert

Before importing, choose your duplicate handling strategy:

Create (fail on duplicate) — fastest option. Fails if a vertex/edge with the same ID and partition key already exists.
Upsert (overwrite duplicates) — safe for re-imports. Creates new items or updates existing ones. Costs roughly 1.5x the RU of a create.

If you’re loading data for the first time, use Create. If you’re refreshing or re-importing, use Upsert.

Error Recovery

Import doesn’t stop on individual failures. If a row fails (type mismatch, conflict, oversized document), GremlinStudio logs the error and continues with the remaining items. At the end, you get:

A completion summary with success/failure counts and total RU consumed
A downloadable error report (CSV) listing every failed row with the error description
The option to fix the source data and re-import just the failed rows

Pro Feature

Bulk Import is available on Pro ($9.99/mo) and Team plans. Trial users can preview the wizard but cannot execute imports. View pricing →

Try It Now

If you’re already on a Pro plan, the Bulk Import wizard is available in the activity bar — look for the upload icon. If you’re new to GremlinStudio, download the app and start your 7-day free trial.

For the full feature walkthrough, see the Bulk Import feature page →.

← Back to Blog