Welcome to BlazingSQL’s documentation!

BlazingSQL is a collection of configurable primitives for operating on distributed dataframes. It includes an I/O layer for interacting with filesystems like S3 or HDFS. It contains communication primitives that can leverage different backends like UCX and TCP. It seperates the.


BlazingSQL provides a high-performance distributed SQL engine in Python. Built on the RAPIDS GPU data science ecosystem, ETL massive datasets on GPUs.

from blazingsql import BlazingContext
# Start up BlazingSQL
bc = BlazingContext()

# Create table from CSV
bc.create_table('taxi', '/blazingdb/data/taxi.csv')

# Query table (Results return as cuDF DataFrame)
gdf = bc.sql('SELECT count(*) FROM taxi GROUP BY year(key)')

# Display query results

Indices and tables