blazingsql.BlazingContext.sql

BlazingContext.sql(query, algebra=None, config_options={}, return_token: bool = False)

Query a BlazingSQL table.

Returns results as cudf.DataFrame on single-GPU or dask_cudf.DataFrame when distributed (multi-GPU).

query : string of SQL query. algebra (optional) : string of SQL algebra plan. Use this to

run on a relational algebra, instead of the query string.

config_options (optional)defaulted to empty. You can use this to

set a specific set of config_options for this query instead of the ones set in BlazingContext. See BlazingContext for more info on this parameter

Register a public S3 bucket, then create and query a table from it:

>>> from blazingsql import BlazingContext
>>> bc = BlazingContext()
>>> bc.s3('blazingsql-colab', bucket_name='blazingsql-colab')
>>> bc.create_table('taxi',
    's3://blazingsql-colab/yellow_taxi/1_0_0.parquet')
<pyblazing.apiv2.context.BlazingTable at 0x7f186006a310>
>>> result = bc.sql('SELECT vendor_id, tpep_pickup_datetime,
        passenger_count, Total_amount FROM taxi')
>>> print(result)
          vendor_id tpep_pickup_datetime  passenger_count  Total_amount
0                 1  2017-01-09 11:13:28                1     15.300000
1                 1  2017-01-09 11:32:27                1      7.250000
2                 1  2017-01-09 11:38:20                1      7.300000
3                 1  2017-01-09 11:52:13                1      8.500000
4                 2  2017-01-01 00:00:00                1     52.799999
...             ...                  ...              ...           ...
>>> query = '''
>>>         SELECT
>>>             tpep_pickup_datetime, trip_distance, Tip_amount,
>>>             MTA_tax + Improvement_surcharge + Tolls_amount AS extra
>>>         FROM taxi
>>>         WHERE passenger_count = 1 AND Fare_amount > 100
>>>         '''
>>> df = bc.sql(query)
>>> print(df)
     tpep_pickup_datetime  trip_distance  Tip_amount      extra
0     2017-01-01 06:56:01       0.000000    0.000000   1.000000
1     2017-01-01 07:11:52       0.000000    0.000000  24.619999
2     2017-01-01 07:27:10      37.740002   37.580002  31.179998
3     2017-01-01 07:35:13      42.730000    5.540000  26.869999
4     2017-01-01 07:42:09      17.540001    0.000000  24.900000
...                   ...            ...         ...        ...

Docs: https://docs.blazingdb.com/docs/single-gpu