BlazingContext.create_table(table_name, input, **kwargs)

Create a BlazingSQL table.

table_name : string of table name. input : data source for table.

cudf.Dataframe, dask_cudf.DataFrame, pandas.DataFrame, filepath for csv, orc, parquet, etc…

file_format (optional)string describing the file format

(e.g. “csv”, “orc”, “parquet”) this field must only be set if the files do not have an extension.

local_files (optional)boolean, must be set to True if workers

only have access to a subset of the files belonging to the same table. In such a case, each worker will load their corresponding partitions.

get_metadata (optional)boolean, to use parquet and orc metadata,

defaults to True. When set to False it will skip the process of getting metadata.

Create table from cudf.DataFrame:

>>> import cudf
>>> df = cudf.DataFrame()
>>> df['a'] = [6, 9, 1, 6, 2]
>>> df['b'] = [7, 2, 7, 1, 2]
>>> from blazingsql import BlazingContext
>>> bc = BlazingContext()
BlazingContext ready
>>> bc.create_table('sample_df', df)
<pyblazing.apiv2.context.BlazingTable at 0x7f22f58371d0>

Create table from local file in ‘data’ directory:

>>> bc.create_table('taxi', 'data/nyc_taxi.csv', header=0)
<pyblazing.apiv2.context.BlazingTable at 0x7f73893c0310>

Register and create table from a public AWS S3 bucket:

>>> bc.s3('blazingsql-colab', bucket_name='blazingsql-colab')
>>> bc.create_table('taxi',
>>>     's3://blazingsql-colab/yellow_taxi/1_0_0.parquet')
<pyblazing.apiv2.context.BlazingTable at 0x7f09264c0310>