CREATE
CREATE SCHEMA
Creates a schema (database) with the specified name.
Overview
CREATE SCHEMA [ IF NOT EXISTS ] schema_name
[ WITH ( location = 'object_storage_location' ) ]
Parameters
[ IF NOT EXISTS ]
- Suppresses the error if a schema with the same name already exists.
[ WITH (location = 'object_storage_location') ]
- For data sources using Object Storage as schema storage, specifies the path where the schema files will reside.
object_storage_location
must be in the formats3a://bucket-name/my/path
.
CREATE TABLE
Creates a table with the specified name.
Overview
CREATE TABLE [ IF NOT EXISTS ] table_name
(
{ column_name data_type [ NOT NULL ]
[ COMMENT comment ]
[ WITH ( property_name = expression [, ...] ) ]
| LIKE existing_table_name
[ { INCLUDING | EXCLUDING } PROPERTIES ]
}
[, ...]
)
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]
Parameters
[ IF NOT EXISTS ]
- Suppresses the error if a table with the same name already exists.
column_name data_type [ NOT NULL ]
-
Defines the columns of the table.
Data Query types mapped from Hive types:Hive data type Data Query data type TINYINT TINYINT SMALLINT SMALLINT INT INTEGER BIGINT BIGINT FLOAT FLOAT DOUBLE DOUBLE DECIMAL(p, s) DECIMAL(p, s) BOOLEAN BOOLEAN STRING VARCHAR CHAR(n) CHAR(n) VARCHAR(n) VARCHAR(n) DATE DATE TIMESTAMP TIMESTAMP INTERVAL INTERVAL ARRAY<T> ARRAY<T> MAP<K, V> MAP<K, V> STRUCT<...> ROW(...)
[ COMMENT comment ]
- Adds a comment to the table or column.
[ LIKE existing_table_name ]
- Copies all columns from an existing table.
Multiple LIKE clauses can be used to copy from multiple tables.
[ WITH (property_name = expression , ...]) ]
- Sets table-level properties as key-value pairs.
Table attributes
property_name | Description | Default |
---|---|---|
bucket_count | Number of buckets - Must be used with bucketed_by | 0 |
bucketed_by | Columns to use for bucketing - Must be used with bucket_count | [] |
csv_escape | Escape character for CSV - Only used when format = 'CSV' | |
csv_quote | Quote character for CSV - Only used when format = 'CSV' | |
csv_separator | Separator for CSV format - Can use custom chars like ` | or Unicode like U&'\0009'` for tabs |
external_location | URI used for Hive external tables - If omitted, creates a managed table - Must be in format s3a://bucket-name/my/path ⚠️ If the path does not exist in Object Storage, table creation fails | |
format | File format (ORC , PARQUET , AVRO , JSON , TEXTFILE , CSV ) | ORC |
partitioned_by | Columns used for partitioning - Must be last in column definition | [] |
Examples
CREATE TABLE Example
-- Create a managed ORC table with partitioning and bucketing
CREATE TABLE orders (
totalprice double,
orderdate date,
orderkey bigint,
orderstatus varchar
)
WITH (
format = 'ORC',
partitioned_by = ARRAY['orderkey', 'orderstatus'],
bucketed_by = ARRAY['orderdate'],
bucket_count = 50
)
-- Create a CSV external table with comments
CREATE TABLE IF NOT EXISTS orders (
orderkey bigint,
orderstatus varchar,
totalprice double COMMENT 'Price in cents.',
orderdate date
)
COMMENT 'A table to keep track of orders.'
WITH (
external_location = 's3a://test-bucket/tpch/orders',
format = 'CSV',
csv_quote = '"',
csv_separator = ','
)
-- Create using LIKE clause
CREATE TABLE bigger_orders (
another_orderkey bigint,
LIKE orders,
another_orderdate date
)
CREATE TABLE AS
Creates a table from the results of a query.
Overview
CREATE TABLE [ IF NOT EXISTS ] table_name
[ ( column_alias, ... ) ]
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]
AS query
[ WITH [ NO ] DATA ]
Parameters
[ IF NOT EXISTS ]
- Suppresses the error if a table with the same name already exists.
[ ( column_alias, ... ) ]
- Specifies column names. If omitted, columns from the query are used.
[ COMMENT comment ]
- Adds a comment to the table.
[ WITH (property_name = expression , ...]) ]
- Sets table-level properties.
Same attributes as Table attributes.
[ WITH [ NO ] DATA ]
- Creates an empty table with the same schema as the query.
Examples
CREATE TABLE AS Example
-- Create a table using column aliases
CREATE TABLE orders_column_aliased (order_date, total_price)
AS
SELECT orderdate, totalprice
FROM orders
-- Create a summary table from query
CREATE TABLE orders_by_date
COMMENT 'Summary of orders by date'
WITH (format = 'ORC')
AS
SELECT orderdate, sum(totalprice) AS price
FROM orders
GROUP BY orderdate
-- Create an empty table by copying schema
CREATE TABLE empty_nation AS
SELECT *
FROM nation
WITH NO DATA