2 years ago
#72114
ardaar
Error when creating external table in Redshift Spectrum with dbt: cross-database reference not supported
I want to create an external table in Redshift Spectrum from CSV files. When I try doing so with dbt, I get a strange error. But when I manually remove some double quotes from the SQL generated by dbt and run it directly, I get no such error.
First I run this in Redshift Query Editor v2 on default database dev
in my cluster:
CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
Database dev
now has an external schema named example_schema
(and Glue catalog registers example_db
).
I then upload example_file.csv
to the S3 bucket s3://example_bucket
. The file looks like this:
col1,col2
1,a,
2,b,
3,c
Then I run dbt run-operation stage_external_sources
in my local dbt project and get this output with an error:
21:03:03 Running with dbt=1.0.1
21:03:03 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.example_project.example_models
21:03:03 1 of 1 START external source example_schema.example_table
21:03:03 1 of 1 (1) drop table if exists "example_db"."example_schema"."example_table" cascade
21:03:04 Encountered an error while running operation: Database Error
cross-database reference to database "example_db" is not supported
I try running the generated SQL in Query Editor:
DROP TABLE IF EXISTS "example_db"."example_schema"."example_table" CASCADE
and get the same error message:
ERROR: cross-database reference to database "example_db" is not supported
But when I run this SQL in Query Editor, it works:
DROP TABLE IF EXISTS "example_db.example_schema.example_table" CASCADE
Note that I just removed some quotes.
What's going on here? Is this a bug in dbt-core
, dbt-redshift
, or dbt_external_tables
--or just a mistake on my part?
To confirm, I can successfully create the external table by running this in Query Editor:
DROP SCHEMA IF EXISTS example_schema
DROP EXTERNAL DATABASE
CASCADE
;
CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
CREATE EXTERNAL TABLE example_schema.example_table (
col1 SMALLINT,
col2 CHAR(1)
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION 's3://example_bucket'
TABLE PROPERTIES ('skip.header.line.count'='1')
;
dbt config files
models/example/schema.yml
(modeled after this example:
version: 2
sources:
- name: example_source
database: dev
schema: example_schema
loader: S3
tables:
- name: example_table
external:
location: 's3://example_bucket'
row_format: >
serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties (
'strip.outer.array'='false'
)
columns:
- name: col1
data_type: smallint
- name: col2
data_type: char(1)
dbt_project.yml
:
name: 'example_project'
version: '1.0.0'
config-version: 2
profile: 'example_profile'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_packages"
models:
example_project:
example:
+materialized: view
packages.yml
:
packages:
- package: dbt-labs/dbt_external_tables
version: 0.8.0
amazon-redshift
dbt
amazon-redshift-spectrum
0 Answers
Your Answer