2 years ago

#72114

test-img

ardaar

Error when creating external table in Redshift Spectrum with dbt: cross-database reference not supported

I want to create an external table in Redshift Spectrum from CSV files. When I try doing so with dbt, I get a strange error. But when I manually remove some double quotes from the SQL generated by dbt and run it directly, I get no such error.

First I run this in Redshift Query Editor v2 on default database dev in my cluster:

CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;

Database dev now has an external schema named example_schema (and Glue catalog registers example_db).

I then upload example_file.csv to the S3 bucket s3://example_bucket. The file looks like this:

col1,col2
1,a,
2,b,
3,c

Then I run dbt run-operation stage_external_sources in my local dbt project and get this output with an error:

21:03:03  Running with dbt=1.0.1
21:03:03  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.example_project.example_models

21:03:03  1 of 1 START external source example_schema.example_table
21:03:03  1 of 1 (1) drop table if exists "example_db"."example_schema"."example_table" cascade
21:03:04  Encountered an error while running operation: Database Error
  cross-database reference to database "example_db" is not supported

I try running the generated SQL in Query Editor:

DROP TABLE IF EXISTS "example_db"."example_schema"."example_table" CASCADE

and get the same error message:

ERROR: cross-database reference to database "example_db" is not supported

But when I run this SQL in Query Editor, it works:

DROP TABLE IF EXISTS "example_db.example_schema.example_table" CASCADE

Note that I just removed some quotes.

What's going on here? Is this a bug in dbt-core, dbt-redshift, or dbt_external_tables--or just a mistake on my part?

To confirm, I can successfully create the external table by running this in Query Editor:

DROP SCHEMA IF EXISTS example_schema
DROP EXTERNAL DATABASE
CASCADE
;
CREATE EXTERNAL SCHEMA example_schema
FROM DATA CATALOG
DATABASE 'example_db'
REGION 'us-east-1'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
CREATE EXTERNAL TABLE example_schema.example_table (
  col1 SMALLINT,
  col2 CHAR(1)
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION 's3://example_bucket'
TABLE PROPERTIES ('skip.header.line.count'='1')
;

dbt config files

models/example/schema.yml (modeled after this example:

version: 2

sources:
  - name: example_source
    database: dev
    schema: example_schema
    loader: S3
    tables:
      - name: example_table
        external:
          location: 's3://example_bucket'
          row_format: >
            serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
            with serdeproperties (
              'strip.outer.array'='false'
            )
          columns:
            - name: col1
              data_type: smallint
            - name: col2
              data_type: char(1)

dbt_project.yml:

name: 'example_project'
version: '1.0.0'
config-version: 2

profile: 'example_profile'

model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target"
clean-targets:
  - "target"
  - "dbt_packages"

models:
  example_project:
    example:
      +materialized: view

packages.yml:

packages:
  - package: dbt-labs/dbt_external_tables
    version: 0.8.0

amazon-redshift

dbt

amazon-redshift-spectrum

0 Answers

Your Answer

Accepted video resources