Skip to main content

ThaiPy Meetup - DuckDB Talk

·150 words·1 min
Kostiantyn Lysenko
Author
Kostiantyn Lysenko

Last Thursday I attended ThaiPy - a monthly Python meetup in Bangkok. There was an interesting talk about DuckDB usage in Python.

I knew that DuckDB is superglue of data wrangling, but I didn’t know that it’s that powerful.

DuckDB can attach to various databases directly - for example Postgres:

ATTACH '' AS postgres_db (TYPE postgres);
SELECT * FROM postgres_db.tbl;

And save data to Parquet from any source - not just Postgres:

COPY postgres_db.tbl TO 'data.parquet';
COPY postgres_db.tbl FROM 'data.parquet';

In your code you can transparently switch between Pandas DataFrames and DuckDB, thanks to Apache Arrow:

pandas_df = pd.DataFrame({"a": [42]})
duckdb.sql("SELECT * FROM pandas_df")

Actually you can represent your DuckDB results as almost any popular data crunching library in Python - Pandas, Polars, Arrow, NumPy:

duckdb.sql("SELECT 42").fetchall()     # Python objects
duckdb.sql("SELECT 42").df()           # Pandas DataFrame
duckdb.sql("SELECT 42").pl()           # Polars DataFrame
duckdb.sql("SELECT 42").arrow()        # Arrow Table
duckdb.sql("SELECT 42").fetchnumpy()   # NumPy Arrays

DuckDB Postgres integration
DuckDB Pandas query
DuckDB output formats


comments powered by Disqus