Skip to content

V2 Query Engines

With Dune V2 we’re moving away from a PostgreSQL query engine to:

  1. Dune SQL - A self hosted instance of Trino.
  2. Spark SQL - Apache Spark hosted on Databricks. Recommended to use only for Spellbook.

Warning

Dune SQL exited alpha on March 2nd, 2023. We have made some changes to datatypes and column names that introduced breaking changes See DuneSQL changes for more.
For help with migrating queries to compatible data types and general Dune SQL functions, the best place to ask questions is on our#dune-sql Discord channel.

Syntax and operator differences

The syntax and keyword operator differences between Postgres, Spark, and Dune SQL are quite minimal, here are a few key ones:

Syntax Comparison

Description
V1 - PostgreSQL V2 - Spark SQL V2 - Dune SQL
bytea2numeric, or casting hex/bytea to a number bytea2numeric (bytea) bytea2numeric_v3 (string) bytearray_to_integer (hex)
bytearray_to_bigint (hex)
bytearray_to_decimal (hex)
bytearray_to_uint256 (hex)
bytearray_to_int256 (hex)
More details on Byte Array to Numeric Functions
Doing math or numeric operations on a column, like value in ethereum.transactions sum(value) sum(value) sum(cast(value as double)) soon this won't be needed as UINT and INT columns are added automatically.
0 vs 1 array based indexing 1 indexed 0 indexed 1 indexed
Implicit type conversions between character and numeric types Available Available Not available
Addresses \x2A7D...(bytea)

Works in Postgres
0x2a7d... (string)

Has to be lowercase in Spark.

Can be done via lower('0x2A7D...')
0x2a7d... (Byte array)

No escape quotes should be used, and the literal does not need to be lowercased.
Selecting keyword columns is different "from" `from` "from"
Alias naming is different as "daily active users" as `daily active user` as "daily active users"
Exponentiation notation x/10^y or x * 1e123 x*power(10,y) or x*1e123 x*power(10,y) or x * 1e123
Interval argument has different syntax Interval '1day' Interval '1 day' Interval '1' day
Generate_series () is now sequence () generate_series('2022-05-15', CURRENT_DATE, '1 day') explode(sequence(to_date('2022-01-01'), to_date('2022-02-01'), interval 1 day)) unnest(sequence(date('2022-01-01'), date('2022-02-01'), interval '7' day))

Has a 10000 values limit, and must go in the FROM statement not the SELECT.
Handling decimals for prices.usd Don’t use prices.usd decimals Replaced by prices.tokens decimals Replaced by tokens_[blockchain].erc20.decimals
Define NULL array NULL::integer[] CAST(NULL AS ARRAY<int>)) CAST(NULL AS ARRAY<int>))
encoding strings to hex encode(string, 'hex') hex(string) hex(string)

*available soon
Get json object differences (takerOutputUpdate->'deltaWei'->'value') decode(substring((addressSet->'baseAsset')::TEXT, 4,40), 'hex') get_json_object(get_json_object(takerOutputUpdate,'\(.deltaWei'),'\).value')'0x' json_query(json_query(takerOutputUpdate, 'lax $.deltaWei' omit quotes), 'lax $.value')
Group by an alias SELECT date_trunc('hour',evt_block_time) as col1, COUNT(*) FROM erc721_ethereum evt_Transfer GROUP BY col1 Same as PostgreSQL GROUP BY date_trunc('hour',evt_block_time)Or: GROUP BY 1, 2
Explicit date/time casting '2021-08-08 17:00'::timestamp cast('2021-08-08 17:00' as timestamp) cast('2021-08-08 17:00' as timestamp)

Or, timestamp '2021-08-08 17:00'

There are many helper functions for casting to date/time types, such as date(‘2022-01-01’)
Checking if an item exists in an array value = ANY (array) array_contains(array, value) contains(array, value) or contains_sequence(array, array[values])
Explode SELECT unnest(array) FROM table SELECT explode(array) FROM table SELECT vals.val FROM table1, unnest(arrayFromTable1) as vals(val)

you have to use unnest with a cross join, as described in this blog post.
Median PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY x) PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY x) approx_percentile(x, 0.5)
Using “is True/False” X is true X is true X = true
String Data Type varchar string varchar
Casting as Strings cast([xxx] as string) cast([xxx] as string) cast([xxx] as varchar)
left() is no longer a method available for returning substrings left([string],[length]) left([string],[length]) substr([string], [start], [length])

Returns varchar; Positions start with 1, so use 1 for length if you want to replicate left() functionality left(somestring, somenumber) -> substr(somestring, 0, somenumber)
Aggregate Functions array_agg(col), array_agg(distinct(col)) array_agg(col) or collect_list(col), collect_set(col) or array_agg(distinct(col)) array_agg(col), array_agg(distinct(col))
user generated views create view dune_user_generated.table none each query is a view, like query_1747157
event logs topic indexing topic 1,2,3,4 topic 1,2,3,4 topic 0,1,2,3

Using double quotes is not recommended in DuneV2, even when the engine runs your query without returning an error. This is because the parser sometimes treats words in double quotes as a string and sometimes it treats them as an object like a column name.

For example, referencing a column name in the WHERE clause using double quotes works as expected. However, the same query inside a CTE treats the column name as a string, as can be seen here.

Dune SQL Data Types and Functions

Numerical types in Dune SQL

We support the numerical types INTEGER, BIGINT, DOUBLE, and fixed precision DECIMAL with precision up to 38 digits (i..e, DECIMAL(38, 0)). Additionally, we support UINT256 for representing unsigned 256 bit integers and INT256 for signed 256 bit integers, using two's complement.

Byte Array Functions in Dune SQL

Warning

Operators such as ||, concat(), trim() being performed on addresses and transaction hashes will stop working when Dune SQL exits Alpha. We are changing the data type of these bytearrays from varchar to varbinary in order to reduce data stored and improve querying speed by up to 50%. Please use the functions below when interacting with bytearrays to remain compatible with future changes.

Dune SQL currently represents byte arrays using the varbinary type. Byte arrays can also be represented using 0x-prefixed strings.

To make it simpler to work with byte arrays we have the following helper functions, which work with these two kinds of representation. They simplify interactions with byte arrays, as they automatically account for the 0x-prefix and use byte index instead of characther index. For instance, the bytearray_substring methods take indexes by byte, not by characther (twice the byte array length).

If there is an operation you need to do on byte arrays which is not covered by a function in the list below you should reach out to the Dune team.

Function Return Type Argument Types Description
bytearray_concat varbinary or varchar varbinary, varbinary or varchar, varchar Concatenates two byte arrays
bytearray_length bigint varbinary or varchar Returns the length of a byte array
bytearray_ltrim varbinary or varchar varbinary or varchar Removes zero bytes from the beginning of a byte array
bytearray_position bigint varbinary, varbinary or varchar, varchar Returns the index of a given bytearray (or 0 if not found)
bytearray_replace varbinary or varchar varbinary, varbinary, varbinary or varchar, varchar, varchar Greedily replaces occurrences of a pattern within a byte array
bytearray_reverse varbinary or varchar varbinary or varchar Reverse a given byte array
bytearray_rtrim varbinary or varchar varbinary or varchar Removes zero bytes from the end of a byte array
bytearray_starts_with boolean varbinary, varbinary or varchar, varchar Determines whether a byte array starts with a prefix
bytearray_substring varbinary or varchar varbinary, integer or varchar, integer Suffix byte array starting at a given index
bytearray_substring varbinary or varchar varbinary, integer, integer or varchar, integer, integer Sub byte array of given length starting at an index

Byte Array to Numeric Functions

Function Description
bytearray_to_integer Returns the INTEGER value of a big-endian byte array of length <= 4 representing the integer in two's complement. If the byte array has length < 4 it is padded with zero bytes.
bytearray_to_bigint Returns the BIGINT value of a big-endian byte array of length <= 8 representing the bigint in two's complement. If the byte array has length < 8 it is padded with zero bytes.
bytearray_to_decimal Returns the DECIMAL(38,0) value of a big-endian byte array of length <= 16 representing the decimal(38,0) in two's complement. If the byte array has length < 16 it is padded with zero bytes.
bytearray_to_uint256 Returns the UINT256 of a big-endian byte array of length <= 32 representing the unsigned integer. If the byte array has length < 32 it is padded with zero bytes.
bytearray_to_int256 Returns the INT256 of a big-endian byte array of length <= 32 representing the signed integer. If the byte array has length < 32 it is padded with zero bytes.
bytea2numeric This function has been deprecated. It is an alias for bytearray_to_bigint

The byte array conversion functions throw an overflow exception if the byte array is larger than the number of bytes supported of the type, even if the most significant bytes are all zero. It is possible to use bytearray_ltrim in order to trim the zero bytes from the left.

Here is a dashboard with examples covering all of the above functions.

Query queries as views in Dune SQL

All non-paramterized queries written using Dune SQL can be queried as views in other queries using the identifier query_<queryId>. For instance:

select * from query_1234
Here is an example query which queries this query. The queryId is part of the URL of a query.

Note:

  • All output columns of the query being queried must be named. That is, you cannot query select 1 or select count(*) from ethereum.transactions, but you can query select 1 as v and select count(*) as total from ethereum.transactions.
  • Parametrized queries are not supported.
  • Only public queries can be queried. Support for querying private queries will be added in the future.
  • Only saved queries can be queried.
  • Only queries written in Dune SQL can be queried.

Other questions and feedback

As ever, Google is a great friend in answering your SQL questions.

With Dune V2, instead of googling “PGSQL median”, you should now google “Spark SQL median” (for Spark SQL) or “Trino SQL median” (for Dune SQL).

Both also have well documented index of built in functions on their website:

Our #dune-sql Discord channel is the best place to get help from our team and Wizard community when Google fails you.

As you come across issues or identify areas of improvement, please send us an email at dunesql-feedback@dune.com and we’ll work with you to update and optimize!