#How to set up and use Presto with Superset + QueryPie

With QueryPie's recent implementation of data warehouse sources such as Snowflake, BigQuery, and Redshift, feedback from developers as well as data analysts dealing with large data volumes has been overwhelming.  With this demand, we are also preparing to add new data sources, such as Presto and Qubole, to meet the growing needs of our users.

Today we're going to talk about how to connect to Presto on QueryPie, set server settings, connect using Supersets, and how much more convenient it is to run queries on QueryPie compared to other tools.

1. What is Presto?

Presto is a big open-source data analysis tool developed by Facebook. It's easy to confuse Presto as an existing database. However, instead of replacing the current concept of MySQL and PostgreSQL, it is more like a SQL query engine, which effectively handles large amounts of data using distributed queries. The Presto server consists of a coordinator and several staff members, where the coordinator plays the role of the interface.

source
Reference: https://labs.gree.jp/blog/2014/12/12838/

Some of the advantages of Presto are its various repositories and natural plug-in expansions. It's also faster than Hive, developed on Facebook for the same initial purpose. Because Presto supports standard SQL, it can process an extensive number of data resources, including Athena and Qubole.  Presto is showing a significant increase in its number of users.

2. How to set up a Presto Server

To set up Presto, prepare the server for installation, and then download the required files from prestosql.io.

wget https://repo1.maven.org/maven2/io/prestosql/presto-server/332/presto-server-332.tar.gz

tar -xvzf presto-server-332.tar.gz

cd presto-server-332
Unzip after downloading tar files

You also need to create an etc directory inside the folder.

And create properties in the etc directory.

## etc/node.properties
node.environment=production
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir=/var/presto/data

## etc/jvm.properties
-server
-Xmx16G
-XX:-UseBiasedLocking
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+UseGCOverheadLimit
-XX:+HeapDumpOnOutOfMemoryError
-XX:ReservedCodeCacheSize=512M
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000

## etc/config.properties
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

## etc/log.properties
io.prestosql=INFO

## etc/catalog/jmx.propterties
connetor.name = jmx
The following is a minimal

After creating all the config files, type bin/launcher start and launch.

Once launched, go to localhost:8080, and you'll see the default UI running automatically. You can also see that it is connected to Presto by creating and re-executing the properties of the database you want in the etc/catalog.

3. Use case (1): BI Tool

📌Presto on Superset

Presto is similar to RDBMSs such as MYSQL, SQLServer, PostgreSQL, and various other data sources such as Hive, Cassandra, Kafka, and Elastic Search, so it is often used by organizations that process large amounts of data in real-time. When analyzing using and analyzing SQL, we usually share visualized data in the form of a dashboard.

Many amazing BI tools support Presto. Check out this helpful list of BI Tools to find the right one for you.

Today we're going to use Superset, an Open Source tool, to connect to Presto.

Superset - https://superset.incubator.apache.org/

You can connect to your desired database using Superset by going to Sources > Databases. To add your database, click the plus button next to the 'Filter List' button. You can connect to Presto on the page that pops up and enter the corresponding values in the Database, SQLalchemy URI.

SQL Alchemy URI - presto://user@host:port/catalog

Now you can use the linked data sources to generate charts and dashboards. You can also use the SQL Lab, which acts as an editor when you need to run a query and check a value. Within the SQL Lab, there is an editor, search functions, and an option to save queries so you can conveniently run SQL inside Superset. However, there are limited options for formatting, viewing tables, and no way to see the simultaneous execution of queries. For that, you need to work within a SQL Tool.

SQL Editor on Superset

4. Use case (2): SQL Tool

📌Presto on QueryPie

Presto supports basic jdbc, so you can create queries by connecting to an SQL IDE like DataGrip that can import custom jdbc. However, it's a bit bothersome to import after downloading the jar file. There is also the inconvenience of writing a query every time you want to specify a catalog.

QueryPie supports Presto without the need to import jar files and standard query statements without specifying a catalog.

It also provides essential SQL IDE tools, such as the simultaneous execution of SQL and easy table browsing. Compared to other tools, QueryPie can collaborate with your teams easily through functions such as sharing SQL through shared directories and comments.

If you were previously analyzing your data with Presto, try out new Presto SQL IDE!

📌 Start QueryPie for free
📌 See QueryPie Tutorial