select*from(select relations.table_name as table_name, -- schema name and table namecount(relations.table_name) as relationships, -- number of table relationshipscount(relations.referenced_tables) as foreign_keys, -- number of foreign keys in a tablecount(relations.referencing_tables) asreferences, -- number of foreign keys that are referring to this tablecount(distinct related_table) as related_tables, -- number of related tablescount(distinct relations.referenced_tables) as referenced_tables, -- number of different tables referenced with FKs (multiple FKs can refer to one table, so number of FKs might be different than number of referenced tables)count(distinct relations.referencing_tables) as referencing_tables -- number of different tables that are referring to this table (similar to referenced_tables)from(select pk_tco.table_schema ||'.'|| pk_tco.table_name as table_name, fk_tco.table_schema ||'.'|| fk_tco.table_name as related_table, fk_tco.table_name as referencing_tables,null::varchar(100) as referenced_tablesfrom information_schema.referential_constraints rcojoin information_schema.table_constraints fk_tcoon rco.constraint_name = fk_tco.constraint_nameand rco.constraint_schema = fk_tco.table_schemajoin information_schema.table_constraints pk_tcoon rco.unique_constraint_name = pk_tco.constraint_nameand rco.unique_constraint_schema = pk_tco.table_schemaunionallselect fk_tco.table_schema ||'.'|| fk_tco.table_name as table_name, pk_tco.table_schema ||'.'|| pk_tco.table_name as related_table,nullas referencing_tables, pk_tco.table_name as referenced_tablesfrom information_schema.referential_constraints rcojoin information_schema.table_constraints fk_tco on rco.constraint_name = fk_tco.constraint_nameand rco.constraint_schema = fk_tco.table_schemajoin information_schema.table_constraints pk_tcoon rco.unique_constraint_name = pk_tco.constraint_nameand rco.unique_constraint_schema = pk_tco.table_schema) relationsgroupby table_nameorderby relationships asc) resultswhere substring(table_name, 5, 2) ='d_'; -- substring(string, start_position, length)
List column definitions and order between a set of tables
with recursivemain_tables ( table_schema, table_name) as (values ('gsidb', 'd_document'), ('gsidb', 'd_docfile'), ('gsidb', 'd_study'), ('gsidb', 'd_location'), ('gsidb', 'd_studylocation'), ('gsidb', 'd_sampcoll'), ('gsidb', 'd_sampmain'), ('gsidb', 'd_sampsplit'), ('gsidb', 'd_labsample'), ('gsidb', 'd_labpkg'), ('gsidb', 'd_labresult')),exclude_columns (column_name) as (values ('gid'), ('studyloc_alias'), ('sampcoll_alias'), ('sample_alias'), ('analresult_alias'), ('rev_user'), ('rev_time')),dependencies as (select tc.table_name aschild, tu.table_name asparentfrom information_schema.table_constraints as tcinnerjoin information_schema.constraint_table_usage as tuon tu.constraint_name = tc.constraint_namewhere tc.constraint_type ='FOREIGN KEY'and tc.table_name <> tu.table_name),dep_depth as (select dep.child, dep.parent, 1as lvlfrom dependencies as depunionallselect dep.child, dep.parent, dd.lvl +1as lvlfrom dep_depth as ddinnerjoin dependencies as dep on dep.parent= dd.child ),table_order as (select all_levels.table_name, all_levels.table_orderfrom (select dd.parentas table_name, max(lvl) as table_orderfrom dep_depth as ddgroupby table_nameunionselect dd.childas table_name, max(lvl) +1aslevelfrom dep_depth as ddleftjoin dependencies as dp on dp.parent= dd.childwhere dp.parentisnullgroupby dd.child ) as all_levelsorderby table_order),cols as (select cc.table_name,-- Some column names need changing because they exist in multiple tablescasewhen cc.column_name in ('comments', 'description', 'fraction')thenreplace(cc.table_name, 'd_', '') ||'_'|| cc.column_nameelse cc.column_name endas column_name, data_type, character_maximum_length, is_nullable, table_order, ordinal_position,row_number() over () as col_orderfrom information_schema.columnsas ccinnerjoin table_order on table_order.table_name=cc.table_nameinnerjoin main_tables on main_tables.table_schema=cc.table_schema and main_tables.table_name=cc.table_nameleftjoin exclude_columns on cc.column_name=exclude_columns.column_namewhere exclude_columns.column_name isnullorderby table_order.table_order, table_order.table_name, cc.ordinal_position)selectdistinct cols.column_name, cols.column_name||' '|| data_type||casewhen character_maximum_length isnullthen''else'('|| character_maximum_length ||')'end-- || case when not is_nullable::boolean then ' NOT NULL' else '' end as col_def,as col_def, cols.table_order, cols.table_name, cols.ordinal_position, row_number() over () as col_orderfrom colsorderby cols.table_order, cols.table_name, cols.ordinal_position
Common Functions
LENGTH(string): Returns the length of the provided string
POSITION(string IN substring): Returns the position of the substring within the specified string.
CAST(expression AS datatype): Converts an expression into the specified data type.
`NOW: Returns the current date, including time.
CEIL(input_val): Returns the smallest integer greater than the provided number.
FLOOR(input_val): Returns the largest integer less than the provided number.
ROUND(input_val, [round_to]): Rounds a number to a specified number of decimal places.
TRUNC(input_value, num_decimals): Truncates a number to a number of decimals.
REPLACE(whole_string, string_to_replace, replacement_string): Replaces one string inside the whole string with another string.
SUBSTRING(string, [start_pos], [length]): Returns part of a value, based on a position and length.
Allow user to insert rows to any schema and table.
4
Allow user to update rows in any schema and table.
5
Allow user to delete rows in any schema and table.
Create read only user - shorthand
Once already creating a specific user role, you can user the `pg_read_all_data` to grant read only access to all tables.
GRANT pg_read_all_data TO username;
Finding Temporary Objects
SELECT n.nspname as SchemaName, c.relname as RelationName,CASE c.relkindWHEN'r'THEN'table'WHEN'v'THEN'view'WHEN'i'THEN'index'WHEN'S'THEN'sequence'WHEN's'THEN'special'ENDas RelationType, pg_catalog.pg_get_userbyid(c.relowner) as RelationOwner, pg_size_pretty(pg_relation_size(n.nspname ||'.'|| c.relname)) as RelationSizeFROM pg_catalog.pg_class cLEFTJOIN pg_catalog.pg_namespace AS n ON n.oid= c.relnamespaceWHERE c.relkind IN ('r','s') AND (n.nspname !~ '^pg_toast'and nspname like'pg_temp%')ORDERBY pg_relation_size(n.nspname ||'.'|| c.relname) DESC;
SQL
Terminate backends of a particular user (“test”)
This query will kill every backend user “test” is connected to.
WITH pids AS (SELECT pidFROM pg_stat_activityWHERE usename='test')SELECT pg_terminate_backend(pid)FROM pids;
Cancel every running SQL commands from a particular user (“test”)
This query will cancel every running query issued by the particular user “test”.
WITH pids AS ( SELECT pid FROM pg_stat_activity WHERE username='test') SELECT pg_cancel_backend(pid) FROM pids;
List tables in database
SELECT table_schema, table_name FROM information_schema.tablesORDERBY table_schema,table_name;
createdatabase<new_db_name> owner <user_or_group> template <name_of_db_to_use_as_template>;-- show search_path;set search_path to<default_schema>,public;create extension ifnotexists postgis;create extension ifnotexists dblink;-- Database Creation-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~createdatabase<new_db_name> owner <user_or_group> template <name_of_db_to_use_as_template>;-- show search_path;set search_path to gsidb, public;grantconnect, temporaryondatabase<new_db_name>topublic;grantallondatabase<new_db_name>to<user>;grantallondatabase<new_db_name>to<group>;create extension ifnotexists postgis;create extension ifnotexists dblink;createschema staging;-- Add a unique constraint to e_analyte.full_name and e_analyte.cas_rn so that-- no full name or cas_rn can be used for more than one analyte-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~altertable e_analyteaddconstraint uc_fullname unique(full_name),addconstraint uc_casrn unique(cas_rn);
DBLink
select a.*, b.*from table1 as aleftjoin (select*from dblink('dbname=<database>','select col1, col2, col3 from <table>' ) as d ( col1 text, col2 text, col3 text ) ) as bon a.col1 = b.col2
select pg_get_userbyid(p.proowner) as owner, n.nspname as function_schema, p.proname as function_name, l.lanname as function_language,casewhen l.lanname ='internal'then p.prosrcelse pg_get_functiondef(p.oid)endas definition, pg_get_function_arguments(p.oid) as function_arguments, t.typname as return_typefrom pg_proc pleftjoin pg_namespace n on p.pronamespace = n.oidleftjoin pg_language l on p.prolang = l.oidleftjoin pg_type t on t.oid= p.prorettype where n.nspname notin ('pg_catalog', 'information_schema')and n.nspname ='gsidb'orderby function_schema, function_name;
-- pg_proc contains data for aggregate functions as well as plain functionsselect*from pg_proc-- pg_aggregate is an extension of pg_proc.select*from pg_aggregate
When you refresh data for a materialized view, PostgreSQL locks the entire table therefore you cannot query data against it. To avoid this, you can use the CONCURRENTLY option.
With CONCURRENTLY option, PostgreSQL creates a temporary updated version of the materialized view, compares two versions, and performs INSERT and UPDATE only the differences.
REFRESHMATERIALIZEDVIEW CONCURRENTLY view_name;
Constants
WITH myconstants (analyte_search) as (values ('%Hexachlorocyclopentadiene%'))SELECT*FROM e_analyte, myconstantsWHERE analyte ilike analyte_searchOR full_name ilike analyte_searchOR aliases ilike analyte_search;
These could be refined further by creating a function.
withconst (param) as (values ('%solid%')),dbrows as (select analyte, full_name, chem_class, aliases, cas_rn, current_database() as db from e_analyteunionselect*from dblink('dbname=database1','select analyte, full_name, chem_class, aliases, cas_rn, current_database() as db from e_analyte' ) as d (analyte text, full_name text, chem_class text, aliases text, cas_rn text, db text)unionselect*from dblink('dbname=database2','select analyte, full_name, chem_class, aliases, cas_rn, current_database() as db from e_analyte' ) as d (analyte text, full_name text, chem_class text, aliases text, cas_rn text, db text)unionselect*from dblink('dbname=database3','select analyte, full_name, chem_class, aliases, cas_rn, current_database() as db from e_analyte' ) as d (analyte text, full_name text, chem_class text, aliases text, cas_rn text, db text))select analyte, full_name, chem_class, cas_rn, aliases, count(*) as num_instances, string_agg(db, '; ') as dbfrom dbrows, constwhere analyte ilike paramor full_name ilike paramor aliases ilike paramgroupby analyte, full_name, chem_class, cas_rn, aliasesorderby chem_class, analyte;
Logging
Log slow queries by setting log_min_duration_statement
ALTERdatabase postgres SET log_min_duration_statement ='250ms';
Control which statement types get logged
Control the types of statements that are logged for your database.
ALTERDATABASE postgres SET log_statement ='all';
Valid values include all, ddl, none, mod
Log when waiting on a lock
Log when database is waiting on a lock.
ALTERDATABASE postgres SET log_lock_waits ='on';
Performance
Use statement timeouts to control runaway queries
Setting a statement timeout prevents queries from running longer than the specified time. You can set a statement timeout on the database, user, or session level. We recommend you set a global timeout on Postgres and then override that one specific users or sessions that need a longer allowed time to run.
ALTERDATABASE mydatabase SET statement_timeout ='60s';
Use pg_stat_statements to find the queries and processes that use the most resources
SELECT total_exec_time, mean_exec_time as avg_ms, calls,queryFROM pg_stat_statementsORDERBY mean_exec_time DESCLIMIT10;
Monitor connections in Postgres
This query will provide the number of connection based on type.
Will return the approximate count for a table based on PostgreSQL internal statistics. Useful for large tables where performing a `SELECT count(*)` is costly on performance.
Adding `CONCURRENTLY` during index creation, while not permitted in a transaction, will not hold a lock on the table while creating your index.
CREATEINDEX CONCURRENTLY foobar ON foo (bar);
Replace nulls with other value
Coalesce will use the value and if the value is null display your specified string.
SELECTid, coalesce(ip, 'no IP') FROM logs;
You can supply two columns as well prior to your replacement value and the function will use first not null value.
Import Schema with Mapping a Foreign Data Wrapper (FDW)
Import foreign schema creates foreign tables representing those from the foreign server.
IMPORT FOREIGNSCHEMA"public";
You can IMPORT FOREIGN SCHEMA when mapping a foreign data wrapper to save you from building a new one
Generate data with generate_series
Generates values from the start to the end values supplied based on the interval. Values can be numbers or timestamps. Can be used in a FROM or JOIN clause or CTE. Commonly used when building charts and reports that require all dates to be filled.
Will truncate the date to the specified level of precision. Some example precision levels include: month, week, day, hour, minute.
SELECT date_trunc('day', now());
Perform time math with intervals
You can add or subtract specific amounts of time of a timestamp by casting the value you want as an interval.
SELECT now() -'1 month'::interval;
Make your session rest a bit
This function will make your session sleep for 2.5 seconds. Useful in any testing tool executing a script in a given loop where you want to pause a bit between iterations, as an example.
select pg_sleep(2.5);
PL/pgSQL
Wipe Schema Tables
-- DROP FUNCTION IF EXISTS gsidb.wipe_staging();CREATEORREPLACEFUNCTION gsidb.wipe_staging()RETURNS TABLE(staging_schema text, deleted_tables integer) LANGUAGE 'plpgsql'COST100VOLATILE PARALLEL UNSAFEROWS1000AS $BODY$#variable_conflict use_columnDECLARE staging_schema TEXT; table_name TEXT; deleted_tables INTEGER:=0;BEGINstaging_schema = (select'stg_'||user)::text;FOR table_name IN (SELECT table_name FROM information_schema.tablesWHERE table_schema = staging_schema)LOOPEXECUTE format('DROP TABLE %I.%I CASCADE', staging_schema, table_name );deleted_tables := deleted_tables +1;ENDLOOP;RETURNqueryselect staging_schema, deleted_tables;END;$BODY$;ALTERFUNCTION gsidb.wipe_staging() OWNER TO envdb_dm;
-F Format (c: custom, d: directory, t: tar, p: plain text sql(default))
Use pg_restore -? to get the full list of options
Automatically log query time in psql
Will automatically print the time it took to run a query from within psql. *Of note this is the round trip time not simply query execution time.*
\timing
You can save this in your .psqlrc to be a default setting
Autoformat query results in psql
Will automatically reorganize the query output based on your terminal window for better readability.
\x auto
You can save this in your .psqlrc to be a default setting
Edit your psql queries in editor of your choice
Will automatically open your last run query in your default $EDITOR. When you save and close it will execute that query.
\e
Set a value for nulls
Will render the nulls as whatever character you specify. Handy for easier parsing of nulls vs. blank text.
\pset null 👻
You can save this in your .psqlrc to be a default setting
Save your query history per database locally
Will automatically save a history file for each **DBNAME**.
\set HISTFILE ~/.psql_history- :DBNAME
You can save this in your .psqlrc to be a default setting
Show queries issued by internal psql commands
Add “-E” (or –echo-hidden) option to psql in the command line. This option will display queries that internal psql commands generate (like “\dt mytable”). This is a cool way to learn more about system catalogs, or reuse queries issued by psql in your own tool.
psql -E
Get data back, and only the data
Add “-qtA” options to psql in the command line. Those options will have psql run in quiet mode (“-q”), return tuples only (“-t”) in an unaligned fashion (“-A”). Combined with “-c” option to send a single query, it can be useful for your scripts if you want the data and only that back from Postgres. Returns one line per row.
psql -qtA
Get results as an HTML table
Add “-qtH” options to psql in the command line. Those options will have psql run in quiet mode (“-q”), return tuples only (“-t”) in an HTML table (“-H”). Combined with “-c” option to send a single query, can be a fast way to embed the result of a query in an HTML page.
psql -qtH
Clear your psql screen
Will clear your screen in current psql session
\! clear
Continually run a query with watch
Will automatically run the last query every 2 seconds and display the output. You can also specify the query that will run after watch as well.
\watch
Rollback to previous statement on error when in interactive mode
When you encounter an error when in interactive mode this will automatically rollback to just before the previous command, allowing you to continue working as you would expect.
\set ON_ERROR_ROLLBACK interactive
Export a CSV from directly in psql
When providing the --csv value with a query, this command will run the specific query and return CSV to STDOUT.
psql <connection-string>--csv -c 'select * from test;'
Run a query from a file in psql
Will execute the specified file when inside psql.
\i filename
Provide clean border within psql
Will give you a border around the output of your queries when in psql
\pset border 2
You can save this in your .psqlrc to be a default setting
Set linestyle to unicode
Changes the linestyle to unicode, which when combined with above tip leads to much cleaner formatting
\pset linestyle unicode
You can save this in your .psqlrc to be a default setting