Skip to main content

TPCH-Klustron vs Greenplum

KlustronAbout 10 min

TPCH-Klustron vs Greenplum

Test Overview

Using the TPC-H benchmarking tool, performance evaluations were conducted for Klustron v1.3.1 and Greenplum Database v7.0.0. The results indicate that Klustron delivers an 8x performance increase over Greenplum at a data volume of 100GB, and a 3.5x increase at 200GB、 a 1.5x increase at 800GB.

Test Environment

Server Specifications

The test involved three servers, with IP addresses 192.168.0.17, 192.168.0.18, and 192.168.0.19, configured as follows:

  • Operating System: CentOS Linux release 8.5.2111
  • CPU: AMD Ryzen 9 7950X 16-Core Processor, with 2 threads per core, totaling 32 cores
  • Memory: 128GB
  • Storage: 1.8TB nvme SSD
  • Network: 10Gbps Ethernet

Software Versions

Software NameSoftware Version
Klustronv1.3.1
Greenplumv7.0.0
TPC-Hv3.0.0

Cluster Environment

Klustron

Node TypeIPPort
Compute Node192.168.0.1923001
Storage-shard1-Primary192.168.0.1733501
Storage-shard2-Primary192.168.0.1833503
Storage-shard3-Primary192.168.0.1933505
Cluster_mgr192.168.0.17/18/1923501
Metedata_cluster192.168.0.17/18/1923301
XPanel192.168.0.1810024

Cluster Description:

  • Compute Node: One single compute node is deployed.
  • Storage Nodes: There are three shards, each with a single primary, distributed across the three machines.
  • Management Node and Metadata Cluster: The management cluster consists of three machines, set up with one primary and two replicas.

Greenplum

Node TypeIPPort
Primary192.168.0.1754321
Segment1192.168.0.186000
Segment2192.168.0.196000

Configuration Parameters

Klustron

Compute Node:

alter system set statement_timeout=6000000;
alter system set mysql_read_timeout=1200;
alter system set mysql_write_timeout=1200;
alter system set lock_timeout=1200000;
alter system set log_min_duration_statement=1200000;
alter system set effective_cache_size = '48GB';
alter system set work_mem  = '5GB';
alter system set wal_buffers='64MB';
alter system set extension.aggregate_ignore_work_mem=false;
alter system set enable_nestloop=false;
alter system set extension.max_custom_indexscan_parallelism=16;
alter system set extension.custom_hashjoin_parallelism=32;

Note: Restart the compute node after adjustments.

Storage Nodes:

mysql -h 192.168.0.17 -P 33501 -upgx -ppgx_pwd
mysql -h 192.168.0.18 -P 33503 -upgx -ppgx_pwd
mysql -h 192.168.0.19 -P 33505 -upgx -ppgx_pwd

Set the following parameters:

set persist innodb_buffer_pool_size=64*1024*1024*1024;
set persist lock_wait_timeout=1200;
set persist innodb_lock_wait_timeout=1200;    
set persist fullsync_timeout=1200000; 
set persist enable_fullsync=false;
set persist innodb_flush_log_at_trx_commit=2;
set persist sync_binlog=0;
set persist max_binlog_size=1*1024*1024*1024;

Greenplum

Execute the following on the Primary node:

gpconfig -c shared_buffers -v 48GB
gpconfig -c statement_timeout -v 6000000
gpconfig -c lock_timeout -v 1200000
gpconfig -c log_min_duration_statement -v 1200000
gpconfig -c effective_cache_size -v 48GB
gpconfig -c work_mem -v 5GB
gpconfig -c wal_buffers -v 64MB
gpconfig -c autovacuum -v false
gpstop -M fast
gpstart

Test Results

TPC-H-100G

For a data volume of 100GB, the test results shown below represent the best of two runs, in terms of shorter duration. All times are in seconds.

QUERY IDKlustron v1.3.1Greenplum v7.0.0
Q14.48248.16
Q28.0523.57
Q310.09100.23
Q45.06156.42
Q522.33100.03
Q62.9536.49
Q712.3662.98
Q831.3663.17
Q929.43129.88
Q1010.6173.09
Q112.66.84
Q126.9971.68
Q1315.7373.93
Q144.0439.68
Q157.5796.99
Q163.7620.25
Q1715.41429.69
Q1839.05320.23
Q198.3744.28
Q2013.5775.38
Q2147.8203.55
Q226.2963.47
总耗时307.92439.98

TPC-H-200G

For a data volume of 200GB, the test results shown below represent the best of two runs, in terms of shorter duration. All times are in seconds.

QUERY IDKlustron v1.3.1Greenplum v7.0.0
Q144.39515.72
Q226.9555.25
Q355.93205.8
Q456.83339.53
Q5114.12220.61
Q645.4783.78
Q770.03156.95
Q8112.17201.17
Q9111.07370.88
Q1063.72177.11
Q1114.9919.48
Q1264.27159.33
Q1333.62144.22
Q1453.7998.42
Q15102.97216.4
Q1617.4947.68
Q1796.64911.3
Q18177.66675.87
Q1952.4999.26
Q2078.37163.19
Q21218.36494.77
Q2220.24133.21
总耗时1631.575489.93

TPC-H-500G

For a data volume of 200GB, the test results shown below represent the best of two runs, in terms of shorter duration. All times are in seconds.

QUERY IDKlustron v1.3.1Greenplum v7.0.0
Q188.79515.72
Q2167.7155.25
Q3424.11205.8
Q4162.17339.53
Q5465.63220.61
Q6132.4183.78
Q7209.25156.95
Q8467.98201.17
Q9717.72370.88
Q10266.68177.11
Q11135.1419.48
Q12128.81159.33
Q13178.35144.22
Q14174.2398.42
Q15330.5216.4
Q16641.4247.68
Q17445.27911.3
Q184191.07675.87
Q19163.699.26
Q20610.63163.19
Q21494.77
Q2295.79133.21
总耗时10197.2616281.12

Table Creation Statements

Klustron

create TABLEGROUP tpch10 partitions 3 (
        partition tpch10_auto_p0 with(hash=0,shard=1),
        partition tpch10_auto_p1 with(hash=1,shard=2),
        partition tpch10_auto_p2 with(hash=2,shard=3)
);

CREATE TABLE NATION  ( N_NATIONKEY  INTEGER NOT NULL,
                            N_NAME       CHAR(25) NOT NULL,
                            N_REGIONKEY  INTEGER NOT NULL,
                            N_COMMENT    VARCHAR(152),
			    PRIMARY KEY (n_nationkey)) WITH(SHARD=all);


CREATE TABLE REGION  ( R_REGIONKEY  INTEGER NOT NULL,
                            R_NAME       CHAR(25) NOT NULL,
                            R_COMMENT    VARCHAR(152),
			    PRIMARY KEY (r_regionkey)) WITH (SHARD=all);

CREATE TABLE PART  ( P_PARTKEY     INTEGER NOT NULL,
                          P_NAME        VARCHAR(55) NOT NULL,
                          P_MFGR        CHAR(25) NOT NULL,
                          P_BRAND       CHAR(10) NOT NULL,
                          P_TYPE        VARCHAR(25) NOT NULL,
                          P_SIZE        INTEGER NOT NULL,
                          P_CONTAINER   CHAR(10) NOT NULL,
                          P_RETAILPRICE double NOT NULL,
                          P_COMMENT     VARCHAR(23) NOT NULL,
			  PRIMARY KEY (p_partkey)) PARTITION BY HASH (p_partkey) partitions 3 TABLEGROUP tpch10;



CREATE TABLE SUPPLIER ( S_SUPPKEY     INTEGER NOT NULL,
                             S_NAME        CHAR(25) NOT NULL,
                             S_ADDRESS     VARCHAR(40) NOT NULL,
                             S_NATIONKEY   INTEGER NOT NULL,
                             S_PHONE       CHAR(15) NOT NULL,
                             S_ACCTBAL     double NOT NULL,
                             S_COMMENT     VARCHAR(101) NOT NULL,
			     PRIMARY KEY (s_suppkey)) PARTITION BY HASH (s_suppkey) partitions 3 TABLEGROUP tpch10;




CREATE TABLE PARTSUPP ( PS_PARTKEY     INTEGER NOT NULL,
                             PS_SUPPKEY     INTEGER NOT NULL,
                             PS_AVAILQTY    INTEGER NOT NULL,
                             PS_SUPPLYCOST  double  NOT NULL,
                             PS_COMMENT     VARCHAR(199) NOT NULL,
			     PRIMARY KEY (ps_partkey, ps_suppkey)) PARTITION BY HASH (ps_partkey) partitions 3 TABLEGROUP tpch10;




CREATE TABLE CUSTOMER ( C_CUSTKEY     INTEGER NOT NULL,
                             C_NAME        VARCHAR(25) NOT NULL,
                             C_ADDRESS     VARCHAR(40) NOT NULL,
                             C_NATIONKEY   INTEGER NOT NULL,
                             C_PHONE       CHAR(15) NOT NULL,
                             C_ACCTBAL     double NOT NULL,
                             C_MKTSEGMENT  CHAR(10) NOT NULL,
                             C_COMMENT     VARCHAR(117) NOT NULL,
			     PRIMARY KEY(c_custkey)) PARTITION BY HASH (C_CUSTKEY) partitions 3 TABLEGROUP tpch10;


CREATE TABLE ORDERS  ( O_ORDERKEY       INTEGER NOT NULL,
                           O_CUSTKEY        INTEGER NOT NULL,
                           O_ORDERSTATUS    CHAR(1) NOT NULL,
                           O_TOTALPRICE     double NOT NULL,
                           O_ORDERDATE      DATE NOT NULL,
                           O_ORDERPRIORITY  CHAR(15) NOT NULL,  
                           O_CLERK          CHAR(15) NOT NULL, 
                           O_SHIPPRIORITY   INTEGER NOT NULL,
                           O_COMMENT        VARCHAR(79) NOT NULL,
			   PRIMARY KEY(o_orderkey)) PARTITION BY HASH (o_orderkey) partitions 3 TABLEGROUP tpch10;



CREATE TABLE LINEITEM ( L_ORDERKEY    INTEGER NOT NULL,
                             L_PARTKEY     INTEGER NOT NULL,
                             L_SUPPKEY     INTEGER NOT NULL,
                             L_LINENUMBER  INTEGER NOT NULL,
                             L_QUANTITY    double NOT NULL,
                             L_EXTENDEDPRICE  double NOT NULL,
                             L_DISCOUNT    double NOT NULL,
                             L_TAX         double NOT NULL,
                             L_RETURNFLAG  CHAR(1) NOT NULL,
                             L_LINESTATUS  CHAR(1) NOT NULL,
                             L_SHIPDATE    DATE NOT NULL,
                             L_COMMITDATE  DATE NOT NULL,
                             L_RECEIPTDATE DATE NOT NULL,
                             L_SHIPINSTRUCT CHAR(25) NOT NULL,
                             L_SHIPMODE     CHAR(10) NOT NULL,
                             L_COMMENT      VARCHAR(44) NOT NULL,
			     PRIMARY KEY(l_orderkey, l_linenumber)) PARTITION BY HASH (l_orderkey) partitions 3 TABLEGROUP tpch10;

Grennplum

create table nation  ( n_nationkey  integer not null,
                            n_name       char(25) not null,
                            n_regionkey  integer not null,
                            n_comment    varchar(152),
			    primary key (n_nationkey)) ;

create table region  ( r_regionkey  integer not null,
                            r_name       char(25) not null,
                            r_comment    varchar(152),
			    primary key (r_regionkey)) ;

create table part  ( p_partkey     integer not null,
                          p_name        varchar(55) not null,
                          p_mfgr        char(25) not null,
                          p_brand       char(10) not null,
                          p_type        varchar(25) not null,
                          p_size        integer not null,
                          p_container   char(10) not null,
                          p_retailprice decimal(15,2) not null,
                          p_comment     varchar(23) not null,
			  primary key (p_partkey));

create table supplier ( s_suppkey     integer not null,
                             s_name        char(25) not null,
                             s_address     varchar(40) not null,
                             s_nationkey   integer not null,
                             s_phone       char(15) not null,
                             s_acctbal     decimal(15,2) not null,
                             s_comment     varchar(101) not null,
			     primary key (s_suppkey));

create table partsupp ( ps_partkey     integer not null,
                             ps_suppkey     integer not null,
                             ps_availqty    integer not null,
                             ps_supplycost  decimal(15,2)  not null,
                             ps_comment     varchar(199) not null,
			     primary key (ps_partkey, ps_suppkey));

create table customer ( c_custkey     integer not null,
                             c_name        varchar(25) not null,
                             c_address     varchar(40) not null,
                             c_nationkey   integer not null,
                             c_phone       char(15) not null,
                             c_acctbal     decimal(15,2) not null,
                             c_mktsegment  char(10) not null,
                             c_comment     varchar(117) not null,
			     primary key(c_custkey));

create table orders  ( o_orderkey       integer not null,
                           o_custkey        integer not null,
                           o_orderstatus    char(1) not null,
                           o_totalprice     decimal(15,2) not null,
                           o_orderdate      date not null,
                           o_orderpriority  char(15) not null,  
                           o_clerk          char(15) not null, 
                           o_shippriority   integer not null,
                           o_comment        varchar(79) not null,
			   primary key(o_orderkey));

create table lineitem ( l_orderkey    integer not null,
                             l_partkey     integer not null,
                             l_suppkey     integer not null,
                             l_linenumber  integer not null,
                             l_quantity    decimal(15,2) not null,
                             l_extendedprice  decimal(15,2) not null,
                             l_discount    decimal(15,2) not null,
                             l_tax         decimal(15,2) not null,
                             l_returnflag  char(1) not null,
                             l_linestatus  char(1) not null,
                             l_shipdate    date not null,
                             l_commitdate  date not null,
                             l_receiptdate date not null,
                             l_shipinstruct char(25) not null,
                             l_shipmode     char(10) not null,
                             l_comment      varchar(44) not null,
			     primary key(l_orderkey, l_linenumber));

Collecting Statistical Data

Use the following statements to collect statistical data:

Both Klustron and Greenplum execute the identical commands:

analyze nation;
analyze region;
analyze part;
analyze supplier;
analyze partsupp;
analyze customer;
analyze orders;
analyze lineitem;
analyze lineitem(l_shipmode,l_receiptdate,l_returnflag,l_linestatus,l_orderkey,l_suppkey,l_discount,l_quantity,l_partkey);
analyze supplier(s_suppkey,s_nationkey);
analyze part(p_size,p_type,p_partkey);
analyze partsupp(ps_suppkey,ps_supplycost);
analyze nation(n_regionkey,n_nationkey,n_name);
analyze region(r_name,r_regionkey);
analyze customer(c_mktsegment,c_custkey,c_nationkey,c_name);
analyze orders(o_custkey,o_orderkey,o_orderdate,o_orderpriority,o_totalprice);

Query Statements

#Q1:
select
	l_returnflag,
	l_linestatus,
	sum(l_quantity) as sum_qty,
	sum(l_extendedprice) as sum_base_price,
	sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
	sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
	avg(l_quantity) as avg_qty,
	avg(l_extendedprice) as avg_price,
	avg(l_discount) as avg_disc,
	count(*) as count_order
from
	lineitem
where
	l_shipdate <= date '1998-12-01' - interval '91' day
group by
	l_returnflag,
	l_linestatus
order by
	l_returnflag,
	l_linestatus
LIMIT 1;
 
 
 
#Q2:
select
	s_acctbal,
	s_name,
	n_name,
	p_partkey,
	p_mfgr,
	s_address,
	s_phone,
	s_comment
from
	part,
	supplier,
	partsupp,
	nation,
	region
where
	p_partkey = ps_partkey
	and s_suppkey = ps_suppkey
	and p_size = 6
	and p_type like '%BRASS'
	and s_nationkey = n_nationkey
	and n_regionkey = r_regionkey
	and r_name = 'ASIA'
	and ps_supplycost = (
		select
			min(ps_supplycost)
		from
			partsupp,
			supplier,
			nation,
			region
		where
			p_partkey = ps_partkey
			and s_suppkey = ps_suppkey
			and s_nationkey = n_nationkey
			and n_regionkey = r_regionkey
			and r_name = 'ASIA'
	)
order by
	s_acctbal desc,
	n_name,
	s_name,
	p_partkey
LIMIT 100;
 
 
 
#Q3:
select
	l_orderkey,
	sum(l_extendedprice * (1 - l_discount)) as revenue,
	o_orderdate,
	o_shippriority
from
	customer,
	orders,
	lineitem
where
	c_mktsegment = 'BUILDING'
	and c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and o_orderdate < date '1995-03-17'
	and l_shipdate > date '1995-03-17'
group by
	l_orderkey,
	o_orderdate,
	o_shippriority
order by
	revenue desc,
	o_orderdate
LIMIT 10; 
 
 
#Q4:
select
	o_orderpriority,
	count(*) as order_count
from
	orders
where
	o_orderdate >= date '1995-07-01'
	and o_orderdate < date '1995-07-01' + interval '3' month
	and exists (
		select
			*
		from
			lineitem
		where
			l_orderkey = o_orderkey
			and l_commitdate < l_receiptdate
	)
group by
	o_orderpriority
order by
	o_orderpriority
LIMIT 1;
 
 
 
#Q5:
select
	n_name,
	sum(l_extendedprice * (1 - l_discount)) as revenue
from
	customer,
	orders,
	lineitem,
	supplier,
	nation,
	region
where
	c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and l_suppkey = s_suppkey
	and c_nationkey = s_nationkey
	and s_nationkey = n_nationkey
	and n_regionkey = r_regionkey
	and r_name = 'AFRICA'
	and o_orderdate >= date '1997-01-01'
	and o_orderdate < date '1997-01-01' + interval '1' year
group by
	n_name
order by
	revenue desc
LIMIT 1;
 
 
 
#Q6:
select
	sum(l_extendedprice * l_discount) as revenue
from
	lineitem
where
	l_shipdate >= date '1997-01-01'
	and l_shipdate < date '1997-01-01' + interval '1' year
	and l_discount between 0.05 - 0.01 and 0.05 + 0.01
	and l_quantity < 25
LIMIT 1;
 
 
 
#Q7:
select
	supp_nation,
	cust_nation,
	l_year,
	sum(volume) as revenue
from
	(
		select
			n1.n_name as supp_nation,
			n2.n_name as cust_nation,
			extract(year from l_shipdate) as l_year,
			l_extendedprice * (1 - l_discount) as volume
		from
			supplier,
			lineitem,
			orders,
			customer,
			nation n1,
			nation n2
		where
			s_suppkey = l_suppkey
			and o_orderkey = l_orderkey
			and c_custkey = o_custkey
			and s_nationkey = n1.n_nationkey
			and c_nationkey = n2.n_nationkey
			and (
				(n1.n_name = 'IRAN' and n2.n_name = 'PERU')
				or (n1.n_name = 'PERU' and n2.n_name = 'IRAN')
			)
			and l_shipdate between date '1995-01-01' and date '1996-12-31'
	) as shipping
group by
	supp_nation,
	cust_nation,
	l_year
order by
	supp_nation,
	cust_nation,
	l_year
LIMIT 1; 
 
 
#Q8:
select
	o_year,
	sum(case
		when nation = 'PERU' then volume
		else 0
	end) / sum(volume) as mkt_share
from
	(
		select
			extract(year from o_orderdate) as o_year,
			l_extendedprice * (1 - l_discount) as volume,
			n2.n_name as nation
		from
			part,
			supplier,
			lineitem,
			orders,
			customer,
			nation n1,
			nation n2,
			region
		where
			p_partkey = l_partkey
			and s_suppkey = l_suppkey
			and l_orderkey = o_orderkey
			and o_custkey = c_custkey
			and c_nationkey = n1.n_nationkey
			and n1.n_regionkey = r_regionkey
			and r_name = 'AMERICA'
			and s_nationkey = n2.n_nationkey
			and o_orderdate between date '1995-01-01' and date '1996-12-31'
			and p_type = 'ECONOMY ANODIZED COPPER'
	) as all_nations
group by
	o_year
order by
	o_year
LIMIT 1; 
 
 
#Q9:
select
	nation,
	o_year,
	sum(amount) as sum_profit
from
	(
		select
			n_name as nation,
			extract(year from o_orderdate) as o_year,
			l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
		from
			part,
			supplier,
			lineitem,
			partsupp,
			orders,
			nation
		where
			s_suppkey = l_suppkey
			and ps_suppkey = l_suppkey
			and ps_partkey = l_partkey
			and p_partkey = l_partkey
			and o_orderkey = l_orderkey
			and s_nationkey = n_nationkey
			and p_name like '%pink%'
	) as profit
group by
	nation,
	o_year
order by
	nation,
	o_year desc
LIMIT 1;
 
 
 
#Q10:
select
	c_custkey,
	c_name,
	sum(l_extendedprice * (1 - l_discount)) as revenue,
	c_acctbal,
	n_name,
	c_address,
	c_phone,
	c_comment
from
	customer,
	orders,
	lineitem,
	nation
where
	c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and o_orderdate >= date '1993-05-01'
	and o_orderdate < date '1993-05-01' + interval '3' month
	and l_returnflag = 'R'
	and c_nationkey = n_nationkey
group by
	c_custkey,
	c_name,
	c_acctbal,
	c_phone,
	n_name,
	c_address,
	c_comment
order by
	revenue desc
LIMIT 20; 
 
 
#Q11:
select
	ps_partkey,
	sum(ps_supplycost * ps_availqty) as value
from
	partsupp,
	supplier,
	nation
where
	ps_suppkey = s_suppkey
	and s_nationkey = n_nationkey
	and n_name = 'MOZAMBIQUE'
group by
	ps_partkey having
		sum(ps_supplycost * ps_availqty) > (
			select
				sum(ps_supplycost * ps_availqty) * 0.0001000000
			from
				partsupp,
				supplier,
				nation
			where
				ps_suppkey = s_suppkey
				and s_nationkey = n_nationkey
				and n_name = 'MOZAMBIQUE'
		)
order by
	value desc
LIMIT 1; 
 
 
#Q12:
select
	l_shipmode,
	sum(case
		when o_orderpriority = '1-URGENT'
			or o_orderpriority = '2-HIGH'
			then 1
		else 0
	end) as high_line_count,
	sum(case
		when o_orderpriority <> '1-URGENT'
			and o_orderpriority <> '2-HIGH'
			then 1
		else 0
	end) as low_line_count
from
	orders,
	lineitem
where
	o_orderkey = l_orderkey
	and (l_shipmode = 'MAIL' or l_shipmode = 'FOB')
	and l_commitdate < l_receiptdate
	and l_shipdate < l_commitdate
	and l_receiptdate >= date '1994-01-01'
	and l_receiptdate < date '1994-01-01' + interval '1' year
group by
	l_shipmode
order by
	l_shipmode
LIMIT 1;
 
 
 
#Q13:
select
	c_count,
	count(*) as custdist
from
	(
		select
			c_custkey,
			count(o_orderkey)
		from
			customer left outer join orders on
				c_custkey = o_custkey
				and o_comment not like '%special%requests%'
		group by
			c_custkey
	) as c_orders (c_custkey, c_count)
group by
	c_count
order by
	custdist desc,
	c_count desc
LIMIT 1; 
 
 
#Q14:
select
	100.00 * sum(case
		when p_type like 'PROMO%'
			then l_extendedprice * (1 - l_discount)
		else 0
	end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
	lineitem,
	part
where
	l_partkey = p_partkey
	and l_shipdate >= date '1994-10-01'
	and l_shipdate < date '1994-10-01' + interval '1' month
LIMIT 1; 
 
 
#Q15:
create view revenue0 (supplier_no, total_revenue) as
	select
		l_suppkey,
		sum(l_extendedprice * (1 - l_discount))
	from
		lineitem
	where
		l_shipdate >= date '1994-04-01'
		and l_shipdate < date '1994-04-01' + interval '3' month
	group by
		l_suppkey;


select
	s_suppkey,
	s_name,
	s_address,
	s_phone,
	total_revenue
from
	supplier,
	revenue0
where
	s_suppkey = supplier_no
	and total_revenue = (
		select
			max(total_revenue)
		from
			revenue0
	)
order by
	s_suppkey
LIMIT 1;

drop view revenue0; 
 
 
#Q16:
select
	p_brand,
	p_type,
	p_size,
	count(distinct ps_suppkey) as supplier_cnt
from
	partsupp,
	part
where
	p_partkey = ps_partkey
	and p_brand <> 'Brand#11'
	and p_type not like 'STANDARD BRUSHED%'
	and p_size in (9, 21, 24, 30, 36, 50, 27, 39)
	and ps_suppkey not in (
		select
			s_suppkey
		from
			supplier
		where
			s_comment like '%Customer%Complaints%'
	)
group by
	p_brand,
	p_type,
	p_size
order by
	supplier_cnt desc,
	p_brand,
	p_type,
	p_size
LIMIT 1; 
 
 
#Q17:
select
	sum(l_extendedprice) / 7.0 as avg_yearly
from
	lineitem,
	part,
	(SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_agg
where
	p_partkey = l_partkey
	and agg_partkey = l_partkey
	and p_brand = 'Brand#11'
	and p_container = 'SM CAN'
	and l_quantity < avg_quantity
LIMIT 1;
 
 
 
#Q18:
select
	c_name,
	c_custkey,
	o_orderkey,
	o_orderdate,
	o_totalprice,
	sum(l_quantity)
from
	customer,
	orders,
	lineitem
where
	o_orderkey in (
		select
			l_orderkey
		from
			lineitem
		group by
			l_orderkey having
				sum(l_quantity) > 312
	)
	and c_custkey = o_custkey
	and o_orderkey = l_orderkey
group by
	c_name,
	c_custkey,
	o_orderkey,
	o_orderdate,
	o_totalprice
order by
	o_totalprice desc,
	o_orderdate
LIMIT 100; 
 
 
#Q19:
select
	sum(l_extendedprice* (1 - l_discount)) as revenue
from
	lineitem,
	part
where
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#12'
		and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
		and l_quantity >= 5 and l_quantity <= 5 + 10
		and p_size between 1 and 5
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#33'
		and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
		and l_quantity >= 18 and l_quantity <= 18 + 10
		and p_size between 1 and 10
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#45'
		and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
		and l_quantity >= 20 and l_quantity <= 20 + 10
		and p_size between 1 and 15
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)
LIMIT 1; 
 
 
#Q20:
select
	s_name,
	s_address
from
	supplier,
	nation
where
	s_suppkey in (
		select
			ps_suppkey
		from
			partsupp,
			(
				select
					l_partkey agg_partkey,
					l_suppkey agg_suppkey,
					0.5 * sum(l_quantity) AS agg_quantity
				from
					lineitem
				where
					l_shipdate >= date '1995-01-01'
					and l_shipdate < date '1995-01-01' + interval '1' year
				group by
					l_partkey,
					l_suppkey
			) agg_lineitem
		where
			agg_partkey = ps_partkey
			and agg_suppkey = ps_suppkey
			and ps_partkey in (
				select
					p_partkey
				from
					part
				where
					p_name like 'hot%'
			)
			and ps_availqty > agg_quantity
	)
	and s_nationkey = n_nationkey
	and n_name = 'KENYA'
order by
	s_name
LIMIT 1;
 
 
 
#Q21:
select
	s_name,
	count(*) as numwait
from
	supplier,
	lineitem l1,
	orders,
	nation
where
	s_suppkey = l1.l_suppkey
	and o_orderkey = l1.l_orderkey
	and o_orderstatus = 'F'
	and l1.l_receiptdate > l1.l_commitdate
	and exists (
		select
			*
		from
			lineitem l2
		where
			l2.l_orderkey = l1.l_orderkey
			and l2.l_suppkey <> l1.l_suppkey
	)
	and not exists (
		select
			*
		from
			lineitem l3
		where
			l3.l_orderkey = l1.l_orderkey
			and l3.l_suppkey <> l1.l_suppkey
			and l3.l_receiptdate > l3.l_commitdate
	)
	and s_nationkey = n_nationkey
	and n_name = 'IRAQ'
group by
	s_name
order by
	numwait desc,
	s_name
LIMIT 100; 
 
 
#Q22:
select
	cntrycode,
	count(*) as numcust,
	sum(c_acctbal) as totacctbal
from
	(
		select
			substring(c_phone from 1 for 2) as cntrycode,
			c_acctbal
		from
			customer
		where
			substring(c_phone from 1 for 2) in
				('24', '27', '34', '23', '29', '15', '13')
			and c_acctbal > (
				select
					avg(c_acctbal)
				from
					customer
				where
					c_acctbal > 0.00
					and substring(c_phone from 1 for 2) in
						('24', '27', '34', '23', '29', '15', '13')
			)
			and not exists (
				select
					*
				from
					orders
				where
					o_custkey = c_custkey
			)
	) as custsale
group by
	cntrycode
order by
	cntrycode
LIMIT 1;