How to Emulate PERCENTILE_DISC in MySQL and Other RDBMS

Spread the love
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

In my previous article, I showed what the very useful percentile functions (also known as inverse distribution functions) can be used for.
Unfortunately, these functions are not ubiquitously available in SQL dialects. As of jOOQ 3.11, they are known to work in these dialects:
Dialect
As aggregate function
As window function
MariaDB 10.3.3
No
Yes
Oracle 18c
Yes
Yes
PostgreSQL 11
Yes
No
SQL Server 2017
No
Yes
Teradata 16
Yes
No
Oracle has the most sophisticated implementation, which supports both the ordered set aggregate function, and the window function version:
Aggregate function: PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x)
Window function: PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x) OVER (PARTITION BY y)
Workarounds if the feature is unavailable
Luckily, as soon as an RDBMS supports window functions, we can easily emulate PERCENTILE_DISC using PERCENT_RANK and FIRST_VALUE as follows. We’re using the Sakila database in this example.
Emulating window functions
Let’s emulate these first, as it requires a bit less SQL transformations. This query works out of the box in Oracle:

SELECT DISTINCT
rating,
percentile_disc(0.5)
WITHIN GROUP (ORDER BY length)
OVER() x1,
percentile_disc(0.5)
WITHIN GROUP (ORDER BY length)
OVER (PARTITION BY rating) x2
FROM film
ORDER BY rating;

Yielding

RATING X1 X2
——————-
G 114 107
NC-17 114 112
PG 114 113
PG-13 114 125
R 114 115

What we can read from this is that the median length of all films is 114 minutes, and the median lengths of films per rating range from 107 minutes to 125 minutes. I’ve used DISTINCT because we don’t care about visualising these values on a per-row basis in this case. This also works in SQL Server.
Now, let’s assume we’re using PostgreSQL, which doesn’t support inverse distribution window functions, or MySQL, which doesn’t support inverse distribution functions at all, but both support PERCENT_RANK and FIRST_VALUE. Here’s the complete query:

SELECT DISTINCT
rating,
first_value(length) OVER (
ORDER BY CASE WHEN p1

X ITM Cloud News

Emily

Next Post

How to Calculate a Cumulative Percentage in SQL

Sun Nov 24 , 2019
Spread the love          A fun report to write is to calculate a cumulative percentage. For example, when querying the Sakila database, we might want to calculate the percentage of our total revenue at any given date. The result might look like this: Notice the beautifully generated data. Or as raw data: […]
X- ITM

Cloud Computing – Consultancy – Development – Hosting – APIs – Legacy Systems

X-ITM Technology helps our customers across the entire enterprise technology stack with differentiated industry solutions. We modernize IT, optimize data architectures, and make everything secure, scalable and orchestrated across public, private and hybrid clouds.

This image has an empty alt attribute; its file name is x-itmdc.jpg

The enterprise technology stack includes ITO; Cloud and Security Services; Applications and Industry IP; Data, Analytics and Engineering Services; and Advisory.

Watch an animation of  X-ITM‘s Enterprise Technology Stack

We combine years of experience running mission-critical systems with the latest digital innovations to deliver better business outcomes and new levels of performance, competitiveness and experiences for our customers and their stakeholders.

X-ITM invests in three key drivers of growth: People, Customers and Operational Execution.

The company’s global scale, talent and innovation platforms serve 6,000 private and public-sector clients in 70 countries.

X-ITM’s extensive partner network helps drive collaboration and leverage technology independence. The company has established more than 200 industry-leading global Partner Network relationships, including 15 strategic partners: Amazon Web Services, AT&T, Dell Technologies, Google Cloud, HCL, HP, HPE, IBM, Micro Focus, Microsoft, Oracle, PwC, SAP, ServiceNow and VMware

.

X ITM