Merge remote-tracking branch 'origin/master' into new_cartodbfy

Conflicts:
	test/CDB_QuotaTest.sql
This commit is contained in:
Rafa de la Torre
2015-08-17 15:27:33 +02:00
45 changed files with 869 additions and 57 deletions

View File

@@ -1,7 +1,7 @@
# cartodb/Makefile
EXTENSION = cartodb
EXTVERSION = 0.7.3
EXTVERSION = 0.8.3
SED = sed
@@ -37,6 +37,11 @@ UPGRADABLE = \
0.7.0 \
0.7.1 \
0.7.2 \
0.7.3 \
0.7.4 \
0.8.0 \
0.8.1 \
0.8.2 \
$(EXTVERSION)dev \
$(EXTVERSION)next \
$(END)

26
NEWS.md
View File

@@ -1,3 +1,29 @@
0.8.3 (2015-mm-dd)
------------------
* Fixes CDB_UserDataSize failing due `ERROR: relation "*" does not exist.` [#108](https://github.com/CartoDB/cartodb-postgresql/issues/108)
0.8.2 (2015-07-27)
------------------
* Fix for CDB_UserTables returning wrong listings when publicuser is used
0.8.1 (2015-06-30)
------------------
* Fix for [#95](https://github.com/CartoDB/cartodb-postgresql/issues/95) *cdb_usertables should return public tables when the user is publicuser*
0.8.0 (2015-06-30)
------------------
* Adds new function CDB_QueryTablesText that can deal with "schema.table_name"
longer than 63 chars.
* Adds a set of statistical functions:
- CDB_DistType
- CDB_DistinctMeasure
- CDB_EqualIntervalBins
* Fix for CDB_UserTables returns 0 entries for multiuser accounts [#64](https://github.com/CartoDB/cartodb-postgresql/issues/64)
0.7.4 (2015-06-29)
------------------
Dummy transitional version.
0.7.3 (2015-03-03)
------------------
* Fix upgrade of CDB_StringToDate function

View File

@@ -11,7 +11,7 @@ See https://github.com/CartoDB/cartodb/wiki/CartoDB-PostgreSQL-extension
Dependencies
------------
* PostgreSQL 9.3+ (with plpythonu extension)
* PostgreSQL 9.3+ (with plpythonu extension and xml support)
* [PostGIS extension](http://postgis.net)
* [Schema triggers extension]
(https://bitbucket.org/malloclabs/pg_schema_triggers)
@@ -37,6 +37,8 @@ NOTE: if ``test_ddl_triggers`` fails it's likely due to an incomplete
NOTE: you need to run the installcheck as a superuser, use PGUSER
env variable if needed, like: PGUSER=postgres make installcheck
NOTE: the tests need to run against a **clean postgres instance**, if you have some roles already created test will likely fail due `publicuser` not being dropped.
Enable database
---------------

14
doc/CDB_ColumnNames.md Normal file
View File

@@ -0,0 +1,14 @@
Retrieve all column names in a particular table
#### Using the function
```sql
SELECT CDB_ColumnNames('table_name')
--- Returns a set of rows with column names
```
#### Arguments
CDB_ColumnNames(table_name)
* **table_name** text

15
doc/CDB_ColumnType.md Normal file
View File

@@ -0,0 +1,15 @@
Returns a column type for any column in a table
#### Using the function
```sql
SELECT CDB_ColumnType('column_name','table_name')
--- Returns a set of rows with column types
```
#### Arguments
CDB_ColumnType(column_name, table_name)
* **column_name** text
* **table_name** text

21
doc/CDB_HeadsTailsBins.md Normal file
View File

@@ -0,0 +1,21 @@
Find the breaks for N categories in a numerical column based on the [Heads/Tails optimization](http://arxiv.org/pdf/1209.2801v1.pdf). Below, Heads/Tails used to color based on the area of the polygons.
![headtails](https://f.cloud.github.com/assets/370259/140655/6eebb918-7228-11e2-89fa-149745f25d34.png)
#### Using the function
We can determine the 7 most optimal breaks in a column of numerical data as follows,
```sql
SELECT CDB_HeadsTailsBins(array_agg(numeric_column), 7) FROM table_name
-- Results in an ordered array like, {7824,23492,52696,233857,666089,1001709,1638094}
-- Each break happens up to, and equal, to a number:
-- (bin1 is less than or equal to 7824, bin2 is less than or equal to 23492, etc.)
```
#### Arguments
CDB_HeadsTailsBins(in_array, breaks)
* **in_array** numeric[]. A NUMERIC array of values.
* **breaks** int. The number of categories you want to create

43
doc/CDB_HexagonGrid.md Normal file
View File

@@ -0,0 +1,43 @@
Fill given extent with an hexagonal coverage
#### Using the function
Create a hexagonal grid from a polygon geometry. For example, take the geometry
```sql
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(10000000,-10000000),
ST_MakePoint(-10000000,10000000)
)
),
3857)
```
We can create a grid as follows,
```sql
SELECT CDB_HexagonGrid(
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(10000000,-10000000),
ST_MakePoint(-10000000,10000000)
)
),
3857),
1000000) the_geom_webmercator
```
Which will look something like this,
![grid tile](http://i.imgur.com/4rZXGMb.png)
#### Arguments
CDB_HexagonGrid(ext, side, origin)
* **ext** geometry. Extent to fill. Only hexagons with center point falling inside the extent (or at the lower or leftmost edge) will be emitted. The returned hexagons will have the same SRID as this extent.
* **side** float. Side measure for the hexagon. Maximum diameter will be 2 * side. Measure is in the same projection as **ext**
* **origin** OPTIONAL geometry. Optional origin to allow for exact tiling. If omitted the origin will be 0,0. The parameter is checked for having the same SRID as the extent.

23
doc/CDB_JenksBins.md Normal file
View File

@@ -0,0 +1,23 @@
Find the breaks for N categories in a numerical column based on the [Jenks optimization](http://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization). Below, Jenks used to color based on the area of the polygons.
![Jenks](https://f.cloud.github.com/assets/370259/140093/b64a9382-7210-11e2-81a4-c65cce3c885e.png)
#### Using the function
We can determine the 7 most optimal breaks in a column of numerical data as follows,
```sql
SELECT CDB_JenksBins(array_agg(numeric_column), 7) FROM table_name
-- Results in an ordered array like, {0,73,2568,9408,29411,768230,1638094}
-- Each break happens up to, and equal, to a number:
-- (bin1 is less than or equal to 0, bin2 is less than or equal to 73, etc.)
```
#### Arguments
CDB_JenksBins(in_array, breaks, invert)
* **in_array** numeric[]. A NUMERIC array of values.
* **breaks** int. The number of categories you want to create
* **iterations** OPTIONAL int. The number of iterations used for calculating breaks.
* **invert** OPTIONAL boolean. Flips whether you receive top down breaks or bottom up breaks. Default is top down (so, <=). Bottom up would give you values that define the lower-end start of a bin (so >=).

21
doc/CDB_MakeHexagon.md Normal file
View File

@@ -0,0 +1,21 @@
Return an Hexagon with given center and side (or maximal radius)
#### Using the function
Running the following SQL
```sql
SELECT CDB_MakeHexagon(ST_MakePoint(0,0),10000000)
```
Would give you back a single hexagon geometry,
![hexagon](http://i.imgur.com/6jeGStb.png)
#### Arguments
CDB_MakeHexagon(center, radius)
* **center** geometry
* **radius** float. Radius of hexagon measured in same projection as **center**

21
doc/CDB_QuantileBins.md Normal file
View File

@@ -0,0 +1,21 @@
Find the breaks for N categories in a numerical column based on the [Quantile bins]. Below, the quantile method is used to determine color based on the area of the polygons.
![qunatile](https://f.cloud.github.com/assets/370259/140714/932ed0e6-722b-11e2-9807-ffbd0fddb9ac.png)
#### Using the function
We can determine the 7 most optimal breaks in a column of numerical data as follows,
```sql
SELECT CDB_QuantileBins(array_agg(numeric_column), 7) FROM table_name
-- Results in an ordered array like, {80,2281,7162,17652,39730,91077,1638094}
-- Each break happens up to, and equal, to a number:
-- (bin1 is less than or equal to 80, bin2 is less than or equal to 2281, etc.)
```
#### Arguments
CDB_QuantileBins(in_array, breaks)
* **in_array** numeric[]. A NUMERIC array of values.
* **breaks** int. The number of categories you want to create

46
doc/CDB_RectangleGrid.md Normal file
View File

@@ -0,0 +1,46 @@
Fill given extent with a rectangular coverage
#### Using the function
Create a rectangular grid from a polygon geometry. For example, take the geometry
```sql
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(10000000,-10000000),
ST_MakePoint(-10000000,10000000)
)
),
3857)
```
We can create a grid as follows,
```sql
SELECT CDB_RectangleGrid(
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(10000000,-10000000),
ST_MakePoint(-10000000,10000000)
)
),
3857),
1000000,
1000000
) the_geom_webmercator
```
Which will look something like this,
![rect grid](http://i.imgur.com/HuhOJRs.png)
#### Arguments
CDB_RectangleGrid(ext, width, height, origin)
* **ext** geometry. Extent to fill. Only rectangles with center point falling inside the extent (or at the lower or leftmost edge) will be emitted. The returned hexagons will have the same SRID as this extent.
* **width** float. Width of each rectangle. Measure is in the same projection as **ext**
* **height** float. Height of each rectangle. Measure is in the same projection as **ext**
* **origin** OPTIONAL geometry. Optional origin to allow for exact tiling. If omitted the origin will be 0,0. The parameter is checked for having the same SRID as the extent.

View File

@@ -0,0 +1,11 @@
Sets user quota in bytes (superuser only)
#### Using the function
```sql
SELECT CDB_SetUserQuotaInBytes(10485760);
--- Returns the previously set quota.
--- Use 0 to disable quota.
```
REF: https://github.com/CartoDB/cartodb-postgresql/blob/master/scripts-available/CDB_Quota.sql

View File

@@ -0,0 +1,44 @@
Function to "safely" transform to webmercator. This function is most useful for rendering custom geometries using the CartoDB tiler. Often, transforming a projection like WGS84 can cause issues with extents beyond what are actually valid in webmercator, this attempts to fix those issues.
#### Using the function
Using a box that is nearly the full globe,
```sql
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(-180,60),
ST_MakePoint(180,-60)
)
),
4326
)
```
We can then convert it to a renderable webmercator geometry.
```sql
SELECT CDB_TransformToWebmercator(
ST_SetSRID(
ST_Envelope(
ST_Collect(
ST_MakePoint(-10,60),
ST_MakePoint(300,-60)
)
),
4326
)
)
```
Would give you back a single valid rectangle in webmercator. Since a longitude of 300 would convert to an unallowed webmercator coordinate, it gets clipped first. Valid extent is WGS84 (-180, -89, 180, 89)
![valid geom](http://i.imgur.com/EFdXiqt.png)
#### Arguments
CDB_TransformToWebmercator(geom)
* **geom** geometry

11
doc/CDB_UserTables.md Normal file
View File

@@ -0,0 +1,11 @@
List the name of available tables (only the usable ones)
#### Using the function
```sql
--- Returns a row for each table having given permission with the table name
--- Currently accepted permissions are: 'public', 'private' or 'all'
SELECT CDB_UserTables(perms)
```
REF: https://github.com/CartoDB/cartodb-postgresql/blob/master/scripts-available/CDB_UserTables.sql

22
doc/CDB_XYZ_Extent.md Normal file
View File

@@ -0,0 +1,22 @@
Determine the spatial extent of a tile based on the tile's XYZ coordinate.
#### Using the function
Take a common tile with coordinates x=3, y=2, z=2,
![2/3/2](https://viz2.cartodb.com/tiles/quantile_breaks/2/3/2.png)
To determine its extent you would run,
```sql
SELECT CDB_XYZ_Extent(3,2,2)
--- Returns a WKB polygon in Webmercator (SRID 3857)
```
#### Arguments
CDB_XYZ_Extent(x,y,z)
* **x** integer
* **y** integer
* **z** integer

20
doc/CDB_XYZ_Resolution.md Normal file
View File

@@ -0,0 +1,20 @@
Return pixel resolution of tiles at a given zoom level
#### Using the function
Take a common tile with zoom, z=2,
![2/3/2](https://viz2.cartodb.com/tiles/quantile_breaks/2/3/2.png)
To determine the resolution of these pixels,
```sql
SELECT CDB_XYZ_Resolution(2)
--- Returns a float, 39135.7587890625
```
#### Arguments
CDB_XYZ_Resolution(z)
* **z** integer

38
doc/CartoDB-PLpgSQL.md Normal file
View File

@@ -0,0 +1,38 @@
INTRODUCTION
============
CartoDB uses a number of custom [PLpgSQL](http://www.postgresql.org/docs/8.3/static/plpgsql.html) functions to perform a few magical things. Those functions are accessible to users on CartoDB as well, so we would like to document what they are and what they do here.
## Spatial functions
[CDB_HexagonGrid](CDB_HexagonGrid) - create hexagonal grid from extent and size
[CDB_MakeHexagon](CDB_MakeHexagon) - make a hexagon with given center and side
[CDB_RectangleGrid](CDB_RectangleGrid) - fill given extent with a rectangular coverage
##### Tile based
[CDB_XYZ_Extent](CDB_XYZ_Extent) - Find the extent of a tile by XYZ
[CDB_XYZ_Resolution](CDB_XYZ_Resolution) - Find the pixel resolution of tiles
[CDB_TransformToWebmercator](CDB_TransformToWebmercator) - Convert a geometry to valid webmercator
## Statistical functions
[CDB_JenksBins](CDB_JenksBins) - Find breaks in an array of numbers using Jenks method
[CDB_HeadsTailsBins](CDB_HeadsTailsBins) - Find breaks in an array of numbers using Heads/Tails method
[CDB_QuantileBins](CDB_QuantileBins) - Find quantile breaks in an array of numbers
## System functions
[CDB_UserTables](CDB_UserTables) - Get a list of all tables in your account
[[CDB_SetUserQuotaInBytes]] - Set maximum user quota in bytes
column names - now returned in JSON response
column types - now returned in JSON response

29
doc/CartoDB-user-table.md Normal file
View File

@@ -0,0 +1,29 @@
A "cartodb" user table is a table with a well-known set of fields and a well-known set of triggers attached on.
The fields are:
- `cartodb_id`, a numerical primary key of serial type
- `created_at`, timestamp with timezone not null default now()
- `updated_at`, timestamp with timezone not null default now()
- `the_geom`, geometry, GiST indexed, constrained (see below)
- `the_geom_webmercator`, geometry, GiST indexed, constrained (see below)
The values of "the_geom" and "the_geom_webmercator" must match these constraints:
- Only POINT, MULTILINE, MULTIPOLYGON types ? Maybe UNCONSTRAINED
- Only 2 dimensions ? Maybe UNCONSTRAINED
- SRID=4326 for the_geom and SRID=3857 for the_geom_webmercator
The triggers are:
- `track_updates` after modifying statement updates cdb_tablemetadata
- `test_quota` before changing statement to forbid if overquota
- `test_quota_per_row` before changing row to forbod if overquota (checked on a probabilistic basis)
- `update_the_geom_webmercator` before insert or update row to maintain the_geom_webmercator
- `update_updated_at_trigger` before update row to maintain updated_at
Some conversions will be attempted to perform upon cartodbfication when certain fields appear:
- `cartodb_id`: If found type TEXT will be attempted to cast
- `created_at`: If found type TEXT will be attempted to cast
- `updated_at`: If found type TEXT will be attempted to cast

48
doc/README.md Normal file
View File

@@ -0,0 +1,48 @@
# Contents
* [CartoDB-user-table](CartoDB-user-table.md)
* [CartoDB-PLpgSQL](CartoDB-PLpgSQL.md)
* [CDB_ColumnNames](CDB_ColumnNames.md)
* [CDB_ColumnType](CDB_ColumnType.md)
* [CDB_HeadsTailsBins](CDB_HeadsTailsBins.md)
* [CDB_HexagonGrid](CDB_HexagonGrid.md)
* [CDB_JenksBins](CDB_JenksBins.md)
* [CDB_MakeHexagon](CDB_MakeHexagon.md)
* [CDB_QuantileBins](CDB_QuantileBins.md)
* [CDB_RectangleGrid](CDB_RectangleGrid.md)
* [CDB_SetUserQuotaInBytes](CDB_SetUserQuotaInBytes.md)
* [CDB_TransformToWebmercator](CDB_TransformToWebmercator.md)
* [CDB_UserTables](CDB_UserTables.md)
* [CDB_XYZ_Extent](CDB_XYZ_Extent.md)
* [CDB_XYZ_Resolution](CDB_XYZ_Resolution.md)
The CartoDB PostgreSQL extension is a module to load into each CartoDB user database to perform cartodb-specific security and functionality checks.
# Checks
No user other than the superuser should be allowed to change `statement_timeout` for the session (SET statement_timeout disallowed).
User tables need to match certain structure criteria (See [[CartoDB-user-table]]) so the extension should provide a mean to enforce such structure everytime an attempt to change structure is encountered.
# Events
The events we want some function to be called upon are:
| event | arguments to handler function | function duty | OK* |
|------------------------|--------------------------------------|----------------------------------|-----|
| SET variable | name of variable | forbid changing some vars | |
| RENAME table | old and new name + oid of the table | flush cache | |
| ADD/DROP/ALTER column | oid of the table | flush cache, cartodbfy, upd meta | Y |
| DISABLE/DROP trigger | oid of table, name of trigger | cartodbfy or forbid | |
| DROP table | oid of the table | flush cache and metadata | Y |
| CREATE table | oid of the table | cartodby, upd metadata | Y |
| GRANT | oid of table, privilege, role | flush cache, upd metadata |
| REVOKE | oid of table, privilege, role | flush cache, upd metadata |
* event available by installing https://github.com/CartoDB/pg_schema_triggers
At this stage we don't need more than this, but the number of events and the number of arguments passed to the handler function may expand in the future, so the extension should be written in a way to easily allow that.
Some of the handler will need to act _after_ the statement is completed (CREATE TABLE, for example).

View File

@@ -0,0 +1,122 @@
--
-- CDB_DistType classifies the histograms of a column into
-- one of the basic types listed by Galtung: http://druedin.com/2012/12/08/galtungs-ajus-system/
--
-- Future improvements:
-- variable number of bins (7 is baked in right now)
-- catch the number of items to ensure that the sample is large enough
--
-- Refs:
-- 1. width_bucket/histograms: http://tapoueh.org/blog/2014/02/21-PostgreSQL-histogram
-- 2. R implementation: https://github.com/cran/agrmt
CREATE OR REPLACE FUNCTION CDB_DistType ( in_array NUMERIC[] ) RETURNS text as $$
DECLARE
element_count INT4;
minv numeric;
maxv numeric;
bins numeric[];
freqs numeric[];
ajus INT[];
freq INT4;
signature text;
i INT := 1;
BEGIN
SELECT min(e), max(e), count(e) INTO minv, maxv, element_count FROM ( SELECT unnest(in_array) e ) x;
IF abs(maxv - minv) < 1e-7 THEN -- if max and min are nearly equal, call if 'F' (make relative to maxv?)
signature = 'F';
ELSE
-- Calculate bins and count in bins
EXECUTE 'WITH stats as (
SELECT min(e) as minv,
max(e) as maxv,
count(e) as total
FROM (SELECT unnest($1) e) x
WHERE e is not null
),
hist as (
SELECT width_bucket(e, s.minv, s.maxv, 7) bucket,
count(*) freq
FROM (SELECT unnest($1) e) x, stats s
WHERE e is not null
GROUP BY 1
ORDER BY 1
)
SELECT array_agg(round(100.0 * hist.freq::numeric / stats.total::numeric,1)) freqs,
array_agg(hist.bucket) buckets
FROM hist, stats'
INTO freqs, bins
USING in_array;
LOOP
IF i < 7 THEN
ajus[i] = CASE WHEN freqs[i] > freqs[i+1] THEN -1
WHEN abs(freqs[i] - freqs[i+1]) <= 0.05 THEN 0
ELSE 1 END;
ELSE
EXIT;
END IF;
i := i + 1;
END LOOP;
signature = _CDB_DistTypeClassify(ajus);
END IF;
RETURN signature;
END;
$$ language plpgsql IMMUTABLE;
-- Classify data into AJUSFL
CREATE OR REPLACE FUNCTION _CDB_DistTypeClassify ( in_array INT[] ) RETURNS text as $$
DECLARE
element_count INT4;
maxv numeric;
minv numeric;
uniques INT[];
type text;
BEGIN
SELECT max(e), min(e) INTO maxv, minv FROM ( SELECT unnest(in_array) e ) x;
IF (maxv = 0 AND minv = 0) THEN
type = 'F';
ELSIF maxv < 1 THEN
type = 'L';
ELSIF minv > -1 THEN
type = 'J';
ELSE
-- Get distinct elements ordered by original position
EXECUTE 'WITH b AS (
SELECT a
FROM (SELECT unnest($1) a) x
),
c AS (
SELECT a, row_number() OVER () r
FROM b
),
d AS (
SELECT DISTINCT a
FROM c
),
e AS (
SELECT a FROM d ORDER BY (
SELECT r FROM c WHERE d.a = c.a ORDER BY r ASC LIMIT 1
) ASC)
SELECT array_agg(a) FROM e'
INTO uniques
USING in_array;
-- Decide if it's an A, U, or other
IF (uniques = ARRAY[1,-1] OR uniques = ARRAY[1,0,-1] OR uniques = ARRAY[1,-1,0] OR uniques = ARRAY[0,1,-1]) THEN
type = 'A';
ELSIF (uniques = ARRAY[-1,1] OR uniques = ARRAY[-1,0,1] OR uniques = ARRAY[-1,1,0] OR uniques = ARRAY[0,-1,1]) THEN
type = 'U';
ELSE
type = 'S';
END IF;
END IF;
RETURN type;
END;
$$ language plpgsql IMMUTABLE;

View File

@@ -0,0 +1,46 @@
--
-- CDB_DistinctMeasure
-- calculates the fraction of rows in the 10 most common distinct categories
-- returns true if the number of rows in these 10 categories is >= 0.9 * total number of rows
--
--
CREATE OR REPLACE FUNCTION CDB_DistinctMeasure ( in_array text[], threshold numeric DEFAULT null ) RETURNS numeric as $$
DECLARE
element_count INT4;
maxval numeric;
passes numeric;
BEGIN
SELECT count(e) INTO element_count FROM ( SELECT unnest(in_array) e ) x;
-- count number of occurrences per bin
-- calculate the normalized cumulative sum
-- return the max value: which corresponds nth entry
-- for n <= 10 depending on # of distinct values
EXECUTE 'WITH a As (
SELECT
count(*) cnt
FROM
(SELECT * FROM unnest($2) e ) x
WHERE e is not null
GROUP BY e
ORDER BY cnt DESC
),
b As (
SELECT
sum(cnt) OVER (ORDER BY cnt DESC) / $1 As cumsum
FROM a
LIMIT 10
)
SELECT max(cumsum) maxval FROM b'
INTO maxval
USING element_count, in_array;
IF threshold is null THEN
passes = maxval;
ELSE
passes = CASE WHEN (maxval >= threshold) THEN 1 ELSE 0 END;
END IF;
RETURN passes;
END;
$$ language plpgsql IMMUTABLE;

View File

@@ -0,0 +1,37 @@
--
-- Calculate the equal interval bins for a given column
--
-- @param in_array A numeric array of numbers to determine the best
-- to determine the bin boundary
--
-- @param breaks The number of bins you want to find.
--
--
-- Returns: upper edges of bins
--
--
CREATE OR REPLACE FUNCTION CDB_EqualIntervalBins ( in_array NUMERIC[], breaks INT ) RETURNS NUMERIC[] as $$
DECLARE
diff numeric;
min_val numeric;
max_val numeric;
tmp_val numeric;
i INT := 1;
reply numeric[];
BEGIN
SELECT min(e), max(e) INTO min_val, max_val FROM ( SELECT unnest(in_array) e ) x WHERE e IS NOT NULL;
diff = (max_val - min_val) / breaks::numeric;
LOOP
IF i < breaks THEN
tmp_val = min_val + i::numeric * diff;
reply = array_append(reply, tmp_val);
i := i+1;
ELSE
reply = array_append(reply, max_val);
EXIT;
END IF;
END LOOP;
RETURN reply;
END;
$$ language plpgsql IMMUTABLE;

View File

@@ -15,18 +15,29 @@ DECLARE
i INT := 1;
reply numeric[];
BEGIN
-- get our unique values
SELECT array_agg(e) INTO in_array FROM (SELECT unnest(in_array) e GROUP BY e ORDER BY e ASC) x;
-- get the total size of our row
element_count := array_upper(in_array, 1) - array_lower(in_array, 1);
-- sort our values
SELECT array_agg(e) INTO in_array FROM (SELECT unnest(in_array) e ORDER BY e ASC) x;
-- get the total size of our data
element_count := array_length(in_array, 1);
break_size := element_count::numeric / breaks;
-- slice our bread
LOOP
IF i > breaks THEN EXIT; END IF;
SELECT e INTO tmp_val FROM ( SELECT unnest(in_array) e LIMIT 1 OFFSET round(break_size * i)) x;
IF i < breaks THEN
IF break_size * i % 1 > 0 THEN
SELECT e INTO tmp_val FROM ( SELECT unnest(in_array) e LIMIT 1 OFFSET ceil(break_size * i) - 1) x;
ELSE
SELECT avg(e) INTO tmp_val FROM ( SELECT unnest(in_array) e LIMIT 2 OFFSET ceil(break_size * i) - 1 ) x;
END IF;
ELSIF i = breaks THEN
-- select the last value
SELECT max(e) INTO tmp_val FROM ( SELECT unnest(in_array) e ) x;
ELSE
EXIT;
END IF;
reply = array_append(reply, tmp_val);
i := i+1;
END LOOP;
RETURN reply;
i := i+1;
END LOOP;
RETURN reply;
END;
$$ language plpgsql IMMUTABLE;
$$ language plpgsql IMMUTABLE;

View File

@@ -2,12 +2,12 @@
--
-- Requires PostgreSQL 9.x+
--
CREATE OR REPLACE FUNCTION CDB_QueryTables(query text)
RETURNS name[]
CREATE OR REPLACE FUNCTION CDB_QueryTablesText(query text)
RETURNS text[]
AS $$
DECLARE
exp XML;
tables NAME[];
tables text[];
rec RECORD;
rec2 RECORD;
BEGIN
@@ -41,11 +41,11 @@ BEGIN
xpath('//x:Relation-Name/text()', exp, ARRAY[ARRAY['x', 'http://www.postgresql.org/2009/explain']]) as x,
xpath('//x:Relation-Name/../x:Schema/text()', exp, ARRAY[ARRAY['x', 'http://www.postgresql.org/2009/explain']]) as s
)
SELECT unnest(x)::name as p, unnest(s)::name as sc from inp
SELECT unnest(x) as p, unnest(s) as sc from inp
LOOP
-- RAISE DEBUG 'tab: %', rec2.p;
-- RAISE DEBUG 'sc: %', rec2.sc;
tables := array_append(tables, (rec2.sc || '.' || rec2.p)::name);
tables := array_append(tables, (rec2.sc || '.' || rec2.p));
END LOOP;
-- RAISE DEBUG 'Tables: %', tables;
@@ -65,3 +65,14 @@ BEGIN
return tables;
END
$$ LANGUAGE 'plpgsql' VOLATILE STRICT;
-- Keep CDB_QueryTables with same signature for backwards compatibility.
-- It should probably be removed in the future.
CREATE OR REPLACE FUNCTION CDB_QueryTables(query text)
RETURNS name[]
AS $$
BEGIN
RETURN CDB_QueryTablesText(query)::name[];
END
$$ LANGUAGE 'plpgsql' VOLATILE STRICT;

View File

@@ -1,3 +1,22 @@
CREATE OR REPLACE FUNCTION cartodb._CDB_total_relation_size(_schema_name TEXT, _table_name TEXT)
RETURNS bigint AS
$$
BEGIN
IF EXISTS (
SELECT 1 FROM information_schema.tables
WHERE table_catalog = current_database()
AND table_schema = _schema_name
AND table_name = _table_name
)
THEN
RETURN pg_total_relation_size(format('"%s"."%s"', _schema_name, _table_name));
ELSE
RETURN 0;
END IF;
END;
$$
LANGUAGE 'plpgsql' VOLATILE;
-- Return the estimated size of user data. Used for quota checking.
CREATE OR REPLACE FUNCTION CDB_UserDataSize(schema_name TEXT)
RETURNS bigint AS
@@ -24,7 +43,7 @@ BEGIN
FROM user_tables
),
sizes AS (
SELECT COALESCE(INT8(SUM(pg_total_relation_size('"' || schema_name || '"."' || table_name || '"')))) table_size,
SELECT COALESCE(INT8(SUM(cartodb._CDB_total_relation_size(schema_name, table_name)))) table_size,
CASE
WHEN is_overview THEN 0
WHEN is_raster THEN 1

View File

@@ -5,35 +5,22 @@
--
-- Currently accepted permissions are: 'public', 'private' or 'all'
--
DROP FUNCTION IF EXISTS cdb_usertables(text);
CREATE OR REPLACE FUNCTION CDB_UserTables(perm text DEFAULT 'all')
RETURNS SETOF information_schema.sql_identifier
RETURNS SETOF name
AS $$
WITH usertables AS (
-- TODO: query CDB_TableMetadata for this ?
-- See http://github.com/CartoDB/cartodb/issues/254#issuecomment-26044777
SELECT table_name as t
FROM information_schema.tables
WHERE
table_type='BASE TABLE'
AND table_schema='public'
AND table_name NOT IN (
'cdb_tablemetadata',
'spatial_ref_sys'
)
), perms AS (
SELECT t, has_table_privilege('public', 'public'||'.'||t, 'SELECT') as p
FROM usertables
)
SELECT t FROM perms
WHERE (
p = CASE WHEN $1 = 'private' THEN false
WHEN $1 = 'public' THEN true
ELSE not p -- none
END
OR $1 = 'all'
)
AND has_table_privilege('public'||'.'||t, 'SELECT')
;
SELECT c.relname
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND c.relname NOT IN ('cdb_tablemetadata', 'spatial_ref_sys')
AND n.nspname NOT IN ('pg_catalog', 'information_schema', 'topology')
AND CASE WHEN perm = 'public' THEN has_table_privilege('publicuser', c.oid, 'SELECT')
WHEN perm = 'private' THEN has_table_privilege(current_user, c.oid, 'SELECT') AND NOT has_table_privilege('publicuser', c.oid, 'SELECT')
WHEN perm = 'all' THEN has_table_privilege(current_user, c.oid, 'SELECT') OR has_table_privilege('publicuser', c.oid, 'SELECT')
ELSE false END;
$$ LANGUAGE 'sql';
-- This is to migrate from pre-0.2.0 version

View File

@@ -0,0 +1 @@
../scripts-available/CDB_EqualIntervalBins.sql

View File

@@ -0,0 +1 @@
../scripts-available/CDB_DistType.sql

View File

@@ -0,0 +1 @@
../scripts-available/CDB_DistinctMeasure.sql

View File

@@ -0,0 +1,4 @@
WITH data AS (
SELECT pow(x,3)::numeric x FROM generate_series(-100,100) x
)
SELECT CDB_DistType(array_agg(x)) FROM data

View File

@@ -0,0 +1 @@
A

View File

@@ -0,0 +1,20 @@
-- a - j add up to 89%, k-m add up to 11%
WITH a As (
SELECT (
repeat('a',12) ||
repeat('b',11) ||
repeat('c',11) ||
repeat('d',10) ||
repeat('e',10) ||
repeat('f',9) ||
repeat('g',8) ||
repeat('h',7) ||
repeat('i',6) ||
repeat('j',5) ||
repeat('k',4) ||
repeat('l',4) ||
repeat('m',3)
)::text AS x
)
SELECT CDB_DistinctMeasure(string_to_array(x,null),0.90) from a

View File

@@ -0,0 +1 @@
0

View File

@@ -0,0 +1,5 @@
WITH data AS (
SELECT array_agg(x::numeric) s FROM generate_series(1,300) x
WHERE x % 5 != 0 AND x % 7 != 0
)
SELECT round(unnest(CDB_EqualIntervalBins(s, 7)),7) FROM data

View File

@@ -0,0 +1,7 @@
43.5714286
86.1428571
128.7142857
171.2857143
213.8571429
256.4285714
299.0000000

View File

@@ -1,7 +1,7 @@
16
13
29
43
57
71
83
86
99

View File

@@ -31,6 +31,8 @@ create table sc.test (a int);
insert into sc.test values (1);
WITH inp AS ( select 'select * from sc.test'::text as q )
SELECT q, CDB_QueryTables(q) from inp;
DROP TABLE sc.test;
DROP SCHEMA sc;
WITH inp AS ( select 'SELECT
* FROM geometry_columns'::text as q )

View File

@@ -4,14 +4,20 @@ CREATE table "my'tab;le" as select 1|{}
SELECT a.oid, b.oid FROM pg_class a, pg_class b|{pg_catalog.pg_class}
SELECT 1 as col1; select 2 as col2|{}
WARNING: CDB_QueryTables cannot explain query: select 1 from nonexistant (42P01: relation "nonexistant" does not exist)
CONTEXT: PL/pgSQL function cdb_querytables(text) line 3 at RETURN
ERROR: relation "nonexistant" does not exist
CONTEXT: PL/pgSQL function cdb_querytables(text) line 3 at RETURN
begin; select * from pg_class; commit;|{pg_catalog.pg_class}
WARNING: CDB_QueryTables cannot explain query: select * from test (42P01: relation "test" does not exist)
CONTEXT: PL/pgSQL function cdb_querytables(text) line 3 at RETURN
ERROR: relation "test" does not exist
CONTEXT: PL/pgSQL function cdb_querytables(text) line 3 at RETURN
WITH a AS (select * from pg_class) select * from a|{pg_catalog.pg_class}
CREATE SCHEMA
CREATE TABLE
INSERT 0 1
select * from sc.test|{sc.test}
DROP TABLE
DROP SCHEMA
SELECT
* FROM geometry_columns|{pg_catalog.pg_attribute,pg_catalog.pg_class,pg_catalog.pg_namespace,pg_catalog.pg_type}

View File

@@ -13,6 +13,11 @@ SELECT CDB_CartodbfyTable('big');
INSERT INTO big SELECT generate_series(2049,4096);
INSERT INTO big SELECT generate_series(4097,6144);
INSERT INTO big SELECT generate_series(6145,8192);
-- Test for #108: https://github.com/CartoDB/cartodb-postgresql/issues/108
SELECT CDB_UserDataSize();
SELECT cartodb._CDB_total_relation_size('public', 'big');
SELECT cartodb._CDB_total_relation_size('public', 'nonexistent_table_name');
-- END Test for #108
SELECT CDB_SetUserQuotaInBytes(2);
INSERT INTO big VALUES (8193);
SELECT CDB_SetUserQuotaInBytes(0);

View File

@@ -8,6 +8,9 @@ big
INSERT 0 2048
INSERT 0 2048
INSERT 0 2048
581632
1163264
0
2
ERROR: Quota exceeded by 443.998046875KB
0

View File

@@ -1,11 +1,19 @@
create table pub(a int);
create table prv(a int);
GRANT SELECT ON TABLE pub TO public;
REVOKE SELECT ON TABLE prv FROM public;
CREATE ROLE publicuser LOGIN;
CREATE TABLE pub(a int);
CREATE TABLE prv(a int);
GRANT SELECT ON TABLE pub TO publicuser;
REVOKE SELECT ON TABLE prv FROM publicuser;
SELECT CDB_UserTables() ORDER BY 1;
SELECT 'all',CDB_UserTables('all') ORDER BY 2;
SELECT 'public',CDB_UserTables('public') ORDER BY 2;
SELECT 'private',CDB_UserTables('private') ORDER BY 2;
SELECT '--unsupported--',CDB_UserTables('--unsupported--') ORDER BY 2;
drop table pub;
drop table prv;
-- now tests with public user
\c contrib_regression publicuser
SELECT 'all_publicuser',CDB_UserTables('all') ORDER BY 2;
SELECT 'public_publicuser',CDB_UserTables('public') ORDER BY 2;
SELECT 'private_publicuser',CDB_UserTables('private') ORDER BY 2;
\c contrib_regression postgres
DROP TABLE pub;
DROP TABLE prv;
DROP ROLE publicuser;

View File

@@ -1,3 +1,4 @@
CREATE ROLE
CREATE TABLE
CREATE TABLE
GRANT
@@ -8,5 +9,10 @@ all|prv
all|pub
public|pub
private|prv
You are now connected to database "contrib_regression" as user "publicuser".
all_publicuser|pub
public_publicuser|pub
You are now connected to database "contrib_regression" as user "postgres".
DROP TABLE
DROP TABLE
DROP ROLE

View File

@@ -5,5 +5,5 @@ inp AS ( select z0.z, r1.r as x, r2.r as y FROM zoom z0, range r1, range r2 WHER
ext AS ( select x,y,z,CDB_XYZ_Extent(x,y,z) as g from inp )
select X::text || ',' || Y::text || ',' || Z::text as xyz,
st_xmin(g), st_xmax(g), st_ymin(g), st_ymax(g)
from ext;
from ext order by xyz;

View File

@@ -1,13 +1,13 @@
0,0,0|-20037508.5|20037508.5|-20037508.5|20037508.5
0,0,1|-20037508.5|0|0|20037508.5
0,1,1|-20037508.5|0|-20037508.5|0
1,0,1|0|20037508.5|0|20037508.5
1,1,1|0|20037508.5|-20037508.5|0
0,0,2|-20037508.5|-10018754.25|10018754.25|20037508.5
0,1,1|-20037508.5|0|-20037508.5|0
0,1,2|-20037508.5|-10018754.25|0|10018754.25
0,2,2|-20037508.5|-10018754.25|-10018754.25|0
0,3,2|-20037508.5|-10018754.25|-20037508.5|-10018754.25
1,0,1|0|20037508.5|0|20037508.5
1,0,2|-10018754.25|0|10018754.25|20037508.5
1,1,1|0|20037508.5|-20037508.5|0
1,1,2|-10018754.25|0|0|10018754.25
1,2,2|-10018754.25|0|-10018754.25|0
1,3,2|-10018754.25|0|-20037508.5|-10018754.25

View File

@@ -142,6 +142,8 @@ function setup() {
log_info "############################# SETUP #############################"
create_role_and_schema cdb_testmember_1
create_role_and_schema cdb_testmember_2
sql "CREATE ROLE publicuser LOGIN;"
sql "GRANT CONNECT ON DATABASE \"${DATABASE}\" TO publicuser;"
create_table cdb_testmember_1 foo
sql cdb_testmember_1 'INSERT INTO cdb_testmember_1.foo VALUES (1), (2), (3), (4), (5);'
@@ -168,9 +170,11 @@ function tear_down() {
sql "REVOKE CONNECT ON DATABASE \"${DATABASE}\" FROM cdb_testmember_1;"
sql "REVOKE CONNECT ON DATABASE \"${DATABASE}\" FROM cdb_testmember_2;"
sql "REVOKE CONNECT ON DATABASE \"${DATABASE}\" FROM publicuser;"
sql 'DROP ROLE cdb_testmember_1;'
sql 'DROP ROLE cdb_testmember_2;'
sql 'DROP ROLE publicuser;'
${CMD} -c "DROP DATABASE ${DATABASE}"
}
@@ -346,6 +350,50 @@ function test_cdb_querytables_does_not_return_functions_as_part_of_the_resultset
sql postgres "select * from CDB_QueryTables('select * from cdb_testmember_1.foo, cdb_testmember_2.bar, plainto_tsquery(''foo'')');" should "{cdb_testmember_1.foo,cdb_testmember_2.bar}"
}
function test_cdb_usertables_should_work_with_orgusers() {
# This test validates the changes proposed in https://github.com/CartoDB/cartodb/pull/5021
# create tables
sql cdb_testmember_1 "CREATE TABLE test_perms_pub (a int)"
sql cdb_testmember_1 "INSERT INTO test_perms_pub (a) values (1);"
sql cdb_testmember_1 "GRANT SELECT ON TABLE test_perms_pub TO publicuser"
sql cdb_testmember_1 "CREATE TABLE test_perms_priv (a int)"
# this is what we need to make public tables available in CDB_UserTables
sql postgres "grant publicuser to cdb_testmember_1;"
sql postgres "grant publicuser to cdb_testmember_2;"
# this is required to enable select from other schema
sql postgres "GRANT USAGE ON SCHEMA cdb_testmember_1 TO publicuser";
# test CDB_UserTables with publicuser
${CMD} -d ${DATABASE} -f scripts-available/CDB_UserTables.sql
sql publicuser "SELECT count(*) FROM CDB_UserTables('all')" should 1
sql publicuser "SELECT count(*) FROM CDB_UserTables('public')" should 1
sql publicuser "SELECT count(*) FROM CDB_UserTables('private')" should 0
sql publicuser "SELECT * FROM CDB_UserTables('all')" should "test_perms_pub"
sql publicuser "SELECT * FROM CDB_UserTables('public')" should "test_perms_pub"
sql publicuser "SELECT * FROM CDB_UserTables('private')" should ""
# the following tests are for https://github.com/CartoDB/cartodb-postgresql/issues/98
# cdb_testmember_2 is already owner of `bar` table
sql cdb_testmember_2 "select string_agg(t,',') from (select cdb_usertables('all') t order by t) as s" should "bar,test_perms_pub"
sql cdb_testmember_2 "SELECT * FROM CDB_UserTables('public')" should "test_perms_pub"
sql cdb_testmember_2 "SELECT * FROM CDB_UserTables('private')" should "bar"
# test cdb_testmember_2 can select from cdb_testmember_1's public table
sql cdb_testmember_2 "SELECT * FROM cdb_testmember_1.test_perms_pub" should 1
sql cdb_testmember_1 "DROP TABLE test_perms_pub"
sql cdb_testmember_1 "DROP TABLE test_perms_priv"
}
#################################################### TESTS END HERE ####################################################