Compare commits

...

58 Commits

Author SHA1 Message Date
Javier Goizueta
f5f59be5b0 Release version 0.16.3
Fixes overviews creation problem
2016-05-09 13:08:50 +02:00
Javier Goizueta
d99dc394c2 Merge pull request #253 from CartoDB/252-estimateextent-quoting
Do not quote arguments to ST_EstimatedExtent
2016-05-09 13:04:44 +02:00
Javier Goizueta
8d7860dc7a Fixes #252 2016-05-09 11:54:56 +02:00
Javier Goizueta
b5427c65c8 Drop aggregate to be defined
Otherwise future versions will fail to recreate the aggregate
2016-04-29 08:46:01 +02:00
Javier Goizueta
8f1435c049 Release 0.16.2 2016-04-27 18:30:26 +02:00
Javier Goizueta
8302f89413 Merge pull request #246 from CartoDB/245-categories-mode
Use the mode to aggregate category columns in overviews
2016-04-27 18:16:05 +02:00
Javier Goizueta
e9050178a8 Merge branch 'master' of github.com:CartoDB/cartodb-postgresql 2016-04-27 16:23:46 +02:00
Javier Goizueta
3e34ca4654 Overviews documentation fixes 2016-04-27 16:23:25 +02:00
Javier Goizueta
a067cc7da1 Generate stats used to identify category columns in overviews if needed
This only generates the stats if no stats are available for a table.
This doesn't warrant that the stats are up to date or accurate.
2016-04-27 15:06:09 +02:00
Javier Goizueta
2c43943df6 Fix syntax 2016-04-26 18:27:52 +02:00
Javier Goizueta
417cbe7902 Fix category columns aggregation in overviews
Overviews are created in cascade, each one from the inmediate
lower level, but the stats to decide if a column is a category
should be taken always from the base table.
2016-04-26 18:02:25 +02:00
Javier Goizueta
9a73703954 Use mode to aggregate categorical columns in overviews
Fixes #245
2016-04-26 15:15:24 +02:00
Rafa de la Torre
36ac831bd1 Update cartodbfy-requirements.rst
Fix broken link to doc
2016-04-26 14:43:24 +02:00
Javier Goizueta
1358964628 Release 0.16.1 2016-04-25 18:47:42 +02:00
Javier Goizueta
efe381ad94 Merge pull request #243 from CartoDB/241-webmercator
Compute webmercator resolution with full accuracy
2016-04-25 17:30:40 +02:00
Javier Goizueta
f7cce21eb7 Merge pull request #242 from CartoDB/240-overviews-pixels
Adjust overview points to pixel centers
2016-04-25 17:30:25 +02:00
Javier Goizueta
18267477da Merge pull request #238 from CartoDB/235-column-names
Optimize column information functions
2016-04-25 17:30:07 +02:00
Javier Goizueta
11ad45306f Remove unneeded pg_catalog schema name 2016-04-25 16:30:58 +02:00
Javier Goizueta
75c7ae98e4 Compute webmercator resolution with full accuracy
Fixes #241
2016-04-25 14:02:26 +02:00
Javier Goizueta
3c12cf629f Optimize overview pixel adjustment for integer-pixel cells 2016-04-25 13:53:59 +02:00
Javier Goizueta
7b2100b51e Adjust overview coordinates to pixel centers
This makes the adjustment for all grid sizes, not only
for integral number of pixels.
2016-04-25 13:33:43 +02:00
Javier Goizueta
580ec38ab8 Adjust overview clustered point to pixel centers
Fixes #240
2016-04-23 15:07:06 +02:00
Raul Ochoa
897689dd43 Release 0.16.0 2016-04-19 15:44:37 +02:00
Raul Ochoa
808fc9fc25 Merge pull request #237 from CartoDB/analysis-catalog
Adds table for storing camshaft analysis nodes
2016-04-19 15:32:46 +02:00
Javier Goizueta
65415bb335 Optimize funcion CDB_COlumnType 2016-04-18 19:07:33 +02:00
Javier Goizueta
06ebb27160 Optimize internal funcion _cdb_unlimited_text_column 2016-04-18 18:50:37 +02:00
Javier Goizueta
bd5ae84e90 Optimize CDB_ColumnNames
This implementation is about 1000 times faster
2016-04-18 18:49:58 +02:00
Raul Ochoa
de5a702510 Adds table for storing camshaft analysis nodes 2016-04-18 17:41:39 +02:00
Javier Goizueta
6908fb4672 Release 0.15.1
Overviews bugfixes & enhancements
2016-04-15 18:15:35 +02:00
Javier Goizueta
a528a250d4 Merge pull request #234 from CartoDB/231-overviews-text-aggr
Aggregate small number of text items in overviews
2016-04-15 18:04:07 +02:00
Javier Goizueta
ef43623f77 Remove unneeded variable 2016-04-15 17:58:03 +02:00
Javier Goizueta
09ad550de3 Fix tests 2016-04-15 17:50:47 +02:00
Javier Goizueta
1b0f77aa96 Always retain single-valued aggregated texts
This makes columns which have the same value in a group to be aggregated
maintain that value (rather than replacing it by the multiple-value
indicator *) whatever the group value is. (Previously this happend
only for small groups)
2016-04-15 17:49:00 +02:00
Javier Goizueta
45f063d469 Aggregate small number of text items in overviews
Instead of nulling text fields for non-singleton aggregated records
concatenate distinct text values when they're few (5 or less).
Fixes #231
2016-04-15 12:37:16 +02:00
Carla
20989e2f28 Merge pull request #233 from CartoDB/232-overviews-avg
Fix AVG computation in overview tables
2016-04-15 11:10:26 +02:00
Javier Goizueta
176d69d09e Fix AVG computation in overview tables
Fixes #232
Averages of averages are not equal to overall averages.
2016-04-15 10:48:08 +02:00
Javier Goizueta
9fdbfda60a Merge pull request #228 from CartoDB/225-no-centroid-master
Use cell centers, not cluster centroids when grouping points
2016-04-15 10:06:44 +02:00
Javier Goizueta
9a3d93976c Merge pull request #227 from CartoDB/226-add_count_aggregated_features
Include and aggregate _vovw_count column to count aggregated features
2016-04-15 10:06:05 +02:00
Javier Goizueta
46b45f6dd4 Merge pull request #224 from CartoDB/223-fix-dropoverviews
Fix CDB_DropOverviews and CDB_Overviews problems
2016-04-15 10:05:28 +02:00
Carla Iriberri
fd14750ce5 Rename _vovw_count to _feature_count 2016-04-14 18:23:09 +02:00
Javier Goizueta
c595e45c11 Add _vovw_count columnt to tables for which overviews are created
Initially we planned to add this column to the queries sent to the
tiler only, but that makes the column hard to access from the editor.
2016-04-14 17:32:18 +02:00
Carla
1cf7074fb1 Merge pull request #230 from CartoDB/229-set_tolerance_px_to_1_overviews
Set tolerance to 1 pixel in overviews by default
2016-04-14 17:26:37 +02:00
Javier Goizueta
f785e71d3b Fix: numeric is a valid numeric column type
Actually this is the type of aggregated _vovw_count columns
2016-04-14 15:46:03 +02:00
Carla
14b8cd7d99 Set default value to 1 and remove typo line 2016-04-14 12:12:17 +02:00
Carla
213adcca16 Fixes tests for tolerance_px = 1.0, with no zoom 3 2016-04-14 11:24:10 +02:00
Carla
1a571c8a9c Set tolerance to 1 pixel 2016-04-14 11:13:59 +02:00
Carla Iriberri
8f44f5347a Fix indent for code clarity 2016-04-13 17:51:30 +02:00
Carla Iriberri
f96163265b Fix bug for tables without geom or with no potential overviews
If the table doesn't have geometries but the createoverviews function is
called, the current geometry type checks won't work because "null" will
not give a boolean value in the type comparisons.

Also, if the createoverviews function is called over a simple table with
would not require overviews according to the strategies it is handled
correctly.
2016-04-13 17:49:38 +02:00
Javier Goizueta
1c67214b09 Use cell centers, not cluster centroids when grouping points
Fixes #225
2016-04-13 11:14:09 +02:00
Carla Iriberri
16d08ef52b Include and aggregate _vovw_count column to count aggregated features 2016-04-12 11:10:55 +02:00
Javier Goizueta
15ac9a2cd9 Remove unneeded code 2016-04-07 10:30:10 +02:00
Javier Goizueta
ee61d46100 💄 rename variable for clarity 2016-04-07 10:24:02 +02:00
Javier Goizueta
49e7094c8a Make CDB_CreateOverview usable by superuser
CDB_CreateOverview had to be executed with the user role
corresponding to the owner of the table; now it can be executed
by the postgres user.
2016-04-07 07:52:58 +02:00
Javier Goizueta
fb910be12f Fix conversion of regclass to indentifier 2016-04-07 07:07:20 +02:00
Javier Goizueta
34c39662ec Replace use of CDB_UserTables in CDB_Overviews
Use a function that returns reclasses and schema names properly instead.
2016-04-07 00:07:45 +02:00
Javier Goizueta
84cac16d1c Temporary fix 2016-04-06 22:05:00 +02:00
Javier Goizueta
c1fc07d2ac Fix typo
This function isn't beint actively used; should consider removing it
or testing it properly
2016-04-06 18:58:37 +02:00
Javier Goizueta
5c3c0f5fc9 Fix bug in CDB_DropOverviews
Fixes #223
2016-04-06 18:57:52 +02:00
13 changed files with 356 additions and 91 deletions

View File

@@ -1,7 +1,7 @@
# cartodb/Makefile
EXTENSION = cartodb
EXTVERSION = 0.15.0
EXTVERSION = 0.16.3
SED = sed
@@ -65,6 +65,11 @@ UPGRADABLE = \
0.14.3 \
0.14.4 \
0.15.0 \
0.15.1 \
0.16.0 \
0.16.1 \
0.16.2 \
0.16.3 \
$(EXTVERSION)dev \
$(EXTVERSION)next \
$(END)

50
NEWS.md
View File

@@ -1,5 +1,51 @@
0.15.0 (2016-04-05)
0.16.3 (2016-05-09)
-------------------
* Fix overview creation problem for organization users
with names that require quoting:
[#253](https://github.com/CartoDB/cartodb-postgresql/pull/253)
0.16.2 (2016-04-27)
-------------------
* Use the mode to aggregate category columns in overviews
[#246](https://github.com/CartoDB/cartodb-postgresql/pull/246)
0.16.1 (2016-04-25)
-------------------
* Optimize column information functions performance
[#238](https://github.com/CartoDB/cartodb-postgresql/pull/238)
* Adjust overview points to pixel CDB_EqualIntervalBins
[#242](https://github.com/CartoDB/cartodb-postgresql/pull/242)
* Compute webmercator resolution using full numeric precision
[#243](https://github.com/CartoDB/cartodb-postgresql/pull/243)
0.16.0 (2016-04-15)
-------------------
* Adds table for storing camshaft analysis nodes
[#237](https://github.com/CartoDB/cartodb-postgresql/pull/237)
0.15.1 (2016-04-15)
-------------------
* Fix problems with org users in overviews functions
[#224](https://github.com/CartoDB/cartodb-postgresql/pull/224)
* Add `_feature_count` to overviews
[#227](https://github.com/CartoDB/cartodb-postgresql/pull/227)
* Change point clustering behaviour of overviews
[#228](https://github.com/CartoDB/cartodb-postgresql/pull/228)
* Change default tolerance of overviews
[#230](https://github.com/CartoDB/cartodb-postgresql/pull/230)
* Fix problem with aggregated numerical fields in overviews
[#233](https://github.com/CartoDB/cartodb-postgresql/pull/233)
* Enhance aggregation of text fields in overviews
[#234]https://github.com/CartoDB/cartodb-postgresql/pull/234
0.15.0 (2016-04-05)
-------------------
* New function CDB_CreateOverviewsWithToleranceInPixels that adds tolerance parameter for overview creation
[#221](https://github.com/CartoDB/cartodb-postgresql/pull/221)
* New default value for the overviews tolerance in pixels is 2 (used to be 7.5) (also in #221)
@@ -8,7 +54,7 @@
[#220](https://github.com/CartoDB/cartodb-postgresql/pull/220)
0.14.4 (2016-03-29)
-------------------
* Fix creating overviews for tables with boolean columns
[#214](https://github.com/CartoDB/cartodb-postgresql/pull/214)
* Fix tests for some systems [#215](https://github.com/CartoDB/cartodb-postgresql/pull/215)

View File

@@ -2,18 +2,25 @@ Overviews are tables that represent a *reduced* version of a dataset intended
for efficient rendering at certain zoom levels while preserving the
general visual appearance of the complete dataset.
The *reduction* consists in a fewer number of records
The *reduction* consists in havig a fewer number of records
(while each overview record may represent an aggregation of multiple records)
and/or simplified record geometries.
Overviews are created through the `CDB_CreateOverviews`.
Overviews are created through the `CDB_CreateOverviews` function.
The statement timeout may need to be adjusted before using this function,
as overview creation for large tables is a time-consuming operation.
The `CDB_Overviews` function can be used determine what overview tables
exist for a given dataset table and which zoom levels correspond to it.
The `CDB_DropOverviews` remove a dataset's existing overviews.
The `CDB_DropOverviews` function removes a dataset's existing overviews.
To know if overview tables exist for some base table, and to obtain
a list of which overview tables are approrpiate for which zoom levels,
the `CDB_Overviews` functions can be used.
The zoom level we're referring here to are those used
by the tiler: http://wiki.openstreetmap.org/wiki/Zoom_levels
### CDB_CreateOverviews
@@ -51,10 +58,14 @@ CDB_CreateOverviews(table_name, ref_z_strategy, reduction_strategy)
#### Tolerance / level of detail
The level of detail to be representable by each overview layer can
be specified as a tolerance in pixels (if different from the default of 2 pixels)
be specified as a tolerance in pixels (if different from the default of 1 pixel)
with the function `CDB_CreateOverviewsWithToleranceInPixels`
which has as a second additional argument the desired tolerance.
This tolerance defines the maximum deviation in pixels of the overviews
geometries with respect to the original geometries when overview tables
are used for their intendend zoom level.
### CDB_Overviews
Obtain overview metadata for a given table (existing overviews).
@@ -79,7 +90,7 @@ SELECT CDB_Overviews(CDB_QueryTablesText('SELECT * FROM table1, table2'));
The result of `CDB_Overviews` has three columns:
| base_table | z | overview_table |
|------------+---+----------------|
| ---------- | - | -------------- |
| table1 | 1 | table1_ov1 |
| table1 | 2 | table1_ov2 |
| table1 | 4 | table1_ov4 |

View File

@@ -33,7 +33,7 @@ Additionally, a CartoDB table can contain other columns.
See the `CartoDB User Table documentation`_
.. _CartoDB User Table documentation: https://github.com/CartoDB/cartodb-postgresql/blob/master/doc/CartoDB-user-table.md
.. _CartoDB User Table documentation: https://github.com/CartoDB/cartodb-postgresql/blob/master/doc/CartoDB-user-table.rst
for further information.
High level requirements

View File

@@ -0,0 +1,24 @@
-- Table to register analysis nodes from https://github.com/cartodb/camshaft
CREATE TABLE IF NOT EXISTS
cartodb.cdb_analysis_catalog (
-- md5 hex hash
node_id char(40) CONSTRAINT cdb_analysis_catalog_pkey PRIMARY KEY,
-- being json allows to do queries like analysis_def->>'type' = 'buffer'
analysis_def json NOT NULL,
-- can reference other nodes in this very same table, allowing recursive queries
input_nodes char(40) ARRAY NOT NULL DEFAULT '{}',
status TEXT NOT NULL DEFAULT 'pending',
CONSTRAINT valid_status CHECK (
status IN ( 'pending', 'waiting', 'running', 'canceled', 'failed', 'ready' )
),
created_at timestamp with time zone NOT NULL DEFAULT now(),
-- should be updated when some operation was performed in the node
-- and anything associated to it might have changed
updated_at timestamp with time zone DEFAULT NULL,
-- should register last time the node was used
used_at timestamp with time zone NOT NULL DEFAULT now(),
-- should register the number of times the node was used
hits NUMERIC DEFAULT 0,
-- should register what was the last node using current node
last_used_from char(40)
);

View File

@@ -2,15 +2,13 @@
CREATE OR REPLACE FUNCTION CDB_ColumnNames(REGCLASS)
RETURNS SETOF information_schema.sql_identifier
AS $$
SELECT c.column_name
FROM information_schema.columns c, pg_class _tn, pg_namespace _sn
WHERE table_name = _tn.relname
AND table_schema = _sn.nspname
AND _tn.oid = $1::oid
AND _sn.oid = _tn.relnamespace
ORDER BY ordinal_position;
SELECT
a.attname::information_schema.sql_identifier column_name
FROM pg_class c
LEFT JOIN pg_attribute a ON a.attrelid = c.oid
WHERE c.oid = $1::oid
AND a.attstattarget < 0 -- exclude system columns
ORDER BY a.attnum;
$$ LANGUAGE SQL;
-- This is to migrate from pre-0.2.0 version

View File

@@ -2,15 +2,13 @@
CREATE OR REPLACE FUNCTION CDB_ColumnType(REGCLASS, TEXT)
RETURNS information_schema.character_data
AS $$
SELECT c.data_type
FROM information_schema.columns c, pg_class _tn, pg_namespace _sn
WHERE table_name = _tn.relname
AND table_schema = _sn.nspname
AND column_name = $2
AND _tn.oid = $1::oid
AND _sn.oid = _tn.relnamespace;
SELECT
format_type(a.atttypid, NULL)::information_schema.character_data data_type
FROM pg_class c
LEFT JOIN pg_attribute a ON a.attrelid = c.oid
WHERE c.oid = $1::oid
AND a.attname = $2
AND a.attstattarget < 0; -- exclude system columns
$$ LANGUAGE SQL;
-- This is to migrate from pre-0.2.0 version

View File

@@ -1,4 +1,24 @@
-- security definer
-- Information about tables in a schema.
-- If the schema name parameter is NULL, then tables from all schemas
-- that may contain user tables are returned.
-- For each table, the regclass, schema name and table name are returned.
-- Scope: private.
CREATE OR REPLACE FUNCTION _CDB_UserTablesInSchema(schema_name text DEFAULT NULL)
RETURNS TABLE(table_regclass REGCLASS, schema_name TEXT, table_name TEXT)
AS $$
SELECT
c.oid::regclass AS table_regclass,
n.nspname::text AS schema_name,
c.relname::text AS table_relname
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND c.relname NOT IN ('cdb_tablemetadata', 'spatial_ref_sys')
AND CASE WHEN schema_name IS NULL
THEN n.nspname NOT IN ('pg_catalog', 'information_schema', 'topology', 'cartodb')
ELSE n.nspname = schema_name
END;
$$ LANGUAGE 'sql';
-- Pattern that can be used to detect overview tables and Extract
-- the intended zoom level from the table name.
@@ -68,6 +88,26 @@ AS $$
END;
$$ LANGUAGE PLPGSQL IMMUTABLE;
CREATE OR REPLACE FUNCTION _CDB_OverviewBaseTable(overview_table REGCLASS)
RETURNS REGCLASS
AS $$
DECLARE
table_name TEXT;
schema_name TEXT;
base_name TEXT;
base_table REGCLASS;
BEGIN
SELECT * FROM _cdb_split_table_name(overview_table) INTO schema_name, table_name;
base_name := _CDB_OverviewBaseTableName(table_name);
IF base_name != table_name THEN
base_table := Format('%I.%I', schema_name, base_name)::regclass;
ELSE
base_table := overview_table;
END IF;
RETURN base_table;
END;
$$ LANGUAGE PLPGSQL IMMUTABLE;
-- Schema and relation names of a table given its reloid
-- Scope: private.
-- Parameters
@@ -120,7 +160,7 @@ BEGIN
FOR row IN
SELECT * FROM CDB_Overviews(reloid)
LOOP
EXECUTE Format('DROP TABLE %I.%I;', schema_name, row.overview_table);
EXECUTE Format('DROP TABLE %s;', row.overview_table);
RAISE NOTICE 'Dropped overview for level %: %', row.z, row.overview_table;
END LOOP;
END;
@@ -140,16 +180,15 @@ RETURNS TABLE(base_table REGCLASS, z integer, overview_table REGCLASS)
AS $$
DECLARE
schema_name TEXT;
table_name TEXT;
base_table_name TEXT;
BEGIN
-- TODO: review implementation of CDB_UserTables an suitability for this
SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, base_table_name;
RETURN QUERY SELECT
reloid AS base_table,
_CDB_OverviewTableZ(cdb_usertables) AS z,
('"' || schema_name|| '"."' ||cdb_usertables || '"')::regclass AS overview_table
FROM CDB_UserTables()
WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=reloid), cdb_usertables)
_CDB_OverviewTableZ(table_name) AS z,
table_regclass AS overview_table
FROM _CDB_UserTablesInSchema(schema_name)
WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=reloid), table_name)
ORDER BY z;
END
$$ LANGUAGE PLPGSQL;
@@ -168,11 +207,13 @@ RETURNS TABLE(base_table REGCLASS, z integer, overview_table REGCLASS)
AS $$
SELECT
base_table::regclass AS base_table,
_CDB_OverviewTableZ(cdb_usertables) AS z,
('"' || _cdb_schema_name(base_table::regclass) || '"."' || cdb_usertables || '"')::regclass AS overview_table
_CDB_OverviewTableZ(table_name) AS z,
table_regclass AS overview_table
FROM
CDB_UserTables(), unnest(tables) base_table
WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=base_table), cdb_usertables)
_CDB_UserTablesInSchema(), unnest(tables) base_table
WHERE
schema_name = _cdb_schema_name(base_table)
AND _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=base_table), table_name)
ORDER BY base_table, z;
$$ LANGUAGE SQL;
@@ -194,17 +235,23 @@ AS $$
FROM pg_class c JOIN pg_namespace n on n.oid = c.relnamespace WHERE c.oid = reloid::oid;
ext_query = format(
'SELECT ST_EstimatedExtent(''%1$I'', ''%2$I'', ''%3$I'');',
'SELECT ST_EstimatedExtent(''%1$s'', ''%2$s'', ''%3$s'');',
table_id.schema_name, table_id.table_name, 'the_geom_webmercator'
);
BEGIN
EXECUTE ext_query INTO ext;
EXCEPTION
EXCEPTION
-- This is the typical ERROR: stats for "mytable" do not exist
WHEN internal_error THEN
-- Get stats and execute again
EXECUTE format('ANALYZE %1$I', reloid);
EXECUTE format('ANALYZE %1$s', reloid);
-- We check the geometry type in case the error is due to empty geometries
IF _CDB_GeometryTypes(reloid) IS NULL THEN
RETURN NULL;
END IF;
EXECUTE ext_query INTO ext;
END;
@@ -370,7 +417,7 @@ AS $$
SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name.overview_rel);
EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name, overview_rel);
-- Estimate number of rows
SELECT reltuples, relpages FROM pg_class INTO STRICT class_info
@@ -384,16 +431,16 @@ AS $$
ELSE
num_samples := ceil(class_info.reltuples*fraction);
EXECUTE Format('
CREATE TABLE %1$I AS SELECT * FROM %2$s
CREATE TABLE %4$I.%1$I AS SELECT * FROM %2$s
WHERE ctid = ANY (
ARRAY[
(SELECT CDB_RandomTids(''%2$s'', %3$s))
]
);
', overview_rel, reloid, num_samples);
', overview_rel, reloid, num_samples, schema_name);
END IF;
RETURN overview_rel;
RETURN Format('%I.%I', schema_name, overview_rel)::regclass;
END;
$$ LANGUAGE PLPGSQL;
@@ -429,9 +476,12 @@ AS $$
-- preserve the owner of the base table
SELECT u.usename
FROM pg_catalog.pg_class c JOIN pg_catalog.pg_user u ON (c.relowner=u.usesysid)
WHERE c.relname = dataset::text
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_user u ON (c.relowner=u.usesysid)
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = dataset_name::text AND n.nspname = dataset_scheme
INTO table_owner;
EXECUTE Format('ALTER TABLE IF EXISTS %s OWNER TO %I;', overview_table::text, table_owner);
-- preserve the table privileges
@@ -485,6 +535,70 @@ BEGIN
END
$$ LANGUAGE PLPGSQL STABLE;
-- Check if a column of a table is of an unlimited-length text type
CREATE OR REPLACE FUNCTION _cdb_unlimited_text_column(reloid REGCLASS, col_name TEXT)
RETURNS BOOLEAN
AS $$
SELECT EXISTS (
SELECT a.attname
FROM pg_class c
LEFT JOIN pg_attribute a ON a.attrelid = c.oid
LEFT JOIN pg_type t ON t.oid = a.atttypid
WHERE c.oid = reloid
AND a.attname = col_name
AND format_type(a.atttypid, NULL) IN ('text', 'character varying', 'character')
AND format_type(a.atttypid, NULL) = format_type(a.atttypid, a.atttypmod)
);
$$ LANGUAGE SQL STABLE;
CREATE OR REPLACE FUNCTION _cdb_categorical_column(reloid REGCLASS, col_name TEXT)
RETURNS BOOLEAN
AS $$
DECLARE
schema_name TEXT;
table_name TEXT;
available BOOLEAN;
categorical BOOLEAN;
BEGIN
SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
SELECT n_distinct IS NOT NULL
FROM pg_stats
WHERE pg_stats.schemaname = schema_name
AND pg_stats.tablename = table_name
AND pg_stats.attname = col_name
INTO available;
IF available IS NULL OR NOT available THEN
EXECUTE Format('ANALYZE %s;', reloid);
END IF;
SELECT n_distinct > 0 AND n_distinct <= 20
FROM pg_stats
WHERE pg_stats.schemaname = schema_name
AND pg_stats.tablename = table_name
AND pg_stats.attname = col_name
INTO categorical;
RETURN categorical;
END;
$$ LANGUAGE PLPGSQL VOLATILE;
CREATE OR REPLACE FUNCTION _cdb_mode_of_array(anyarray)
RETURNS anyelement AS
$$
SELECT a
FROM unnest($1) a
GROUP BY 1
ORDER BY COUNT(1) DESC, 1
LIMIT 1;
$$
LANGUAGE SQL IMMUTABLE;
DROP AGGREGATE IF EXISTS _cdb_mode(anyelement);
CREATE AGGREGATE _cdb_mode(anyelement) (
SFUNC=array_append,
STYPE=anyarray,
FINALFUNC=_cdb_mode_of_array,
INITCOND='{}'
);
-- SQL Aggregation expression for a datase attribute
-- Scope: private.
-- Parameters
@@ -499,6 +613,10 @@ AS $$
DECLARE
column_type TEXT;
qualified_column TEXT;
has_counter_column BOOLEAN;
feature_count TEXT;
total_feature_count TEXT;
base_table REGCLASS;
BEGIN
IF table_alias <> '' THEN
qualified_column := Format('%I.%I', table_alias, column_name);
@@ -508,19 +626,42 @@ BEGIN
column_type := CDB_ColumnType(reloid, column_name);
SELECT EXISTS (
SELECT * FROM CDB_ColumnNames(reloid) as colname WHERE colname = '_feature_count'
) INTO has_counter_column;
IF has_counter_column THEN
feature_count := '_feature_count';
total_feature_count := 'SUM(_feature_count)';
ELSE
feature_count := '1';
total_feature_count := 'count(*)';
END IF;
base_table := _CDB_OverviewBaseTable(reloid);
CASE column_type
WHEN 'double precision', 'real', 'integer', 'bigint' THEN
RETURN Format('AVG(%s)::' || column_type, qualified_column);
WHEN 'text' THEN
-- TODO: we could define a new aggregate function that returns distinct
-- separated values with a limit, adding ellipsis if more values existed
-- e.g. with '/' as separator and a limit of three:
-- 'A', 'B', 'A', 'C', 'D' => 'A/B/C/...'
-- Other ideas: if value is unique then use it, otherwise use something
-- like '*' or '(varies)' or '(multiple values)', or NULL
-- Using 'string_agg(' || qualified_column || ',''/'')'
-- here causes
RETURN 'CASE count(*) WHEN 1 THEN MIN(' || qualified_column || ') ELSE NULL END::' || column_type;
WHEN 'double precision', 'real', 'integer', 'bigint', 'numeric' THEN
IF column_name = '_feature_count' THEN
RETURN 'SUM(_feature_count)';
ELSE
IF column_type = 'integer' AND _cdb_categorical_column(base_table, column_name) THEN
RETURN Format('CDB_Math_Mode(%s)::', qualified_column) || column_type;
ELSE
RETURN Format('SUM(%s*%s)/%s::' || column_type, qualified_column, feature_count, total_feature_count);
END IF;
END IF;
WHEN 'text', 'character varying', 'character' THEN
IF _cdb_categorical_column(base_table, column_name) THEN
RETURN Format('_cdb_mode(%s)::', qualified_column) || column_type;
ELSE
IF _cdb_unlimited_text_column(base_table, column_name) THEN
-- TODO: this should not be applied to columns containing largish text;
-- it is intended only to short names/identifiers
RETURN 'CASE WHEN count(distinct ' || qualified_column || ') = 1 THEN MIN(' || qualified_column || ') WHEN ' || total_feature_count || ' < 5 THEN string_agg(distinct ' || qualified_column || ','' / '') ELSE ''*'' END::' || column_type;
ELSE
RETURN 'CASE count(*) WHEN 1 THEN MIN(' || qualified_column || ') ELSE NULL END::' || column_type;
END IF;
END IF;
WHEN 'boolean' THEN
RETURN 'CASE count(*) WHEN 1 THEN BOOL_AND(' || qualified_column || ') ELSE NULL END::' || column_type;
ELSE
@@ -589,19 +730,25 @@ AS $$
overview_rel TEXT;
reduction FLOAT8;
base_name TEXT;
pixel_m FLOAT8;
grid_m FLOAT8;
offset_m FLOAT8;
offset_x TEXT;
offset_y TEXT;
cell_x TEXT;
cell_y TEXT;
aggr_attributes TEXT;
attributes TEXT;
columns TEXT;
gtypes TEXT[];
schema_name TEXT;
table_name TEXT;
point_geom TEXT;
BEGIN
SELECT _CDB_GeometryTypes(reloid) INTO gtypes;
IF array_upper(gtypes, 1) <> 1 OR gtypes[1] <> 'ST_Point' THEN
IF gtypes IS NULL OR array_upper(gtypes, 1) <> 1 OR gtypes[1] <> 'ST_Point' THEN
-- This strategy only supports datasets with point geomety
RETURN NULL;
RETURN 'x';
END IF;
--TODO: check applicability: geometry type, minimum number of points...
@@ -610,13 +757,15 @@ AS $$
-- Grid size in pixels at Z level overview_z
IF grid_px IS NULL THEN
grid_px := 7.5;
grid_px := 1.0;
END IF;
SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
-- compute grid cell size using the overview_z dimension...
SELECT CDB_XYZ_Resolution(overview_z)*grid_px INTO grid_m;
-- pixel_m: size of a pixel in webmercator units (meters)
SELECT CDB_XYZ_Resolution(overview_z) INTO pixel_m;
-- grid size in meters
grid_m = grid_px * pixel_m;
attributes := _CDB_Aggregable_Attributes_Expression(reloid);
aggr_attributes := _CDB_Aggregated_Attributes_Expression(reloid);
@@ -627,19 +776,31 @@ AS $$
aggr_attributes := aggr_attributes || ', ';
END IF;
-- Center of each cell:
cell_x := Format('gx*%1$s + %2$s', grid_m, grid_m/2);
cell_y := Format('gy*%1$s + %2$s', grid_m, grid_m/2);
-- Displacement to the nearest pixel center:
IF MOD(grid_px::numeric, 1.0::numeric) = 0 THEN
offset_m := pixel_m/2 - MOD((grid_m/2)::numeric, pixel_m::numeric)::float8;
offset_x := Format('%s', offset_m);
offset_y := Format('%s', offset_m);
ELSE
offset_x := Format('%2$s/2 - MOD((%1$s)::numeric, (%2$s)::numeric)::float8', cell_x, pixel_m);
offset_y := Format('%2$s/2 - MOD((%1$s)::numeric, (%2$s)::numeric)::float8', cell_y, pixel_m);
END IF;
point_geom := Format('ST_SetSRID(ST_MakePoint(%1$s + %3$s, %2$s + %4$s), 3857)', cell_x, cell_y, offset_x, offset_y);
-- compute the resulting columns in the same order as in the base table
-- cartodb_id,
-- ST_Transform(ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857), 4326) AS the_geom,
-- ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857) AS the_geom_webmercator
-- %4$s
WITH cols AS (
SELECT
CASE c
WHEN 'cartodb_id' THEN 'cartodb_id'
WHEN 'the_geom' THEN
'ST_Transform(ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857), 4326) AS the_geom'
Format('ST_Transform(%s, 4326) AS the_geom', point_geom)
WHEN 'the_geom_webmercator' THEN
'ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857) AS the_geom_webmercator'
Format('%s AS the_geom_webmercator', point_geom)
ELSE c
END AS column
FROM CDB_ColumnNames(reloid) c
@@ -648,6 +809,10 @@ AS $$
SELECT * FROM cols
) AS s INTO columns;
IF NOT columns LIKE '%_feature_count%' THEN
columns := columns || ', n AS _feature_count';
END IF;
EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name, overview_rel);
-- Now we cluster the data using a grid of size grid_m
@@ -655,13 +820,11 @@ AS $$
-- If we had a selected numeric attribute of interest we could use it
-- as a weight for the average coordinates.
EXECUTE Format('
CREATE TABLE %3$I AS
CREATE TABLE %7$I.%3$I AS
WITH clusters AS (
SELECT
%5$s
count(*) AS n,
SUM(ST_X(f.the_geom_webmercator)) AS sx,
SUM(ST_Y(f.the_geom_webmercator)) AS sy,
Floor(ST_X(f.the_geom_webmercator)/%2$s)::int AS gx,
Floor(ST_Y(f.the_geom_webmercator)/%2$s)::int AS gy,
MIN(cartodb_id) AS cartodb_id
@@ -669,9 +832,9 @@ AS $$
GROUP BY gx, gy
)
SELECT %6$s FROM clusters
', reloid::text, grid_m, overview_rel, attributes, aggr_attributes, columns);
', reloid::text, grid_m, overview_rel, attributes, aggr_attributes, columns, schema_name);
RETURN overview_rel;
RETURN Format('%I.%I', schema_name, overview_rel)::regclass;
END;
$$ LANGUAGE PLPGSQL;
@@ -693,7 +856,7 @@ DECLARE
tolerance_px FLOAT8;
BEGIN
-- Use the default tolerance
tolerance_px := 2.0;
tolerance_px := 1.0;
RETURN CDB_CreateOverviewsWithToleranceInPixels(reloid, tolerance_px, refscale_strategy, reduce_strategy);
END;
$$ LANGUAGE PLPGSQL;
@@ -710,10 +873,15 @@ DECLARE
overview_z integer;
overview_tables REGCLASS[];
overviews_step integer := 1;
has_counter_column boolean;
BEGIN
-- Determine the referece zoom level
EXECUTE 'SELECT ' || quote_ident(refscale_strategy::text) || Format('(''%s'', %s);', reloid, tolerance_px) INTO ref_z;
IF ref_z < 0 OR ref_z IS NULL THEN
RETURN NULL;
END IF;
-- Determine overlay zoom levels
-- TODO: should be handled by the refscale_strategy?
overview_z := ref_z - 1;
@@ -735,6 +903,17 @@ BEGIN
SELECT array_append(overview_tables, base_rel) INTO overview_tables;
END LOOP;
IF overview_tables IS NOT NULL AND array_length(overview_tables, 1) > 0 THEN
SELECT EXISTS (
SELECT * FROM CDB_ColumnNames(reloid) as colname WHERE colname = '_feature_count'
) INTO has_counter_column;
IF NOT has_counter_column THEN
EXECUTE Format('
ALTER TABLE %s ADD COLUMN _feature_count integer DEFAULT 1;
', reloid);
END IF;
END IF;
RETURN overview_tables;
END;
$$ LANGUAGE PLPGSQL;

View File

@@ -6,7 +6,7 @@ CREATE OR REPLACE FUNCTION CDB_XYZ_Resolution(z INTEGER)
RETURNS FLOAT8
AS $$
-- circumference divided by 256 is z0 resolution, then divide by 2^z
SELECT 40075017.0 / 256 / power(2, z);
SELECT 6378137.0*2.0*pi() / 256.0 / power(2.0, z);
$$ LANGUAGE SQL IMMUTABLE STRICT;
-- }

View File

@@ -0,0 +1 @@
../scripts-available/CDB_AnalysisCatalog.sql

View File

@@ -9,35 +9,30 @@ SELECT 1114
{_vovw_3_base_bare_t,_vovw_2_base_bare_t,_vovw_1_base_bare_t,_vovw_0_base_bare_t}
113
{_vovw_2_base_bare_t,_vovw_1_base_bare_t,_vovw_0_base_bare_t}
126
number,int_number,name,start
AVG(number)::double precision AS number,AVG(int_number)::integer AS int_number,CASE count(*) WHEN 1 THEN MIN(name) ELSE NULL END::text AS name,CASE count(*) WHEN 1 THEN MIN(start) ELSE NULL END::date AS start
AVG(tab.number)::double precision AS number,AVG(tab.int_number)::integer AS int_number,CASE count(*) WHEN 1 THEN MIN(tab.name) ELSE NULL END::text AS name,CASE count(*) WHEN 1 THEN MIN(tab.start) ELSE NULL END::date AS start
{_vovw_3_base_t,_vovw_2_base_t,_vovw_1_base_t,_vovw_0_base_t}
113
SUM(number*1)/count(*)::double precision AS number,SUM(int_number*1)/count(*)::integer AS int_number,CASE WHEN count(distinct name) = 1 THEN MIN(name) WHEN count(*) < 5 THEN string_agg(distinct name,' / ') ELSE '*' END::text AS name,CASE count(*) WHEN 1 THEN MIN(start) ELSE NULL END::date AS start
SUM(tab.number*1)/count(*)::double precision AS number,SUM(tab.int_number*1)/count(*)::integer AS int_number,CASE WHEN count(distinct tab.name) = 1 THEN MIN(tab.name) WHEN count(*) < 5 THEN string_agg(distinct tab.name,' / ') ELSE '*' END::text AS name,CASE count(*) WHEN 1 THEN MIN(tab.start) ELSE NULL END::date AS start
{_vovw_2_base_t,_vovw_1_base_t,_vovw_0_base_t}
126
{_vovw_3_column_types_t,_vovw_2_column_types_t,_vovw_1_column_types_t,_vovw_0_column_types_t}
{_vovw_2_column_types_t,_vovw_1_column_types_t,_vovw_0_column_types_t}
(base_t,0,_vovw_0_base_t)
(base_t,1,_vovw_1_base_t)
(base_t,2,_vovw_2_base_t)
(base_t,3,_vovw_3_base_t)
(base_t,0,_vovw_0_base_t)
(base_t,1,_vovw_1_base_t)
(base_t,2,_vovw_2_base_t)
(base_t,3,_vovw_3_base_t)
(base_bare_t,0,_vovw_0_base_bare_t)
(base_bare_t,1,_vovw_1_base_bare_t)
(base_bare_t,2,_vovw_2_base_bare_t)
(base_bare_t,3,_vovw_3_base_bare_t)
(base_t,0,_vovw_0_base_t)
(base_t,1,_vovw_1_base_t)
(base_t,2,_vovw_2_base_t)
(base_t,3,_vovw_3_base_t)
(column_types_t,0,_vovw_0_column_types_t)
(column_types_t,1,_vovw_1_column_types_t)
(column_types_t,2,_vovw_2_column_types_t)
(column_types_t,3,_vovw_3_column_types_t)

View File

@@ -3,4 +3,5 @@ SET SCHEMA 'cartodb';
\i scripts-available/CDB_TableMetadata.sql
\i scripts-available/CDB_ColumnNames.sql
\i scripts-available/CDB_ColumnType.sql
\i scripts-available/CDB_AnalysisCatalog.sql
SET SCHEMA 'public';

View File

@@ -563,6 +563,13 @@ test_extension|public|"local-table-with-dashes"'
DATABASE=fdw_target tear_down_database
}
function test_cdb_catalog_basic_node() {
DEF="'{\"type\":\"buffer\",\"source\":\"b2db66bc7ac02e135fd20bbfef0fdd81b2d15fad\",\"radio\":10000}'"
sql postgres "INSERT INTO cartodb.cdb_analysis_catalog (node_id, analysis_def) VALUES ('1bbc4c41ea7c9d3a7dc1509727f698b7', ${DEF}::json)"
sql postgres "SELECT status from cartodb.cdb_analysis_catalog where node_id = '1bbc4c41ea7c9d3a7dc1509727f698b7'" should 'pending'
sql postgres "DELETE FROM cartodb.cdb_analysis_catalog"
}
#################################################### TESTS END HERE ####################################################
run_tests $@