Release version 0.16.3

Fixes overviews creation problem
Merge pull request #253 from CartoDB/252-estimateextent-quoting
2016-05-09 13:08:50 +02:00 · 2016-05-09 13:04:44 +02:00 · 2016-05-09 11:54:56 +02:00 · 2016-04-29 08:46:01 +02:00 · 2016-04-27 18:30:26 +02:00 · 2016-04-27 18:16:05 +02:00
13 changed files with 356 additions and 91 deletions
--- a/7
+++ b/7
@@ -1,7 +1,7 @@
 # cartodb/Makefile

 EXTENSION = cartodb
-EXTVERSION = 0.15.0
+EXTVERSION = 0.16.3

 SED = sed

@@ -65,6 +65,11 @@ UPGRADABLE = \
  0.14.3 \
  0.14.4 \
  0.15.0 \
+  0.15.1 \
+  0.16.0 \
+  0.16.1 \
+  0.16.2 \
+  0.16.3 \
  $(EXTVERSION)dev \
  $(EXTVERSION)next \
  $(END)
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,5 +1,51 @@
-0.15.0 (2016-04-05)
+0.16.3 (2016-05-09)
+-------------------

+* Fix overview creation problem for organization users
+  with names that require quoting:
+  [#253](https://github.com/CartoDB/cartodb-postgresql/pull/253)
+
+0.16.2 (2016-04-27)
+-------------------
+
+* Use the mode to aggregate category columns in overviews
+  [#246](https://github.com/CartoDB/cartodb-postgresql/pull/246)
+
+0.16.1 (2016-04-25)
+-------------------
+
+* Optimize column information functions performance
+  [#238](https://github.com/CartoDB/cartodb-postgresql/pull/238)
+
+* Adjust overview points to pixel CDB_EqualIntervalBins
+  [#242](https://github.com/CartoDB/cartodb-postgresql/pull/242)
+
+* Compute webmercator resolution using full numeric precision
+  [#243](https://github.com/CartoDB/cartodb-postgresql/pull/243)
+
+
+0.16.0 (2016-04-15)
+-------------------
+* Adds table for storing camshaft analysis nodes
+  [#237](https://github.com/CartoDB/cartodb-postgresql/pull/237)
+
+0.15.1 (2016-04-15)
+-------------------
+* Fix problems with org users in overviews functions
+  [#224](https://github.com/CartoDB/cartodb-postgresql/pull/224)
+* Add `_feature_count` to overviews
+  [#227](https://github.com/CartoDB/cartodb-postgresql/pull/227)
+* Change point clustering behaviour of overviews
+  [#228](https://github.com/CartoDB/cartodb-postgresql/pull/228)
+* Change default tolerance of overviews
+  [#230](https://github.com/CartoDB/cartodb-postgresql/pull/230)
+* Fix problem with aggregated numerical fields in overviews
+  [#233](https://github.com/CartoDB/cartodb-postgresql/pull/233)
+* Enhance aggregation of text fields in overviews
+  [#234]https://github.com/CartoDB/cartodb-postgresql/pull/234
+
+0.15.0 (2016-04-05)
+-------------------
 * New function CDB_CreateOverviewsWithToleranceInPixels that adds tolerance parameter for overview creation
  [#221](https://github.com/CartoDB/cartodb-postgresql/pull/221)
 * New default value for the overviews tolerance in pixels is 2 (used to be 7.5) (also in #221)
@@ -8,7 +54,7 @@
  [#220](https://github.com/CartoDB/cartodb-postgresql/pull/220)

 0.14.4 (2016-03-29)
-
+-------------------
 * Fix creating overviews for tables with boolean columns
  [#214](https://github.com/CartoDB/cartodb-postgresql/pull/214)
 * Fix tests for some systems [#215](https://github.com/CartoDB/cartodb-postgresql/pull/215)
--- a/doc/CDB_Overviews.md
+++ b/doc/CDB_Overviews.md
@@ -2,18 +2,25 @@ Overviews are tables that represent a *reduced* version of a dataset intended
 for efficient rendering at certain zoom levels while preserving the
 general visual appearance of the complete dataset.

-The *reduction* consists in a fewer number of records
+The *reduction* consists in havig a fewer number of records
 (while each overview record may represent an aggregation of multiple records)
 and/or simplified record geometries.

-Overviews are created through the `CDB_CreateOverviews`.
+Overviews are created through the `CDB_CreateOverviews` function.
 The statement timeout may need to be adjusted before using this function,
 as overview creation for large tables is a time-consuming operation.

 The `CDB_Overviews` function can be used determine what overview tables
 exist for a given dataset table and which zoom levels correspond to it.

-The `CDB_DropOverviews` remove a dataset's existing overviews.
+The `CDB_DropOverviews` function removes a dataset's existing overviews.
+
+To know if overview tables exist for some base table, and to obtain
+a list of which overview tables are approrpiate for which zoom levels,
+the `CDB_Overviews` functions can be used.
+
+The zoom level we're referring here to are those used
+by the tiler: http://wiki.openstreetmap.org/wiki/Zoom_levels

 ### CDB_CreateOverviews

@@ -51,10 +58,14 @@ CDB_CreateOverviews(table_name, ref_z_strategy, reduction_strategy)
 #### Tolerance / level of detail

 The level of detail to be representable by each overview layer can
-be specified as a tolerance in pixels (if different from the default of 2 pixels)
+be specified as a tolerance in pixels (if different from the default of 1 pixel)
 with the function `CDB_CreateOverviewsWithToleranceInPixels`
 which has as a second additional argument the desired tolerance.

+This tolerance defines the maximum deviation in pixels of the overviews
+geometries with respect to the original geometries when overview tables
+are used for their intendend zoom level.
+
 ### CDB_Overviews

 Obtain overview metadata for a given table (existing overviews).
@@ -79,7 +90,7 @@ SELECT CDB_Overviews(CDB_QueryTablesText('SELECT * FROM table1, table2'));
 The result of `CDB_Overviews` has three columns:

 | base_table | z | overview_table |
-|------------+---+----------------|
+| ---------- | - | -------------- |
 | table1     | 1 | table1_ov1     |
 | table1     | 2 | table1_ov2     |
 | table1     | 4 | table1_ov4     |
--- a/doc/cartodbfy-requirements.rst
+++ b/doc/cartodbfy-requirements.rst
@@ -33,7 +33,7 @@ Additionally, a CartoDB table can contain other columns.

 See the `CartoDB User Table documentation`_

-.. _CartoDB User Table documentation: https://github.com/CartoDB/cartodb-postgresql/blob/master/doc/CartoDB-user-table.md 
+.. _CartoDB User Table documentation: https://github.com/CartoDB/cartodb-postgresql/blob/master/doc/CartoDB-user-table.rst 
 for further information.

 High level requirements
--- a/scripts-available/CDB_AnalysisCatalog.sql
+++ b/scripts-available/CDB_AnalysisCatalog.sql
@@ -0,0 +1,24 @@
+-- Table to register analysis nodes from https://github.com/cartodb/camshaft
+CREATE TABLE IF NOT EXISTS
+cartodb.cdb_analysis_catalog (
+    -- md5 hex hash
+    node_id char(40) CONSTRAINT cdb_analysis_catalog_pkey PRIMARY KEY,
+    -- being json allows to do queries like analysis_def->>'type' = 'buffer'
+    analysis_def json NOT NULL,
+    -- can reference other nodes in this very same table, allowing recursive queries
+    input_nodes char(40) ARRAY NOT NULL DEFAULT '{}',
+    status TEXT NOT NULL DEFAULT 'pending',
+    CONSTRAINT valid_status CHECK (
+        status IN ( 'pending', 'waiting', 'running', 'canceled', 'failed', 'ready' )
+    ),
+    created_at timestamp with time zone NOT NULL DEFAULT now(),
+    -- should be updated when some operation was performed in the node
+    -- and anything associated to it might have changed
+    updated_at timestamp with time zone DEFAULT NULL,
+    -- should register last time the node was used
+    used_at timestamp with time zone NOT NULL DEFAULT now(),
+    -- should register the number of times the node was used
+    hits NUMERIC DEFAULT 0,
+    -- should register what was the last node using current node
+    last_used_from char(40)
+);
--- a/scripts-available/CDB_ColumnNames.sql
+++ b/scripts-available/CDB_ColumnNames.sql
@@ -2,15 +2,13 @@
 CREATE OR REPLACE FUNCTION CDB_ColumnNames(REGCLASS)
 RETURNS SETOF information_schema.sql_identifier
 AS $$
-
-    SELECT c.column_name
-      FROM information_schema.columns c, pg_class _tn, pg_namespace _sn
-      WHERE table_name = _tn.relname
-        AND table_schema = _sn.nspname
-        AND _tn.oid = $1::oid
-        AND _sn.oid = _tn.relnamespace
-      ORDER BY ordinal_position;
-
+  SELECT
+    a.attname::information_schema.sql_identifier column_name
+    FROM pg_class c
+         LEFT JOIN pg_attribute a ON a.attrelid = c.oid
+    WHERE c.oid = $1::oid
+    AND a.attstattarget < 0 -- exclude system columns
+   ORDER BY a.attnum;
 $$ LANGUAGE SQL;

 -- This is to migrate from pre-0.2.0 version
--- a/scripts-available/CDB_ColumnType.sql
+++ b/scripts-available/CDB_ColumnType.sql
@@ -2,15 +2,13 @@
 CREATE OR REPLACE FUNCTION CDB_ColumnType(REGCLASS, TEXT)
 RETURNS information_schema.character_data
 AS $$
-
-    SELECT c.data_type
-      FROM information_schema.columns c, pg_class _tn, pg_namespace _sn
-      WHERE table_name = _tn.relname
-        AND table_schema = _sn.nspname
-        AND column_name = $2
-        AND _tn.oid = $1::oid
-        AND _sn.oid = _tn.relnamespace;
-         
+  SELECT
+    format_type(a.atttypid, NULL)::information_schema.character_data data_type
+  FROM pg_class c
+       LEFT JOIN pg_attribute a ON a.attrelid = c.oid
+  WHERE c.oid = $1::oid
+  AND a.attname = $2
+  AND a.attstattarget < 0; -- exclude system columns
 $$ LANGUAGE SQL;

 -- This is to migrate from pre-0.2.0 version
--- a/scripts-available/CDB_Overviews.sql
+++ b/scripts-available/CDB_Overviews.sql
@@ -1,4 +1,24 @@
-- security definer
+-- Information about tables in a schema.
+-- If the schema name parameter is NULL, then tables from all schemas
+-- that may contain user tables are returned.
+-- For each table, the regclass, schema name and table name are returned.
+-- Scope: private.
+CREATE OR REPLACE FUNCTION _CDB_UserTablesInSchema(schema_name text DEFAULT NULL)
+RETURNS TABLE(table_regclass REGCLASS, schema_name TEXT, table_name TEXT)
+AS $$
+  SELECT
+    c.oid::regclass AS table_regclass,
+    n.nspname::text AS schema_name,
+    c.relname::text AS table_relname
+  FROM pg_class c
+  JOIN pg_namespace n ON n.oid = c.relnamespace
+  WHERE c.relkind = 'r'
+  AND c.relname NOT IN ('cdb_tablemetadata', 'spatial_ref_sys')
+  AND CASE WHEN schema_name IS NULL
+             THEN n.nspname NOT IN ('pg_catalog', 'information_schema', 'topology', 'cartodb')
+           ELSE n.nspname = schema_name
+           END;
+$$ LANGUAGE 'sql';

 -- Pattern that can be used to detect overview tables and Extract
 -- the intended zoom level from the table name.
@@ -68,6 +88,26 @@ AS $$
  END;
 $$ LANGUAGE PLPGSQL IMMUTABLE;

+CREATE OR REPLACE FUNCTION _CDB_OverviewBaseTable(overview_table REGCLASS)
+RETURNS REGCLASS
+AS $$
+  DECLARE
+    table_name TEXT;
+    schema_name TEXT;
+    base_name TEXT;
+    base_table REGCLASS;
+  BEGIN
+    SELECT * FROM _cdb_split_table_name(overview_table) INTO schema_name, table_name;
+    base_name := _CDB_OverviewBaseTableName(table_name);
+    IF base_name != table_name THEN
+      base_table := Format('%I.%I', schema_name, base_name)::regclass;
+    ELSE
+      base_table := overview_table;
+    END IF;
+    RETURN base_table;
+  END;
+$$ LANGUAGE PLPGSQL IMMUTABLE;
+
 -- Schema and relation names of a table given its reloid
 -- Scope: private.
 -- Parameters
@@ -120,7 +160,7 @@ BEGIN
    FOR row IN
        SELECT * FROM CDB_Overviews(reloid)
    LOOP
-        EXECUTE Format('DROP TABLE %I.%I;', schema_name, row.overview_table);
+        EXECUTE Format('DROP TABLE %s;', row.overview_table);
        RAISE NOTICE 'Dropped overview for level %: %', row.z, row.overview_table;
    END LOOP;
 END;
@@ -140,16 +180,15 @@ RETURNS TABLE(base_table REGCLASS, z integer, overview_table REGCLASS)
 AS $$
  DECLARE
    schema_name TEXT;
-    table_name TEXT;
+    base_table_name TEXT;
  BEGIN
-    -- TODO: review implementation of CDB_UserTables an suitability for this
-    SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
+    SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, base_table_name;
    RETURN QUERY SELECT
      reloid AS base_table,
-      _CDB_OverviewTableZ(cdb_usertables) AS z,
-      ('"' || schema_name|| '"."' ||cdb_usertables || '"')::regclass AS overview_table
-      FROM CDB_UserTables()
-      WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=reloid), cdb_usertables)
+      _CDB_OverviewTableZ(table_name) AS z,
+      table_regclass AS overview_table
+      FROM _CDB_UserTablesInSchema(schema_name)
+      WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=reloid), table_name)
      ORDER BY z;
  END
 $$ LANGUAGE PLPGSQL;
@@ -168,11 +207,13 @@ RETURNS TABLE(base_table REGCLASS, z integer, overview_table REGCLASS)
 AS $$
  SELECT
    base_table::regclass AS base_table,
-    _CDB_OverviewTableZ(cdb_usertables) AS z,
-    ('"' || _cdb_schema_name(base_table::regclass) || '"."' || cdb_usertables || '"')::regclass AS overview_table
+    _CDB_OverviewTableZ(table_name) AS z,
+    table_regclass AS overview_table
    FROM
-      CDB_UserTables(), unnest(tables) base_table
-    WHERE _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=base_table), cdb_usertables)
+      _CDB_UserTablesInSchema(), unnest(tables) base_table
+    WHERE
+      schema_name = _cdb_schema_name(base_table)
+      AND _CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=base_table), table_name)
    ORDER BY base_table, z;
 $$ LANGUAGE SQL;

@@ -194,17 +235,23 @@ AS $$
      FROM pg_class c JOIN pg_namespace n on n.oid = c.relnamespace WHERE c.oid = reloid::oid;

    ext_query = format(
-      'SELECT ST_EstimatedExtent(''%1$I'', ''%2$I'', ''%3$I'');',
+      'SELECT ST_EstimatedExtent(''%1$s'', ''%2$s'', ''%3$s'');',
      table_id.schema_name, table_id.table_name, 'the_geom_webmercator'
    );

    BEGIN
      EXECUTE ext_query INTO ext;
-      EXCEPTION
+    EXCEPTION
        -- This is the typical ERROR: stats for "mytable" do not exist
        WHEN internal_error THEN
          -- Get stats and execute again
-          EXECUTE format('ANALYZE %1$I', reloid);
+          EXECUTE format('ANALYZE %1$s', reloid);
+
+          -- We check the geometry type in case the error is due to empty geometries
+          IF _CDB_GeometryTypes(reloid) IS NULL THEN
+            RETURN NULL;
+          END IF;
+
          EXECUTE ext_query INTO ext;
    END;

@@ -370,7 +417,7 @@ AS $$

    SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;

-    EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name.overview_rel);
+    EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name, overview_rel);

    -- Estimate number of rows
    SELECT reltuples, relpages FROM pg_class INTO STRICT class_info
@@ -384,16 +431,16 @@ AS $$
    ELSE
      num_samples := ceil(class_info.reltuples*fraction);
      EXECUTE Format('
-        CREATE TABLE %1$I AS SELECT * FROM %2$s
+        CREATE TABLE %4$I.%1$I AS SELECT * FROM %2$s
          WHERE ctid = ANY (
            ARRAY[
              (SELECT CDB_RandomTids(''%2$s'', %3$s))
            ]
          );
-      ', overview_rel, reloid, num_samples);
+      ', overview_rel, reloid, num_samples, schema_name);
    END IF;

-    RETURN overview_rel;
+    RETURN Format('%I.%I', schema_name, overview_rel)::regclass;
  END;
 $$ LANGUAGE PLPGSQL;

@@ -429,9 +476,12 @@ AS $$

      -- preserve the owner of the base table
      SELECT u.usename
-        FROM pg_catalog.pg_class c JOIN pg_catalog.pg_user u ON (c.relowner=u.usesysid)
-        WHERE c.relname = dataset::text
+        FROM pg_catalog.pg_class c
+          JOIN pg_catalog.pg_user u ON (c.relowner=u.usesysid)
+          JOIN pg_namespace n ON n.oid = c.relnamespace
+        WHERE c.relname = dataset_name::text AND n.nspname = dataset_scheme
        INTO table_owner;
+
      EXECUTE Format('ALTER TABLE IF EXISTS %s OWNER TO %I;', overview_table::text, table_owner);

      -- preserve the table privileges
@@ -485,6 +535,70 @@ BEGIN
 END
 $$ LANGUAGE PLPGSQL STABLE;

+-- Check if a column of a table is of an unlimited-length text type
+CREATE OR REPLACE FUNCTION _cdb_unlimited_text_column(reloid REGCLASS, col_name TEXT)
+RETURNS BOOLEAN
+AS $$
+  SELECT EXISTS (
+    SELECT a.attname
+    FROM pg_class c
+         LEFT JOIN pg_attribute a ON a.attrelid = c.oid
+         LEFT JOIN pg_type t ON t.oid = a.atttypid
+    WHERE c.oid = reloid
+      AND a.attname = col_name
+      AND format_type(a.atttypid, NULL) IN ('text', 'character varying', 'character')
+      AND format_type(a.atttypid, NULL) = format_type(a.atttypid, a.atttypmod)
+  );
+$$ LANGUAGE SQL STABLE;
+
+CREATE OR REPLACE FUNCTION _cdb_categorical_column(reloid REGCLASS, col_name TEXT)
+RETURNS BOOLEAN
+AS $$
+DECLARE
+    schema_name TEXT;
+    table_name TEXT;
+    available BOOLEAN;
+    categorical BOOLEAN;
+BEGIN
+    SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;
+    SELECT n_distinct IS NOT NULL
+    FROM pg_stats
+    WHERE pg_stats.schemaname = schema_name
+      AND pg_stats.tablename = table_name
+      AND pg_stats.attname = col_name
+    INTO available;
+    IF available IS NULL OR NOT available THEN
+      EXECUTE Format('ANALYZE %s;', reloid);
+    END IF;
+    SELECT n_distinct > 0 AND n_distinct <= 20
+    FROM pg_stats
+    WHERE pg_stats.schemaname = schema_name
+      AND pg_stats.tablename = table_name
+      AND pg_stats.attname = col_name
+    INTO categorical;
+    RETURN categorical;
+END;
+$$ LANGUAGE PLPGSQL VOLATILE;
+
+CREATE OR REPLACE FUNCTION _cdb_mode_of_array(anyarray)
+  RETURNS anyelement AS
+$$
+    SELECT a
+    FROM unnest($1) a
+    GROUP BY 1
+    ORDER BY COUNT(1) DESC, 1
+    LIMIT 1;
+$$
+LANGUAGE SQL IMMUTABLE;
+
+DROP AGGREGATE IF EXISTS _cdb_mode(anyelement);
+CREATE AGGREGATE _cdb_mode(anyelement) (
+  SFUNC=array_append,
+  STYPE=anyarray,
+  FINALFUNC=_cdb_mode_of_array,
+  INITCOND='{}'
+);
+
 -- SQL Aggregation expression for a datase attribute
 -- Scope: private.
 -- Parameters
@@ -499,6 +613,10 @@ AS $$
 DECLARE
  column_type TEXT;
  qualified_column TEXT;
+  has_counter_column BOOLEAN;
+  feature_count TEXT;
+  total_feature_count TEXT;
+  base_table REGCLASS;
 BEGIN
  IF table_alias <> '' THEN
    qualified_column := Format('%I.%I', table_alias, column_name);
@@ -508,19 +626,42 @@ BEGIN

  column_type := CDB_ColumnType(reloid, column_name);

+  SELECT EXISTS (
+    SELECT * FROM CDB_ColumnNames(reloid)  as colname WHERE colname = '_feature_count'
+  ) INTO has_counter_column;
+  IF has_counter_column THEN
+    feature_count := '_feature_count';
+    total_feature_count := 'SUM(_feature_count)';
+  ELSE
+    feature_count := '1';
+    total_feature_count := 'count(*)';
+  END IF;
+
+  base_table := _CDB_OverviewBaseTable(reloid);
+
  CASE column_type
-  WHEN 'double precision', 'real', 'integer', 'bigint' THEN
-    RETURN Format('AVG(%s)::' || column_type, qualified_column);
-  WHEN 'text' THEN
-    -- TODO: we could define a new aggregate function that returns distinct
-    -- separated values with a limit, adding ellipsis if more values existed
-    -- e.g. with '/' as separator and a limit of three:
-    --     'A', 'B', 'A', 'C', 'D' => 'A/B/C/...'
-    -- Other ideas: if value is unique then use it, otherwise use something
-    -- like '*' or '(varies)' or '(multiple values)', or NULL
-    -- Using 'string_agg(' || qualified_column || ',''/'')'
-    -- here causes
-    RETURN 'CASE count(*) WHEN 1 THEN MIN(' || qualified_column || ') ELSE NULL END::' || column_type;
+  WHEN 'double precision', 'real', 'integer', 'bigint', 'numeric' THEN
+    IF column_name = '_feature_count' THEN
+      RETURN 'SUM(_feature_count)';
+    ELSE
+      IF column_type = 'integer' AND _cdb_categorical_column(base_table, column_name) THEN
+        RETURN Format('CDB_Math_Mode(%s)::', qualified_column) || column_type;
+      ELSE
+        RETURN Format('SUM(%s*%s)/%s::' || column_type, qualified_column, feature_count, total_feature_count);
+      END IF;
+    END IF;
+  WHEN 'text', 'character varying', 'character' THEN
+    IF _cdb_categorical_column(base_table, column_name) THEN
+      RETURN Format('_cdb_mode(%s)::', qualified_column) || column_type;
+    ELSE
+      IF _cdb_unlimited_text_column(base_table, column_name) THEN
+        -- TODO: this should not be applied to columns containing largish text;
+        -- it is intended only to short names/identifiers
+        RETURN  'CASE WHEN count(distinct ' || qualified_column || ') = 1 THEN MIN(' || qualified_column || ') WHEN ' || total_feature_count || ' < 5 THEN string_agg(distinct ' || qualified_column || ','' / '') ELSE ''*'' END::' || column_type;
+      ELSE
+        RETURN 'CASE count(*) WHEN 1 THEN MIN(' || qualified_column || ') ELSE NULL END::' || column_type;
+      END IF;
+    END IF;
  WHEN 'boolean' THEN
    RETURN 'CASE count(*) WHEN 1 THEN BOOL_AND(' || qualified_column || ') ELSE NULL END::' || column_type;
  ELSE
@@ -589,19 +730,25 @@ AS $$
    overview_rel TEXT;
    reduction FLOAT8;
    base_name TEXT;
+    pixel_m FLOAT8;
    grid_m FLOAT8;
+    offset_m FLOAT8;
+    offset_x TEXT;
+    offset_y TEXT;
+    cell_x TEXT;
+    cell_y TEXT;
    aggr_attributes TEXT;
    attributes TEXT;
    columns TEXT;
    gtypes TEXT[];
    schema_name TEXT;
    table_name TEXT;
+    point_geom TEXT;
  BEGIN
    SELECT _CDB_GeometryTypes(reloid) INTO gtypes;
-    IF array_upper(gtypes, 1) <> 1 OR gtypes[1] <> 'ST_Point' THEN
+    IF gtypes IS NULL OR array_upper(gtypes, 1) <> 1 OR gtypes[1] <> 'ST_Point' THEN
      -- This strategy only supports datasets with point geomety
      RETURN NULL;
-      RETURN 'x';
    END IF;

    --TODO: check applicability: geometry type, minimum number of points...
@@ -610,13 +757,15 @@ AS $$

    -- Grid size in pixels at Z level overview_z
    IF grid_px IS NULL THEN
-      grid_px := 7.5;
+      grid_px := 1.0;
    END IF;

    SELECT * FROM _cdb_split_table_name(reloid) INTO schema_name, table_name;

-    -- compute grid cell size using the overview_z dimension...
-    SELECT CDB_XYZ_Resolution(overview_z)*grid_px INTO grid_m;
+    -- pixel_m: size of a pixel in webmercator units (meters)
+    SELECT CDB_XYZ_Resolution(overview_z) INTO pixel_m;
+    -- grid size in meters
+    grid_m = grid_px * pixel_m;

    attributes := _CDB_Aggregable_Attributes_Expression(reloid);
    aggr_attributes := _CDB_Aggregated_Attributes_Expression(reloid);
@@ -627,19 +776,31 @@ AS $$
      aggr_attributes := aggr_attributes || ', ';
    END IF;

+    -- Center of each cell:
+    cell_x := Format('gx*%1$s + %2$s', grid_m, grid_m/2);
+    cell_y := Format('gy*%1$s + %2$s', grid_m, grid_m/2);
+
+    -- Displacement to the nearest pixel center:
+    IF MOD(grid_px::numeric, 1.0::numeric) = 0 THEN
+      offset_m := pixel_m/2 - MOD((grid_m/2)::numeric, pixel_m::numeric)::float8;
+      offset_x := Format('%s', offset_m);
+      offset_y := Format('%s', offset_m);
+    ELSE
+      offset_x := Format('%2$s/2 - MOD((%1$s)::numeric, (%2$s)::numeric)::float8', cell_x, pixel_m);
+      offset_y := Format('%2$s/2 - MOD((%1$s)::numeric, (%2$s)::numeric)::float8', cell_y, pixel_m);
+    END IF;
+
+    point_geom := Format('ST_SetSRID(ST_MakePoint(%1$s + %3$s, %2$s + %4$s), 3857)', cell_x, cell_y, offset_x, offset_y);
+
    -- compute the resulting columns in the same order as in the base table
-    -- cartodb_id,
-    -- ST_Transform(ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857), 4326) AS the_geom,
-    -- ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857) AS the_geom_webmercator
-    -- %4$s
    WITH cols AS (
      SELECT
        CASE c
        WHEN 'cartodb_id' THEN 'cartodb_id'
        WHEN 'the_geom' THEN
-          'ST_Transform(ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857), 4326) AS the_geom'
+          Format('ST_Transform(%s, 4326) AS the_geom', point_geom)
        WHEN 'the_geom_webmercator' THEN
-           'ST_SetSRID(ST_MakePoint(sx/n, sy/n), 3857) AS the_geom_webmercator'
+           Format('%s AS the_geom_webmercator', point_geom)
        ELSE c
        END AS column
        FROM CDB_ColumnNames(reloid) c
@@ -648,6 +809,10 @@ AS $$
      SELECT * FROM cols
    ) AS s INTO columns;

+    IF NOT columns LIKE '%_feature_count%' THEN
+      columns := columns || ', n AS _feature_count';
+    END IF;
+
    EXECUTE Format('DROP TABLE IF EXISTS %I.%I CASCADE;', schema_name, overview_rel);

    -- Now we cluster the data using a grid of size grid_m
@@ -655,13 +820,11 @@ AS $$
    -- If we had a selected numeric attribute of interest we could use it
    -- as a weight for the average coordinates.
    EXECUTE Format('
-      CREATE TABLE %3$I AS
+      CREATE TABLE %7$I.%3$I AS
         WITH clusters AS (
           SELECT
             %5$s
             count(*) AS n,
-             SUM(ST_X(f.the_geom_webmercator)) AS sx,
-             SUM(ST_Y(f.the_geom_webmercator)) AS sy,
             Floor(ST_X(f.the_geom_webmercator)/%2$s)::int AS gx,
             Floor(ST_Y(f.the_geom_webmercator)/%2$s)::int AS gy,
             MIN(cartodb_id) AS cartodb_id
@@ -669,9 +832,9 @@ AS $$
          GROUP BY gx, gy
         )
         SELECT %6$s FROM clusters
-    ', reloid::text, grid_m, overview_rel, attributes, aggr_attributes, columns);
+    ', reloid::text, grid_m, overview_rel, attributes, aggr_attributes, columns, schema_name);

-    RETURN overview_rel;
+    RETURN Format('%I.%I', schema_name, overview_rel)::regclass;
  END;
 $$ LANGUAGE PLPGSQL;

@@ -693,7 +856,7 @@ DECLARE
  tolerance_px FLOAT8;
 BEGIN
  -- Use the default tolerance
-  tolerance_px := 2.0;
+  tolerance_px := 1.0;
  RETURN CDB_CreateOverviewsWithToleranceInPixels(reloid, tolerance_px, refscale_strategy, reduce_strategy);
 END;
 $$ LANGUAGE PLPGSQL;
@@ -710,10 +873,15 @@ DECLARE
  overview_z integer;
  overview_tables REGCLASS[];
  overviews_step integer := 1;
+  has_counter_column boolean;
 BEGIN
  -- Determine the referece zoom level
  EXECUTE 'SELECT ' || quote_ident(refscale_strategy::text) || Format('(''%s'', %s);', reloid, tolerance_px) INTO ref_z;

+  IF ref_z < 0 OR ref_z IS NULL THEN
+    RETURN NULL;
+  END IF;
+
  -- Determine overlay zoom levels
  -- TODO: should be handled by the refscale_strategy?
  overview_z := ref_z - 1;
@@ -735,6 +903,17 @@ BEGIN
    SELECT array_append(overview_tables, base_rel) INTO overview_tables;
  END LOOP;

+  IF overview_tables IS NOT NULL AND array_length(overview_tables, 1) > 0 THEN
+    SELECT EXISTS (
+      SELECT * FROM CDB_ColumnNames(reloid)  as colname WHERE colname = '_feature_count'
+    ) INTO has_counter_column;
+    IF NOT has_counter_column THEN
+      EXECUTE Format('
+        ALTER TABLE %s ADD COLUMN _feature_count integer DEFAULT 1;
+      ', reloid);
+    END IF;
+  END IF;
+
  RETURN overview_tables;
 END;
 $$ LANGUAGE PLPGSQL;
--- a/scripts-available/CDB_XYZ.sql
+++ b/scripts-available/CDB_XYZ.sql
@@ -6,7 +6,7 @@ CREATE OR REPLACE FUNCTION CDB_XYZ_Resolution(z INTEGER)
 RETURNS FLOAT8
 AS $$
  -- circumference divided by 256 is z0 resolution, then divide by 2^z
-  SELECT 40075017.0 / 256 / power(2, z);
+  SELECT 6378137.0*2.0*pi() / 256.0 / power(2.0, z);
 $$ LANGUAGE SQL IMMUTABLE STRICT;
 -- }

--- a/scripts-enabled/260-CDB_AnalysisCatalog.sql
+++ b/scripts-enabled/260-CDB_AnalysisCatalog.sql
@@ -0,0 +1 @@
+../scripts-available/CDB_AnalysisCatalog.sql
--- a/test/CDB_OverviewsTest_expect
+++ b/test/CDB_OverviewsTest_expect
@@ -9,35 +9,30 @@ SELECT 1114



-{_vovw_3_base_bare_t,_vovw_2_base_bare_t,_vovw_1_base_bare_t,_vovw_0_base_bare_t}
-113
+{_vovw_2_base_bare_t,_vovw_1_base_bare_t,_vovw_0_base_bare_t}
+126
 number,int_number,name,start
-AVG(number)::double precision AS number,AVG(int_number)::integer AS int_number,CASE count(*) WHEN 1 THEN MIN(name) ELSE NULL END::text AS name,CASE count(*) WHEN 1 THEN MIN(start) ELSE NULL END::date AS start
-AVG(tab.number)::double precision AS number,AVG(tab.int_number)::integer AS int_number,CASE count(*) WHEN 1 THEN MIN(tab.name) ELSE NULL END::text AS name,CASE count(*) WHEN 1 THEN MIN(tab.start) ELSE NULL END::date AS start
-{_vovw_3_base_t,_vovw_2_base_t,_vovw_1_base_t,_vovw_0_base_t}
-113
+SUM(number*1)/count(*)::double precision AS number,SUM(int_number*1)/count(*)::integer AS int_number,CASE WHEN count(distinct name) = 1 THEN MIN(name) WHEN count(*) < 5 THEN string_agg(distinct name,' / ') ELSE '*' END::text AS name,CASE count(*) WHEN 1 THEN MIN(start) ELSE NULL END::date AS start
+SUM(tab.number*1)/count(*)::double precision AS number,SUM(tab.int_number*1)/count(*)::integer AS int_number,CASE WHEN count(distinct tab.name) = 1 THEN MIN(tab.name) WHEN count(*) < 5 THEN string_agg(distinct tab.name,' / ') ELSE '*' END::text AS name,CASE count(*) WHEN 1 THEN MIN(tab.start) ELSE NULL END::date AS start
+{_vovw_2_base_t,_vovw_1_base_t,_vovw_0_base_t}
+126

-{_vovw_3_column_types_t,_vovw_2_column_types_t,_vovw_1_column_types_t,_vovw_0_column_types_t}
+{_vovw_2_column_types_t,_vovw_1_column_types_t,_vovw_0_column_types_t}
 (base_t,0,_vovw_0_base_t)
 (base_t,1,_vovw_1_base_t)
 (base_t,2,_vovw_2_base_t)
-(base_t,3,_vovw_3_base_t)
 (base_t,0,_vovw_0_base_t)
 (base_t,1,_vovw_1_base_t)
 (base_t,2,_vovw_2_base_t)
-(base_t,3,_vovw_3_base_t)
 (base_bare_t,0,_vovw_0_base_bare_t)
 (base_bare_t,1,_vovw_1_base_bare_t)
 (base_bare_t,2,_vovw_2_base_bare_t)
-(base_bare_t,3,_vovw_3_base_bare_t)
 (base_t,0,_vovw_0_base_t)
 (base_t,1,_vovw_1_base_t)
 (base_t,2,_vovw_2_base_t)
-(base_t,3,_vovw_3_base_t)
 (column_types_t,0,_vovw_0_column_types_t)
 (column_types_t,1,_vovw_1_column_types_t)
 (column_types_t,2,_vovw_2_column_types_t)
-(column_types_t,3,_vovw_3_column_types_t)



--- a/test/extension/run_at_cartodb_schema.sql
+++ b/test/extension/run_at_cartodb_schema.sql
@@ -3,4 +3,5 @@ SET SCHEMA 'cartodb';
 \i scripts-available/CDB_TableMetadata.sql
 \i scripts-available/CDB_ColumnNames.sql
 \i scripts-available/CDB_ColumnType.sql
+\i scripts-available/CDB_AnalysisCatalog.sql
 SET SCHEMA 'public';
--- a/test/extension/test.sh
+++ b/test/extension/test.sh
@@ -563,6 +563,13 @@ test_extension|public|"local-table-with-dashes"'
    DATABASE=fdw_target tear_down_database
 }

+function test_cdb_catalog_basic_node() {
+    DEF="'{\"type\":\"buffer\",\"source\":\"b2db66bc7ac02e135fd20bbfef0fdd81b2d15fad\",\"radio\":10000}'"
+    sql postgres "INSERT INTO cartodb.cdb_analysis_catalog (node_id, analysis_def) VALUES ('1bbc4c41ea7c9d3a7dc1509727f698b7', ${DEF}::json)"
+    sql postgres "SELECT status from cartodb.cdb_analysis_catalog where node_id = '1bbc4c41ea7c9d3a7dc1509727f698b7'" should 'pending'
+    sql postgres "DELETE FROM cartodb.cdb_analysis_catalog"
+}
+
 #################################################### TESTS END HERE ####################################################

 run_tests $@
Author	SHA1	Message	Date
Javier Goizueta	f5f59be5b0	Release version 0.16.3 Fixes overviews creation problem	2016-05-09 13:08:50 +02:00
Javier Goizueta	d99dc394c2	Merge pull request #253 from CartoDB/252-estimateextent-quoting Do not quote arguments to ST_EstimatedExtent	2016-05-09 13:04:44 +02:00
Javier Goizueta	8d7860dc7a	Fixes #252	2016-05-09 11:54:56 +02:00
Javier Goizueta	b5427c65c8	Drop aggregate to be defined Otherwise future versions will fail to recreate the aggregate	2016-04-29 08:46:01 +02:00
Javier Goizueta	8f1435c049	Release 0.16.2	2016-04-27 18:30:26 +02:00
Javier Goizueta	8302f89413	Merge pull request #246 from CartoDB/245-categories-mode Use the mode to aggregate category columns in overviews	2016-04-27 18:16:05 +02:00
Javier Goizueta	e9050178a8	Merge branch 'master' of github.com:CartoDB/cartodb-postgresql	2016-04-27 16:23:46 +02:00
Javier Goizueta	3e34ca4654	Overviews documentation fixes	2016-04-27 16:23:25 +02:00
Javier Goizueta	a067cc7da1	Generate stats used to identify category columns in overviews if needed This only generates the stats if no stats are available for a table. This doesn't warrant that the stats are up to date or accurate.	2016-04-27 15:06:09 +02:00
Javier Goizueta	2c43943df6	Fix syntax	2016-04-26 18:27:52 +02:00
Javier Goizueta	417cbe7902	Fix category columns aggregation in overviews Overviews are created in cascade, each one from the inmediate lower level, but the stats to decide if a column is a category should be taken always from the base table.	2016-04-26 18:02:25 +02:00
Javier Goizueta	9a73703954	Use mode to aggregate categorical columns in overviews Fixes #245	2016-04-26 15:15:24 +02:00
Rafa de la Torre	36ac831bd1	Update cartodbfy-requirements.rst Fix broken link to doc	2016-04-26 14:43:24 +02:00
Javier Goizueta	1358964628	Release 0.16.1	2016-04-25 18:47:42 +02:00
Javier Goizueta	efe381ad94	Merge pull request #243 from CartoDB/241-webmercator Compute webmercator resolution with full accuracy	2016-04-25 17:30:40 +02:00
Javier Goizueta	f7cce21eb7	Merge pull request #242 from CartoDB/240-overviews-pixels Adjust overview points to pixel centers	2016-04-25 17:30:25 +02:00
Javier Goizueta	18267477da	Merge pull request #238 from CartoDB/235-column-names Optimize column information functions	2016-04-25 17:30:07 +02:00
Javier Goizueta	11ad45306f	Remove unneeded pg_catalog schema name	2016-04-25 16:30:58 +02:00
Javier Goizueta	75c7ae98e4	Compute webmercator resolution with full accuracy Fixes #241	2016-04-25 14:02:26 +02:00
Javier Goizueta	3c12cf629f	Optimize overview pixel adjustment for integer-pixel cells	2016-04-25 13:53:59 +02:00
Javier Goizueta	7b2100b51e	Adjust overview coordinates to pixel centers This makes the adjustment for all grid sizes, not only for integral number of pixels.	2016-04-25 13:33:43 +02:00
Javier Goizueta	580ec38ab8	Adjust overview clustered point to pixel centers Fixes #240	2016-04-23 15:07:06 +02:00
Raul Ochoa	897689dd43	Release 0.16.0	2016-04-19 15:44:37 +02:00
Raul Ochoa	808fc9fc25	Merge pull request #237 from CartoDB/analysis-catalog Adds table for storing camshaft analysis nodes	2016-04-19 15:32:46 +02:00
Javier Goizueta	65415bb335	Optimize funcion CDB_COlumnType	2016-04-18 19:07:33 +02:00
Javier Goizueta	06ebb27160	Optimize internal funcion _cdb_unlimited_text_column	2016-04-18 18:50:37 +02:00
Javier Goizueta	bd5ae84e90	Optimize CDB_ColumnNames This implementation is about 1000 times faster	2016-04-18 18:49:58 +02:00
Raul Ochoa	de5a702510	Adds table for storing camshaft analysis nodes	2016-04-18 17:41:39 +02:00
Javier Goizueta	6908fb4672	Release 0.15.1 Overviews bugfixes & enhancements	2016-04-15 18:15:35 +02:00
Javier Goizueta	a528a250d4	Merge pull request #234 from CartoDB/231-overviews-text-aggr Aggregate small number of text items in overviews	2016-04-15 18:04:07 +02:00
Javier Goizueta	ef43623f77	Remove unneeded variable	2016-04-15 17:58:03 +02:00
Javier Goizueta	09ad550de3	Fix tests	2016-04-15 17:50:47 +02:00
Javier Goizueta	1b0f77aa96	Always retain single-valued aggregated texts This makes columns which have the same value in a group to be aggregated maintain that value (rather than replacing it by the multiple-value indicator *) whatever the group value is. (Previously this happend only for small groups)	2016-04-15 17:49:00 +02:00
Javier Goizueta	45f063d469	Aggregate small number of text items in overviews Instead of nulling text fields for non-singleton aggregated records concatenate distinct text values when they're few (5 or less). Fixes #231	2016-04-15 12:37:16 +02:00
Carla	20989e2f28	Merge pull request #233 from CartoDB/232-overviews-avg Fix AVG computation in overview tables	2016-04-15 11:10:26 +02:00
Javier Goizueta	176d69d09e	Fix AVG computation in overview tables Fixes #232 Averages of averages are not equal to overall averages.	2016-04-15 10:48:08 +02:00
Javier Goizueta	9fdbfda60a	Merge pull request #228 from CartoDB/225-no-centroid-master Use cell centers, not cluster centroids when grouping points	2016-04-15 10:06:44 +02:00
Javier Goizueta	9a3d93976c	Merge pull request #227 from CartoDB/226-add_count_aggregated_features Include and aggregate _vovw_count column to count aggregated features	2016-04-15 10:06:05 +02:00
Javier Goizueta	46b45f6dd4	Merge pull request #224 from CartoDB/223-fix-dropoverviews Fix CDB_DropOverviews and CDB_Overviews problems	2016-04-15 10:05:28 +02:00
Carla Iriberri	fd14750ce5	Rename _vovw_count to _feature_count	2016-04-14 18:23:09 +02:00
Javier Goizueta	c595e45c11	Add _vovw_count columnt to tables for which overviews are created Initially we planned to add this column to the queries sent to the tiler only, but that makes the column hard to access from the editor.	2016-04-14 17:32:18 +02:00
Carla	1cf7074fb1	Merge pull request #230 from CartoDB/229-set_tolerance_px_to_1_overviews Set tolerance to 1 pixel in overviews by default	2016-04-14 17:26:37 +02:00
Javier Goizueta	f785e71d3b	Fix: numeric is a valid numeric column type Actually this is the type of aggregated _vovw_count columns	2016-04-14 15:46:03 +02:00
Carla	14b8cd7d99	Set default value to 1 and remove typo line	2016-04-14 12:12:17 +02:00
Carla	213adcca16	Fixes tests for tolerance_px = 1.0, with no zoom 3	2016-04-14 11:24:10 +02:00
Carla	1a571c8a9c	Set tolerance to 1 pixel	2016-04-14 11:13:59 +02:00
Carla Iriberri	8f44f5347a	Fix indent for code clarity	2016-04-13 17:51:30 +02:00
Carla Iriberri	f96163265b	Fix bug for tables without geom or with no potential overviews If the table doesn't have geometries but the createoverviews function is called, the current geometry type checks won't work because "null" will not give a boolean value in the type comparisons. Also, if the createoverviews function is called over a simple table with would not require overviews according to the strategies it is handled correctly.	2016-04-13 17:49:38 +02:00
Javier Goizueta	1c67214b09	Use cell centers, not cluster centroids when grouping points Fixes #225	2016-04-13 11:14:09 +02:00
Carla Iriberri	16d08ef52b	Include and aggregate _vovw_count column to count aggregated features	2016-04-12 11:10:55 +02:00
Javier Goizueta	15ac9a2cd9	Remove unneeded code	2016-04-07 10:30:10 +02:00
Javier Goizueta	ee61d46100	💄 rename variable for clarity	2016-04-07 10:24:02 +02:00
Javier Goizueta	49e7094c8a	Make CDB_CreateOverview usable by superuser CDB_CreateOverview had to be executed with the user role corresponding to the owner of the table; now it can be executed by the postgres user.	2016-04-07 07:52:58 +02:00
Javier Goizueta	fb910be12f	Fix conversion of regclass to indentifier	2016-04-07 07:07:20 +02:00
Javier Goizueta	34c39662ec	Replace use of CDB_UserTables in CDB_Overviews Use a function that returns reclasses and schema names properly instead.	2016-04-07 00:07:45 +02:00
Javier Goizueta	84cac16d1c	Temporary fix	2016-04-06 22:05:00 +02:00
Javier Goizueta	c1fc07d2ac	Fix typo This function isn't beint actively used; should consider removing it or testing it properly	2016-04-06 18:58:37 +02:00
Javier Goizueta	5c3c0f5fc9	Fix bug in CDB_DropOverviews Fixes #223	2016-04-06 18:57:52 +02:00
				`@@ -0,0 +1 @@`
				`../scripts-available/CDB_AnalysisCatalog.sql`