update NEWS for 1.5.1

1.5.1 release artifact
new code to handle mixed geometries more quickly
2017-05-16 14:33:02 +00:00 · 2017-05-16 14:27:49 +00:00 · 2017-05-10 20:24:21 +00:00 · 2017-04-26 20:08:44 +00:00 · 2017-04-25 18:57:12 +00:00 · 2017-04-25 18:53:12 +00:00
17 changed files with 8533 additions and 1223 deletions
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,45 @@
+1.5.1 (2017-05-16)
+
+__Improvements__
+
+* Much improved performance for `OBS_GetData` when augmenting with several
+  different geometries simultaneously ([#285](https://github.com/CartoDB/observatory-extension/pull/285))
+* Return the automatically assigned normalization type from `OBS_GetMeta`
+  ([#285](https://github.com/CartoDB/observatory-extension/pull/285))
+
+1.5.0 (2017-04-24)
+
+__API Changes__
+
+* Add `suggested_name` to `OBS_GetMeta` responses
+  ([#281](https://github.com/CartoDB/observatory-extension/pull/281))
+* Add `geom_type`, `geom_extra`, and `geom_tags` to
+  `OBS_GetAvailableGeometries`.  This brings it up to spec with existing docs.
+  ([#282](https://github.com/CartoDB/observatory-extension/pull/282))
+* Add `timespan_type`, `timespan_extra`, and `timespan_tags` to
+  `OBS_GetAvailableTimespans` for consistency.
+  ([#282](https://github.com/CartoDB/observatory-extension/pull/282))
+
+1.4.0 (2017-03-21)
+
+__API Changes__
+
+* Allow for override of `target_area` and `target_geoms` in `OBS_GetMeta`
+  ([#276](https://github.com/CartoDB/observatory-extension/pull/276)).  This
+  allows the interface to work with points and sparse areas much btter.
+* Allow for override of `max_timespan_rank` and `max_score_rank` on an
+  item-by-item basis for metadata.
+* `numer_description`, `geom_description`, `denom_description`,
+  `numer_t_description`, `denom_t_description` and `geom_t_description` now
+  returned as part of `OBS_GetMeta`.
+
+__Improvements__
+
+* Reduced amount of simplification done on input geometries (from 0.0001 above
+  500 points to 0.00001 above 1000 points).
+* Added tests to confirm that accurate results are returned from automatic
+  boundary selection
+
 1.3.5 (2017-03-15)

 No changes.  Artifact to allow for data update.
--- a/doc/boundary_functions.md
+++ b/doc/boundary_functions.md
@@ -4,7 +4,7 @@ Use the following functions to retrieve [Boundary](https://carto.com/docs/carto-

 You can [access](https://carto.com/docs/carto-engine/data/accessing) boundaries through CARTO Builder. The same methods will work if you are using the CARTO Engine to develop your application. We [encourage you](http://docs/carto-engine/data/accessing/#best-practices) to use table modifying methods (UPDATE and INSERT) over dynamic methods (SELECT).

-## OBS_GetBoundariesByGeometry(polygon geometry, geometry_id text)
+## OBS_GetBoundariesByGeometry(geom geometry, geometry_id text)

 The ```OBS_GetBoundariesByGeometry(geometry, geometry_id)``` method returns a set of boundary geometries that intersect a supplied geometry. This can be used to find all boundaries that are within or overlap a bounding box. You have the ability to choose whether to retrieve all boundaries that intersect your supplied bounding box or only those that fall entirely inside of your bounding box.

@@ -12,7 +12,7 @@ The ```OBS_GetBoundariesByGeometry(geometry, geometry_id)``` method returns a se

 Name |Description
 --- | ---
-polygon | a bounding box or other WGS84 geometry
+geom | a WGS84 geometry
 geometry_id | a string identifier for a boundary geometry
 timespan (optional) | year(s) to request from ('NULL' (default) gives most recent)
 overlap_type (optional) | one of '[intersects](http://postgis.net/docs/manual-2.2/ST_Intersects.html)' (default), '[contains](http://postgis.net/docs/manual-2.2/ST_Contains.html)', or '[within](http://postgis.net/docs/manual-2.2/ST_Within.html)'.
@@ -26,7 +26,7 @@ Column Name | Description
 the_geom | a boundary geometry (e.g., US Census tract boundaries)
 geom_refs | a string identifier for the geometry (e.g., geoids of US Census tracts)

-If geometries are not found for the requested `polygon`, `geometry_id`, `timespan`, or `overlap_type`, then null values are returned.
+If geometries are not found for the requested `geom`, `geometry_id`, `timespan`, or `overlap_type`, then null values are returned.

 #### Example

@@ -44,7 +44,6 @@ FROM OBS_GetBoundariesByGeometry(

 #### Errors

-* If a geometry other than a point is passed as the first argument, an error is thrown: `Invalid geometry type (ST_Polygon), expecting 'ST_Point'`
 * If an `overlap_type` other than the valid ones listed above is entered, then an error is thrown

 ## OBS_GetPointsByGeometry(polygon geometry, geometry_id text)
--- a/doc/discovery_functions.md
+++ b/doc/discovery_functions.md
@@ -327,9 +327,12 @@ timespan_id | Text | The ID of the timespan
 timespan_name | Text | A human readable name for the timespan
 timespan_description | Text | Ignored
 timespan_weight | Numeric | Ignored
+timespan_aggregate | Text | Ignored
 timespan_license | Text | Ignored
 timespan_source | Text | Ignored
-timespan_aggregate | Text | Ignored
+timespan_type | Text | Ignored
+timespan_extra | JSONB | Ignored
+timespan_tags | JSONB | Ignored
 valid_numer | Boolean | True if the `numer_id` argument is a valid numerator for this timespan, False otherwise
 valid_denom | Boolean | True if the `timespan` argument is a valid timespan for this timespan, False otherwise
 valid_geom | Boolean | True if the `geom_id` argument is a valid geometry for this timespan, False otherwise
--- a/doc/measures_functions.md
+++ b/doc/measures_functions.md
@@ -196,7 +196,7 @@ UPDATE tablename
 SET segmentation = OBS_GetCategory(the_geom, 'us.census.spielman_singleton_segments.X55')
 ```

-## OBS_GetMeta(extent geometry, metadata json, max_timespan_rank, max_boundary_score_rank, num_target_geoms)
+## OBS_GetMeta(extent geometry, metadata json, max_timespan_rank, max_score_rank, target_geoms)

 The ```OBS_GetMeta(extent, metadata)``` function returns a completed Data
 Observatory metadata JSON Object for use in ```OBS_GetData(geomvals,
@@ -213,9 +213,9 @@ Name | Description
 ---- | -----------
 extent | A geometry of the extent of the input geometries
 metadata | A JSON array composed of metadata input objects.  Each indicates one desired measure for an output column, and optionally additional parameters about that column
-max_timespan_rank | How many historical time periods to include.  Defaults to 1
-max_boundary_score_rank | How many alternative boundary levels to include.  Defaults to 1
-num_target_geoms | Target number of geometries.  Boundaries with close to this many objects within `extent` will be ranked highest. 
+num_timespan_options | How many historical time periods to include.  Defaults to 1
+num_score_options | How many alternative boundary levels to include.  Defaults to 1
+target_geoms | Target number of geometries.  Boundaries with close to this many objects within `extent` will be ranked highest. 

 The schema of the metadata input objects are as follows:

@@ -227,6 +227,10 @@ normalization | The desired normalization.  One of 'area', 'prenormalized', or '
 denom_id | Identifier for a desired normalization column in case `normalization` is 'denominated'.  Will be automatically assigned if necessary.  Ignored if this metadata object specifies a geometry.
 numer_timespan | The desired timespan for the measurement.  Defaults to most recent timespan available if left unspecified.
 geom_timespan | The desired timespan for the geometry.  Defaults to timespan matching numer_timespan if left unspecified.
+target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area.  Unit is square degrees WGS84.  Set this to `0` if you want to use the smallest source geometry for this element of metadata, for example if you're passing in points.
+target_geoms | Override global `target_geoms` for this element of metadata
+max_timespan_rank | Only include timespans of this recency (for example, `1` is only the most recent timespan). No limit by default
+max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary).  Is `1` by default

 #### Returns

@@ -242,9 +246,12 @@ fail.

 Metadata Output Key | Description
 --- | -----------
+suggested_name | A suggested column name for adding this to an existing table
 numer_id | Identifier for desired measurement
 numer_timespan | Timespan that will be used of the desired measurement
 numer_name | Human-readable name of desired measure
+numer_description | Long human-readable description of the desired measure
+numer_t_description | Further information about the source table
 numer_type | PostgreSQL/PostGIS type of desired measure
 numer_colname | Internal identifier for column name
 numer_tablename | Internal identifier for table
@@ -252,6 +259,8 @@ numer_geomref_colname | Internal identifier for geomref column name
 denom_id | Identifier for desired normalization
 denom_timespan | Timespan that will be used of the desired normalization
 denom_name | Human-readable name of desired measure's normalization
+denom_description | Long human-readable description of the desired measure's normalization
+denom_t_description | Further information about the source table
 denom_type | PostgreSQL/PostGIS type of desired measure's normalization
 denom_colname | Internal identifier for normalization column name
 denom_tablename | Internal identifier for normalization table
@@ -259,12 +268,14 @@ denom_geomref_colname | Internal identifier for normalization geomref column nam
 geom_id | Identifier for desired boundary geometry
 geom_timespan | Timespan that will be used of the desired boundary geometry
 geom_name | Human-readable name of desired boundary geometry
+geom_description | Long human-readable description of the desired boundary geometry
+geom_t_description | Further information about the source table
 geom_type | PostgreSQL/PostGIS type of desired boundary geometry
 geom_colname | Internal identifier for boundary geometry column name
 geom_tablename | Internal identifier for boundary geometry table
 geom_geomref_colname | Internal identifier for boundary geometry ref column name
 timespan_rank | Ranking of this measurement by time, most recent is 1, second most recent 2, etc.
-score | The score of this measurement's boundary compared to the `extent` and `num_target_geoms` passed in.  Between 0 and 100.
+score | The score of this measurement's boundary compared to the `extent` and `target_geoms` passed in.  Between 0 and 100.
 score_rank | The ranking of this measurement's boundary, highest ranked is 1, second is 2, etc.
 numer_aggregate | The aggregate type of the numerator, either `sum`, `average`, `median`, or blank
 denom_aggregate | The aggregate type of the denominator, either `sum`, `average`, `median`, or blank
--- a/release/observatory--1.4.0.sql
+++ b/release/observatory--1.4.0.sql
--- a/release/observatory--1.5.0.sql
+++ b/release/observatory--1.5.0.sql
--- a/release/observatory--1.5.1.sql
+++ b/release/observatory--1.5.1.sql
--- a/release/observatory.control
+++ b/release/observatory.control
@@ -1,5 +1,5 @@
 comment = 'CartoDB Observatory backend extension'
-default_version = '1.3.5'
+default_version = '1.5.1'
 requires = 'postgis'
 superuser = true
 schema = cdb_observatory
--- a/src/pg/observatory.control
+++ b/src/pg/observatory.control
@@ -1,5 +1,5 @@
 comment = 'CartoDB Observatory backend extension'
-default_version = '1.3.5'
+default_version = '1.5.1'
 requires = 'postgis'
 superuser = true
 schema = cdb_observatory
--- a/src/pg/sql/41_observatory_augmentation.sql
+++ b/src/pg/sql/41_observatory_augmentation.sql
@@ -102,8 +102,8 @@ $$ LANGUAGE plpgsql STABLE;
 CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetMeta(
  geom geometry(Geometry, 4326),
  params JSON,
-  max_timespan_rank INTEGER DEFAULT NULL, -- cutoff for timespan ranks when there's ambiguity
-  max_score_rank INTEGER DEFAULT NULL, -- cutoff for geom ranks when there's ambiguity
+  num_timespan_options INTEGER DEFAULT NULL, -- how many timespan options to show
+  num_score_options INTEGER DEFAULT NULL, -- how many score options to show
  target_geoms INTEGER DEFAULT NULL
 )
 RETURNS JSON
@@ -115,20 +115,34 @@ DECLARE
  scores_clause TEXT;
  result JSON;
 BEGIN
-  IF max_timespan_rank IS NULL THEN
-    max_timespan_rank := 1;
+  IF num_timespan_options IS NULL THEN
+    num_timespan_options := 1;
  END IF;
-  IF max_score_rank IS NULL THEN
-    max_score_rank := 1;
+  IF num_score_options IS NULL THEN
+    num_score_options := 1;
  END IF;

  numer_filters := (SELECT Array_Agg(val) FILTER (WHERE val IS NOT NULL) FROM (SELECT (JSON_Array_Elements(params))->>'numer_id' val) foo);
  geom_filters := (SELECT Array_Agg(val) FILTER (WHERE val IS NOT NULL) FROM (SELECT (JSON_Array_Elements(params))->>'geom_id' val) bar);
  meta_filter_clause := '(m.numer_id = ANY ($6) OR m.geom_id = ANY ($7))';

-  scores_clause := 'SELECT *
-                    FROM cdb_observatory._OBS_GetGeometryScores($1,
-                    (SELECT Array_Agg(geom_id) FROM meta), $2) scores ';
+  scores_clause := ' agg_geoms AS (
+    SELECT target_geoms, target_area, ARRAY_AGG(geom_id) geom_ids
+    FROM meta
+    GROUP BY target_geoms, target_area
+  ), scores AS (
+    SELECT target_geoms, target_area,
+      CASE target_area
+      -- point-specific, just order by numgeoms instead of score
+      WHEN 0 THEN scores.numgeoms
+      -- has some area, use proper scoring
+      ELSE scores.score
+      END AS score,
+           scores.numgeoms, scores.table_id, scores.column_id
+    FROM agg_geoms,
+         LATERAL cdb_observatory._OBS_GetGeometryScores($1,
+            geom_ids, COALESCE(target_geoms, $2), target_area) scores
+  ) ';

  IF JSON_Array_Length(params) = 1 THEN
    IF numer_filters IS NULL AND geom_filters IS NOT NULL THEN
@@ -142,21 +156,22 @@ BEGIN
    END IF;

    IF geom_filters IS NOT NULL AND numer_filters IS NOT NULL THEN
-      scores_clause := 'SELECT 1 score, null, geom_tid table_id, geom_id column_id,
-                               null, null, null, null, null, null
-                        FROM meta ';
+      scores_clause := 'scores AS (
+        SELECT NULL::INTEGER target_geoms, NULL::Numeric target_area,
+        1 score, null, geom_tid table_id, geom_id column_id,
+        NULL::Integer numgeoms
+        FROM meta) ';
    END IF;
  END IF;

  EXECUTE format($string$
    WITH _filters AS (SELECT
-        generate_series(1, array_length($3, 1)) id,
-        (unnest($3))->>'numer_id' numer_id,
-        (unnest($3))->>'denom_id' denom_id,
-        (unnest($3))->>'geom_id' geom_id,
-        (unnest($3))->>'numer_timespan' numer_timespan,
-        (unnest($3))->>'geom_timespan' geom_timespan,
-        (unnest($3))->>'normalization' normalization
+        row_number() over () id, *
+        FROM json_to_recordset($3)
+        AS x(numer_id TEXT, denom_id TEXT, geom_id TEXT, numer_timespan TEXT,
+          geom_timespan TEXT, normalization TEXT, max_timespan_rank TEXT,
+          max_score_rank TEXT, target_geoms INTEGER, target_area Numeric
+        )
    ), meta AS (SELECT
        id,
        f.numer_id,
@@ -166,6 +181,8 @@ BEGIN
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_tablename END numer_tablename,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_type END numer_type,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_name END numer_name,
+        CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_description END numer_description,
+        CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_t_description END numer_t_description,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE m.numer_timespan END numer_timespan,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE m.denom_id END denom_id,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_aggregate END denom_aggregate,
@@ -173,6 +190,8 @@ BEGIN
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_geomref_colname END denom_geomref_colname,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_tablename END denom_tablename,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_name END denom_name,
+        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_description END denom_description,
+        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_t_description END denom_t_description,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_type END denom_type,
        CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_reltype END denom_reltype,
        m.geom_id,
@@ -182,8 +201,24 @@ BEGIN
        geom_geomref_colname,
        geom_tablename,
        geom_name,
+        geom_description,
+        geom_t_description,
        geom_type,
-        normalization
+        Coalesce(normalization,
+          -- automatically assign normalization to numeric numerators
+          CASE WHEN cdb_observatory.isnumeric(numer_type) THEN
+            CASE WHEN denom_reltype ILIKE 'denominator' THEN 'denominated'
+                 WHEN numer_aggregate ILIKE 'sum' THEN 'area'
+                 WHEN numer_aggregate IN ('median', 'average') AND denom_reltype ILIKE 'universe'
+                  THEN 'prenormalized'
+                ELSE 'prenormalized'
+            END ELSE NULL
+          END
+        ) normalization,
+        max_timespan_rank,
+        max_score_rank,
+        target_geoms,
+        target_area
      FROM observatory.obs_meta m JOIN _filters f
      ON CASE WHEN f.numer_id IS NULL THEN m.geom_id ELSE m.numer_id END =
         CASE WHEN f.numer_id IS NULL THEN f.geom_id ELSE f.numer_id END
@@ -194,9 +229,8 @@ BEGIN
        AND (m.geom_id = f.geom_id OR COALESCE(f.geom_id, '') = '')
        AND (m.geom_timespan = f.geom_timespan OR COALESCE(f.geom_timespan, '') = '')
        AND (m.numer_timespan = f.numer_timespan OR COALESCE(f.numer_timespan, '') = '')
-    ), scores AS (
-        %s
-    ), groups AS (SELECT
+    ), %s
+    , groups AS (SELECT
        id,
        scores.score,
        numer_timespan,
@@ -207,45 +241,68 @@ BEGIN
          'numer_id', numer_id,
          'timespan_rank', dense_rank() OVER (PARTITION BY id ORDER BY numer_timespan DESC),
          'score_rank', dense_rank() OVER (PARTITION BY id ORDER BY score DESC),
+          'timespan_rownum', row_number() over
+            (PARTITION BY id, score ORDER BY numer_timespan DESC, Coalesce(denom_id, '')),
+          'score_rownum', row_number() over
+            (PARTITION BY id, numer_timespan ORDER BY score DESC, Coalesce(denom_id, '')),
          'score', scores.score,
+          'suggested_name', cdb_observatory.FIRST(
+            LOWER(TRIM(BOTH '_' FROM regexp_replace(CASE WHEN numer_id IS NOT NULL
+              THEN CASE
+                WHEN normalization ILIKE 'area%%' THEN numer_colname || ' per sq km'
+                WHEN normalization ILIKE 'denom%%' THEN numer_colname || ' rate'
+                ELSE numer_colname
+              END || ' ' || numer_timespan
+              ELSE geom_name || ' ' || geom_timespan
+            END, '[^a-zA-Z0-9]+', '_', 'g')))
+          ),
          'numer_aggregate', cdb_observatory.FIRST(meta.numer_aggregate),
          'numer_colname', cdb_observatory.FIRST(meta.numer_colname),
          'numer_geomref_colname', cdb_observatory.FIRST(meta.numer_geomref_colname),
          'numer_tablename', cdb_observatory.FIRST(meta.numer_tablename),
          'numer_type', cdb_observatory.FIRST(meta.numer_type),
-          --'numer_description', cdb_observatory.FIRST(meta.numer_description),
-          --'numer_t_description', cdb_observatory.FIRST(meta.numer_t_description),
+          'numer_description', cdb_observatory.FIRST(meta.numer_description),
+          'numer_t_description', cdb_observatory.FIRST(meta.numer_t_description),
          'denom_aggregate', cdb_observatory.FIRST(meta.denom_aggregate),
          'denom_colname', cdb_observatory.FIRST(denom_colname),
          'denom_geomref_colname', cdb_observatory.FIRST(denom_geomref_colname),
          'denom_tablename', cdb_observatory.FIRST(denom_tablename),
          'denom_type', cdb_observatory.FIRST(meta.denom_type),
          'denom_reltype', cdb_observatory.FIRST(meta.denom_reltype),
-          --'denom_description', cdb_observatory.FIRST(meta.denom_description),
-          --'denom_t_description', cdb_observatory.FIRST(meta.denom_t_description),
+          'denom_description', cdb_observatory.FIRST(meta.denom_description),
+          'denom_t_description', cdb_observatory.FIRST(meta.denom_t_description),
          'geom_colname', cdb_observatory.FIRST(geom_colname),
          'geom_geomref_colname', cdb_observatory.FIRST(geom_geomref_colname),
          'geom_tablename', cdb_observatory.FIRST(geom_tablename),
          'geom_type', cdb_observatory.FIRST(meta.geom_type),
          'geom_timespan', cdb_observatory.FIRST(meta.geom_timespan),
-          --'geom_description', cdb_observatory.FIRST(meta.geom_description),
-          --'geom_t_description', cdb_observatory.FIRST(meta.geom_t_description),
+          'geom_description', cdb_observatory.FIRST(meta.geom_description),
+          'geom_t_description', cdb_observatory.FIRST(meta.geom_t_description),
          'numer_timespan', cdb_observatory.FIRST(numer_timespan),
          'numer_name', cdb_observatory.FIRST(numer_name),
          'denom_name', cdb_observatory.FIRST(denom_name),
          'geom_name', cdb_observatory.FIRST(geom_name),
          'normalization', cdb_observatory.FIRST(normalization),
+          'max_timespan_rank', cdb_observatory.FIRST(max_timespan_rank),
+          'max_score_rank', cdb_observatory.FIRST(max_score_rank),
+          'target_geoms', cdb_observatory.FIRST(scores.target_geoms),
+          'target_area', cdb_observatory.FIRST(scores.target_area),
+          'num_geoms', cdb_observatory.FIRST(scores.numgeoms),
          'denom_id', denom_id,
          'geom_id', meta.geom_id
        ) metadata
      FROM meta, scores
      WHERE meta.geom_id = scores.column_id
        AND meta.geom_tid = scores.table_id
+        AND COALESCE(meta.target_geoms, 0) = COALESCE(scores.target_geoms, 0)
+        AND COALESCE(meta.target_area, 0) = COALESCE(scores.target_area, 0)
      GROUP BY id, score, numer_id, denom_id, geom_id, numer_timespan
    ) SELECT JSON_AGG(metadata ORDER BY id)
      FROM groups
-      WHERE timespan_rank <= $4
-        AND score_rank <= $5
+      WHERE timespan_rank <= Coalesce((metadata->>'max_timespan_rank')::INTEGER, 'infinity'::FLOAT)
+        AND score_rank <= Coalesce((metadata->>'max_score_rank')::INTEGER, 1)
+        AND (metadata->>'timespan_rownum')::INTEGER <= $4
+        AND (metadata->>'score_rownum')::INTEGER <= $5
  $string$, meta_filter_clause, scores_clause)
  INTO result
  USING
@@ -254,9 +311,9 @@ BEGIN
         ELSE geom
    END,
    target_geoms,
-    (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]),
-    max_timespan_rank,
-    max_score_rank, numer_filters, geom_filters
+    params,
+    num_timespan_options,
+    num_score_options, numer_filters, geom_filters
    ;
  RETURN result;
 END;
@@ -536,14 +593,9 @@ RETURNS TABLE (
 )
 AS $$
 DECLARE
-  geom_colspecs TEXT;
-  geom_tables TEXT;
-  geomrefs_alias TEXT;
-  geomrefs_noalias TEXT;
-  data_colspecs TEXT;
-  data_tables TEXT;
-  obs_wheres TEXT;
-  user_wheres TEXT;
+  procgeom_clauses TEXT;
+  val_clauses TEXT;
+  json_clause TEXT;
  geomtype TEXT;
 BEGIN
    IF params IS NULL OR JSON_ARRAY_LENGTH(params) = 0 OR ARRAY_LENGTH(geomvals, 1) IS NULL THEN
@@ -553,250 +605,233 @@ BEGIN

    geomtype := ST_GeometryType(geomvals[1].geom);

-    EXECUTE
-      $query$
-        WITH _meta AS (SELECT
-          row_number() over () colid,
-          meta->>'id' id,
-          meta->>'numer_id' numer_id,
-          meta->>'numer_aggregate' numer_aggregate,
-          meta->>'numer_colname' numer_colname,
-          meta->>'numer_geomref_colname' numer_geomref_colname,
-          meta->>'numer_tablename' numer_tablename,
-          meta->>'numer_type' numer_type,
-          meta->>'denom_id' denom_id,
-          meta->>'denom_aggregate' denom_aggregate,
-          meta->>'denom_colname' denom_colname,
-          meta->>'denom_geomref_colname' denom_geomref_colname,
-          meta->>'denom_tablename' denom_tablename,
-          meta->>'denom_type' denom_type,
-          meta->>'denom_reltype' denom_reltype,
-          meta->>'geom_id' geom_id,
-          meta->>'geom_colname' geom_colname,
-          meta->>'geom_geomref_colname' geom_geomref_colname,
-          meta->>'geom_tablename' geom_tablename,
-          meta->>'geom_type' geom_type,
-          meta->>'numer_timespan' numer_timespan,
-          meta->>'geom_timespan' geom_timespan,
-          meta->>'normalization' normalization,
-          meta->>'api_method' api_method,
-          meta->'api_args' api_args
-          FROM UNNEST($1) AS meta
-        )
+    /* Read metadata to generate clauses for query */
+    EXECUTE $query$
+      WITH _meta AS (SELECT
+        row_number() over () colid, *
+        FROM json_to_recordset($1)
+        AS x(id TEXT, numer_id TEXT, numer_aggregate TEXT, numer_colname TEXT,
+             numer_geomref_colname TEXT, numer_tablename TEXT, numer_type TEXT,
+             denom_id TEXT, denom_aggregate TEXT, denom_colname TEXT,
+             denom_geomref_colname TEXT, denom_tablename TEXT, denom_type TEXT,
+             denom_reltype TEXT, geom_id TEXT, geom_colname TEXT,
+             geom_geomref_colname TEXT, geom_tablename TEXT, geom_type TEXT,
+             numer_timespan TEXT, geom_timespan TEXT, normalization TEXT,
+             api_method TEXT, api_args JSON)
+      ),
+
+      -- Generate procgeom clauses.
+      -- These join the users' geoms to the relevant geometries for the
+      -- asked-for measures in the Observatory.
+      _procgeom_clauses AS (
        SELECT
-        String_Agg(DISTINCT
-          CASE
-          -- pass-through geom if user is requesting it only
-          WHEN numer_id IS NULL AND api_method IS NULL THEN
-            geom_tablename || '.' || geom_colname || ' AS geom_' || geom_tablename
-          WHEN cdb_observatory.isnumeric(numer_type) AND api_method IS NULL THEN
-            -- for numeric points with area normalization, include areas of underlying geoms
-            CASE
-            WHEN $2 = 'ST_Point' AND (LOWER(normalization) LIKE 'area%' OR
-               (normalization IS NULL AND numer_aggregate ILIKE 'sum')) THEN
-              ' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
-              ' AS area_' || geom_tablename
-            -- for numeric areas, include more complex calcs
-            WHEN $2 != 'ST_Point' THEN
-            'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
-            '     THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)' ||
-            '     WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
-            '     THEN 1 ' ||
-            '     ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' ||
-                               geom_tablename || '.' || geom_colname || ')) / ' ||
-                           'Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0) ' ||
-            'END pct_' || geom_tablename
-            ELSE NULL
-            END
-          ELSE NULL END
-        , ', ') AS geom_colspecs,
-        String_Agg(DISTINCT 'observatory.' || geom_tablename, ', ') AS geom_tables,
-        String_Agg(
-        'JSON_Build_Object(' || CASE
-          -- api-delivered values
-          WHEN api_method IS NOT NULL THEN
-          '''value'', ' ||
-            'ARRAY_AGG( ' ||
-              api_method || '.' || numer_colname || ')::' || numer_type || '[]'
-          -- numeric internal values
-          WHEN cdb_observatory.isnumeric(numer_type) THEN
-          '''value'', ' || CASE
-          -- denominated
-          WHEN LOWER(normalization) LIKE 'denom%' OR
-               (normalization IS NULL AND LOWER(denom_reltype) LIKE 'denominator')
-            THEN CASE
-            -- denominated point-in-poly
-            WHEN $2 = 'ST_Point' THEN
-            ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
-            '      / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
-            -- denominated polygon interpolation
-            -- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
-            ELSE
-            ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
-            ' * pct_' || geom_tablename ||
-            ' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
-            '            * pct_' || geom_tablename || '), 0) ' ||
-            ' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
-            END
-          -- areaNormalized
-          WHEN LOWER(normalization) LIKE 'area%' OR
-              (normalization IS NULL AND numer_aggregate ILIKE 'sum')
-            THEN CASE
-            -- areaNormalized point-in-poly
-            WHEN $2 = 'ST_Point' THEN
-            ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
-            '      / area_' || geom_tablename || ')'
-            -- areaNormalized polygon interpolation
-            -- SUM (numer * (% OBS geom in user geom)) / area of big geom
-            ELSE
-            --' NULL END '
-            ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
-            ' * pct_' || geom_tablename ||
-            ' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) ' ||
-            ' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
-            END
-          -- median/average measures with universe
-          WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
-              denom_reltype ILIKE 'universe' AND
-              (normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
-            THEN CASE
-            -- predenominated point-in-poly
-            WHEN $2 = 'ST_Point' THEN
-            ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
-            ELSE
-            -- predenominated polygon interpolation weighted by universe
-            -- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
-            --     (10 * 1000 * 1) / (1000 * 1) = 10
-            --     (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
-            ' SUM(' || numer_tablename || '.' || numer_colname ||
-            ' * ' || denom_tablename || '.' || denom_colname ||
-            ' * pct_' || geom_tablename ||
-            ' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
-            ' * pct_' || geom_tablename || '), 0) ' ||
-            ' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
-            END
-          -- prenormalized for summable measures. point or summable only!
-          WHEN numer_aggregate ILIKE 'sum' AND
-              (normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
-            THEN CASE
-            -- predenominated point-in-poly
-            WHEN $2 = 'ST_Point' THEN
-            ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
-            ELSE
-            -- predenominated polygon interpolation
-            -- SUM (numer * (% user geom in OBS geom))
-            ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
-            ' * pct_' || geom_tablename ||
-            ' ) / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
-            END
-          -- Everything else. Point only!
-          ELSE CASE
-            WHEN $2 = 'ST_Point' THEN
-            ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
-            ELSE
-            ' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
-                numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
-            END
-          END || '::' || numer_type
-
-          -- categorical/text
-        WHEN LOWER(numer_type) LIKE 'text' THEN
-          '''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
-
-          -- geometry
-        WHEN numer_id IS NULL THEN
-          '''geomref'', geomref_' || geom_tablename || ', ' ||
-          '''value'', ' || 'cdb_observatory.FIRST(geom_' || geom_tablename ||
-              ')::TEXT'
-          -- code below will return the intersection of the user's geom and the
-          -- OBS geom
-          --'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
-          --    '.' || geom_colname || '))::TEXT'
-        ELSE ''
-        END || ')', ', ')
-        AS colspecs,
-
-        -- geomrefs, used to separate out rows in case we don't want to merge
-        -- results by user input IDs
-        --
-        -- api_method and geom_tablename are interchangeable since when an
-        -- api_method is passed, geom_tablename is ignored
-        String_Agg(DISTINCT COALESCE(geom_tablename, api_method) || '.' || geom_geomref_colname ||
-          ' AS geomref_' || COALESCE(geom_tablename, api_method), ', ') AS geomrefs_alias,
-
-        String_Agg(DISTINCT 'geomref_' || COALESCE(geom_tablename, api_method)
-          , ', ') AS geomrefs_noalias,
-
-          (SELECT String_Agg(DISTINCT CASE
-              -- External API
-              WHEN tablename LIKE 'cdb_observatory.%' THEN
-                'LATERAL (SELECT * FROM ' || tablename || ') ' ||
-                  REPLACE(split_part(tablename, '(', 1), 'cdb_observatory.', '')
-              -- Internal obs_ table
-              ELSE 'observatory.' || tablename
-            END, ', ') FROM (
-            SELECT DISTINCT UNNEST(tablenames_ary) tablename FROM (
-            SELECT ARRAY_AGG(numer_tablename) ||
-                ARRAY_AGG(denom_tablename) ||
-                ARRAY_AGG('cdb_observatory.' || api_method || '(_procgeoms.geom' || COALESCE(', ' ||
-                      (SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
-                       FROM (SELECT json_array_elements(api_args) as val) as vals),
-                    '') || ')')
-              tablenames_ary
-            ) tablenames_inner
-          ) tablenames_outer) data_tables,
-
-          String_Agg(DISTINCT array_to_string(ARRAY[
-            CASE WHEN numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL
-                 THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
-                      '_procgeoms.geomref_' || geom_tablename
-                 ELSE NULL END,
-            CASE WHEN numer_tablename != denom_tablename
-                 THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
-                      denom_tablename || '.' || denom_geomref_colname
-                 ELSE NULL END
-            ], ' AND '),
-           ' AND ') FILTER (WHERE numer_tablename != denom_tablename OR
-                            (numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL)) AS obs_wheres,
-
-          String_Agg(DISTINCT 'ST_Intersects(' || geom_tablename || '.' ||  geom_colname
-             || ', _geoms.geom)', ' AND ')
-             AS user_wheres
+          '_procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (' ||
+            CASE WHEN api_method IS NULL THEN
+              'SELECT _geoms.id, ' ||
+                CASE $3 WHEN True THEN '_geoms.geom'
+                        ELSE geom_tablename || '.' || geom_colname
+                END || ' AS geom, ' ||
+                geom_tablename || '.' || geom_geomref_colname || ' AS geomref, ' ||
+                CASE
+                  WHEN $2 = 'ST_Point' THEN
+                    ' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
+                    ' AS area'
+                  -- for numeric areas, include more complex calcs
+                  ELSE
+                  'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')
+                       THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
+                       WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom)
+                       THEN 1
+                       ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) /
+                         Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
+                  END pct_obs'
+                END || '
+              FROM _geoms, observatory.' || geom_tablename || '
+              WHERE ST_Intersects(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')'
+              -- pass through input geometries for api_method
+              ELSE 'SELECT _geoms.id, _geoms.geom FROM _geoms'
+            END ||
+          ') '
+          AS procgeom_clause
        FROM _meta
-        ;
-      $query$
-    INTO geom_colspecs, geom_tables, data_colspecs, geomrefs_alias,
-         geomrefs_noalias, data_tables, obs_wheres, user_wheres
-    USING (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]), geomtype;
+        GROUP BY api_method, geom_tablename, geom_geomref_colname, geom_colname
+      ),

+      -- Generate val clauses.
+      -- These perform interpolations or other necessary calculations to
+      -- provide values according to users geometries.
+      _val_clauses AS (
+        SELECT
+          '_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (
+            SELECT _procgeoms.id, ' ||
+              String_Agg('json_build_object(' || CASE
+                -- api-delivered values
+                WHEN api_method IS NOT NULL THEN
+                '''value'', ' ||
+                  'ARRAY_AGG( ' ||
+                    api_method || '.' || numer_colname || ')::' || numer_type || '[]'
+                -- numeric internal values
+                WHEN cdb_observatory.isnumeric(numer_type) THEN
+                '''value'', ' || CASE
+                  -- denominated
+                  WHEN LOWER(normalization) LIKE 'denom%'
+                    THEN CASE
+                    WHEN denom_tablename IS NULL THEN ' NULL '
+                    -- denominated point-in-poly
+                    WHEN $2 = 'ST_Point' THEN
+                    ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
+                    '      / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
+                    -- denominated polygon interpolation
+                    -- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
+                    ELSE
+                    ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
+                    ' * _procgeoms.pct_obs ' ||
+                    ' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
+                    '            * _procgeoms.pct_obs), 0) '
+                    END
+                  -- areaNormalized
+                  WHEN LOWER(normalization) LIKE 'area%'
+                    THEN CASE
+                    -- areaNormalized point-in-poly
+                    WHEN $2 = 'ST_Point' THEN
+                    ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
+                    '      / _procgeoms.area)'
+                    -- areaNormalized polygon interpolation
+                    -- SUM (numer * (% OBS geom in user geom)) / area of big geom
+                    ELSE
+                    --' NULL END '
+                    ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
+                    ' * _procgeoms.pct_obs' ||
+                    ' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) '
+                    END
+                  -- median/average measures with universe
+                  WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
+                      denom_reltype ILIKE 'universe' AND LOWER(normalization) LIKE 'pre%'
+                    THEN CASE
+                    -- predenominated point-in-poly
+                    WHEN $2 = 'ST_Point' THEN
+                    ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
+                    ELSE
+                    -- predenominated polygon interpolation weighted by universe
+                    -- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
+                    --     (10 * 1000 * 1) / (1000 * 1) = 10
+                    --     (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
+                    ' SUM(' || numer_tablename || '.' || numer_colname ||
+                    ' * ' || denom_tablename || '.' || denom_colname ||
+                    ' * _procgeoms.pct_obs ' ||
+                    ' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
+                    ' * _procgeoms.pct_obs ' || '), 0) '
+                    END
+                  -- prenormalized for summable measures. point or summable only!
+                  WHEN numer_aggregate ILIKE 'sum' AND LOWER(normalization) LIKE 'pre%'
+                    THEN CASE
+                    -- predenominated point-in-poly
+                    WHEN $2 = 'ST_Point' THEN
+                    ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
+                    ELSE
+                    -- predenominated polygon interpolation
+                    -- SUM (numer * (% user geom in OBS geom))
+                    ' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
+                    ' * _procgeoms.pct_obs) '
+                    END
+                  -- Everything else. Point only!
+                  ELSE CASE
+                    WHEN $2 = 'ST_Point' THEN
+                    ' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
+                    ELSE
+                    ' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
+                        numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
+                    END
+                  END || '::' || numer_type
+
+                -- categorical/text
+                WHEN LOWER(numer_type) LIKE 'text' THEN
+                  '''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
+                  -- geometry
+                WHEN numer_id IS NULL THEN
+                  '''geomref'', _procgeoms.geomref, ' ||
+                  '''value'', ' || 'cdb_observatory.FIRST(_procgeoms.geom)::TEXT'
+                  -- code below will return the intersection of the user's geom and the
+                  -- OBS geom
+                  --'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
+                  --    '.' || geom_colname || '))::TEXT'
+                ELSE ''
+                END
+              || ') val_' || colid, ', ')
+            || '
+            FROM _procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' _procgeoms ' ||
+              Coalesce(', ' || String_Agg(DISTINCT
+                  Coalesce('observatory.' || numer_tablename,
+                    'LATERAL (SELECT * FROM cdb_observatory.' || api_method || '(_procgeoms.geom' || Coalesce(', ' ||
+                        (SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
+                          FROM (SELECT JSON_Array_Elements(api_args) as val) as vals),
+                        '') || ')) AS ' || api_method)
+              , ', '), '') ||
+            Coalesce(' WHERE ' || String_Agg(DISTINCT
+              '_procgeoms.geomref = ' || numer_tablename || '.' || numer_geomref_colname, ' AND '
+            ), '') ||
+            CASE $3 WHEN True THEN E'\n GROUP BY _procgeoms.id ORDER BY _procgeoms.id '
+                    ELSE           E'\n GROUP BY _procgeoms.id, _procgeoms.geomref
+                                        ORDER BY _procgeoms.id, _procgeoms.geomref' END
+          || ')'
+          AS val_clause,
+          '_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) AS cte_name
+        FROM _meta
+        GROUP BY geom_tablename, geom_geomref_colname, geom_colname, api_method
+      ),
+
+      -- Generate clauses necessary to join together val_clauses
+      _val_joins AS (
+        SELECT String_Agg(a.cte_name || '.id = ' || b.cte_name || '.id ', ' AND ') val_joins
+        FROM _val_clauses a, _val_clauses b
+        WHERE a.cte_name != b.cte_name
+          AND a.cte_name < b.cte_name
+      ),
+
+      -- Generate JSON clause.  This puts together vals from val_clauses
+      _json_clause AS (SELECT
+        'SELECT ' || cdb_observatory.FIRST(cte_name) || '.id::INT,
+           Array_to_JSON(ARRAY[' || (SELECT String_Agg('val_' || colid, ', ') FROM _meta) || '])
+         FROM ' || String_Agg(cte_name, ', ') ||
+        Coalesce(' WHERE ' || val_joins, '')
+        AS json_clause
+        FROM _val_clauses, _val_joins
+        GROUP BY val_joins
+      )
+
+      SELECT (SELECT String_Agg(procgeom_clause, E',\n     ') FROM _procgeom_clauses),
+             (SELECT String_Agg(val_clause, E',\n     ') FROM _val_clauses),
+             json_clause
+      FROM _json_clause
+    $query$ INTO
+      procgeom_clauses,
+      val_clauses,
+      json_clause
+    USING params, geomtype, merge;
+
+    /* Execute query */
    RETURN QUERY EXECUTE format($query$
      WITH _raw_geoms AS (%s),
      _geoms AS (SELECT id,
-        CASE WHEN (ST_NPoints(geom) > 500)
-               THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.0001)), 3)
+        CASE WHEN (ST_NPoints(geom) > 1000)
+               THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.00001)), 3)
             ELSE geom END geom
        FROM _raw_geoms),
-      _procgeoms AS (SELECT _geoms.id, _geoms.geom %s %s
-        FROM _geoms %s
-        %s
-      )
-      SELECT _procgeoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
-      FROM _procgeoms %s
-           %s
-      GROUP BY _procgeoms.id %s
-      ORDER BY _procgeoms.id
-    $query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1 THEN
-               ' SELECT $1[1].val as id, $1[1].geom as geom '
-             ELSE
-               ' SELECT val as id, geom FROM UNNEST($1) '
+      -- procgeom_clauses
+      %s,
+
+      -- val_clauses
+      %s
+
+      -- json_clause
+      %s
+    $query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1
+               THEN ' SELECT $1[1].val as id, $1[1].geom as geom '
+               ELSE ' SELECT val as id, geom FROM UNNEST($1) '
             END,
-             ', ' || NullIf(geomrefs_alias, ''),
-             ', ' || NullIf(geom_colspecs, ''),
-             ', ' || NullIf(geom_tables, ''),
-             'WHERE ' || NullIf( user_wheres, ''),
-              data_colspecs, ', ' || NullIf(data_tables, ''),
-             'WHERE ' || NULLIF(obs_wheres, ''),
-             CASE WHEN merge IS False THEN ', ' || geomrefs_noalias ELSE '' END)
+             String_Agg(procgeom_clauses, E',\n       '),
+             String_Agg(val_clauses, E',\n       '),
+             json_clause)
    USING geomvals;
    RETURN;
 END;
--- a/src/pg/sql/42_observatory_exploration.sql
+++ b/src/pg/sql/42_observatory_exploration.sql
@@ -252,6 +252,9 @@ CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableGeometries(
  geom_aggregate TEXT,
  geom_license TEXT,
  geom_source TEXT,
+  geom_type TEXT,
+  geom_extra JSONB,
+  geom_tags JSONB,
  valid_numer BOOLEAN,
  valid_denom BOOLEAN,
  valid_timespan BOOLEAN,
@@ -286,6 +289,9 @@ BEGIN
             NULL::TEXT geom_aggregate,
             NULL::TEXT license,
             NULL::TEXT source,
+             geom_type::TEXT,
+             geom_extra::JSONB,
+             geom_tags::JSONB,
      $1 = ANY(numers) valid_numer,
      $2 = ANY(denoms) valid_denom,
      $3 = ANY(timespans) valid_timespan
@@ -319,6 +325,9 @@ CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableTimespans(
  timespan_aggregate TEXT,
  timespan_license TEXT,
  timespan_source TEXT,
+  timespan_type TEXT,
+  timespan_extra JSONB,
+  timespan_tags JSONB,
  valid_numer BOOLEAN,
  valid_denom BOOLEAN,
  valid_geom BOOLEAN
@@ -343,8 +352,11 @@ BEGIN
           timespan_description::TEXT,
           timespan_weight::NUMERIC,
           NULL::TEXT timespan_aggregate,
-           NULL::TEXT license,
-           NULL::TEXT source,
+           NULL::TEXT timespan_license,
+           NULL::TEXT timespan_source,
+           NULL::TEXT timespan_type,
+           NULL::JSONB timespan_extra,
+           NULL::JSONB timespan_tags,
    $1 = ANY(numers) valid_numer,
    $2 = ANY(denoms) valid_denom,
    $3 = ANY(geoms) valid_geom_id
@@ -418,7 +430,8 @@ $$ LANGUAGE plpgsql;
 CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetGeometryScores(
  bounds Geometry(Geometry, 4326) DEFAULT NULL,
  filter_geom_ids TEXT[] DEFAULT NULL,
-  desired_num_geoms INTEGER DEFAULT NULL
+  desired_num_geoms INTEGER DEFAULT NULL,
+  desired_area NUMERIC DEFAULT NULL
 ) RETURNS TABLE (
  score NUMERIC,
  numtiles BIGINT,
@@ -430,6 +443,8 @@ CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetGeometryScores(
  estnumgeoms NUMERIC,
  meanmediansize NUMERIC
 ) AS $$
+DECLARE
+  num_geoms_multiplier Numeric;
 BEGIN
  IF desired_num_geoms IS NULL THEN
    desired_num_geoms := 3000;
@@ -440,6 +455,18 @@ BEGIN
  IF ST_Npoints(bounds) > 10000 THEN
    bounds := ST_Envelope(bounds);
  END IF;
+  IF desired_area IS NULL THEN
+    desired_area := ST_Area(bounds);
+  END IF;
+
+  -- In case of points, desired_area will be 0.  We still want an accurate
+  -- estimate of numgeoms in that case.
+  IF desired_area = 0 THEN
+    num_geoms_multiplier := 1;
+  ELSE
+    num_geoms_multiplier := Coalesce(desired_area / Nullif(ST_Area(bounds), 0), 1);
+  END IF;
+
  RETURN QUERY
  EXECUTE $string$
    WITH clipped_geom AS (
@@ -453,13 +480,11 @@ BEGIN
    ), clipped_geom_countagg AS (
      SELECT column_id, table_id
        , BOOL_AND(ST_BandIsNoData(clipped_tile, 1)) nodata
-        , ST_CountAgg(clipped_tile, 1, False)::Numeric pixels -- -10
      FROM clipped_geom
      GROUP BY column_id, table_id
    ), clipped_geom_reagg AS (
      SELECT COUNT(*)::BIGINT cnt, a.column_id, a.table_id,
             cdb_observatory.FIRST(nodata) first_nodata,
-             cdb_observatory.FIRST(pixels) first_pixel,
             cdb_observatory.FIRST(tile) first_tile,
             (ST_SummaryStatsAgg(clipped_tile, 1, False)).sum::Numeric sum_geoms, -- ND
             (ST_SummaryStatsAgg(clipped_tile, 2, False)).mean::Numeric / 255 mean_fill --ND
@@ -474,9 +499,8 @@ BEGIN
        , (CASE WHEN first_nodata IS FALSE
                THEN sum_geoms
                ELSE COALESCE(ST_Value(first_tile, 1, ST_PointOnSurface($1)), 0)
-                  * (ST_Area($1) / ST_Area(ST_PixelAsPolygon(first_tile, 0, 0))
-                    * first_pixel) -- -20
-          END)::Numeric
+                  * (ST_Area($1) / ST_Area(ST_PixelAsPolygon(first_tile, 0, 0)))
+          END)::Numeric * $4
        AS numgeoms
        , (CASE WHEN first_nodata IS FALSE
                THEN mean_fill
@@ -490,7 +514,7 @@ BEGIN
      ((100.0 / (1+abs(log(0.0001 + $3) - log(0.0001 + numgeoms::Numeric)))) * percentfill)::Numeric
      AS score, *
      FROM final
-  $string$ USING bounds, filter_geom_ids, desired_num_geoms;
+  $string$ USING bounds, filter_geom_ids, desired_num_geoms, num_geoms_multiplier;
  RETURN;
 END
 $$ LANGUAGE plpgsql IMMUTABLE;
--- a/src/pg/test/expected/41_observatory_augmentation_test.out
+++ b/src/pg/test/expected/41_observatory_augmentation_test.out
@@ -150,6 +150,18 @@ t|t|t|t|t|t|t|t|t|t|t|t|t|t|t
 obs_getmeta_conflicting_metadata
 t
 (1 row)
+obs_getmeta_suggested_name
+t
+(1 row)
+obs_getmeta_suggested_name_implicit_area
+t
+(1 row)
+obs_getmeta_suggested_name_area
+t
+(1 row)
+obs_getmeta_suggested_name_denom
+t
+(1 row)
 obs_getdata_geomval_empty_null
 t
 (1 row)
@@ -261,3 +273,31 @@ t|t
 ary_type|obs_getdata_api_geomrefs_args_string_return
 t|t
 (1 row)
+setseed
+
+(1 row)
+bg_sample|bg_max_error|bg_avg_error|bg_min_error
+1|t|t|t
+2|t|t|t
+3|t|t|t
+5|t|t|t
+10|t|t|t
+25|t|t|t
+50|t|t|t
+100|t|t|t
+2085|t|t|t
+(9 rows)
+tract_sample|tract_max_error|tract_avg_error|tract_min_error
+1|t|t|t
+2|t|t|t
+3|t|t|t
+5|t|t|t
+10|t|t|t
+25|t|t|t
+50|t|t|t
+100|t|t|t
+761|t|t|t
+(9 rows)
+no_bg_point_error
+t
+(1 row)
--- a/src/pg/test/expected/42_observatory_exploration_test.out
+++ b/src/pg/test/expected/42_observatory_exploration_test.out
@@ -120,6 +120,9 @@ t
 _obs_getavailablegeometries_bg_not_1996
 t
 (1 row)
+_obs_getavailablegeometries_has_boundary_tag
+t
+(1 row)
 _obs_getavailabletimespans_2010_2014_in_all
 t
 (1 row)
@@ -159,21 +162,36 @@ t
 _obs_geometryscores_2500km_buffer
 t
 (1 row)
-_obs_geometryscores_numgeoms_500m_buffer
-t
-(1 row)
-_obs_geometryscores_numgeoms_5km_buffer
-t
-(1 row)
-_obs_geometryscores_numgeoms_50km_buffer
-t
-(1 row)
-_obs_geometryscores_numgeoms_500km_buffer
-t
-(1 row)
-_obs_geometryscores_numgeoms_2500km_buffer
-t
-(1 row)
+column_id|_obs_geometryscores_numgeoms_500m_buffer
+us.census.tiger.block_group|2
+us.census.tiger.census_tract|1
+us.census.tiger.zcta5|0
+us.census.tiger.county|0
+(4 rows)
+column_id|_obs_geometryscores_numgeoms_5km_buffer
+us.census.tiger.block_group|244
+us.census.tiger.census_tract|78
+us.census.tiger.zcta5|9
+us.census.tiger.county|0
+(4 rows)
+column_id|_obs_geometryscores_numgeoms_50km_buffer
+us.census.tiger.block_group|10817
+us.census.tiger.census_tract|3396
+us.census.tiger.zcta5|484
+us.census.tiger.county|11
+(4 rows)
+column_id|_obs_geometryscores_numgeoms_500km_buffer
+us.census.tiger.block_group|48567
+us.census.tiger.census_tract|15823
+us.census.tiger.zcta5|6466
+us.census.tiger.county|295
+(4 rows)
+column_id|_obs_geometryscores_numgeoms_2500km_buffer
+us.census.tiger.block_group|165852
+us.census.tiger.census_tract|55283
+us.census.tiger.zcta5|27046
+us.census.tiger.county|2551
+(4 rows)
 _obs_geometryscores_500km_buffer_50_geoms
 t
 (1 row)
@@ -186,6 +204,12 @@ t
 _obs_geometryscores_500km_buffer_25000_geoms
 t
 (1 row)
+testarea_uses_tract
+t
+(1 row)
+points_use_bg
+t
+(1 row)
 _total_pop_in_legacy_builder_metadata
 t
 (1 row)
--- a/src/pg/test/fixtures/load_fixtures.sql
+++ b/src/pg/test/fixtures/load_fixtures.sql
--- a/src/pg/test/sql/41_observatory_augmentation_test.sql
+++ b/src/pg/test/sql/41_observatory_augmentation_test.sql
@@ -268,7 +268,7 @@ SELECT
 (meta->0->>'numer_name') = 'Total Population' numer_name,
 (meta->0->>'denom_id') IS NULL denom_id,
 (meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
-(meta->0->>'normalization') IS NULL normalization
+(meta->0->>'normalization') = 'area' normalization
 FROM meta;

 -- OBS_GetMeta for point completes one partial measure with "best" metadata
@@ -290,7 +290,7 @@ SELECT
 (meta->0->>'denom_type') = 'Numeric' denom_type,
 (meta->0->>'denom_name') = 'Total Population' denom_name,
 (meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
-(meta->0->>'normalization') IS NULL normalization
+(meta->0->>'normalization') = 'denominated' normalization
 FROM meta;

 -- OBS_GetMeta for polygon completes one partial measure with "best" metadata
@@ -308,7 +308,7 @@ SELECT
 (meta->0->>'numer_name') = 'Total Population' numer_name,
 (meta->0->>'denom_id') IS NULL denom_id,
 (meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
-(meta->0->>'normalization') IS NULL normalization
+(meta->0->>'normalization') = 'area' normalization
 FROM meta;

 -- OBS_GetMeta for polygon completes one partial measure with "best" metadata
@@ -330,13 +330,13 @@ SELECT
 (meta->0->>'denom_type') = 'Numeric' denom_type,
 (meta->0->>'denom_name') = 'Total Population' denom_name,
 (meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
-(meta->0->>'normalization') IS NULL normalization
+(meta->0->>'normalization') = 'denominated' normalization
 FROM meta;

 -- OBS_GetMeta for point completes several partial measures with "best"
 -- metadata, includes geom alternatives if asked
 WITH meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
-  '[{"numer_id": "us.census.acs.B01001002"}]', null, 2) meta)
+  '[{"numer_id": "us.census.acs.B01001002", "max_score_rank": 2}]', null, 2) meta)
 SELECT
 (meta->0->>'id')::integer = 1 id,
 (meta->0->>'numer_id') = 'us.census.acs.B01001002' numer_id,
@@ -352,7 +352,7 @@ SELECT
 (meta->0->>'denom_type') = 'Numeric' denom_type,
 (meta->0->>'denom_name') = 'Total Population' denom_name,
 (meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
-(meta->0->>'normalization') IS NULL normalization,
+(meta->0->>'normalization') = 'denominated' normalization,
 (meta->1->>'id')::integer = 1 id,
 (meta->1->>'numer_id') = 'us.census.acs.B01001002' numer_id,
 (meta->1->>'timespan_rank')::integer = 1 timespan_rank,
@@ -367,7 +367,7 @@ SELECT
 (meta->1->>'denom_type') = 'Numeric' denom_type,
 (meta->1->>'denom_name') = 'Total Population' denom_name,
 (meta->1->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
-(meta->1->>'normalization') IS NULL normalization
+(meta->1->>'normalization') = 'denominated' normalization
 FROM meta;

 -- OBS_GetMeta for point completes several partial measures with "best" metadata
@@ -389,7 +389,7 @@ SELECT
 (meta->0->>'denom_type') = 'Numeric' denom_type,
 (meta->0->>'denom_name') = 'Total Population' denom_name,
 (meta->0->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
-(meta->0->>'normalization') IS NULL normalization
+(meta->0->>'normalization') = 'denominated' normalization
 FROM meta;

 -- OBS_GetMeta for point completes several partial measures with conflicting
@@ -398,6 +398,26 @@ SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
  '[{"numer_id": "us.census.acs.B01001002", "denom_id": "us.census.acs.B01001002", "geom_id": "us.census.tiger.census_tract"}]') IS NULL
 AS obs_getmeta_conflicting_metadata;

+-- OBS_GetMeta provides suggested name for simple meta request
+SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
+  '[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom"}]'
+)->0->>'suggested_name' = 'total_pop_2010_2014' obs_getmeta_suggested_name;
+
+-- OBS_GetMeta provides suggested name for simple meta request with area norm
+SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
+  '[{"numer_id": "us.census.acs.B01003001"}]'
+)->0->>'suggested_name' = 'total_pop_per_sq_km_2010_2014' obs_getmeta_suggested_name_implicit_area;
+
+-- OBS_GetMeta provides suggested name for simple meta request with area norm
+SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
+  '[{"numer_id": "us.census.acs.B01003001", "normalization": "area"}]'
+)->0->>'suggested_name' = 'total_pop_per_sq_km_2010_2014' obs_getmeta_suggested_name_area;
+
+-- OBS_GetMeta provides suggested name for simple meta request with denom
+SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
+  '[{"numer_id": "us.census.acs.B01001002", "normalization": "denom"}]'
+)->0->>'suggested_name' = 'male_pop_rate_2010_2014' obs_getmeta_suggested_name_denom;
+
 -- OBS_GetData/OBS_GetMeta by id with empty list/null
 WITH data AS (SELECT * FROM cdb_observatory.OBS_GetData(ARRAY[]::TEXT[], null))
 SELECT ARRAY_AGG(data) IS NULL AS obs_getdata_geomval_empty_null FROM data;
@@ -662,25 +682,25 @@ FROM data;
 -- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure
 WITH
 meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
-  '[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}]') meta),
+  '[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}]') meta),
 data AS (SELECT * FROM cdb_observatory.OBS_GetData(
    ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
  (SELECT meta FROM meta), false))
 SELECT every(id = 1) is TRUE id,
       count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
-       abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop
+       abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop
 FROM data;

 -- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure + one text
 WITH
 meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
-  '[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
+  '[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
 data AS (SELECT * FROM cdb_observatory.OBS_GetData(
    ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
  (SELECT meta FROM meta), false))
 SELECT every(id = 1) is TRUE id,
       count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
-       abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop,
+       abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop,
       array_agg(distinct data->2->>'value') = '{"Block Group 1","Block Group 2","Block Group 3","Block Group 4","Block Group 5"}' correct_bg_names
 FROM data;

@@ -798,3 +818,146 @@ SELECT json_typeof(data->0->'value') = 'array' ary_type,
 AS OBS_GetData_API_geomrefs_args_string_return
 FROM cdb_observatory.obs_getdata(array['36047'],
      '[{"numer_type": "text", "numer_colname": "obs_getboundarybyid", "api_method": "obs_getboundarybyid", "api_args": ["us.census.tiger.county"]}]');
+
+-- Ensure consistent results below.
+select setseed(0);
+
+-- Check that random assortment of block groups in Brooklyn return accurate data
+WITH _geoms AS (
+  SELECT
+    (data->0->>'value')::geometry the_geom,
+    data->0->>'geomref' geom_ref,
+    (data->1->>'value')::numeric total_pop
+  FROM cdb_observatory.OBS_GetData(
+    array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
+    (SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
+      '[{"geom_id": "us.census.tiger.block_group"},
+        {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group", "normalization": "predenom"}]')),
+    FALSE
+  )
+  WHERE data->0->>'geomref' LIKE '36047%'
+  ORDER BY RANDOM()
+), geoms AS (
+  SELECT *, row_number() OVER () cartodb_id FROM _geoms
+), samples AS (
+  SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
+), filtered AS (
+  SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
+), summary AS (
+  SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
+    COUNT(*)::INT cnt,
+    ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
+    SUM(ST_Area(the_geom))::Numeric sumarea
+  FROM filtered
+  GROUP BY sample
+), meta AS (
+  SELECT sample, cdb_observatory.OBS_GetMeta(extent,
+    ('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
+    1, 1, cnt) meta
+  FROM summary
+  GROUP BY sample, extent, cnt, sumarea
+), results AS (
+  SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
+  FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
+  WHERE summary.sample = meta.sample
+) SELECT sample bg_sample
+ , MAX(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_max_error
+ , AVG(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_avg_error
+ , MIN(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_min_error
+FROM geoms, results
+WHERE cartodb_id = id
+GROUP BY sample
+ORDER BY sample
+;
+
+-- Check that random assortment of tracts in Brooklyn return accurate data
+WITH _geoms AS (
+  SELECT
+    (data->0->>'value')::geometry the_geom,
+    data->0->>'geomref' geom_ref,
+    (data->1->>'value')::numeric total_pop
+  FROM cdb_observatory.OBS_GetData(
+    array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
+    (SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
+      '[{"geom_id": "us.census.tiger.census_tract"},
+        {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.census_tract", "normalization": "predenom"}]')),
+    FALSE
+  )
+  WHERE data->0->>'geomref' LIKE '36047%'
+  ORDER BY RANDOM()
+), geoms AS (
+  SELECT *, row_number() OVER () cartodb_id FROM _geoms
+), samples AS (
+  SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
+), filtered AS (
+  SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
+), summary AS (
+  SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
+    COUNT(*)::INT cnt,
+    ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
+    SUM(ST_Area(the_geom))::Numeric sumarea
+  FROM filtered
+  GROUP BY sample
+), meta AS (
+  SELECT sample, cdb_observatory.OBS_GetMeta(extent,
+    ('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
+    1, 1, cnt) meta
+  FROM summary
+  GROUP BY sample, extent, cnt, sumarea
+), results AS (
+  SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
+  FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
+  WHERE summary.sample = meta.sample
+) SELECT sample tract_sample
+ , MAX(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_max_error
+ , AVG(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_avg_error
+ , MIN(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_min_error
+FROM geoms, results
+WHERE cartodb_id = id
+GROUP BY sample
+ORDER BY sample
+;
+
+-- Check that random assortment of block group points in Brooklyn return accurate data
+WITH _geoms AS (
+  SELECT
+    ST_PointOnSurface((data->0->>'value')::geometry) the_geom,
+    data->0->>'geomref' geom_ref,
+    (data->1->>'value')::numeric total_pop
+  FROM cdb_observatory.OBS_GetData(
+    array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
+    (SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
+      '[{"geom_id": "us.census.tiger.block_group"},
+        {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group", "normalization": "predenom"}]')),
+    FALSE
+  )
+  WHERE data->0->>'geomref' LIKE '36047%'
+), geoms AS (
+  SELECT *, row_number() OVER () cartodb_id FROM _geoms
+), samples AS (
+  SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
+), filtered AS (
+  SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
+), summary AS (
+  SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
+    COUNT(*)::INT cnt,
+    ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
+    SUM(ST_Area(the_geom))::Numeric sumarea
+  FROM filtered
+  GROUP BY sample
+), meta AS (
+  SELECT sample, cdb_observatory.OBS_GetMeta(extent,
+    ('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
+    1, 1, cnt) meta
+  FROM summary
+  GROUP BY sample, extent, cnt, sumarea
+), results AS (
+  SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
+  FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
+  WHERE summary.sample = meta.sample
+) SELECT
+ BOOL_AND(abs((geoms.total_pop - val) /
+      Coalesce(NullIf(total_pop, 0), 1)) = 0) is True no_bg_point_error
+FROM geoms, results
+WHERE cartodb_id = id
+;
--- a/src/pg/test/sql/42_observatory_exploration_test.sql
+++ b/src/pg/test/sql/42_observatory_exploration_test.sql
@@ -289,6 +289,11 @@ FROM cdb_observatory.OBS_GetAvailableGeometries(
 ) WHERE valid_timespan = True)
 AS _obs_getavailablegeometries_bg_not_1996;

+SELECT 'subsection/tags.boundary' IN (SELECT (Jsonb_Each(geom_tags)).key
+FROM cdb_observatory.OBS_GetAvailableGeometries(
+  ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)
+)) AS _obs_getavailablegeometries_has_boundary_tag;
+
 --
 -- OBS_GetAvailableTimespans tests
 --
@@ -360,9 +365,9 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
        'us.census.tiger.county', 'us.census.tiger.zcta5'])
      WHERE table_id LIKE '%2015%';

-SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
-       ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
-             'us.census.tiger.zcta5', 'us.census.tiger.county']
+SELECT ARRAY_AGG(column_id ORDER BY score DESC)
+       = ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
+             'us.census.tiger.county', 'us.census.tiger.zcta5']
       AS _obs_geometryscores_5km_buffer
       FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 5000)::Geometry(Geometry, 4326),
@@ -390,60 +395,55 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
      WHERE table_id LIKE '%2015%';

-SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
-       ARRAY['us.census.tiger.county', 'us.census.tiger.zcta5',
-             'us.census.tiger.census_tract', 'us.census.tiger.block_group']
+SELECT ARRAY_AGG(column_id ORDER BY score DESC)
+       = ARRAY['us.census.tiger.county', 'us.census.tiger.census_tract',
+             'us.census.tiger.zcta5', 'us.census.tiger.block_group']
      AS _obs_geometryscores_2500km_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 2500000)::Geometry(Geometry, 4326),
-  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
-        'us.census.tiger.zcta5', 'us.census.tiger.county'])
+  ARRAY['us.census.tiger.county', 'us.census.tiger.census_tract',
+        'us.census.tiger.zcta5', 'us.census.tiger.block_group'])
      WHERE table_id LIKE '%2015%';

-SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text
-      = '{ "us.census.tiger.block_group" : 9, "us.census.tiger.census_tract" : 3, "us.census.tiger.zcta5" : 0, "us.census.tiger.county" : 0 }'
-      AS _obs_geometryscores_numgeoms_500m_buffer
+SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_500m_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 500)::Geometry(Geometry, 4326),
  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
-      WHERE table_id LIKE '%2015%';
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC;

-SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
-      '{ "us.census.tiger.block_group" : 880, "us.census.tiger.census_tract" : 310, "us.census.tiger.zcta5" : 45, "us.census.tiger.county" : 1 }'
-      AS _obs_geometryscores_numgeoms_5km_buffer
+SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_5km_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 5000)::Geometry(Geometry, 4326),
  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
-      WHERE table_id LIKE '%2015%';
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC;

-SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
-      '{ "us.census.tiger.block_group" : 11531, "us.census.tiger.census_tract" : 3601, "us.census.tiger.zcta5" : 550, "us.census.tiger.county" : 14 }'
-      AS _obs_geometryscores_numgeoms_50km_buffer
+SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_50km_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 50000)::Geometry(Geometry, 4326),
  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
-      WHERE table_id LIKE '%2015%';
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC;

-SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
-      '{ "us.census.tiger.block_group" : 48917, "us.census.tiger.census_tract" : 15969, "us.census.tiger.zcta5" : 6534, "us.census.tiger.county" : 314 }'
-      AS _obs_geometryscores_numgeoms_500km_buffer
+SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_500km_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 500000)::Geometry(Geometry, 4326),
  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
-      WHERE table_id LIKE '%2015%';
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC;

-SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
-      '{ "us.census.tiger.block_group" : 169191, "us.census.tiger.census_tract" : 56469, "us.census.tiger.zcta5" : 26525, "us.census.tiger.county" : 2753 }'
-      AS _obs_geometryscores_numgeoms_2500km_buffer
+SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_2500km_buffer
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 2500000)::Geometry(Geometry, 4326),
  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
        'us.census.tiger.zcta5', 'us.census.tiger.county'])
-      WHERE table_id LIKE '%2015%';
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC;

 SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
       ARRAY['us.census.tiger.county', 'us.census.tiger.zcta5',
@@ -475,9 +475,9 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
        'us.census.tiger.zcta5', 'us.census.tiger.county'], 2500)
      WHERE table_id LIKE '%2015%';

-SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
-       ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
-             'us.census.tiger.zcta5', 'us.census.tiger.county']
+SELECT ARRAY_AGG(column_id ORDER BY score DESC)
+       = ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
+               'us.census.tiger.county', 'us.census.tiger.zcta5']
      AS _obs_geometryscores_500km_buffer_25000_geoms
      FROM cdb_observatory._OBS_GetGeometryScores(
  ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 50000)::Geometry(Geometry, 4326),
@@ -485,6 +485,44 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
        'us.census.tiger.zcta5', 'us.census.tiger.county'], 25000)
      WHERE table_id LIKE '%2015%';

+-- Check that one small geom approximates tract data
+WITH geoms AS (SELECT cdb_observatory._testarea() the_geom),
+summary AS (SELECT ST_SetSRID(ST_Extent(the_geom), 4326) extent,
+                   COUNT(*)::INT cnt,
+                   SUM(ST_Area(the_geom))::Numeric sumarea
+            FROM geoms)
+SELECT column_id = 'us.census.tiger.census_tract' testarea_uses_tract
+FROM summary, LATERAL (
+  SELECT *
+  FROM cdb_observatory._OBS_GetGeometryScores(extent,
+  ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
+        'us.census.tiger.zcta5', 'us.census.tiger.county'],
+        cnt, sumarea)) foo
+ORDER BY score DESC LIMIT 1;
+
+-- Check that randomly distributed points always use smallest geometry if we
+-- order by numgeoms desc
+WITH geoms as (SELECT UNNEST(ARRAY[
+    cdb_observatory._testpoint(),
+    st_translate(cdb_observatory._testpoint(), -0.003, 0),
+    st_translate(cdb_observatory._testpoint(), -0.006, 0)
+]) the_geom),
+summary as (SELECT
+  ST_SetSRID(ST_Extent(the_geom), 4326) extent,
+  SUM(ST_Area(the_geom))::Numeric area,
+  COUNT(*)::INTEGER cnt
+  FROM geoms
+)
+SELECT column_id = 'us.census.tiger.block_group' points_use_bg
+      FROM summary, LATERAL (
+        SELECT * FROM cdb_observatory._OBS_GetGeometryScores(
+          extent,
+          ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
+                'us.census.tiger.zcta5', 'us.census.tiger.county'],
+        cnt, area)) foo
+      WHERE table_id LIKE '%2015%'
+      ORDER BY numgeoms DESC LIMIT 1;
+
 --
 -- OBS_LegacyBuilderMetadata tests
 --
--- a/src/python/test/autotest.py
+++ b/src/python/test/autotest.py
@@ -73,6 +73,7 @@ SKIP_COLUMNS = set([
    , 'uk.ons.LC3204WA0004'
    , 'uk.ons.LC3204WA0007'
    , 'uk.ons.LC3204WA0010'
+    , 'br.geo.subdistritos_name'
 ])

 MEASURE_COLUMNS = query('''
Author	SHA1	Message	Date
John Krauss	415a4ccc05	update NEWS for 1.5.1	2017-05-16 14:33:02 +00:00
John Krauss	ccb8092506	1.5.1 release artifact	2017-05-16 14:27:49 +00:00
John Krauss	6266262427	new code to handle mixed geometries more quickly	2017-05-10 20:24:21 +00:00
John Krauss	183c046289	release artifact	2017-04-26 20:08:44 +00:00
John Krauss	8df89f4a91	remove br subdistritos from testing	2017-04-25 18:57:12 +00:00
John Krauss	28694163a2	prefer geographpic precision over most recent timespan, handles issues emerging from inclusion of 1-year acs	2017-04-25 18:53:12 +00:00
John Krauss	60c7f54315	update NEWS for 1.5.0, fix error in link in 1.4.0	2017-04-24 18:22:31 +00:00
John Krauss	3ebb0b8662	Merge branch 'release-v-1.5.0' into obs-getavailablegeometries-return-tags	2017-04-24 18:10:43 +00:00
John Krauss	a2e84696dc	fix tests to match fixture data	2017-04-24 18:01:38 +00:00
John Krauss	cd5cb38e8d	Merge branch 'release-v-1.5.0' into obs-getavailablegeometries-return-tags	2017-04-24 17:50:57 +00:00
John Krauss	26e1a2f461	Add tags to obs_getavailablegeometries Fixes #260 * Adds `geom_type`, `geom_extra`, and `geom_tags` to `OBS_GetAvailableGeometries`. This brings it up to spec with existing docs. * Adds `timespan_type`, `timespan_extra`, and `timespan_tags` to `OBS_GetAvailableTimespans` for consistency.	2017-04-03 21:51:32 +00:00
John Krauss	090a1add43	add suggested_name output to OBS_GetMeta. fixes #279	2017-04-03 19:44:00 +00:00
John Krauss	536af5e4a2	release artifact	2017-03-22 15:17:19 +00:00
John Krauss	ebf23d2a23	Merge branch 'develop' into release-v-1.4.0	2017-03-22 15:16:35 +00:00
John Krauss	f1afcf0d8e	update NEWS.md	2017-03-22 15:14:35 +00:00
John Krauss	3c0b40cf3f	more consistent arguments in docs	2017-03-22 15:12:50 +00:00
John Krauss	8a87dc7e9a	update NEWS.md	2017-03-21 21:24:50 +00:00
John Krauss	61552adba4	Allow for target_geoms and target_area override on column-by-column basis	2017-03-21 17:26:02 +00:00
csobier	36abbee64f	Merge pull request #274 from CartoDB/273-docs-edit clarification of docs for obs_getboundariesbygeometry function	2017-03-17 12:07:48 -04:00
csobier	5a76a7381e	clarification of docs for obs_getboundariesbygeometry function	2017-03-17 11:45:49 -04:00