Compare commits
48 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3353ad0a32 | ||
|
|
b4ef3c77a9 | ||
|
|
90a2421b6e | ||
|
|
fd21709ca1 | ||
|
|
3791511d7d | ||
|
|
fad541c3fc | ||
|
|
48ed086fec | ||
|
|
7e550cf909 | ||
|
|
6ab17bf8be | ||
|
|
1f7f8015ad | ||
|
|
6066ef028d | ||
|
|
3c2e997a85 | ||
|
|
cef99c6343 | ||
|
|
50d975ce9b | ||
|
|
c56633dd2a | ||
|
|
2b26c5ad64 | ||
|
|
3ed18ca1f0 | ||
|
|
028c93170c | ||
|
|
8d52857f01 | ||
|
|
9e36e11bb3 | ||
|
|
adae37631e | ||
|
|
8b98b6b64a | ||
|
|
aedc45f2a8 | ||
|
|
8612da57f7 | ||
|
|
24a736c72e | ||
|
|
cde6d5bfba | ||
|
|
d1f4e570ad | ||
|
|
415a4ccc05 | ||
|
|
ccb8092506 | ||
|
|
6266262427 | ||
|
|
183c046289 | ||
|
|
8df89f4a91 | ||
|
|
28694163a2 | ||
|
|
60c7f54315 | ||
|
|
3ebb0b8662 | ||
|
|
a2e84696dc | ||
|
|
cd5cb38e8d | ||
|
|
26e1a2f461 | ||
|
|
090a1add43 | ||
|
|
536af5e4a2 | ||
|
|
ebf23d2a23 | ||
|
|
f1afcf0d8e | ||
|
|
3c0b40cf3f | ||
|
|
8a87dc7e9a | ||
|
|
61552adba4 | ||
|
|
36abbee64f | ||
|
|
5a76a7381e | ||
|
|
217ca2d84d |
42
.travis.yml
Normal file
42
.travis.yml
Normal file
@@ -0,0 +1,42 @@
|
||||
language: c
|
||||
|
||||
env:
|
||||
global:
|
||||
- PAGER=cat
|
||||
|
||||
before_install:
|
||||
- sudo add-apt-repository -y ppa:cartodb/postgresql-9.5
|
||||
- sudo add-apt-repository -y ppa:cartodb/gis
|
||||
- sudo add-apt-repository -y ppa:cartodb/gis-testing
|
||||
- sudo apt-get update
|
||||
|
||||
# Install postgres db and build deps
|
||||
- sudo /etc/init.d/postgresql stop # stop travis default instance
|
||||
- sudo apt-get -y remove --purge postgresql-9.1
|
||||
- sudo apt-get -y remove --purge postgresql-9.2
|
||||
- sudo apt-get -y remove --purge postgresql-9.3
|
||||
- sudo apt-get -y remove --purge postgresql-9.4
|
||||
- sudo apt-get -y remove --purge postgresql-9.5
|
||||
- sudo rm -rf /var/lib/postgresql/
|
||||
- sudo rm -rf /var/log/postgresql/
|
||||
- sudo rm -rf /etc/postgresql/
|
||||
- sudo apt-get -y remove --purge postgis-2.2
|
||||
- sudo apt-get -y autoremove
|
||||
|
||||
- sudo apt-get -y install postgresql-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-server-dev-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-plpython-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-9.5-postgis-scripts=2.2.2.0-cdb2
|
||||
- sudo apt-get -y install postgresql-9.5-postgis-2.2=2.2.2.0-cdb2
|
||||
|
||||
# configure it to accept local connections from postgres
|
||||
- echo -e "# TYPE DATABASE USER ADDRESS METHOD \nlocal all postgres trust\nlocal all all trust\nhost all all 127.0.0.1/32 trust" \
|
||||
| sudo tee /etc/postgresql/9.5/main/pg_hba.conf
|
||||
- sudo /etc/init.d/postgresql restart 9.5
|
||||
|
||||
install:
|
||||
- sudo make install
|
||||
|
||||
script:
|
||||
- cd src/pg
|
||||
- make test || { cat src/pg/test/regression.diffs; false; }
|
||||
107
NEWS.md
107
NEWS.md
@@ -1,10 +1,97 @@
|
||||
1.7.0 (2017-08-18)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Add Travis support to execute the extension tests ([#183](https://github.com/CartoDB/observatory-extension/issues/183))
|
||||
|
||||
__API Changes__
|
||||
|
||||
* Add new function `OBS_MetadataValidation` ([#303](https://github.com/CartoDB/observatory-extension/pull/303))
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Fixed parentheses for obs_getdata with ids
|
||||
* Fixed failing tests due changes in the data dump for some TIGER geometries
|
||||
|
||||
1.6.0 (2017-07-20)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* The current OBS_GetAvailableNumerators is not designed with our
|
||||
UI in mind so it's causing a lot of troubles and we're doing so
|
||||
many hacks to fit our UI needs and the interface of the function so this
|
||||
function it's a better fit for our purposes. ([#300](https://github.com/CartoDB/observatory-extension/pull/300))
|
||||
* Now use the new meta table `obs_meta_geom_numer_timespan` to filter
|
||||
the geometries by geometries timespan and/or numerator timespan (which
|
||||
is what we get when we use the obs_getavailabletimespans) ([#302](https://github.com/CartoDB/observatory-extension/pull/302))
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Right now we're doing INNER JOINS when we JOIN the `_procgeoms` and
|
||||
the data so we end up with NULL value instead of id, NULL value. ([#298](https://github.com/CartoDB/observatory-extension/pull/298))
|
||||
|
||||
|
||||
1.5.1 (2017-05-16)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Much improved performance for `OBS_GetData` when augmenting with several
|
||||
different geometries simultaneously ([#285](https://github.com/CartoDB/observatory-extension/pull/285))
|
||||
* Return the automatically assigned normalization type from `OBS_GetMeta`
|
||||
([#285](https://github.com/CartoDB/observatory-extension/pull/285))
|
||||
|
||||
1.5.0 (2017-04-24)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
* Add `suggested_name` to `OBS_GetMeta` responses
|
||||
([#281](https://github.com/CartoDB/observatory-extension/pull/281))
|
||||
* Add `geom_type`, `geom_extra`, and `geom_tags` to
|
||||
`OBS_GetAvailableGeometries`. This brings it up to spec with existing docs.
|
||||
([#282](https://github.com/CartoDB/observatory-extension/pull/282))
|
||||
* Add `timespan_type`, `timespan_extra`, and `timespan_tags` to
|
||||
`OBS_GetAvailableTimespans` for consistency.
|
||||
([#282](https://github.com/CartoDB/observatory-extension/pull/282))
|
||||
|
||||
1.4.0 (2017-03-21)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
* Allow for override of `target_area` and `target_geoms` in `OBS_GetMeta`
|
||||
([#276](https://github.com/CartoDB/observatory-extension/pull/276)). This
|
||||
allows the interface to work with points and sparse areas much btter.
|
||||
* Allow for override of `max_timespan_rank` and `max_score_rank` on an
|
||||
item-by-item basis for metadata.
|
||||
* `numer_description`, `geom_description`, `denom_description`,
|
||||
`numer_t_description`, `denom_t_description` and `geom_t_description` now
|
||||
returned as part of `OBS_GetMeta`.
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Reduced amount of simplification done on input geometries (from 0.0001 above
|
||||
500 points to 0.00001 above 1000 points).
|
||||
* Added tests to confirm that accurate results are returned from automatic
|
||||
boundary selection
|
||||
|
||||
1.3.5 (2017-03-15)
|
||||
------------------
|
||||
|
||||
No changes. Artifact to allow for data update.
|
||||
|
||||
1.3.4 (2017-03-10)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Remove erroneously committed `RAISE NOTICE` in `OBS_GetData`
|
||||
|
||||
1.3.3 (2017-03-10)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -27,6 +114,7 @@ __Improvements__
|
||||
([#267](https://github.com/CartoDB/observatory-extension/pull/267))
|
||||
|
||||
1.3.2 (2017-03-02)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -34,6 +122,7 @@ __Bugfixes__
|
||||
This fixes issues with Camshaft.
|
||||
|
||||
1.3.1 (2017-02-16)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -44,6 +133,7 @@ __Improvements__
|
||||
called for measures for polygons
|
||||
|
||||
1.3.0 (2017-01-17)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -68,9 +158,8 @@ __Bugfixes__
|
||||
* Remove unnecessary dependency on `postgres_fdw`
|
||||
* `OBS_GetData()` now aggregates measures with mixed geoms correctly
|
||||
|
||||
__API Changes__
|
||||
|
||||
1.2.1 (2017-01-17)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -78,6 +167,7 @@ __Improvements__
|
||||
([#243](https://github.com/CartoDB/observatory-extension/pull/233))
|
||||
|
||||
1.2.0 (2016-12-28)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -98,6 +188,7 @@ __Improvements__
|
||||
* Return both `table_id` and `column_id` from `_OBS_GetGeometryScores`
|
||||
|
||||
1.1.7 (2016-12-15)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -110,6 +201,7 @@ __Improvements__
|
||||
* Yields a ~50% improvement in performance for `_OBSGetGeomeryScores`.
|
||||
|
||||
1.1.6 (2016-12-08)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -136,6 +228,7 @@ __Improvements__
|
||||
- Add ability to persist results to JSON for graph visualization later
|
||||
|
||||
1.1.5 (2016-11-29)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -143,6 +236,7 @@ __Bugfixes__
|
||||
a geometry where it does not exist ([#220](https://github.com/CartoDB/observatory-extension/issues/220)).
|
||||
|
||||
1.1.4 (2016-11-21)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -150,10 +244,12 @@ __Bugfixes__
|
||||
`OBS_GetLegacyMetadata` ([#216](https://github.com/CartoDB/observatory-extension/issues/216)).
|
||||
|
||||
1.1.3 (2016-11-15)
|
||||
------------------
|
||||
|
||||
* Temporarily ignore EU data for the sake of testing
|
||||
|
||||
1.1.2 (2016-11-09)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -169,12 +265,14 @@ __API Changes (Internal)__
|
||||
* Add internal `_OBS_GetGeometryScores`
|
||||
|
||||
1.1.1 (2016-10-14)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Test points for Canada and France ([#204](https://github.com/CartoDB/observatory-extension/issues/120))
|
||||
|
||||
1.1.0 (2016-10-04)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -197,6 +295,7 @@ __API Changes__
|
||||
is also referred to here ([CartoDB/design#68](https://github.com/CartoDB/design/issues/68)).
|
||||
|
||||
1.0.7 (2016-09-20)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -208,6 +307,7 @@ __Improvements__
|
||||
* Automatic tests work for Canada and Thailand
|
||||
|
||||
1.0.6 (2016-09-08)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -215,6 +315,7 @@ __Improvements__
|
||||
framework logic from the observatory measure functions.
|
||||
|
||||
1.0.5 (2016-08-12)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -222,6 +323,7 @@ __Improvements__
|
||||
any HTTP SQL API.
|
||||
|
||||
1.0.4 (2016-07-26)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -230,6 +332,7 @@ __Bugfixes__
|
||||
([#173](https://github.com/CartoDB/observatory-extension/issues/173))
|
||||
|
||||
1.0.3 (2016-07-25)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ Use the following functions to retrieve [Boundary](https://carto.com/docs/carto-
|
||||
|
||||
You can [access](https://carto.com/docs/carto-engine/data/accessing) boundaries through CARTO Builder. The same methods will work if you are using the CARTO Engine to develop your application. We [encourage you](http://docs/carto-engine/data/accessing/#best-practices) to use table modifying methods (UPDATE and INSERT) over dynamic methods (SELECT).
|
||||
|
||||
## OBS_GetBoundariesByGeometry(polygon geometry, geometry_id text)
|
||||
## OBS_GetBoundariesByGeometry(geom geometry, geometry_id text)
|
||||
|
||||
The ```OBS_GetBoundariesByGeometry(geometry, geometry_id)``` method returns a set of boundary geometries that intersect a supplied geometry. This can be used to find all boundaries that are within or overlap a bounding box. You have the ability to choose whether to retrieve all boundaries that intersect your supplied bounding box or only those that fall entirely inside of your bounding box.
|
||||
|
||||
@@ -12,7 +12,7 @@ The ```OBS_GetBoundariesByGeometry(geometry, geometry_id)``` method returns a se
|
||||
|
||||
Name |Description
|
||||
--- | ---
|
||||
polygon | a bounding box or other WGS84 geometry
|
||||
geom | a WGS84 geometry
|
||||
geometry_id | a string identifier for a boundary geometry
|
||||
timespan (optional) | year(s) to request from ('NULL' (default) gives most recent)
|
||||
overlap_type (optional) | one of '[intersects](http://postgis.net/docs/manual-2.2/ST_Intersects.html)' (default), '[contains](http://postgis.net/docs/manual-2.2/ST_Contains.html)', or '[within](http://postgis.net/docs/manual-2.2/ST_Within.html)'.
|
||||
@@ -26,7 +26,7 @@ Column Name | Description
|
||||
the_geom | a boundary geometry (e.g., US Census tract boundaries)
|
||||
geom_refs | a string identifier for the geometry (e.g., geoids of US Census tracts)
|
||||
|
||||
If geometries are not found for the requested `polygon`, `geometry_id`, `timespan`, or `overlap_type`, then null values are returned.
|
||||
If geometries are not found for the requested `geom`, `geometry_id`, `timespan`, or `overlap_type`, then null values are returned.
|
||||
|
||||
#### Example
|
||||
|
||||
@@ -44,7 +44,6 @@ FROM OBS_GetBoundariesByGeometry(
|
||||
|
||||
#### Errors
|
||||
|
||||
* If a geometry other than a point is passed as the first argument, an error is thrown: `Invalid geometry type (ST_Polygon), expecting 'ST_Point'`
|
||||
* If an `overlap_type` other than the valid ones listed above is entered, then an error is thrown
|
||||
|
||||
## OBS_GetPointsByGeometry(polygon geometry, geometry_id text)
|
||||
|
||||
@@ -327,9 +327,12 @@ timespan_id | Text | The ID of the timespan
|
||||
timespan_name | Text | A human readable name for the timespan
|
||||
timespan_description | Text | Ignored
|
||||
timespan_weight | Numeric | Ignored
|
||||
timespan_aggregate | Text | Ignored
|
||||
timespan_license | Text | Ignored
|
||||
timespan_source | Text | Ignored
|
||||
timespan_aggregate | Text | Ignored
|
||||
timespan_type | Text | Ignored
|
||||
timespan_extra | JSONB | Ignored
|
||||
timespan_tags | JSONB | Ignored
|
||||
valid_numer | Boolean | True if the `numer_id` argument is a valid numerator for this timespan, False otherwise
|
||||
valid_denom | Boolean | True if the `timespan` argument is a valid timespan for this timespan, False otherwise
|
||||
valid_geom | Boolean | True if the `geom_id` argument is a valid geometry for this timespan, False otherwise
|
||||
|
||||
@@ -108,7 +108,7 @@ The ```OBS_GetMeasure(polygon, measure_id)``` function returns any Data Observat
|
||||
Name |Description
|
||||
--- | ---
|
||||
polygon_geometry | a WGS84 polygon geometry (the_geom)
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
normalize | for measures that are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
|
||||
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
|
||||
time_span | time span of interest (e.g., 2010 - 2014)
|
||||
@@ -143,7 +143,7 @@ The ```OBS_GetMeasureById(geom_ref, measure_id, boundary_id)``` function returns
|
||||
Name |Description
|
||||
--- | ---
|
||||
geom_ref | a geometry reference (e.g., a US Census geoid)
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
|
||||
time_span (optional) | time span of interest (e.g., 2010 - 2014). If `NULL` is passed, the measure from the most recent data will be used.
|
||||
|
||||
@@ -196,7 +196,7 @@ UPDATE tablename
|
||||
SET segmentation = OBS_GetCategory(the_geom, 'us.census.spielman_singleton_segments.X55')
|
||||
```
|
||||
|
||||
## OBS_GetMeta(extent geometry, metadata json, max_timespan_rank, max_boundary_score_rank, num_target_geoms)
|
||||
## OBS_GetMeta(extent geometry, metadata json, max_timespan_rank, max_score_rank, target_geoms)
|
||||
|
||||
The ```OBS_GetMeta(extent, metadata)``` function returns a completed Data
|
||||
Observatory metadata JSON Object for use in ```OBS_GetData(geomvals,
|
||||
@@ -213,9 +213,9 @@ Name | Description
|
||||
---- | -----------
|
||||
extent | A geometry of the extent of the input geometries
|
||||
metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column
|
||||
max_timespan_rank | How many historical time periods to include. Defaults to 1
|
||||
max_boundary_score_rank | How many alternative boundary levels to include. Defaults to 1
|
||||
num_target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest.
|
||||
num_timespan_options | How many historical time periods to include. Defaults to 1
|
||||
num_score_options | How many alternative boundary levels to include. Defaults to 1
|
||||
target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest.
|
||||
|
||||
The schema of the metadata input objects are as follows:
|
||||
|
||||
@@ -227,6 +227,10 @@ normalization | The desired normalization. One of 'area', 'prenormalized', or '
|
||||
denom_id | Identifier for a desired normalization column in case `normalization` is 'denominated'. Will be automatically assigned if necessary. Ignored if this metadata object specifies a geometry.
|
||||
numer_timespan | The desired timespan for the measurement. Defaults to most recent timespan available if left unspecified.
|
||||
geom_timespan | The desired timespan for the geometry. Defaults to timespan matching numer_timespan if left unspecified.
|
||||
target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area. Unit is square degrees WGS84. Set this to `0` if you want to use the smallest source geometry for this element of metadata, for example if you're passing in points.
|
||||
target_geoms | Override global `target_geoms` for this element of metadata
|
||||
max_timespan_rank | Only include timespans of this recency (for example, `1` is only the most recent timespan). No limit by default
|
||||
max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary). Is `1` by default
|
||||
|
||||
#### Returns
|
||||
|
||||
@@ -242,9 +246,12 @@ fail.
|
||||
|
||||
Metadata Output Key | Description
|
||||
--- | -----------
|
||||
suggested_name | A suggested column name for adding this to an existing table
|
||||
numer_id | Identifier for desired measurement
|
||||
numer_timespan | Timespan that will be used of the desired measurement
|
||||
numer_name | Human-readable name of desired measure
|
||||
numer_description | Long human-readable description of the desired measure
|
||||
numer_t_description | Further information about the source table
|
||||
numer_type | PostgreSQL/PostGIS type of desired measure
|
||||
numer_colname | Internal identifier for column name
|
||||
numer_tablename | Internal identifier for table
|
||||
@@ -252,6 +259,8 @@ numer_geomref_colname | Internal identifier for geomref column name
|
||||
denom_id | Identifier for desired normalization
|
||||
denom_timespan | Timespan that will be used of the desired normalization
|
||||
denom_name | Human-readable name of desired measure's normalization
|
||||
denom_description | Long human-readable description of the desired measure's normalization
|
||||
denom_t_description | Further information about the source table
|
||||
denom_type | PostgreSQL/PostGIS type of desired measure's normalization
|
||||
denom_colname | Internal identifier for normalization column name
|
||||
denom_tablename | Internal identifier for normalization table
|
||||
@@ -259,12 +268,14 @@ denom_geomref_colname | Internal identifier for normalization geomref column nam
|
||||
geom_id | Identifier for desired boundary geometry
|
||||
geom_timespan | Timespan that will be used of the desired boundary geometry
|
||||
geom_name | Human-readable name of desired boundary geometry
|
||||
geom_description | Long human-readable description of the desired boundary geometry
|
||||
geom_t_description | Further information about the source table
|
||||
geom_type | PostgreSQL/PostGIS type of desired boundary geometry
|
||||
geom_colname | Internal identifier for boundary geometry column name
|
||||
geom_tablename | Internal identifier for boundary geometry table
|
||||
geom_geomref_colname | Internal identifier for boundary geometry ref column name
|
||||
timespan_rank | Ranking of this measurement by time, most recent is 1, second most recent 2, etc.
|
||||
score | The score of this measurement's boundary compared to the `extent` and `num_target_geoms` passed in. Between 0 and 100.
|
||||
score | The score of this measurement's boundary compared to the `extent` and `target_geoms` passed in. Between 0 and 100.
|
||||
score_rank | The ranking of this measurement's boundary, highest ranked is 1, second is 2, etc.
|
||||
numer_aggregate | The aggregate type of the numerator, either `sum`, `average`, `median`, or blank
|
||||
denom_aggregate | The aggregate type of the denominator, either `sum`, `average`, `median`, or blank
|
||||
@@ -310,6 +321,55 @@ SELECT OBS_GetMeta(
|
||||
) FROM tablename
|
||||
```
|
||||
|
||||
## OBS_MetadataValidation(extent geometry, geometry_type text, metadata json, target_geoms)
|
||||
|
||||
The ```OBS_MetadataValidation``` function performs a validation check over the known issues using the extent, type of geometry, and metadata that is being used in the ```OBS_GetMeta``` function.
|
||||
|
||||
#### Arguments
|
||||
|
||||
Name | Description
|
||||
---- | -----------
|
||||
extent | A geometry of the extent of the input geometries
|
||||
geometry_type | The geometry type of the source data
|
||||
metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optional additional parameters about that column
|
||||
target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest
|
||||
|
||||
The schema of the metadata input objects are as follows:
|
||||
|
||||
Metadata Input Key | Description
|
||||
--- | -----------
|
||||
numer_id | The identifier for the desired measurement. If left blank, a `geom_id` is specified and the column returns a geometry, instead of a measurement
|
||||
geom_id | Identifier for a desired geographic boundary level used to calculate measures. If undefined, this is automatically assigned. If defined, `numer_id` is blank and the column returns a geometry, instead of a measurement
|
||||
normalization | The desired normalization. One of 'area', 'prenormalized', or 'denominated'. 'Area' will normalize the measure per square kilometer, 'prenormalized' will return the original value, and 'denominated' will normalize by a denominator. If the metadata object specifies a geometry, this is ignored
|
||||
denom_id | When `normalization` is 'denominated', this is the identifier for a desired normalization column. This is automatically assigned. If the metadata object specifies a geometry, this is ignored
|
||||
numer_timespan | The desired timespan for the measurement. If left unspecified, it defaults to the most recent timespan available
|
||||
geom_timespan | The desired timespan for the geometry. If left unspecified, it defaults to the timespan matching `numer_timespan`
|
||||
target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area. Unit is square degrees WGS84. Set this to `0` if you want to use the smallest source geometry for this element of metadata. For example, if you are passing in points
|
||||
target_geoms | Override global `target_geoms` for this element of metadata
|
||||
max_timespan_rank | Only include timespans of this recency (For example, `1` is only the most recent timespan). There is no limit by default
|
||||
max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary). The default is `1`
|
||||
|
||||
#### Returns
|
||||
|
||||
Key | Description
|
||||
--- | -----------
|
||||
valid | A boolean field that represents if the validation was successful or not
|
||||
errors | A text array with all possible errors
|
||||
|
||||
#### Examples
|
||||
|
||||
Validate metadata with two additional columns of US census data; using a boundary relevant for the geometry provided and the latest timespan. Limited to the most recent column, and the most relevant, based on the extent and density of input geometries in `tablename`.
|
||||
|
||||
```SQL
|
||||
SELECT OBS_MetadataValidation(
|
||||
ST_SetSRID(ST_Extent(the_geom), 4326),
|
||||
ST_GeometryType(the_geom),
|
||||
'[{"numer_id": "us.census.acs.B01003001"}, {"numer_id": "us.census.acs.B01001002"}]',
|
||||
COUNT(*)::INTEGER
|
||||
) FROM tablename
|
||||
GROUP BY ST_GeometryType(the_geom)
|
||||
```
|
||||
|
||||
## OBS_GetData(geomvals array[geomval], metadata json)
|
||||
|
||||
The ```OBS_GetData(geomvals, metadata)``` function returns a measure and/or
|
||||
@@ -454,7 +514,7 @@ WITH meta AS (
|
||||
'[{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.county"}]'
|
||||
) meta FROM tablename)
|
||||
SELECT id AS fips, (data->0->>'value')::Numeric AS pop_density
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG(fips) FROM tablename),
|
||||
(SELECT meta FROM meta))
|
||||
```
|
||||
|
||||
@@ -470,7 +530,7 @@ WITH meta AS (
|
||||
) meta FROM tablename),
|
||||
data as (
|
||||
SELECT id AS fips, (data->0->>'value') AS pop_density
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG(fips) FROM tablename),
|
||||
(SELECT meta FROM meta)))
|
||||
UPDATE tablename
|
||||
SET pop_density = data.pop_density
|
||||
|
||||
2252
release/observatory--1.3.5.sql
Normal file
2252
release/observatory--1.3.5.sql
Normal file
File diff suppressed because one or more lines are too long
2300
release/observatory--1.4.0.sql
Normal file
2300
release/observatory--1.4.0.sql
Normal file
File diff suppressed because one or more lines are too long
2327
release/observatory--1.5.0.sql
Normal file
2327
release/observatory--1.5.0.sql
Normal file
File diff suppressed because one or more lines are too long
2311
release/observatory--1.5.1.sql
Normal file
2311
release/observatory--1.5.1.sql
Normal file
File diff suppressed because one or more lines are too long
2400
release/observatory--1.6.0.sql
Normal file
2400
release/observatory--1.6.0.sql
Normal file
File diff suppressed because one or more lines are too long
2443
release/observatory--1.7.0.sql
Normal file
2443
release/observatory--1.7.0.sql
Normal file
File diff suppressed because one or more lines are too long
@@ -1,5 +1,5 @@
|
||||
comment = 'CartoDB Observatory backend extension'
|
||||
default_version = '1.3.4'
|
||||
default_version = '1.7.0'
|
||||
requires = 'postgis'
|
||||
superuser = true
|
||||
schema = cdb_observatory
|
||||
|
||||
@@ -52,8 +52,8 @@ def get_tablename_query(column_id, boundary_id, timespan):
|
||||
METADATA_TABLES = ['obs_table', 'obs_column_table', 'obs_column', 'obs_column_tag',
|
||||
'obs_tag', 'obs_column_to_column', 'obs_dump_version', 'obs_meta',
|
||||
'obs_meta_numer', 'obs_meta_denom', 'obs_meta_geom',
|
||||
'obs_meta_timespan', 'obs_column_table_tile',
|
||||
'obs_column_table_tile_simple']
|
||||
'obs_meta_timespan', 'obs_meta_geom_numer_timespan',
|
||||
'obs_column_table_tile', 'obs_column_table_tile_simple']
|
||||
|
||||
FIXTURES = [
|
||||
('us.census.acs.B01003001_quantile', 'us.census.tiger.census_tract', '2010 - 2014'),
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
requests
|
||||
nose
|
||||
nose_parameterized
|
||||
psycopg2
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
comment = 'CartoDB Observatory backend extension'
|
||||
default_version = '1.3.4'
|
||||
default_version = '1.7.0'
|
||||
requires = 'postgis'
|
||||
superuser = true
|
||||
schema = cdb_observatory
|
||||
|
||||
@@ -102,8 +102,8 @@ $$ LANGUAGE plpgsql STABLE;
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetMeta(
|
||||
geom geometry(Geometry, 4326),
|
||||
params JSON,
|
||||
max_timespan_rank INTEGER DEFAULT NULL, -- cutoff for timespan ranks when there's ambiguity
|
||||
max_score_rank INTEGER DEFAULT NULL, -- cutoff for geom ranks when there's ambiguity
|
||||
num_timespan_options INTEGER DEFAULT NULL, -- how many timespan options to show
|
||||
num_score_options INTEGER DEFAULT NULL, -- how many score options to show
|
||||
target_geoms INTEGER DEFAULT NULL
|
||||
)
|
||||
RETURNS JSON
|
||||
@@ -115,20 +115,34 @@ DECLARE
|
||||
scores_clause TEXT;
|
||||
result JSON;
|
||||
BEGIN
|
||||
IF max_timespan_rank IS NULL THEN
|
||||
max_timespan_rank := 1;
|
||||
IF num_timespan_options IS NULL THEN
|
||||
num_timespan_options := 1;
|
||||
END IF;
|
||||
IF max_score_rank IS NULL THEN
|
||||
max_score_rank := 1;
|
||||
IF num_score_options IS NULL THEN
|
||||
num_score_options := 1;
|
||||
END IF;
|
||||
|
||||
numer_filters := (SELECT Array_Agg(val) FILTER (WHERE val IS NOT NULL) FROM (SELECT (JSON_Array_Elements(params))->>'numer_id' val) foo);
|
||||
geom_filters := (SELECT Array_Agg(val) FILTER (WHERE val IS NOT NULL) FROM (SELECT (JSON_Array_Elements(params))->>'geom_id' val) bar);
|
||||
meta_filter_clause := '(m.numer_id = ANY ($6) OR m.geom_id = ANY ($7))';
|
||||
|
||||
scores_clause := 'SELECT *
|
||||
FROM cdb_observatory._OBS_GetGeometryScores($1,
|
||||
(SELECT Array_Agg(geom_id) FROM meta), $2) scores ';
|
||||
scores_clause := ' agg_geoms AS (
|
||||
SELECT target_geoms, target_area, ARRAY_AGG(geom_id) geom_ids
|
||||
FROM meta
|
||||
GROUP BY target_geoms, target_area
|
||||
), scores AS (
|
||||
SELECT target_geoms, target_area,
|
||||
CASE target_area
|
||||
-- point-specific, just order by numgeoms instead of score
|
||||
WHEN 0 THEN scores.numgeoms
|
||||
-- has some area, use proper scoring
|
||||
ELSE scores.score
|
||||
END AS score,
|
||||
scores.numgeoms, scores.table_id, scores.column_id
|
||||
FROM agg_geoms,
|
||||
LATERAL cdb_observatory._OBS_GetGeometryScores($1,
|
||||
geom_ids, COALESCE(target_geoms, $2), target_area) scores
|
||||
) ';
|
||||
|
||||
IF JSON_Array_Length(params) = 1 THEN
|
||||
IF numer_filters IS NULL AND geom_filters IS NOT NULL THEN
|
||||
@@ -142,21 +156,22 @@ BEGIN
|
||||
END IF;
|
||||
|
||||
IF geom_filters IS NOT NULL AND numer_filters IS NOT NULL THEN
|
||||
scores_clause := 'SELECT 1 score, null, geom_tid table_id, geom_id column_id,
|
||||
null, null, null, null, null, null
|
||||
FROM meta ';
|
||||
scores_clause := 'scores AS (
|
||||
SELECT NULL::INTEGER target_geoms, NULL::Numeric target_area,
|
||||
1 score, null, geom_tid table_id, geom_id column_id,
|
||||
NULL::Integer numgeoms
|
||||
FROM meta) ';
|
||||
END IF;
|
||||
END IF;
|
||||
|
||||
EXECUTE format($string$
|
||||
WITH _filters AS (SELECT
|
||||
generate_series(1, array_length($3, 1)) id,
|
||||
(unnest($3))->>'numer_id' numer_id,
|
||||
(unnest($3))->>'denom_id' denom_id,
|
||||
(unnest($3))->>'geom_id' geom_id,
|
||||
(unnest($3))->>'numer_timespan' numer_timespan,
|
||||
(unnest($3))->>'geom_timespan' geom_timespan,
|
||||
(unnest($3))->>'normalization' normalization
|
||||
row_number() over () id, *
|
||||
FROM json_to_recordset($3)
|
||||
AS x(numer_id TEXT, denom_id TEXT, geom_id TEXT, numer_timespan TEXT,
|
||||
geom_timespan TEXT, normalization TEXT, max_timespan_rank TEXT,
|
||||
max_score_rank TEXT, target_geoms INTEGER, target_area Numeric
|
||||
)
|
||||
), meta AS (SELECT
|
||||
id,
|
||||
f.numer_id,
|
||||
@@ -166,6 +181,8 @@ BEGIN
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_tablename END numer_tablename,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_type END numer_type,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_name END numer_name,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_description END numer_description,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_t_description END numer_t_description,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE m.numer_timespan END numer_timespan,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE m.denom_id END denom_id,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_aggregate END denom_aggregate,
|
||||
@@ -173,6 +190,8 @@ BEGIN
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_geomref_colname END denom_geomref_colname,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_tablename END denom_tablename,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_name END denom_name,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_description END denom_description,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_t_description END denom_t_description,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_type END denom_type,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_reltype END denom_reltype,
|
||||
m.geom_id,
|
||||
@@ -182,8 +201,24 @@ BEGIN
|
||||
geom_geomref_colname,
|
||||
geom_tablename,
|
||||
geom_name,
|
||||
geom_description,
|
||||
geom_t_description,
|
||||
geom_type,
|
||||
normalization
|
||||
Coalesce(normalization,
|
||||
-- automatically assign normalization to numeric numerators
|
||||
CASE WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
CASE WHEN denom_reltype ILIKE 'denominator' THEN 'denominated'
|
||||
WHEN numer_aggregate ILIKE 'sum' THEN 'area'
|
||||
WHEN numer_aggregate IN ('median', 'average') AND denom_reltype ILIKE 'universe'
|
||||
THEN 'prenormalized'
|
||||
ELSE 'prenormalized'
|
||||
END ELSE NULL
|
||||
END
|
||||
) normalization,
|
||||
max_timespan_rank,
|
||||
max_score_rank,
|
||||
target_geoms,
|
||||
target_area
|
||||
FROM observatory.obs_meta m JOIN _filters f
|
||||
ON CASE WHEN f.numer_id IS NULL THEN m.geom_id ELSE m.numer_id END =
|
||||
CASE WHEN f.numer_id IS NULL THEN f.geom_id ELSE f.numer_id END
|
||||
@@ -194,9 +229,8 @@ BEGIN
|
||||
AND (m.geom_id = f.geom_id OR COALESCE(f.geom_id, '') = '')
|
||||
AND (m.geom_timespan = f.geom_timespan OR COALESCE(f.geom_timespan, '') = '')
|
||||
AND (m.numer_timespan = f.numer_timespan OR COALESCE(f.numer_timespan, '') = '')
|
||||
), scores AS (
|
||||
%s
|
||||
), groups AS (SELECT
|
||||
), %s
|
||||
, groups AS (SELECT
|
||||
id,
|
||||
scores.score,
|
||||
numer_timespan,
|
||||
@@ -207,45 +241,68 @@ BEGIN
|
||||
'numer_id', numer_id,
|
||||
'timespan_rank', dense_rank() OVER (PARTITION BY id ORDER BY numer_timespan DESC),
|
||||
'score_rank', dense_rank() OVER (PARTITION BY id ORDER BY score DESC),
|
||||
'timespan_rownum', row_number() over
|
||||
(PARTITION BY id, score ORDER BY numer_timespan DESC, Coalesce(denom_id, '')),
|
||||
'score_rownum', row_number() over
|
||||
(PARTITION BY id, numer_timespan ORDER BY score DESC, Coalesce(denom_id, '')),
|
||||
'score', scores.score,
|
||||
'suggested_name', cdb_observatory.FIRST(
|
||||
LOWER(TRIM(BOTH '_' FROM regexp_replace(CASE WHEN numer_id IS NOT NULL
|
||||
THEN CASE
|
||||
WHEN normalization ILIKE 'area%%' THEN numer_colname || ' per sq km'
|
||||
WHEN normalization ILIKE 'denom%%' THEN numer_colname || ' rate'
|
||||
ELSE numer_colname
|
||||
END || ' ' || numer_timespan
|
||||
ELSE geom_name || ' ' || geom_timespan
|
||||
END, '[^a-zA-Z0-9]+', '_', 'g')))
|
||||
),
|
||||
'numer_aggregate', cdb_observatory.FIRST(meta.numer_aggregate),
|
||||
'numer_colname', cdb_observatory.FIRST(meta.numer_colname),
|
||||
'numer_geomref_colname', cdb_observatory.FIRST(meta.numer_geomref_colname),
|
||||
'numer_tablename', cdb_observatory.FIRST(meta.numer_tablename),
|
||||
'numer_type', cdb_observatory.FIRST(meta.numer_type),
|
||||
--'numer_description', cdb_observatory.FIRST(meta.numer_description),
|
||||
--'numer_t_description', cdb_observatory.FIRST(meta.numer_t_description),
|
||||
'numer_description', cdb_observatory.FIRST(meta.numer_description),
|
||||
'numer_t_description', cdb_observatory.FIRST(meta.numer_t_description),
|
||||
'denom_aggregate', cdb_observatory.FIRST(meta.denom_aggregate),
|
||||
'denom_colname', cdb_observatory.FIRST(denom_colname),
|
||||
'denom_geomref_colname', cdb_observatory.FIRST(denom_geomref_colname),
|
||||
'denom_tablename', cdb_observatory.FIRST(denom_tablename),
|
||||
'denom_type', cdb_observatory.FIRST(meta.denom_type),
|
||||
'denom_reltype', cdb_observatory.FIRST(meta.denom_reltype),
|
||||
--'denom_description', cdb_observatory.FIRST(meta.denom_description),
|
||||
--'denom_t_description', cdb_observatory.FIRST(meta.denom_t_description),
|
||||
'denom_description', cdb_observatory.FIRST(meta.denom_description),
|
||||
'denom_t_description', cdb_observatory.FIRST(meta.denom_t_description),
|
||||
'geom_colname', cdb_observatory.FIRST(geom_colname),
|
||||
'geom_geomref_colname', cdb_observatory.FIRST(geom_geomref_colname),
|
||||
'geom_tablename', cdb_observatory.FIRST(geom_tablename),
|
||||
'geom_type', cdb_observatory.FIRST(meta.geom_type),
|
||||
'geom_timespan', cdb_observatory.FIRST(meta.geom_timespan),
|
||||
--'geom_description', cdb_observatory.FIRST(meta.geom_description),
|
||||
--'geom_t_description', cdb_observatory.FIRST(meta.geom_t_description),
|
||||
'geom_description', cdb_observatory.FIRST(meta.geom_description),
|
||||
'geom_t_description', cdb_observatory.FIRST(meta.geom_t_description),
|
||||
'numer_timespan', cdb_observatory.FIRST(numer_timespan),
|
||||
'numer_name', cdb_observatory.FIRST(numer_name),
|
||||
'denom_name', cdb_observatory.FIRST(denom_name),
|
||||
'geom_name', cdb_observatory.FIRST(geom_name),
|
||||
'normalization', cdb_observatory.FIRST(normalization),
|
||||
'max_timespan_rank', cdb_observatory.FIRST(max_timespan_rank),
|
||||
'max_score_rank', cdb_observatory.FIRST(max_score_rank),
|
||||
'target_geoms', cdb_observatory.FIRST(scores.target_geoms),
|
||||
'target_area', cdb_observatory.FIRST(scores.target_area),
|
||||
'num_geoms', cdb_observatory.FIRST(scores.numgeoms),
|
||||
'denom_id', denom_id,
|
||||
'geom_id', meta.geom_id
|
||||
) metadata
|
||||
FROM meta, scores
|
||||
WHERE meta.geom_id = scores.column_id
|
||||
AND meta.geom_tid = scores.table_id
|
||||
AND COALESCE(meta.target_geoms, 0) = COALESCE(scores.target_geoms, 0)
|
||||
AND COALESCE(meta.target_area, 0) = COALESCE(scores.target_area, 0)
|
||||
GROUP BY id, score, numer_id, denom_id, geom_id, numer_timespan
|
||||
) SELECT JSON_AGG(metadata ORDER BY id)
|
||||
FROM groups
|
||||
WHERE timespan_rank <= $4
|
||||
AND score_rank <= $5
|
||||
WHERE timespan_rank <= Coalesce((metadata->>'max_timespan_rank')::INTEGER, 'infinity'::FLOAT)
|
||||
AND score_rank <= Coalesce((metadata->>'max_score_rank')::INTEGER, 1)
|
||||
AND (metadata->>'timespan_rownum')::INTEGER <= $4
|
||||
AND (metadata->>'score_rownum')::INTEGER <= $5
|
||||
$string$, meta_filter_clause, scores_clause)
|
||||
INTO result
|
||||
USING
|
||||
@@ -254,9 +311,9 @@ BEGIN
|
||||
ELSE geom
|
||||
END,
|
||||
target_geoms,
|
||||
(SELECT ARRAY(SELECT json_array_elements_text(params))::json[]),
|
||||
max_timespan_rank,
|
||||
max_score_rank, numer_filters, geom_filters
|
||||
params,
|
||||
num_timespan_options,
|
||||
num_score_options, numer_filters, geom_filters
|
||||
;
|
||||
RETURN result;
|
||||
END;
|
||||
@@ -536,14 +593,9 @@ RETURNS TABLE (
|
||||
)
|
||||
AS $$
|
||||
DECLARE
|
||||
geom_colspecs TEXT;
|
||||
geom_tables TEXT;
|
||||
geomrefs_alias TEXT;
|
||||
geomrefs_noalias TEXT;
|
||||
data_colspecs TEXT;
|
||||
data_tables TEXT;
|
||||
obs_wheres TEXT;
|
||||
user_wheres TEXT;
|
||||
procgeom_clauses TEXT;
|
||||
val_clauses TEXT;
|
||||
json_clause TEXT;
|
||||
geomtype TEXT;
|
||||
BEGIN
|
||||
IF params IS NULL OR JSON_ARRAY_LENGTH(params) = 0 OR ARRAY_LENGTH(geomvals, 1) IS NULL THEN
|
||||
@@ -553,250 +605,230 @@ BEGIN
|
||||
|
||||
geomtype := ST_GeometryType(geomvals[1].geom);
|
||||
|
||||
EXECUTE
|
||||
$query$
|
||||
WITH _meta AS (SELECT
|
||||
row_number() over () colid,
|
||||
meta->>'id' id,
|
||||
meta->>'numer_id' numer_id,
|
||||
meta->>'numer_aggregate' numer_aggregate,
|
||||
meta->>'numer_colname' numer_colname,
|
||||
meta->>'numer_geomref_colname' numer_geomref_colname,
|
||||
meta->>'numer_tablename' numer_tablename,
|
||||
meta->>'numer_type' numer_type,
|
||||
meta->>'denom_id' denom_id,
|
||||
meta->>'denom_aggregate' denom_aggregate,
|
||||
meta->>'denom_colname' denom_colname,
|
||||
meta->>'denom_geomref_colname' denom_geomref_colname,
|
||||
meta->>'denom_tablename' denom_tablename,
|
||||
meta->>'denom_type' denom_type,
|
||||
meta->>'denom_reltype' denom_reltype,
|
||||
meta->>'geom_id' geom_id,
|
||||
meta->>'geom_colname' geom_colname,
|
||||
meta->>'geom_geomref_colname' geom_geomref_colname,
|
||||
meta->>'geom_tablename' geom_tablename,
|
||||
meta->>'geom_type' geom_type,
|
||||
meta->>'numer_timespan' numer_timespan,
|
||||
meta->>'geom_timespan' geom_timespan,
|
||||
meta->>'normalization' normalization,
|
||||
meta->>'api_method' api_method,
|
||||
meta->'api_args' api_args
|
||||
FROM UNNEST($1) AS meta
|
||||
)
|
||||
/* Read metadata to generate clauses for query */
|
||||
EXECUTE $query$
|
||||
WITH _meta AS (SELECT
|
||||
row_number() over () colid, *
|
||||
FROM json_to_recordset($1)
|
||||
AS x(id TEXT, numer_id TEXT, numer_aggregate TEXT, numer_colname TEXT,
|
||||
numer_geomref_colname TEXT, numer_tablename TEXT, numer_type TEXT,
|
||||
denom_id TEXT, denom_aggregate TEXT, denom_colname TEXT,
|
||||
denom_geomref_colname TEXT, denom_tablename TEXT, denom_type TEXT,
|
||||
denom_reltype TEXT, geom_id TEXT, geom_colname TEXT,
|
||||
geom_geomref_colname TEXT, geom_tablename TEXT, geom_type TEXT,
|
||||
numer_timespan TEXT, geom_timespan TEXT, normalization TEXT,
|
||||
api_method TEXT, api_args JSON)
|
||||
),
|
||||
|
||||
-- Generate procgeom clauses.
|
||||
-- These join the users' geoms to the relevant geometries for the
|
||||
-- asked-for measures in the Observatory.
|
||||
_procgeom_clauses AS (
|
||||
SELECT
|
||||
String_Agg(DISTINCT
|
||||
CASE
|
||||
-- pass-through geom if user is requesting it only
|
||||
WHEN numer_id IS NULL AND api_method IS NULL THEN
|
||||
geom_tablename || '.' || geom_colname || ' AS geom_' || geom_tablename
|
||||
WHEN cdb_observatory.isnumeric(numer_type) AND api_method IS NULL THEN
|
||||
-- for numeric points with area normalization, include areas of underlying geoms
|
||||
CASE
|
||||
WHEN $2 = 'ST_Point' AND (LOWER(normalization) LIKE 'area%' OR
|
||||
(normalization IS NULL AND numer_aggregate ILIKE 'sum')) THEN
|
||||
' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
|
||||
' AS area_' || geom_tablename
|
||||
-- for numeric areas, include more complex calcs
|
||||
WHEN $2 != 'ST_Point' THEN
|
||||
'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
|
||||
' THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)' ||
|
||||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
|
||||
' THEN 1 ' ||
|
||||
' ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' ||
|
||||
geom_tablename || '.' || geom_colname || ')) / ' ||
|
||||
'Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0) ' ||
|
||||
'END pct_' || geom_tablename
|
||||
ELSE NULL
|
||||
END
|
||||
ELSE NULL END
|
||||
, ', ') AS geom_colspecs,
|
||||
String_Agg(DISTINCT 'observatory.' || geom_tablename, ', ') AS geom_tables,
|
||||
String_Agg(
|
||||
'JSON_Build_Object(' || CASE
|
||||
-- api-delivered values
|
||||
WHEN api_method IS NOT NULL THEN
|
||||
'''value'', ' ||
|
||||
'ARRAY_AGG( ' ||
|
||||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
|
||||
-- numeric internal values
|
||||
WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
'''value'', ' || CASE
|
||||
-- denominated
|
||||
WHEN LOWER(normalization) LIKE 'denom%' OR
|
||||
(normalization IS NULL AND LOWER(denom_reltype) LIKE 'denominator')
|
||||
THEN CASE
|
||||
-- denominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
|
||||
-- denominated polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
|
||||
ELSE
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
|
||||
' * pct_' || geom_tablename || '), 0) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- areaNormalized
|
||||
WHEN LOWER(normalization) LIKE 'area%' OR
|
||||
(normalization IS NULL AND numer_aggregate ILIKE 'sum')
|
||||
THEN CASE
|
||||
-- areaNormalized point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / area_' || geom_tablename || ')'
|
||||
-- areaNormalized polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / area of big geom
|
||||
ELSE
|
||||
--' NULL END '
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- median/average measures with universe
|
||||
WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
|
||||
denom_reltype ILIKE 'universe' AND
|
||||
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation weighted by universe
|
||||
-- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
|
||||
-- (10 * 1000 * 1) / (1000 * 1) = 10
|
||||
-- (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
|
||||
' SUM(' || numer_tablename || '.' || numer_colname ||
|
||||
' * ' || denom_tablename || '.' || denom_colname ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
|
||||
' * pct_' || geom_tablename || '), 0) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- prenormalized for summable measures. point or summable only!
|
||||
WHEN numer_aggregate ILIKE 'sum' AND
|
||||
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation
|
||||
-- SUM (numer * (% user geom in OBS geom))
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- Everything else. Point only!
|
||||
ELSE CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
|
||||
numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
|
||||
END
|
||||
END || '::' || numer_type
|
||||
|
||||
-- categorical/text
|
||||
WHEN LOWER(numer_type) LIKE 'text' THEN
|
||||
'''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
|
||||
|
||||
-- geometry
|
||||
WHEN numer_id IS NULL THEN
|
||||
'''geomref'', geomref_' || geom_tablename || ', ' ||
|
||||
'''value'', ' || 'cdb_observatory.FIRST(geom_' || geom_tablename ||
|
||||
')::TEXT'
|
||||
-- code below will return the intersection of the user's geom and the
|
||||
-- OBS geom
|
||||
--'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
|
||||
-- '.' || geom_colname || '))::TEXT'
|
||||
ELSE ''
|
||||
END || ')', ', ')
|
||||
AS colspecs,
|
||||
|
||||
-- geomrefs, used to separate out rows in case we don't want to merge
|
||||
-- results by user input IDs
|
||||
--
|
||||
-- api_method and geom_tablename are interchangeable since when an
|
||||
-- api_method is passed, geom_tablename is ignored
|
||||
String_Agg(DISTINCT COALESCE(geom_tablename, api_method) || '.' || geom_geomref_colname ||
|
||||
' AS geomref_' || COALESCE(geom_tablename, api_method), ', ') AS geomrefs_alias,
|
||||
|
||||
String_Agg(DISTINCT 'geomref_' || COALESCE(geom_tablename, api_method)
|
||||
, ', ') AS geomrefs_noalias,
|
||||
|
||||
(SELECT String_Agg(DISTINCT CASE
|
||||
-- External API
|
||||
WHEN tablename LIKE 'cdb_observatory.%' THEN
|
||||
'LATERAL (SELECT * FROM ' || tablename || ') ' ||
|
||||
REPLACE(split_part(tablename, '(', 1), 'cdb_observatory.', '')
|
||||
-- Internal obs_ table
|
||||
ELSE 'observatory.' || tablename
|
||||
END, ', ') FROM (
|
||||
SELECT DISTINCT UNNEST(tablenames_ary) tablename FROM (
|
||||
SELECT ARRAY_AGG(numer_tablename) ||
|
||||
ARRAY_AGG(denom_tablename) ||
|
||||
ARRAY_AGG('cdb_observatory.' || api_method || '(_procgeoms.geom' || COALESCE(', ' ||
|
||||
(SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
|
||||
FROM (SELECT json_array_elements(api_args) as val) as vals),
|
||||
'') || ')')
|
||||
tablenames_ary
|
||||
) tablenames_inner
|
||||
) tablenames_outer) data_tables,
|
||||
|
||||
String_Agg(DISTINCT array_to_string(ARRAY[
|
||||
CASE WHEN numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL
|
||||
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
|
||||
'_procgeoms.geomref_' || geom_tablename
|
||||
ELSE NULL END,
|
||||
CASE WHEN numer_tablename != denom_tablename
|
||||
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
|
||||
denom_tablename || '.' || denom_geomref_colname
|
||||
ELSE NULL END
|
||||
], ' AND '),
|
||||
' AND ') FILTER (WHERE numer_tablename != denom_tablename OR
|
||||
(numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL)) AS obs_wheres,
|
||||
|
||||
String_Agg(DISTINCT 'ST_Intersects(' || geom_tablename || '.' || geom_colname
|
||||
|| ', _geoms.geom)', ' AND ')
|
||||
AS user_wheres
|
||||
'_procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (' ||
|
||||
CASE WHEN api_method IS NULL THEN
|
||||
'SELECT _geoms.id, ' ||
|
||||
CASE $3 WHEN True THEN '_geoms.geom'
|
||||
ELSE geom_tablename || '.' || geom_colname
|
||||
END || ' AS geom, ' ||
|
||||
geom_tablename || '.' || geom_geomref_colname || ' AS geomref, ' ||
|
||||
CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
|
||||
' AS area'
|
||||
-- for numeric areas, include more complex calcs
|
||||
ELSE
|
||||
'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')
|
||||
THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
|
||||
WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom)
|
||||
THEN 1
|
||||
ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) /
|
||||
Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
|
||||
END pct_obs'
|
||||
END || '
|
||||
FROM _geoms, observatory.' || geom_tablename || '
|
||||
WHERE ST_Intersects(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')'
|
||||
-- pass through input geometries for api_method
|
||||
ELSE 'SELECT _geoms.id, _geoms.geom FROM _geoms'
|
||||
END ||
|
||||
') '
|
||||
AS procgeom_clause
|
||||
FROM _meta
|
||||
;
|
||||
$query$
|
||||
INTO geom_colspecs, geom_tables, data_colspecs, geomrefs_alias,
|
||||
geomrefs_noalias, data_tables, obs_wheres, user_wheres
|
||||
USING (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]), geomtype;
|
||||
GROUP BY api_method, geom_tablename, geom_geomref_colname, geom_colname
|
||||
),
|
||||
|
||||
-- Generate val clauses.
|
||||
-- These perform interpolations or other necessary calculations to
|
||||
-- provide values according to users geometries.
|
||||
_val_clauses AS (
|
||||
SELECT
|
||||
'_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (
|
||||
SELECT _procgeoms.id, ' ||
|
||||
String_Agg('json_build_object(' || CASE
|
||||
-- api-delivered values
|
||||
WHEN api_method IS NOT NULL THEN
|
||||
'''value'', ' ||
|
||||
'ARRAY_AGG( ' ||
|
||||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
|
||||
-- numeric internal values
|
||||
WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
'''value'', ' || CASE
|
||||
-- denominated
|
||||
WHEN LOWER(normalization) LIKE 'denom%'
|
||||
THEN CASE
|
||||
WHEN denom_tablename IS NULL THEN ' NULL '
|
||||
-- denominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
|
||||
-- denominated polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
|
||||
ELSE
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs ' ||
|
||||
' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs), 0) '
|
||||
END
|
||||
-- areaNormalized
|
||||
WHEN LOWER(normalization) LIKE 'area%'
|
||||
THEN CASE
|
||||
-- areaNormalized point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / _procgeoms.area)'
|
||||
-- areaNormalized polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / area of big geom
|
||||
ELSE
|
||||
--' NULL END '
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs' ||
|
||||
' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) '
|
||||
END
|
||||
-- median/average measures with universe
|
||||
WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
|
||||
denom_reltype ILIKE 'universe' AND LOWER(normalization) LIKE 'pre%'
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation weighted by universe
|
||||
-- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
|
||||
-- (10 * 1000 * 1) / (1000 * 1) = 10
|
||||
-- (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
|
||||
' SUM(' || numer_tablename || '.' || numer_colname ||
|
||||
' * ' || denom_tablename || '.' || denom_colname ||
|
||||
' * _procgeoms.pct_obs ' ||
|
||||
' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
|
||||
' * _procgeoms.pct_obs ' || '), 0) '
|
||||
END
|
||||
-- prenormalized for summable measures. point or summable only!
|
||||
WHEN numer_aggregate ILIKE 'sum' AND LOWER(normalization) LIKE 'pre%'
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation
|
||||
-- SUM (numer * (% user geom in OBS geom))
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs) '
|
||||
END
|
||||
-- Everything else. Point only!
|
||||
ELSE CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
|
||||
numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
|
||||
END
|
||||
END || '::' || numer_type
|
||||
|
||||
-- categorical/text
|
||||
WHEN LOWER(numer_type) LIKE 'text' THEN
|
||||
'''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
|
||||
-- geometry
|
||||
WHEN numer_id IS NULL THEN
|
||||
'''geomref'', _procgeoms.geomref, ' ||
|
||||
'''value'', ' || 'cdb_observatory.FIRST(_procgeoms.geom)::TEXT'
|
||||
-- code below will return the intersection of the user's geom and the
|
||||
-- OBS geom
|
||||
--'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
|
||||
-- '.' || geom_colname || '))::TEXT'
|
||||
ELSE ''
|
||||
END
|
||||
|| ') val_' || colid, ', ')
|
||||
|| '
|
||||
FROM _procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' _procgeoms ' ||
|
||||
Coalesce(String_Agg(DISTINCT
|
||||
Coalesce('LEFT JOIN observatory.' || numer_tablename || ' ON _procgeoms.geomref = observatory.' || numer_tablename || '.' || numer_geomref_colname,
|
||||
', LATERAL (SELECT * FROM cdb_observatory.' || api_method || '(_procgeoms.geom' || Coalesce(', ' ||
|
||||
(SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
|
||||
FROM (SELECT JSON_Array_Elements(api_args) as val) as vals),
|
||||
'') || ')) AS ' || api_method)
|
||||
, ' '), '') ||
|
||||
CASE $3 WHEN True THEN E'\n GROUP BY _procgeoms.id ORDER BY _procgeoms.id '
|
||||
ELSE E'\n GROUP BY _procgeoms.id, _procgeoms.geomref
|
||||
ORDER BY _procgeoms.id, _procgeoms.geomref' END
|
||||
|| ')'
|
||||
AS val_clause,
|
||||
'_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) AS cte_name
|
||||
FROM _meta
|
||||
GROUP BY geom_tablename, geom_geomref_colname, geom_colname, api_method
|
||||
),
|
||||
|
||||
-- Generate clauses necessary to join together val_clauses
|
||||
_val_joins AS (
|
||||
SELECT String_Agg(a.cte_name || '.id = ' || b.cte_name || '.id ', ' AND ') val_joins
|
||||
FROM _val_clauses a, _val_clauses b
|
||||
WHERE a.cte_name != b.cte_name
|
||||
AND a.cte_name < b.cte_name
|
||||
),
|
||||
|
||||
-- Generate JSON clause. This puts together vals from val_clauses
|
||||
_json_clause AS (SELECT
|
||||
'SELECT ' || cdb_observatory.FIRST(cte_name) || '.id::INT,
|
||||
Array_to_JSON(ARRAY[' || (SELECT String_Agg('val_' || colid, ', ') FROM _meta) || '])
|
||||
FROM ' || String_Agg(cte_name, ', ') ||
|
||||
Coalesce(' WHERE ' || val_joins, '')
|
||||
AS json_clause
|
||||
FROM _val_clauses, _val_joins
|
||||
GROUP BY val_joins
|
||||
)
|
||||
|
||||
SELECT (SELECT String_Agg(procgeom_clause, E',\n ') FROM _procgeom_clauses),
|
||||
(SELECT String_Agg(val_clause, E',\n ') FROM _val_clauses),
|
||||
json_clause
|
||||
FROM _json_clause
|
||||
$query$ INTO
|
||||
procgeom_clauses,
|
||||
val_clauses,
|
||||
json_clause
|
||||
USING params, geomtype, merge;
|
||||
|
||||
/* Execute query */
|
||||
RETURN QUERY EXECUTE format($query$
|
||||
WITH _raw_geoms AS (%s),
|
||||
_geoms AS (SELECT id,
|
||||
CASE WHEN (ST_NPoints(geom) > 500)
|
||||
THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.0001)), 3)
|
||||
CASE WHEN (ST_NPoints(geom) > 1000)
|
||||
THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.00001)), 3)
|
||||
ELSE geom END geom
|
||||
FROM _raw_geoms),
|
||||
_procgeoms AS (SELECT _geoms.id, _geoms.geom %s %s
|
||||
FROM _geoms %s
|
||||
%s
|
||||
)
|
||||
SELECT _procgeoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
|
||||
FROM _procgeoms %s
|
||||
%s
|
||||
GROUP BY _procgeoms.id %s
|
||||
ORDER BY _procgeoms.id
|
||||
$query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1 THEN
|
||||
' SELECT $1[1].val as id, $1[1].geom as geom '
|
||||
ELSE
|
||||
' SELECT val as id, geom FROM UNNEST($1) '
|
||||
-- procgeom_clauses
|
||||
%s,
|
||||
|
||||
-- val_clauses
|
||||
%s
|
||||
|
||||
-- json_clause
|
||||
%s
|
||||
$query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1
|
||||
THEN ' SELECT $1[1].val as id, $1[1].geom as geom '
|
||||
ELSE ' SELECT val as id, geom FROM UNNEST($1) '
|
||||
END,
|
||||
', ' || NullIf(geomrefs_alias, ''),
|
||||
', ' || NullIf(geom_colspecs, ''),
|
||||
', ' || NullIf(geom_tables, ''),
|
||||
'WHERE ' || NullIf( user_wheres, ''),
|
||||
data_colspecs, ', ' || NullIf(data_tables, ''),
|
||||
'WHERE ' || NULLIF(obs_wheres, ''),
|
||||
CASE WHEN merge IS False THEN ', ' || geomrefs_noalias ELSE '' END)
|
||||
String_Agg(procgeom_clauses, E',\n '),
|
||||
String_Agg(val_clauses, E',\n '),
|
||||
json_clause)
|
||||
USING geomvals;
|
||||
RETURN;
|
||||
END;
|
||||
@@ -1044,3 +1076,46 @@ BEGIN
|
||||
RETURN result;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql STABLE;
|
||||
|
||||
-- MetadataValidation checks the metadata parameters and the geometry type
|
||||
-- of the data in order to find possible wrong cases
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory.obs_metadatavalidation(
|
||||
geometry_extent geometry(Geometry, 4326),
|
||||
geometry_type text,
|
||||
params JSON,
|
||||
target_geoms INTEGER DEFAULT NULL
|
||||
)
|
||||
RETURNS TABLE(valid boolean, errors text[]) AS $$
|
||||
DECLARE
|
||||
meta json;
|
||||
errors text[];
|
||||
BEGIN
|
||||
errors := (ARRAY[])::TEXT[];
|
||||
IF geometry_type IN ('ST_Polygon', 'ST_MultiPolygon') THEN
|
||||
FOR meta IN EXECUTE 'SELECT json_array_elements(cdb_observatory.OBS_GetMeta($1, $2, 1, 1, $3))' USING geometry_extent, params, target_geoms
|
||||
LOOP
|
||||
IF (meta->>'normalization' = 'denominated' AND meta->>'denom_id' is NULL) THEN
|
||||
errors := array_append(errors, 'Normalizated measure should have a numerator and a denominator. Please review the provided options.');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IS NULL) THEN
|
||||
errors := array_append(errors, 'For polygon geometries, aggregation is mandatory. Please review the provided options');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IN ('median', 'average') AND meta->>'denom_id' IS NULL) THEN
|
||||
errors := array_append(errors, 'Median or average aggregation for polygons requires a denominator to provide weights. Please review the provided options');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IN ('median', 'average') AND meta->>'normalization' NOT LIKE 'pre%') THEN
|
||||
errors := array_append(errors, format('Median or average aggregation only supports prenormalized normalization, %s passed. Please review the provided options', meta->>'normalization'));
|
||||
END IF;
|
||||
END LOOP;
|
||||
|
||||
IF CARDINALITY(errors) > 0 THEN
|
||||
RETURN QUERY EXECUTE 'SELECT FALSE, $1' USING errors;
|
||||
ELSE
|
||||
RETURN QUERY SELECT TRUE, ARRAY[]::TEXT[];
|
||||
END IF;
|
||||
ELSE
|
||||
RETURN QUERY SELECT TRUE, ARRAY[]::TEXT[];
|
||||
END IF;
|
||||
RETURN;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql STABLE;
|
||||
|
||||
@@ -181,6 +181,86 @@ BEGIN
|
||||
END
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetNumerators(
|
||||
bounds GEOMETRY DEFAULT NULL,
|
||||
section_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
subsection_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
other_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
ids TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
name TEXT DEFAULT NULL,
|
||||
denom_id TEXT DEFAULT '',
|
||||
geom_id TEXT DEFAULT '',
|
||||
timespan TEXT DEFAULT ''
|
||||
) RETURNS TABLE (
|
||||
numer_id TEXT,
|
||||
numer_name TEXT,
|
||||
numer_description TEXT,
|
||||
numer_weight NUMERIC,
|
||||
numer_license TEXT,
|
||||
numer_source TEXT,
|
||||
numer_type TEXT,
|
||||
numer_aggregate TEXT,
|
||||
numer_extra JSONB,
|
||||
numer_tags JSONB,
|
||||
valid_denom BOOLEAN,
|
||||
valid_geom BOOLEAN,
|
||||
valid_timespan BOOLEAN
|
||||
) AS $$
|
||||
DECLARE
|
||||
where_clause_elements TEXT[];
|
||||
geom_clause TEXT;
|
||||
where_clause TEXT;
|
||||
BEGIN
|
||||
where_clause_elements := (ARRAY[])::TEXT[];
|
||||
where_clause := '';
|
||||
|
||||
IF bounds IS NOT NULL THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$ST_Intersects(the_geom, '%s'::geometry)$data$, bounds));
|
||||
END IF;
|
||||
IF cardinality(section_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, section_tags));
|
||||
END IF;
|
||||
IF cardinality(subsection_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, subsection_tags));
|
||||
END IF;
|
||||
IF cardinality(other_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, other_tags));
|
||||
END IF;
|
||||
IF cardinality(ids) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_id IN (array_to_string('%s'::text[], ','))$data$, ids));
|
||||
END IF;
|
||||
IF name IS NOT NULL AND name != '' THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_name ilike '%%%s%%'$data$, name));
|
||||
END IF;
|
||||
IF cardinality(where_clause_elements) > 0 THEN
|
||||
where_clause := format($clause$WHERE %s$clause$, array_to_string(where_clause_elements, ' AND '));
|
||||
END IF;
|
||||
RAISE DEBUG '%', array_to_string(where_clause_elements, ' AND ');
|
||||
|
||||
RETURN QUERY
|
||||
EXECUTE
|
||||
format($string$
|
||||
SELECT numer_id::TEXT,
|
||||
numer_name::TEXT,
|
||||
numer_description::TEXT,
|
||||
numer_weight::NUMERIC,
|
||||
NULL::TEXT license,
|
||||
NULL::TEXT source,
|
||||
numer_type numer_type,
|
||||
numer_aggregate numer_aggregate,
|
||||
numer_extra::JSONB numer_extra,
|
||||
numer_tags numer_tags,
|
||||
$1 = ANY(denoms) valid_denom,
|
||||
$2 = ANY(geoms) valid_geom,
|
||||
$3 = ANY(timespans) valid_timespan
|
||||
FROM observatory.obs_meta_numer
|
||||
%s
|
||||
$string$, where_clause)
|
||||
USING denom_id, geom_id, timespan;
|
||||
RETURN;
|
||||
END
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableDenominators(
|
||||
bounds GEOMETRY DEFAULT NULL,
|
||||
filter_tags TEXT[] DEFAULT NULL,
|
||||
@@ -252,6 +332,9 @@ CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableGeometries(
|
||||
geom_aggregate TEXT,
|
||||
geom_license TEXT,
|
||||
geom_source TEXT,
|
||||
geom_type TEXT,
|
||||
geom_extra JSONB,
|
||||
geom_tags JSONB,
|
||||
valid_numer BOOLEAN,
|
||||
valid_denom BOOLEAN,
|
||||
valid_timespan BOOLEAN,
|
||||
@@ -286,16 +369,31 @@ BEGIN
|
||||
NULL::TEXT geom_aggregate,
|
||||
NULL::TEXT license,
|
||||
NULL::TEXT source,
|
||||
$1 = ANY(numers) valid_numer,
|
||||
$2 = ANY(denoms) valid_denom,
|
||||
$3 = ANY(timespans) valid_timespan
|
||||
FROM observatory.obs_meta_geom
|
||||
geom_type::TEXT,
|
||||
geom_extra::JSONB,
|
||||
geom_tags::JSONB,
|
||||
$1 = ANY(numers) valid_numer,
|
||||
$2 = ANY(denoms) valid_denom,
|
||||
CASE WHEN $3 IS NOT NULL AND $3 != '' THEN
|
||||
-- Here we are looking for geometries with: a) geometry timespan or b) numerators linked to that geometries that fit in the
|
||||
-- timespan passed. For example it look for geometries with timespan '2015 - 2015' or numerators linked to that geometry that has
|
||||
-- '2015 - 2015' as one of the valid timespans.
|
||||
-- If we pass a numerator_id, we filter by that numerator
|
||||
CASE WHEN $1 IS NOT NULL AND $1 != '' THEN
|
||||
EXISTS (SELECT 1 FROM observatory.obs_meta_geom_numer_timespan onu WHERE o.geom_id = onu.geom_id AND onu.numer_id = $1 AND ($3 = ANY(onu.timespans) OR $3 IN (select(unnest(o.timespans)))))
|
||||
ELSE
|
||||
EXISTS (SELECT 1 FROM observatory.obs_meta_geom_numer_timespan onu WHERE o.geom_id = onu.geom_id AND ($3 = ANY(onu.timespans) OR $3 IN (select(unnest(o.timespans)))))
|
||||
END
|
||||
ELSE
|
||||
false
|
||||
END as valid_timespan
|
||||
FROM observatory.obs_meta_geom o
|
||||
WHERE %s (geom_tags ?& $4 OR CARDINALITY($4) = 0)
|
||||
), scores AS (
|
||||
SELECT * FROM cdb_observatory._OBS_GetGeometryScores($5,
|
||||
(SELECT ARRAY_AGG(geom_id) FROM available_geoms)
|
||||
)
|
||||
) SELECT available_geoms.*, score, numtiles, notnull_percent, numgeoms,
|
||||
) SELECT DISTINCT ON (geom_id) available_geoms.*, score, numtiles, notnull_percent, numgeoms,
|
||||
percentfill, estnumgeoms, meanmediansize
|
||||
FROM available_geoms, scores
|
||||
WHERE available_geoms.geom_id = scores.column_id
|
||||
@@ -319,6 +417,9 @@ CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableTimespans(
|
||||
timespan_aggregate TEXT,
|
||||
timespan_license TEXT,
|
||||
timespan_source TEXT,
|
||||
timespan_type TEXT,
|
||||
timespan_extra JSONB,
|
||||
timespan_tags JSONB,
|
||||
valid_numer BOOLEAN,
|
||||
valid_denom BOOLEAN,
|
||||
valid_geom BOOLEAN
|
||||
@@ -343,8 +444,11 @@ BEGIN
|
||||
timespan_description::TEXT,
|
||||
timespan_weight::NUMERIC,
|
||||
NULL::TEXT timespan_aggregate,
|
||||
NULL::TEXT license,
|
||||
NULL::TEXT source,
|
||||
NULL::TEXT timespan_license,
|
||||
NULL::TEXT timespan_source,
|
||||
NULL::TEXT timespan_type,
|
||||
NULL::JSONB timespan_extra,
|
||||
NULL::JSONB timespan_tags,
|
||||
$1 = ANY(numers) valid_numer,
|
||||
$2 = ANY(denoms) valid_denom,
|
||||
$3 = ANY(geoms) valid_geom_id
|
||||
@@ -418,7 +522,8 @@ $$ LANGUAGE plpgsql;
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetGeometryScores(
|
||||
bounds Geometry(Geometry, 4326) DEFAULT NULL,
|
||||
filter_geom_ids TEXT[] DEFAULT NULL,
|
||||
desired_num_geoms INTEGER DEFAULT NULL
|
||||
desired_num_geoms INTEGER DEFAULT NULL,
|
||||
desired_area NUMERIC DEFAULT NULL
|
||||
) RETURNS TABLE (
|
||||
score NUMERIC,
|
||||
numtiles BIGINT,
|
||||
@@ -430,6 +535,8 @@ CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetGeometryScores(
|
||||
estnumgeoms NUMERIC,
|
||||
meanmediansize NUMERIC
|
||||
) AS $$
|
||||
DECLARE
|
||||
num_geoms_multiplier Numeric;
|
||||
BEGIN
|
||||
IF desired_num_geoms IS NULL THEN
|
||||
desired_num_geoms := 3000;
|
||||
@@ -440,6 +547,18 @@ BEGIN
|
||||
IF ST_Npoints(bounds) > 10000 THEN
|
||||
bounds := ST_Envelope(bounds);
|
||||
END IF;
|
||||
IF desired_area IS NULL THEN
|
||||
desired_area := ST_Area(bounds);
|
||||
END IF;
|
||||
|
||||
-- In case of points, desired_area will be 0. We still want an accurate
|
||||
-- estimate of numgeoms in that case.
|
||||
IF desired_area = 0 THEN
|
||||
num_geoms_multiplier := 1;
|
||||
ELSE
|
||||
num_geoms_multiplier := Coalesce(desired_area / Nullif(ST_Area(bounds), 0), 1);
|
||||
END IF;
|
||||
|
||||
RETURN QUERY
|
||||
EXECUTE $string$
|
||||
WITH clipped_geom AS (
|
||||
@@ -453,13 +572,11 @@ BEGIN
|
||||
), clipped_geom_countagg AS (
|
||||
SELECT column_id, table_id
|
||||
, BOOL_AND(ST_BandIsNoData(clipped_tile, 1)) nodata
|
||||
, ST_CountAgg(clipped_tile, 1, False)::Numeric pixels -- -10
|
||||
FROM clipped_geom
|
||||
GROUP BY column_id, table_id
|
||||
), clipped_geom_reagg AS (
|
||||
SELECT COUNT(*)::BIGINT cnt, a.column_id, a.table_id,
|
||||
cdb_observatory.FIRST(nodata) first_nodata,
|
||||
cdb_observatory.FIRST(pixels) first_pixel,
|
||||
cdb_observatory.FIRST(tile) first_tile,
|
||||
(ST_SummaryStatsAgg(clipped_tile, 1, False)).sum::Numeric sum_geoms, -- ND
|
||||
(ST_SummaryStatsAgg(clipped_tile, 2, False)).mean::Numeric / 255 mean_fill --ND
|
||||
@@ -474,9 +591,8 @@ BEGIN
|
||||
, (CASE WHEN first_nodata IS FALSE
|
||||
THEN sum_geoms
|
||||
ELSE COALESCE(ST_Value(first_tile, 1, ST_PointOnSurface($1)), 0)
|
||||
* (ST_Area($1) / ST_Area(ST_PixelAsPolygon(first_tile, 0, 0))
|
||||
* first_pixel) -- -20
|
||||
END)::Numeric
|
||||
* (ST_Area($1) / ST_Area(ST_PixelAsPolygon(first_tile, 0, 0)))
|
||||
END)::Numeric * $4
|
||||
AS numgeoms
|
||||
, (CASE WHEN first_nodata IS FALSE
|
||||
THEN mean_fill
|
||||
@@ -490,7 +606,7 @@ BEGIN
|
||||
((100.0 / (1+abs(log(0.0001 + $3) - log(0.0001 + numgeoms::Numeric)))) * percentfill)::Numeric
|
||||
AS score, *
|
||||
FROM final
|
||||
$string$ USING bounds, filter_geom_ids, desired_num_geoms;
|
||||
$string$ USING bounds, filter_geom_ids, desired_num_geoms, num_geoms_multiplier;
|
||||
RETURN;
|
||||
END
|
||||
$$ LANGUAGE plpgsql IMMUTABLE;
|
||||
|
||||
@@ -150,6 +150,18 @@ t|t|t|t|t|t|t|t|t|t|t|t|t|t|t
|
||||
obs_getmeta_conflicting_metadata
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name_implicit_area
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name_area
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name_denom
|
||||
t
|
||||
(1 row)
|
||||
obs_getdata_geomval_empty_null
|
||||
t
|
||||
(1 row)
|
||||
@@ -198,6 +210,9 @@ t|t|t
|
||||
id|data_polygon_measure_one_null|data_polygon_measure_two_null
|
||||
t|t|t
|
||||
(1 row)
|
||||
id|data_polygon_measure_one_null|data_polygon_measure_two_null
|
||||
t|t|t
|
||||
(1 row)
|
||||
id|data_polygon_measure_one_predenom|data_polygon_measure_two_predenom
|
||||
t|t|t
|
||||
(1 row)
|
||||
@@ -261,3 +276,40 @@ t|t
|
||||
ary_type|obs_getdata_api_geomrefs_args_string_return
|
||||
t|t
|
||||
(1 row)
|
||||
setseed
|
||||
|
||||
(1 row)
|
||||
bg_sample|bg_max_error|bg_avg_error|bg_min_error
|
||||
1|t|t|t
|
||||
2|t|t|t
|
||||
3|t|t|t
|
||||
5|t|t|t
|
||||
10|t|t|t
|
||||
25|t|t|t
|
||||
50|t|t|t
|
||||
100|t|t|t
|
||||
2085|t|t|t
|
||||
(9 rows)
|
||||
tract_sample|tract_max_error|tract_avg_error|tract_min_error
|
||||
1|t|t|t
|
||||
2|t|t|t
|
||||
3|t|t|t
|
||||
5|t|t|t
|
||||
10|t|t|t
|
||||
25|t|t|t
|
||||
50|t|t|t
|
||||
100|t|t|t
|
||||
761|t|t|t
|
||||
(9 rows)
|
||||
no_bg_point_error
|
||||
t
|
||||
(1 row)
|
||||
valid|errors
|
||||
t|{}
|
||||
(1 row)
|
||||
valid|errors
|
||||
f|{"Median or average aggregation only supports prenormalized normalization, denominated passed. Please review the provided options"}
|
||||
(1 row)
|
||||
valid|errors
|
||||
f|{"Normalizated measure should have a numerator and a denominator. Please review the provided options."}
|
||||
(1 row)
|
||||
|
||||
@@ -48,6 +48,63 @@ t
|
||||
_obs_getavailablenumerators_no_total_pop_1996
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_all
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_nyc_point
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_usa_extents
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_usa_pop_not_in_zero_point
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_age_gender_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_pop_in_income_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_male_pop_denom_by_total_pop
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_income_denom_by_total_pop
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_zillow_at_zcta5
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_zillow_at_block_group
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_2010_2014
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_total_pop_1996
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_name
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_section
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_in_canada
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_in_employment_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_id
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_with_other_id
|
||||
t
|
||||
(1 row)
|
||||
_obs_getavailabledenominators_usa_pop_in_all
|
||||
t
|
||||
(1 row)
|
||||
@@ -120,6 +177,9 @@ t
|
||||
_obs_getavailablegeometries_bg_not_1996
|
||||
t
|
||||
(1 row)
|
||||
_obs_getavailablegeometries_has_boundary_tag
|
||||
t
|
||||
(1 row)
|
||||
_obs_getavailabletimespans_2010_2014_in_all
|
||||
t
|
||||
(1 row)
|
||||
@@ -159,21 +219,36 @@ t
|
||||
_obs_geometryscores_2500km_buffer
|
||||
t
|
||||
(1 row)
|
||||
_obs_geometryscores_numgeoms_500m_buffer
|
||||
t
|
||||
(1 row)
|
||||
_obs_geometryscores_numgeoms_5km_buffer
|
||||
t
|
||||
(1 row)
|
||||
_obs_geometryscores_numgeoms_50km_buffer
|
||||
t
|
||||
(1 row)
|
||||
_obs_geometryscores_numgeoms_500km_buffer
|
||||
t
|
||||
(1 row)
|
||||
_obs_geometryscores_numgeoms_2500km_buffer
|
||||
t
|
||||
(1 row)
|
||||
column_id|_obs_geometryscores_numgeoms_500m_buffer
|
||||
us.census.tiger.block_group|2
|
||||
us.census.tiger.census_tract|1
|
||||
us.census.tiger.zcta5|0
|
||||
us.census.tiger.county|0
|
||||
(4 rows)
|
||||
column_id|_obs_geometryscores_numgeoms_5km_buffer
|
||||
us.census.tiger.block_group|244
|
||||
us.census.tiger.census_tract|78
|
||||
us.census.tiger.zcta5|9
|
||||
us.census.tiger.county|0
|
||||
(4 rows)
|
||||
column_id|_obs_geometryscores_numgeoms_50km_buffer
|
||||
us.census.tiger.block_group|10817
|
||||
us.census.tiger.census_tract|3396
|
||||
us.census.tiger.zcta5|484
|
||||
us.census.tiger.county|11
|
||||
(4 rows)
|
||||
column_id|_obs_geometryscores_numgeoms_500km_buffer
|
||||
us.census.tiger.block_group|48567
|
||||
us.census.tiger.census_tract|15823
|
||||
us.census.tiger.zcta5|6466
|
||||
us.census.tiger.county|295
|
||||
(4 rows)
|
||||
column_id|_obs_geometryscores_numgeoms_2500km_buffer
|
||||
us.census.tiger.block_group|165852
|
||||
us.census.tiger.census_tract|55283
|
||||
us.census.tiger.zcta5|27046
|
||||
us.census.tiger.county|2551
|
||||
(4 rows)
|
||||
_obs_geometryscores_500km_buffer_50_geoms
|
||||
t
|
||||
(1 row)
|
||||
@@ -186,6 +261,12 @@ t
|
||||
_obs_geometryscores_500km_buffer_25000_geoms
|
||||
t
|
||||
(1 row)
|
||||
testarea_uses_tract
|
||||
t
|
||||
(1 row)
|
||||
points_use_bg
|
||||
t
|
||||
(1 row)
|
||||
_total_pop_in_legacy_builder_metadata
|
||||
t
|
||||
(1 row)
|
||||
|
||||
1
src/pg/test/fixtures/drop_fixtures.sql
vendored
1
src/pg/test/fixtures/drop_fixtures.sql
vendored
@@ -12,6 +12,7 @@ DROP TABLE IF EXISTS observatory.obs_meta_numer;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_denom;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_geom;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_timespan;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_geom_numer_timespan;
|
||||
DROP TABLE IF EXISTS observatory.obs_column_table_tile;
|
||||
DROP TABLE IF EXISTS observatory.obs_column_table_tile_simple;
|
||||
DROP TABLE IF EXISTS observatory.obs_78fb6c1d6ff6505225175922c2c389ce48d7632c;
|
||||
|
||||
177463
src/pg/test/fixtures/load_fixtures.sql
vendored
177463
src/pg/test/fixtures/load_fixtures.sql
vendored
File diff suppressed because one or more lines are too long
@@ -268,7 +268,7 @@ SELECT
|
||||
(meta->0->>'numer_name') = 'Total Population' numer_name,
|
||||
(meta->0->>'denom_id') IS NULL denom_id,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'area' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes one partial measure with "best" metadata
|
||||
@@ -290,7 +290,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for polygon completes one partial measure with "best" metadata
|
||||
@@ -308,7 +308,7 @@ SELECT
|
||||
(meta->0->>'numer_name') = 'Total Population' numer_name,
|
||||
(meta->0->>'denom_id') IS NULL denom_id,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'area' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for polygon completes one partial measure with "best" metadata
|
||||
@@ -330,13 +330,13 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with "best"
|
||||
-- metadata, includes geom alternatives if asked
|
||||
WITH meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01001002"}]', null, 2) meta)
|
||||
'[{"numer_id": "us.census.acs.B01001002", "max_score_rank": 2}]', null, 2) meta)
|
||||
SELECT
|
||||
(meta->0->>'id')::integer = 1 id,
|
||||
(meta->0->>'numer_id') = 'us.census.acs.B01001002' numer_id,
|
||||
@@ -352,7 +352,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization,
|
||||
(meta->0->>'normalization') = 'denominated' normalization,
|
||||
(meta->1->>'id')::integer = 1 id,
|
||||
(meta->1->>'numer_id') = 'us.census.acs.B01001002' numer_id,
|
||||
(meta->1->>'timespan_rank')::integer = 1 timespan_rank,
|
||||
@@ -367,7 +367,7 @@ SELECT
|
||||
(meta->1->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->1->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->1->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
|
||||
(meta->1->>'normalization') IS NULL normalization
|
||||
(meta->1->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with "best" metadata
|
||||
@@ -389,7 +389,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with conflicting
|
||||
@@ -398,6 +398,26 @@ SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01001002", "denom_id": "us.census.acs.B01001002", "geom_id": "us.census.tiger.census_tract"}]') IS NULL
|
||||
AS obs_getmeta_conflicting_metadata;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom"}]'
|
||||
)->0->>'suggested_name' = 'total_pop_2010_2014' obs_getmeta_suggested_name;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request with area norm
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001"}]'
|
||||
)->0->>'suggested_name' = 'total_pop_per_sq_km_2010_2014' obs_getmeta_suggested_name_implicit_area;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request with area norm
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001", "normalization": "area"}]'
|
||||
)->0->>'suggested_name' = 'total_pop_per_sq_km_2010_2014' obs_getmeta_suggested_name_area;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request with denom
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01001002", "normalization": "denom"}]'
|
||||
)->0->>'suggested_name' = 'male_pop_rate_2010_2014' obs_getmeta_suggested_name_denom;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by id with empty list/null
|
||||
WITH data AS (SELECT * FROM cdb_observatory.OBS_GetData(ARRAY[]::TEXT[], null))
|
||||
SELECT ARRAY_AGG(data) IS NULL AS obs_getdata_geomval_empty_null FROM data;
|
||||
@@ -576,6 +596,18 @@ SELECT id = 1 id,
|
||||
abs((data->1->>'value')::Numeric - 0.4902) / 0.4902 < 0.001 data_polygon_measure_two_null
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with two measures and one return null
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"numer_id": "us.census.acs.B19013001_quantile"}, {"numer_id": "us.census.acs.B01001002"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta)))
|
||||
SELECT id = 1 id,
|
||||
(data->0->>'value') is NULL data_polygon_measure_one_null,
|
||||
abs((data->1->>'value')::Numeric - 0.4902) / 0.4902 < 0.001 data_polygon_measure_two_null
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with two standard measures predenom normalization
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
@@ -662,25 +694,25 @@ FROM data;
|
||||
-- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta), false))
|
||||
SELECT every(id = 1) is TRUE id,
|
||||
count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
|
||||
abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop
|
||||
abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure + one text
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta), false))
|
||||
SELECT every(id = 1) is TRUE id,
|
||||
count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
|
||||
abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop,
|
||||
abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop,
|
||||
array_agg(distinct data->2->>'value') = '{"Block Group 1","Block Group 2","Block Group 3","Block Group 4","Block Group 5"}' correct_bg_names
|
||||
FROM data;
|
||||
|
||||
@@ -798,3 +830,152 @@ SELECT json_typeof(data->0->'value') = 'array' ary_type,
|
||||
AS OBS_GetData_API_geomrefs_args_string_return
|
||||
FROM cdb_observatory.obs_getdata(array['36047'],
|
||||
'[{"numer_type": "text", "numer_colname": "obs_getboundarybyid", "api_method": "obs_getboundarybyid", "api_args": ["us.census.tiger.county"]}]');
|
||||
|
||||
-- Ensure consistent results below.
|
||||
select setseed(0);
|
||||
|
||||
-- Check that random assortment of block groups in Brooklyn return accurate data
|
||||
WITH _geoms AS (
|
||||
SELECT
|
||||
(data->0->>'value')::geometry the_geom,
|
||||
data->0->>'geomref' geom_ref,
|
||||
(data->1->>'value')::numeric total_pop
|
||||
FROM cdb_observatory.OBS_GetData(
|
||||
array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
|
||||
(SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
|
||||
'[{"geom_id": "us.census.tiger.block_group"},
|
||||
{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group", "normalization": "predenom"}]')),
|
||||
FALSE
|
||||
)
|
||||
WHERE data->0->>'geomref' LIKE '36047%'
|
||||
ORDER BY RANDOM()
|
||||
), geoms AS (
|
||||
SELECT *, row_number() OVER () cartodb_id FROM _geoms
|
||||
), samples AS (
|
||||
SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
|
||||
), filtered AS (
|
||||
SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
|
||||
), summary AS (
|
||||
SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
|
||||
COUNT(*)::INT cnt,
|
||||
ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
|
||||
SUM(ST_Area(the_geom))::Numeric sumarea
|
||||
FROM filtered
|
||||
GROUP BY sample
|
||||
), meta AS (
|
||||
SELECT sample, cdb_observatory.OBS_GetMeta(extent,
|
||||
('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
|
||||
1, 1, cnt) meta
|
||||
FROM summary
|
||||
GROUP BY sample, extent, cnt, sumarea
|
||||
), results AS (
|
||||
SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
|
||||
FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
|
||||
WHERE summary.sample = meta.sample
|
||||
) SELECT sample bg_sample
|
||||
, MAX(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_max_error
|
||||
, AVG(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_avg_error
|
||||
, MIN(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 bg_min_error
|
||||
FROM geoms, results
|
||||
WHERE cartodb_id = id
|
||||
GROUP BY sample
|
||||
ORDER BY sample
|
||||
;
|
||||
|
||||
-- Check that random assortment of tracts in Brooklyn return accurate data
|
||||
WITH _geoms AS (
|
||||
SELECT
|
||||
(data->0->>'value')::geometry the_geom,
|
||||
data->0->>'geomref' geom_ref,
|
||||
(data->1->>'value')::numeric total_pop
|
||||
FROM cdb_observatory.OBS_GetData(
|
||||
array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
|
||||
(SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
|
||||
'[{"geom_id": "us.census.tiger.census_tract"},
|
||||
{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.census_tract", "normalization": "predenom"}]')),
|
||||
FALSE
|
||||
)
|
||||
WHERE data->0->>'geomref' LIKE '36047%'
|
||||
ORDER BY RANDOM()
|
||||
), geoms AS (
|
||||
SELECT *, row_number() OVER () cartodb_id FROM _geoms
|
||||
), samples AS (
|
||||
SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
|
||||
), filtered AS (
|
||||
SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
|
||||
), summary AS (
|
||||
SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
|
||||
COUNT(*)::INT cnt,
|
||||
ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
|
||||
SUM(ST_Area(the_geom))::Numeric sumarea
|
||||
FROM filtered
|
||||
GROUP BY sample
|
||||
), meta AS (
|
||||
SELECT sample, cdb_observatory.OBS_GetMeta(extent,
|
||||
('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
|
||||
1, 1, cnt) meta
|
||||
FROM summary
|
||||
GROUP BY sample, extent, cnt, sumarea
|
||||
), results AS (
|
||||
SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
|
||||
FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
|
||||
WHERE summary.sample = meta.sample
|
||||
) SELECT sample tract_sample
|
||||
, MAX(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_max_error
|
||||
, AVG(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_avg_error
|
||||
, MIN(100 * abs((geoms.total_pop - val) / Coalesce(NullIf(total_pop, 0), NULL)))::Numeric(10, 2) < 10 tract_min_error
|
||||
FROM geoms, results
|
||||
WHERE cartodb_id = id
|
||||
GROUP BY sample
|
||||
ORDER BY sample
|
||||
;
|
||||
|
||||
-- Check that random assortment of block group points in Brooklyn return accurate data
|
||||
WITH _geoms AS (
|
||||
SELECT
|
||||
ST_PointOnSurface((data->0->>'value')::geometry) the_geom,
|
||||
data->0->>'geomref' geom_ref,
|
||||
(data->1->>'value')::numeric total_pop
|
||||
FROM cdb_observatory.OBS_GetData(
|
||||
array[(st_buffer(cdb_observatory._testpoint(), 0.2), 1)::geomval],
|
||||
(SELECT cdb_observatory.OBS_GetMeta(ST_MakeEnvelope(-179, 89, 179, -89, 4326),
|
||||
'[{"geom_id": "us.census.tiger.block_group"},
|
||||
{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group", "normalization": "predenom"}]')),
|
||||
FALSE
|
||||
)
|
||||
WHERE data->0->>'geomref' LIKE '36047%'
|
||||
), geoms AS (
|
||||
SELECT *, row_number() OVER () cartodb_id FROM _geoms
|
||||
), samples AS (
|
||||
SELECT COUNT(*) cnt, unnest(ARRAY[1, 2, 3, 5, 10, 25, 50, 100, COUNT(*)]) sample FROM geoms
|
||||
), filtered AS (
|
||||
SELECT * FROM geoms, samples WHERE cartodb_id % (cnt / sample) = 0
|
||||
), summary AS (
|
||||
SELECT sample, ST_SetSRID(ST_Extent(the_geom), 4326) extent,
|
||||
COUNT(*)::INT cnt,
|
||||
ARRAY_AGG((the_geom, cartodb_id)::geomval) geomvals,
|
||||
SUM(ST_Area(the_geom))::Numeric sumarea
|
||||
FROM filtered
|
||||
GROUP BY sample
|
||||
), meta AS (
|
||||
SELECT sample, cdb_observatory.OBS_GetMeta(extent,
|
||||
('[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "target_area": ' || sumarea || '}]')::JSON,
|
||||
1, 1, cnt) meta
|
||||
FROM summary
|
||||
GROUP BY sample, extent, cnt, sumarea
|
||||
), results AS (
|
||||
SELECT summary.sample, id, meta->0->>'geom_id' geom_id, (data->0->>'value')::Numeric as val
|
||||
FROM summary, meta, LATERAL cdb_observatory.OBS_GetData(geomvals, meta) data
|
||||
WHERE summary.sample = meta.sample
|
||||
) SELECT
|
||||
BOOL_AND(abs((geoms.total_pop - val) /
|
||||
Coalesce(NullIf(total_pop, 0), 1)) = 0) is True no_bg_point_error
|
||||
FROM geoms, results
|
||||
WHERE cartodb_id = id
|
||||
;
|
||||
|
||||
-- OBS_MetadataValidation
|
||||
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B01003001","denom_id": null,"normalization": "prenormalized","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B25058001","denom_id": null,"normalization": "denominated","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B15003001","denom_id": null,"normalization": "denominated","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
|
||||
@@ -119,6 +119,142 @@ FROM cdb_observatory.OBS_GetAvailableNumerators(
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getavailablenumerators_no_total_pop_1996;
|
||||
|
||||
--
|
||||
-- _OBS_GetNumerators tests
|
||||
--
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators())
|
||||
AS _obs_getnumerators_usa_pop_in_all;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_usa_pop_in_nyc_point;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakeEnvelope(
|
||||
-169.8046875, 21.289374355860424,
|
||||
-47.4609375, 72.0739114882038
|
||||
), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_usa_pop_in_usa_extents;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(0, 0), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_no_usa_pop_not_in_zero_point;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
subsection_tags => ARRAY['subsection/tags.age_gender']
|
||||
))
|
||||
AS _obs_getnumerators_usa_pop_in_age_gender_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
subsection_tags => ARRAY['subsection/tags.income']
|
||||
))
|
||||
AS _obs_getnumerators_no_pop_in_income_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01001002' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
denom_id => 'us.census.acs.B01003001'
|
||||
) WHERE valid_denom = True)
|
||||
AS _obs_getnumerators_male_pop_denom_by_total_pop;
|
||||
|
||||
SELECT 'us.census.acs.B19013001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
denom_id => 'us.census.acs.B01003001'
|
||||
) WHERE valid_denom = True)
|
||||
AS _obs_getnumerators_no_income_denom_by_total_pop;
|
||||
|
||||
SELECT 'us.zillow.AllHomes_Zhvi' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
geom_id => 'us.census.tiger.zcta5'
|
||||
) WHERE valid_geom = True)
|
||||
AS _obs_getnumerators_zillow_at_zcta5;
|
||||
|
||||
SELECT 'us.zillow.AllHomes_Zhvi' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
geom_id => 'us.census.tiger.block_group'
|
||||
) WHERE valid_geom = True)
|
||||
AS _obs_getnumerators_no_zillow_at_block_group;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
timespan => '2010 - 2014'
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getnumerators_total_pop_2010_2014;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
timespan => '1996'
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getnumerators_no_total_pop_1996;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
name => 'tot'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_name;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_section;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.ca}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_in_canada;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}',
|
||||
subsection_tags => '{subsection/tags.age_gender}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}',
|
||||
subsection_tags => '{subsection/tags.employment}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_in_employment_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
ids => '{us.census.acs.B01003001}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_id;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
ids => '{us.census.acs.B01003002}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_with_other_id;
|
||||
|
||||
--
|
||||
-- OBS_GetAvailableDenominators tests
|
||||
--
|
||||
@@ -289,6 +425,11 @@ FROM cdb_observatory.OBS_GetAvailableGeometries(
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getavailablegeometries_bg_not_1996;
|
||||
|
||||
SELECT 'subsection/tags.boundary' IN (SELECT (Jsonb_Each(geom_tags)).key
|
||||
FROM cdb_observatory.OBS_GetAvailableGeometries(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)
|
||||
)) AS _obs_getavailablegeometries_has_boundary_tag;
|
||||
|
||||
--
|
||||
-- OBS_GetAvailableTimespans tests
|
||||
--
|
||||
@@ -360,9 +501,9 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
'us.census.tiger.county', 'us.census.tiger.zcta5'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county']
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC)
|
||||
= ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.county', 'us.census.tiger.zcta5']
|
||||
AS _obs_geometryscores_5km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 5000)::Geometry(Geometry, 4326),
|
||||
@@ -390,60 +531,55 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
ARRAY['us.census.tiger.county', 'us.census.tiger.zcta5',
|
||||
'us.census.tiger.census_tract', 'us.census.tiger.block_group']
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC)
|
||||
= ARRAY['us.census.tiger.county', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.block_group']
|
||||
AS _obs_geometryscores_2500km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 2500000)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
ARRAY['us.census.tiger.county', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.block_group'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
|
||||
SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text
|
||||
= '{ "us.census.tiger.block_group" : 9, "us.census.tiger.census_tract" : 3, "us.census.tiger.zcta5" : 0, "us.census.tiger.county" : 0 }'
|
||||
AS _obs_geometryscores_numgeoms_500m_buffer
|
||||
SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_500m_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 500)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC;
|
||||
|
||||
SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
|
||||
'{ "us.census.tiger.block_group" : 880, "us.census.tiger.census_tract" : 310, "us.census.tiger.zcta5" : 45, "us.census.tiger.county" : 1 }'
|
||||
AS _obs_geometryscores_numgeoms_5km_buffer
|
||||
SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_5km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 5000)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC;
|
||||
|
||||
SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
|
||||
'{ "us.census.tiger.block_group" : 11531, "us.census.tiger.census_tract" : 3601, "us.census.tiger.zcta5" : 550, "us.census.tiger.county" : 14 }'
|
||||
AS _obs_geometryscores_numgeoms_50km_buffer
|
||||
SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_50km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 50000)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC;
|
||||
|
||||
SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
|
||||
'{ "us.census.tiger.block_group" : 48917, "us.census.tiger.census_tract" : 15969, "us.census.tiger.zcta5" : 6534, "us.census.tiger.county" : 314 }'
|
||||
AS _obs_geometryscores_numgeoms_500km_buffer
|
||||
SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_500km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 500000)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC;
|
||||
|
||||
SELECT JSON_Object_Agg(column_id, numgeoms::int ORDER BY numgeoms DESC)::Text =
|
||||
'{ "us.census.tiger.block_group" : 169191, "us.census.tiger.census_tract" : 56469, "us.census.tiger.zcta5" : 26525, "us.census.tiger.county" : 2753 }'
|
||||
AS _obs_geometryscores_numgeoms_2500km_buffer
|
||||
SELECT column_id, numgeoms::int AS _obs_geometryscores_numgeoms_2500km_buffer
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 2500000)::Geometry(Geometry, 4326),
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'])
|
||||
WHERE table_id LIKE '%2015%';
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC;
|
||||
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
ARRAY['us.census.tiger.county', 'us.census.tiger.zcta5',
|
||||
@@ -475,9 +611,9 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'], 2500)
|
||||
WHERE table_id LIKE '%2015%';
|
||||
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county']
|
||||
SELECT ARRAY_AGG(column_id ORDER BY score DESC)
|
||||
= ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.county', 'us.census.tiger.zcta5']
|
||||
AS _obs_geometryscores_500km_buffer_25000_geoms
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
ST_Buffer(ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)::Geography, 50000)::Geometry(Geometry, 4326),
|
||||
@@ -485,6 +621,44 @@ SELECT ARRAY_AGG(column_id ORDER BY score DESC) =
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'], 25000)
|
||||
WHERE table_id LIKE '%2015%';
|
||||
|
||||
-- Check that one small geom approximates tract data
|
||||
WITH geoms AS (SELECT cdb_observatory._testarea() the_geom),
|
||||
summary AS (SELECT ST_SetSRID(ST_Extent(the_geom), 4326) extent,
|
||||
COUNT(*)::INT cnt,
|
||||
SUM(ST_Area(the_geom))::Numeric sumarea
|
||||
FROM geoms)
|
||||
SELECT column_id = 'us.census.tiger.census_tract' testarea_uses_tract
|
||||
FROM summary, LATERAL (
|
||||
SELECT *
|
||||
FROM cdb_observatory._OBS_GetGeometryScores(extent,
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'],
|
||||
cnt, sumarea)) foo
|
||||
ORDER BY score DESC LIMIT 1;
|
||||
|
||||
-- Check that randomly distributed points always use smallest geometry if we
|
||||
-- order by numgeoms desc
|
||||
WITH geoms as (SELECT UNNEST(ARRAY[
|
||||
cdb_observatory._testpoint(),
|
||||
st_translate(cdb_observatory._testpoint(), -0.003, 0),
|
||||
st_translate(cdb_observatory._testpoint(), -0.006, 0)
|
||||
]) the_geom),
|
||||
summary as (SELECT
|
||||
ST_SetSRID(ST_Extent(the_geom), 4326) extent,
|
||||
SUM(ST_Area(the_geom))::Numeric area,
|
||||
COUNT(*)::INTEGER cnt
|
||||
FROM geoms
|
||||
)
|
||||
SELECT column_id = 'us.census.tiger.block_group' points_use_bg
|
||||
FROM summary, LATERAL (
|
||||
SELECT * FROM cdb_observatory._OBS_GetGeometryScores(
|
||||
extent,
|
||||
ARRAY['us.census.tiger.block_group', 'us.census.tiger.census_tract',
|
||||
'us.census.tiger.zcta5', 'us.census.tiger.county'],
|
||||
cnt, area)) foo
|
||||
WHERE table_id LIKE '%2015%'
|
||||
ORDER BY numgeoms DESC LIMIT 1;
|
||||
|
||||
--
|
||||
-- OBS_LegacyBuilderMetadata tests
|
||||
--
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
from nose.tools import assert_equal, assert_is_not_none
|
||||
from nose.plugins.skip import SkipTest
|
||||
from nose_parameterized import parameterized
|
||||
|
||||
from itertools import izip_longest
|
||||
@@ -55,83 +54,50 @@ SKIP_COLUMNS = set([
|
||||
u'us.census.tiger.mtfcc',
|
||||
u'whosonfirst.wof_county_name',
|
||||
u'whosonfirst.wof_region_name',
|
||||
'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
, 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
, 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
, 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
, 'uk.ons.LC3202WA0007'
|
||||
, 'uk.ons.LC3202WA0010'
|
||||
, 'uk.ons.LC3202WA0004'
|
||||
, 'uk.ons.LC3204WA0004'
|
||||
, 'uk.ons.LC3204WA0007'
|
||||
, 'uk.ons.LC3204WA0010'
|
||||
u'fr.insee.P12_RP_CHOS',
|
||||
u'fr.insee.P12_RP_HABFOR',
|
||||
u'fr.insee.P12_RP_EAUCH',
|
||||
u'fr.insee.P12_RP_BDWC',
|
||||
u'fr.insee.P12_RP_MIDUR',
|
||||
u'fr.insee.P12_RP_CLIM',
|
||||
u'fr.insee.P12_RP_MIBOIS',
|
||||
u'fr.insee.P12_RP_CASE',
|
||||
u'fr.insee.P12_RP_TTEGOU',
|
||||
u'fr.insee.P12_RP_ELEC',
|
||||
u'fr.insee.P12_ACTOCC15P_ILT45D',
|
||||
u'fr.insee.P12_RP_CHOS',
|
||||
u'fr.insee.P12_RP_HABFOR',
|
||||
u'fr.insee.P12_RP_EAUCH',
|
||||
u'fr.insee.P12_RP_BDWC',
|
||||
u'fr.insee.P12_RP_MIDUR',
|
||||
u'fr.insee.P12_RP_CLIM',
|
||||
u'fr.insee.P12_RP_MIBOIS',
|
||||
u'fr.insee.P12_RP_CASE',
|
||||
u'fr.insee.P12_RP_TTEGOU',
|
||||
u'fr.insee.P12_RP_ELEC',
|
||||
u'fr.insee.P12_ACTOCC15P_ILT45D',
|
||||
u'uk.ons.LC3202WA0007',
|
||||
u'uk.ons.LC3202WA0010',
|
||||
u'uk.ons.LC3202WA0004',
|
||||
u'uk.ons.LC3204WA0004',
|
||||
u'uk.ons.LC3204WA0007',
|
||||
u'uk.ons.LC3204WA0010',
|
||||
u'br.geo.subdistritos_name'
|
||||
])
|
||||
|
||||
MEASURE_COLUMNS = query('''
|
||||
SELECT ARRAY_AGG(DISTINCT numer_id) numer_ids,
|
||||
SELECT cdb_observatory.FIRST(distinct numer_id) numer_ids,
|
||||
numer_aggregate,
|
||||
denom_reltype,
|
||||
section_tags
|
||||
denom_reltype
|
||||
FROM observatory.obs_meta
|
||||
WHERE numer_weight > 0
|
||||
AND numer_id NOT IN ('{skip}')
|
||||
AND numer_id NOT LIKE 'eu.%' --Skipping Eurostat
|
||||
AND section_tags IS NOT NULL
|
||||
AND subsection_tags IS NOT NULL
|
||||
GROUP BY numer_aggregate, section_tags, denom_reltype
|
||||
GROUP BY numer_id, numer_aggregate, denom_reltype
|
||||
'''.format(skip="', '".join(SKIP_COLUMNS))).fetchall()
|
||||
|
||||
#CATEGORY_COLUMNS = query('''
|
||||
#SELECT distinct numer_id
|
||||
#FROM observatory.obs_meta
|
||||
#WHERE numer_type ILIKE 'text'
|
||||
#AND numer_weight > 0
|
||||
#''').fetchall()
|
||||
#
|
||||
#BOUNDARY_COLUMNS = query('''
|
||||
#SELECT id FROM observatory.obs_column
|
||||
#WHERE type ILIKE 'geometry'
|
||||
#AND weight > 0
|
||||
#''').fetchall()
|
||||
#
|
||||
#US_CENSUS_MEASURE_COLUMNS = query('''
|
||||
#SELECT distinct numer_name
|
||||
#FROM observatory.obs_meta
|
||||
#WHERE numer_type ILIKE 'numeric'
|
||||
#AND 'us.census.acs' = ANY (subsection_tags)
|
||||
#AND numer_weight > 0
|
||||
#''').fetchall()
|
||||
|
||||
|
||||
#def default_geometry_id(column_id):
|
||||
# '''
|
||||
# Returns default test point for the column_id.
|
||||
# '''
|
||||
# if column_id == 'whosonfirst.wof_disputed_geom':
|
||||
# return 'ST_SetSRID(ST_MakePoint(76.57, 33.78), 4326)'
|
||||
# elif column_id == 'whosonfirst.wof_marinearea_geom':
|
||||
# return 'ST_SetSRID(ST_MakePoint(-68.47, 43.33), 4326)'
|
||||
# elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
# 'us.census.tiger.school_district_secondary',
|
||||
# 'us.census.tiger.school_district_elementary_clipped',
|
||||
# 'us.census.tiger.school_district_secondary_clipped'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-73.7067, 40.7025), 4326)'
|
||||
# elif column_id.startswith('es.ine'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-2.51141249535454, 42.8226119029222), 4326)'
|
||||
# elif column_id.startswith('us.zillow'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-81.3544048197256, 28.3305906291771), 4326)'
|
||||
# elif column_id.startswith('ca.'):
|
||||
# return ''
|
||||
# else:
|
||||
# return 'ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)'
|
||||
|
||||
|
||||
def default_lonlat(column_id):
|
||||
'''
|
||||
@@ -141,11 +107,6 @@ def default_lonlat(column_id):
|
||||
return (76.57, 33.78)
|
||||
elif column_id == 'whosonfirst.wof_marinearea_geom':
|
||||
return (-68.47, 43.33)
|
||||
elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
'us.census.tiger.school_district_secondary',
|
||||
'us.census.tiger.school_district_elementary_clipped',
|
||||
'us.census.tiger.school_district_secondary_clipped'):
|
||||
return (40.7025, -73.7067)
|
||||
elif column_id.startswith('uk'):
|
||||
if 'WA' in column_id:
|
||||
return (51.46844551219723, -3.184833526611328)
|
||||
@@ -157,30 +118,19 @@ def default_lonlat(column_id):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('mx.'):
|
||||
return (19.41347699386547, -99.17019367218018)
|
||||
elif column_id.startswith('th.'):
|
||||
return (13.725377712079784, 100.49263000488281)
|
||||
# cols for French Guyana only
|
||||
#elif column_id in ('fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
# , 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
# , 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
# , 'fr.insee.P12_ACTOCC15P_ILT45D'):
|
||||
# return (4.938408371206558, -52.32908248901367)
|
||||
elif column_id.startswith('fr.'):
|
||||
return (48.860875144709475, 2.3613739013671875)
|
||||
elif column_id.startswith('ca.'):
|
||||
return (43.65594991256823, -79.37965393066406)
|
||||
elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
'us.census.tiger.school_district_secondary',
|
||||
'us.census.tiger.school_district_elementary_clipped',
|
||||
'us.census.tiger.school_district_secondary_clipped',
|
||||
'us.census.tiger.school_district_elementary_geoname',
|
||||
'us.census.tiger.school_district_secondary_geoname'):
|
||||
return (40.7025, -73.7067)
|
||||
elif column_id.startswith('us.census.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.dma.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.ihme.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.bls.'):
|
||||
@@ -191,8 +141,6 @@ def default_lonlat(column_id):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.epa.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('eu.'):
|
||||
raise SkipTest('No tests for Eurostat!')
|
||||
elif column_id.startswith('br.'):
|
||||
return (-23.53, -46.63)
|
||||
elif column_id.startswith('au.'):
|
||||
@@ -201,56 +149,65 @@ def default_lonlat(column_id):
|
||||
raise Exception('No catalog point set for {}'.format(column_id))
|
||||
|
||||
|
||||
def default_point(column_id):
|
||||
lat, lng = default_lonlat(column_id)
|
||||
def default_point(test_point):
|
||||
lat, lng = test_point
|
||||
return 'ST_SetSRID(ST_MakePoint({lng}, {lat}), 4326)'.format(
|
||||
lat=lat, lng=lng)
|
||||
|
||||
|
||||
def default_area(column_id):
|
||||
def default_area(test_point):
|
||||
'''
|
||||
Returns default test area for the column_id
|
||||
'''
|
||||
point = default_point(column_id)
|
||||
point = default_point(test_point)
|
||||
area = 'ST_Transform(ST_Buffer(ST_Transform({point}, 3857), 250), 4326)'.format(
|
||||
point=point)
|
||||
return area
|
||||
|
||||
#@parameterized(US_CENSUS_MEASURE_COLUMNS)
|
||||
#def test_get_us_census_measure_points(name):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetUSCensusMeasure({point}, '{name}')
|
||||
# '''.format(name=name.replace("'", "''"),
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point('')))
|
||||
# rows = resp.fetchall()
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
def filter_points():
|
||||
return MEASURE_COLUMNS
|
||||
|
||||
|
||||
def grouped_measure_columns():
|
||||
for numer_ids, numer_aggregate, denom_reltype, section_tags in MEASURE_COLUMNS:
|
||||
def filter_areas():
|
||||
filtered = []
|
||||
for numer_ids, numer_aggregate, denom_reltype in MEASURE_COLUMNS:
|
||||
if numer_aggregate is None or numer_aggregate.lower() not in ('sum', 'median', 'average'):
|
||||
continue
|
||||
if numer_aggregate.lower() in ('median', 'average') \
|
||||
and (denom_reltype is None or denom_reltype.lower() != 'universe'):
|
||||
continue
|
||||
filtered.append((numer_ids, numer_aggregate, denom_reltype))
|
||||
|
||||
return filtered
|
||||
|
||||
|
||||
def grouped_measure_columns(filtered_columns):
|
||||
groupbypoint = dict()
|
||||
for row in filtered_columns:
|
||||
numer_ids = row[0]
|
||||
point = default_lonlat(numer_ids)
|
||||
if point in groupbypoint:
|
||||
groupbypoint[point].append(numer_ids)
|
||||
else:
|
||||
groupbypoint[point] = [numer_ids]
|
||||
|
||||
for point, numer_ids in groupbypoint.iteritems():
|
||||
for colgroup in grouper(numer_ids, 50):
|
||||
yield [c for c in colgroup if c], numer_aggregate, denom_reltype, section_tags
|
||||
yield point, [c for c in colgroup if c]
|
||||
|
||||
|
||||
@parameterized(grouped_measure_columns())
|
||||
def test_get_measure_points(numer_ids, numer_aggregate, denom_reltype, section_tags):
|
||||
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_point(numer_ids[0]))
|
||||
@parameterized(grouped_measure_columns(filter_points()))
|
||||
def test_get_measure_points(point, numer_ids):
|
||||
_test_measures(numer_ids, default_point(point))
|
||||
|
||||
|
||||
@parameterized(grouped_measure_columns())
|
||||
def test_get_measure_areas(numer_ids, numer_aggregate, denom_reltype, section_tags):
|
||||
if numer_aggregate is None or numer_aggregate.lower() not in ('sum', 'median', 'average'):
|
||||
return
|
||||
if numer_aggregate.lower() in ('median', 'average') \
|
||||
and (denom_reltype is None \
|
||||
or denom_reltype.lower() != 'universe'):
|
||||
return
|
||||
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_area(numer_ids[0]))
|
||||
@parameterized(grouped_measure_columns(filter_areas()))
|
||||
def test_get_measure_areas(point, numer_ids):
|
||||
_test_measures(numer_ids, default_area(point))
|
||||
|
||||
|
||||
def _test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, geom):
|
||||
def _test_measures(numer_ids, geom):
|
||||
in_params = []
|
||||
for numer_id in numer_ids:
|
||||
in_params.append({
|
||||
@@ -283,90 +240,3 @@ def _test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, geom
|
||||
assert_equal(len(vals), len(in_params))
|
||||
for i, val in enumerate(vals):
|
||||
assert_is_not_none(val, 'NULL for {}'.format(in_params[i]['numer_id']))
|
||||
|
||||
|
||||
#@parameterized(CATEGORY_COLUMNS)
|
||||
#def test_get_category_areas(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetCategory({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(CATEGORY_COLUMNS)
|
||||
#def test_get_category_points(column_id):
|
||||
# if column_id in SKIP_COLUMNS:
|
||||
# raise SkipTest('Column {} should be skipped'.format(column_id))
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetCategory({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# rows = resp.fetchall()
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundaries_by_geometry(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundariesByGeometry({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_points_by_geometry(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetPointsByGeometry({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_points(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundary({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_id(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundaryId({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_by_id(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundaryById({geometry_id}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# geometry_id=default_geometry_id(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
|
||||
Reference in New Issue
Block a user