Compare commits
39 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ff0989f8fc | ||
|
|
0a753e95c0 | ||
|
|
b62e3ea963 | ||
|
|
1da0b8cb6b | ||
|
|
a39de46531 | ||
|
|
94b8e7492d | ||
|
|
91ece26c06 | ||
|
|
74b9d209c0 | ||
|
|
4ae889dfdc | ||
|
|
3353ad0a32 | ||
|
|
b4ef3c77a9 | ||
|
|
90a2421b6e | ||
|
|
fd21709ca1 | ||
|
|
3791511d7d | ||
|
|
fad541c3fc | ||
|
|
48ed086fec | ||
|
|
7e550cf909 | ||
|
|
6ab17bf8be | ||
|
|
1f7f8015ad | ||
|
|
6066ef028d | ||
|
|
3c2e997a85 | ||
|
|
cef99c6343 | ||
|
|
50d975ce9b | ||
|
|
c56633dd2a | ||
|
|
2b26c5ad64 | ||
|
|
3ed18ca1f0 | ||
|
|
028c93170c | ||
|
|
8d52857f01 | ||
|
|
9e36e11bb3 | ||
|
|
adae37631e | ||
|
|
8b98b6b64a | ||
|
|
aedc45f2a8 | ||
|
|
8612da57f7 | ||
|
|
24a736c72e | ||
|
|
cde6d5bfba | ||
|
|
d1f4e570ad | ||
|
|
415a4ccc05 | ||
|
|
ccb8092506 | ||
|
|
6266262427 |
43
.travis.yml
Normal file
43
.travis.yml
Normal file
@@ -0,0 +1,43 @@
|
||||
language: c
|
||||
dist: precise
|
||||
|
||||
env:
|
||||
global:
|
||||
- PAGER=cat
|
||||
|
||||
before_install:
|
||||
- sudo add-apt-repository -y ppa:cartodb/postgresql-9.5
|
||||
- sudo add-apt-repository -y ppa:cartodb/gis
|
||||
- sudo add-apt-repository -y ppa:cartodb/gis-testing
|
||||
- sudo apt-get update
|
||||
|
||||
# Install postgres db and build deps
|
||||
- sudo /etc/init.d/postgresql stop # stop travis default instance
|
||||
- sudo apt-get -y remove --purge postgresql-9.1
|
||||
- sudo apt-get -y remove --purge postgresql-9.2
|
||||
- sudo apt-get -y remove --purge postgresql-9.3
|
||||
- sudo apt-get -y remove --purge postgresql-9.4
|
||||
- sudo apt-get -y remove --purge postgresql-9.5
|
||||
- sudo rm -rf /var/lib/postgresql/
|
||||
- sudo rm -rf /var/log/postgresql/
|
||||
- sudo rm -rf /etc/postgresql/
|
||||
- sudo apt-get -y remove --purge postgis-2.2
|
||||
- sudo apt-get -y autoremove
|
||||
|
||||
- sudo apt-get -y install postgresql-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-server-dev-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-plpython-9.5=9.5.2-3cdb3
|
||||
- sudo apt-get -y install postgresql-9.5-postgis-scripts=2.2.2.0-cdb2
|
||||
- sudo apt-get -y install postgresql-9.5-postgis-2.2=2.2.2.0-cdb2
|
||||
|
||||
# configure it to accept local connections from postgres
|
||||
- echo -e "# TYPE DATABASE USER ADDRESS METHOD \nlocal all postgres trust\nlocal all all trust\nhost all all 127.0.0.1/32 trust" \
|
||||
| sudo tee /etc/postgresql/9.5/main/pg_hba.conf
|
||||
- sudo /etc/init.d/postgresql restart 9.5
|
||||
|
||||
install:
|
||||
- sudo make install
|
||||
|
||||
script:
|
||||
- cd src/pg
|
||||
- make test || { cat src/pg/test/regression.diffs; false; }
|
||||
77
NEWS.md
77
NEWS.md
@@ -1,4 +1,57 @@
|
||||
1.8.0 (2017-10-18)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Add `number_geometries` field to `OBS_GetAvailableGeometries` in order to provide the number of geometries from the source data to be used in the score calculation ([#313](https://github.com/CartoDB/observatory-extension/issues/313))
|
||||
|
||||
1.7.0 (2017-08-18)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Add Travis support to execute the extension tests ([#183](https://github.com/CartoDB/observatory-extension/issues/183))
|
||||
|
||||
__API Changes__
|
||||
|
||||
* Add new function `OBS_MetadataValidation` ([#303](https://github.com/CartoDB/observatory-extension/pull/303))
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Fixed parentheses for obs_getdata with ids
|
||||
* Fixed failing tests due changes in the data dump for some TIGER geometries
|
||||
|
||||
1.6.0 (2017-07-20)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* The current OBS_GetAvailableNumerators is not designed with our
|
||||
UI in mind so it's causing a lot of troubles and we're doing so
|
||||
many hacks to fit our UI needs and the interface of the function so this
|
||||
function it's a better fit for our purposes. ([#300](https://github.com/CartoDB/observatory-extension/pull/300))
|
||||
* Now use the new meta table `obs_meta_geom_numer_timespan` to filter
|
||||
the geometries by geometries timespan and/or numerator timespan (which
|
||||
is what we get when we use the obs_getavailabletimespans) ([#302](https://github.com/CartoDB/observatory-extension/pull/302))
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Right now we're doing INNER JOINS when we JOIN the `_procgeoms` and
|
||||
the data so we end up with NULL value instead of id, NULL value. ([#298](https://github.com/CartoDB/observatory-extension/pull/298))
|
||||
|
||||
|
||||
1.5.1 (2017-05-16)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Much improved performance for `OBS_GetData` when augmenting with several
|
||||
different geometries simultaneously ([#285](https://github.com/CartoDB/observatory-extension/pull/285))
|
||||
* Return the automatically assigned normalization type from `OBS_GetMeta`
|
||||
([#285](https://github.com/CartoDB/observatory-extension/pull/285))
|
||||
|
||||
1.5.0 (2017-04-24)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -12,6 +65,7 @@ __API Changes__
|
||||
([#282](https://github.com/CartoDB/observatory-extension/pull/282))
|
||||
|
||||
1.4.0 (2017-03-21)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -32,16 +86,19 @@ __Improvements__
|
||||
boundary selection
|
||||
|
||||
1.3.5 (2017-03-15)
|
||||
------------------
|
||||
|
||||
No changes. Artifact to allow for data update.
|
||||
|
||||
1.3.4 (2017-03-10)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
* Remove erroneously committed `RAISE NOTICE` in `OBS_GetData`
|
||||
|
||||
1.3.3 (2017-03-10)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -64,6 +121,7 @@ __Improvements__
|
||||
([#267](https://github.com/CartoDB/observatory-extension/pull/267))
|
||||
|
||||
1.3.2 (2017-03-02)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -71,6 +129,7 @@ __Bugfixes__
|
||||
This fixes issues with Camshaft.
|
||||
|
||||
1.3.1 (2017-02-16)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -81,6 +140,7 @@ __Improvements__
|
||||
called for measures for polygons
|
||||
|
||||
1.3.0 (2017-01-17)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -105,9 +165,8 @@ __Bugfixes__
|
||||
* Remove unnecessary dependency on `postgres_fdw`
|
||||
* `OBS_GetData()` now aggregates measures with mixed geoms correctly
|
||||
|
||||
__API Changes__
|
||||
|
||||
1.2.1 (2017-01-17)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -115,6 +174,7 @@ __Improvements__
|
||||
([#243](https://github.com/CartoDB/observatory-extension/pull/233))
|
||||
|
||||
1.2.0 (2016-12-28)
|
||||
------------------
|
||||
|
||||
__API Changes__
|
||||
|
||||
@@ -135,6 +195,7 @@ __Improvements__
|
||||
* Return both `table_id` and `column_id` from `_OBS_GetGeometryScores`
|
||||
|
||||
1.1.7 (2016-12-15)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -147,6 +208,7 @@ __Improvements__
|
||||
* Yields a ~50% improvement in performance for `_OBSGetGeomeryScores`.
|
||||
|
||||
1.1.6 (2016-12-08)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -173,6 +235,7 @@ __Improvements__
|
||||
- Add ability to persist results to JSON for graph visualization later
|
||||
|
||||
1.1.5 (2016-11-29)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -180,6 +243,7 @@ __Bugfixes__
|
||||
a geometry where it does not exist ([#220](https://github.com/CartoDB/observatory-extension/issues/220)).
|
||||
|
||||
1.1.4 (2016-11-21)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -187,10 +251,12 @@ __Bugfixes__
|
||||
`OBS_GetLegacyMetadata` ([#216](https://github.com/CartoDB/observatory-extension/issues/216)).
|
||||
|
||||
1.1.3 (2016-11-15)
|
||||
------------------
|
||||
|
||||
* Temporarily ignore EU data for the sake of testing
|
||||
|
||||
1.1.2 (2016-11-09)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -206,12 +272,14 @@ __API Changes (Internal)__
|
||||
* Add internal `_OBS_GetGeometryScores`
|
||||
|
||||
1.1.1 (2016-10-14)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
* Test points for Canada and France ([#204](https://github.com/CartoDB/observatory-extension/issues/120))
|
||||
|
||||
1.1.0 (2016-10-04)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -234,6 +302,7 @@ __API Changes__
|
||||
is also referred to here ([CartoDB/design#68](https://github.com/CartoDB/design/issues/68)).
|
||||
|
||||
1.0.7 (2016-09-20)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -245,6 +314,7 @@ __Improvements__
|
||||
* Automatic tests work for Canada and Thailand
|
||||
|
||||
1.0.6 (2016-09-08)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -252,6 +322,7 @@ __Improvements__
|
||||
framework logic from the observatory measure functions.
|
||||
|
||||
1.0.5 (2016-08-12)
|
||||
------------------
|
||||
|
||||
__Improvements__
|
||||
|
||||
@@ -259,6 +330,7 @@ __Improvements__
|
||||
any HTTP SQL API.
|
||||
|
||||
1.0.4 (2016-07-26)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
@@ -267,6 +339,7 @@ __Bugfixes__
|
||||
([#173](https://github.com/CartoDB/observatory-extension/issues/173))
|
||||
|
||||
1.0.3 (2016-07-25)
|
||||
------------------
|
||||
|
||||
__Bugfixes__
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Use the following functions to retrieve [Boundary](https://carto.com/docs/carto-engine/data/overview/#boundary-data) data. Data ranges from small areas (e.g. US Census Block Groups) to large areas (e.g. Countries). You can access boundaries by point location lookup, bounding box lookup, direct ID access and several other methods described below.
|
||||
|
||||
You can [access](https://carto.com/docs/carto-engine/data/accessing) boundaries through CARTO Builder. The same methods will work if you are using the CARTO Engine to develop your application. We [encourage you](http://docs/carto-engine/data/accessing/#best-practices) to use table modifying methods (UPDATE and INSERT) over dynamic methods (SELECT).
|
||||
You can [access](https://carto.com/docs/carto-engine/data/accessing) boundaries through CARTO Builder. The same methods will work if you are using the CARTO Engine to develop your application. We [encourage you](https://carto.com/docs/carto-engine/data/accessing/#best-practices) to use table modifying methods (UPDATE and INSERT) over dynamic methods (SELECT).
|
||||
|
||||
## OBS_GetBoundariesByGeometry(geom geometry, geometry_id text)
|
||||
|
||||
@@ -123,7 +123,7 @@ SET the_geom = OBS_GetBoundary(the_geom, 'us.census.tiger.block_group')
|
||||
|
||||
## OBS_GetBoundaryId(point_geometry, boundary_id)
|
||||
|
||||
The ```OBS_GetBoundaryId(point_geometry, boundary_id)``` returns a unique geometry_id for the boundary geometry that contains a given point geometry. See the [Boundary ID Glossary](http://docs/carto-engine/data/glossary/#boundary-ids). The method can be combined with ```OBS_GetBoundaryById(geometry_id)``` to create a point aggregation workflow.
|
||||
The ```OBS_GetBoundaryId(point_geometry, boundary_id)``` returns a unique geometry_id for the boundary geometry that contains a given point geometry. See the [Boundary ID Glossary](https://carto.com/docs/carto-engine/data/glossary/#boundary-ids). The method can be combined with ```OBS_GetBoundaryById(geometry_id)``` to create a point aggregation workflow.
|
||||
|
||||
#### Arguments
|
||||
|
||||
|
||||
@@ -228,7 +228,7 @@ SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
|
||||
WHERE valid_timespan IS True;
|
||||
```
|
||||
|
||||
## OBS_GetAvailableGeometries(bounds, filter_tags, numer_id, denom_id, timespan)
|
||||
## OBS_GetAvailableGeometries(bounds, filter_tags, numer_id, denom_id, timespan, number_geometries)
|
||||
|
||||
Return available geometries within a boundary and with the specified
|
||||
`filter_tags`.
|
||||
@@ -242,6 +242,7 @@ filter_tags | Text[] | a list of filters. Only geometries for which all of thes
|
||||
numer_id | Text | the ID of a numerator to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_numer` (optional)
|
||||
denom_id | Text | the ID of a denominator to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_denom` (optional)
|
||||
timespan | Text | the ID of a timespan to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_timespan` (optional)
|
||||
number_geometries | Integer | Number of geometries of the source data in order to calculate more accurately the score value to know which geometry fits better with the provided extent. (optional)
|
||||
|
||||
#### Returns
|
||||
|
||||
|
||||
@@ -108,7 +108,7 @@ The ```OBS_GetMeasure(polygon, measure_id)``` function returns any Data Observat
|
||||
Name |Description
|
||||
--- | ---
|
||||
polygon_geometry | a WGS84 polygon geometry (the_geom)
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
normalize | for measures that are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
|
||||
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
|
||||
time_span | time span of interest (e.g., 2010 - 2014)
|
||||
@@ -143,7 +143,7 @@ The ```OBS_GetMeasureById(geom_ref, measure_id, boundary_id)``` function returns
|
||||
Name |Description
|
||||
--- | ---
|
||||
geom_ref | a geometry reference (e.g., a US Census geoid)
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
|
||||
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
|
||||
time_span (optional) | time span of interest (e.g., 2010 - 2014). If `NULL` is passed, the measure from the most recent data will be used.
|
||||
|
||||
@@ -215,7 +215,7 @@ extent | A geometry of the extent of the input geometries
|
||||
metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column
|
||||
num_timespan_options | How many historical time periods to include. Defaults to 1
|
||||
num_score_options | How many alternative boundary levels to include. Defaults to 1
|
||||
target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest.
|
||||
target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest.
|
||||
|
||||
The schema of the metadata input objects are as follows:
|
||||
|
||||
@@ -321,6 +321,55 @@ SELECT OBS_GetMeta(
|
||||
) FROM tablename
|
||||
```
|
||||
|
||||
## OBS_MetadataValidation(extent geometry, geometry_type text, metadata json, target_geoms)
|
||||
|
||||
The ```OBS_MetadataValidation``` function performs a validation check over the known issues using the extent, type of geometry, and metadata that is being used in the ```OBS_GetMeta``` function.
|
||||
|
||||
#### Arguments
|
||||
|
||||
Name | Description
|
||||
---- | -----------
|
||||
extent | A geometry of the extent of the input geometries
|
||||
geometry_type | The geometry type of the source data
|
||||
metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optional additional parameters about that column
|
||||
target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest
|
||||
|
||||
The schema of the metadata input objects are as follows:
|
||||
|
||||
Metadata Input Key | Description
|
||||
--- | -----------
|
||||
numer_id | The identifier for the desired measurement. If left blank, a `geom_id` is specified and the column returns a geometry, instead of a measurement
|
||||
geom_id | Identifier for a desired geographic boundary level used to calculate measures. If undefined, this is automatically assigned. If defined, `numer_id` is blank and the column returns a geometry, instead of a measurement
|
||||
normalization | The desired normalization. One of 'area', 'prenormalized', or 'denominated'. 'Area' will normalize the measure per square kilometer, 'prenormalized' will return the original value, and 'denominated' will normalize by a denominator. If the metadata object specifies a geometry, this is ignored
|
||||
denom_id | When `normalization` is 'denominated', this is the identifier for a desired normalization column. This is automatically assigned. If the metadata object specifies a geometry, this is ignored
|
||||
numer_timespan | The desired timespan for the measurement. If left unspecified, it defaults to the most recent timespan available
|
||||
geom_timespan | The desired timespan for the geometry. If left unspecified, it defaults to the timespan matching `numer_timespan`
|
||||
target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area. Unit is square degrees WGS84. Set this to `0` if you want to use the smallest source geometry for this element of metadata. For example, if you are passing in points
|
||||
target_geoms | Override global `target_geoms` for this element of metadata
|
||||
max_timespan_rank | Only include timespans of this recency (For example, `1` is only the most recent timespan). There is no limit by default
|
||||
max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary). The default is `1`
|
||||
|
||||
#### Returns
|
||||
|
||||
Key | Description
|
||||
--- | -----------
|
||||
valid | A boolean field that represents if the validation was successful or not
|
||||
errors | A text array with all possible errors
|
||||
|
||||
#### Examples
|
||||
|
||||
Validate metadata with two additional columns of US census data; using a boundary relevant for the geometry provided and the latest timespan. Limited to the most recent column, and the most relevant, based on the extent and density of input geometries in `tablename`.
|
||||
|
||||
```SQL
|
||||
SELECT OBS_MetadataValidation(
|
||||
ST_SetSRID(ST_Extent(the_geom), 4326),
|
||||
ST_GeometryType(the_geom),
|
||||
'[{"numer_id": "us.census.acs.B01003001"}, {"numer_id": "us.census.acs.B01001002"}]',
|
||||
COUNT(*)::INTEGER
|
||||
) FROM tablename
|
||||
GROUP BY ST_GeometryType(the_geom)
|
||||
```
|
||||
|
||||
## OBS_GetData(geomvals array[geomval], metadata json)
|
||||
|
||||
The ```OBS_GetData(geomvals, metadata)``` function returns a measure and/or
|
||||
@@ -465,7 +514,7 @@ WITH meta AS (
|
||||
'[{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.county"}]'
|
||||
) meta FROM tablename)
|
||||
SELECT id AS fips, (data->0->>'value')::Numeric AS pop_density
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG(fips) FROM tablename),
|
||||
(SELECT meta FROM meta))
|
||||
```
|
||||
|
||||
@@ -481,7 +530,7 @@ WITH meta AS (
|
||||
) meta FROM tablename),
|
||||
data as (
|
||||
SELECT id AS fips, (data->0->>'value') AS pop_density
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
|
||||
FROM OBS_GetData((SELECT ARRAY_AGG(fips) FROM tablename),
|
||||
(SELECT meta FROM meta)))
|
||||
UPDATE tablename
|
||||
SET pop_density = data.pop_density
|
||||
|
||||
2311
release/observatory--1.5.1.sql
Normal file
2311
release/observatory--1.5.1.sql
Normal file
File diff suppressed because one or more lines are too long
2400
release/observatory--1.6.0.sql
Normal file
2400
release/observatory--1.6.0.sql
Normal file
File diff suppressed because one or more lines are too long
2443
release/observatory--1.7.0.sql
Normal file
2443
release/observatory--1.7.0.sql
Normal file
File diff suppressed because one or more lines are too long
2445
release/observatory--1.8.0.sql
Normal file
2445
release/observatory--1.8.0.sql
Normal file
File diff suppressed because one or more lines are too long
@@ -1,5 +1,5 @@
|
||||
comment = 'CartoDB Observatory backend extension'
|
||||
default_version = '1.5.0'
|
||||
default_version = '1.8.0'
|
||||
requires = 'postgis'
|
||||
superuser = true
|
||||
schema = cdb_observatory
|
||||
|
||||
@@ -52,8 +52,8 @@ def get_tablename_query(column_id, boundary_id, timespan):
|
||||
METADATA_TABLES = ['obs_table', 'obs_column_table', 'obs_column', 'obs_column_tag',
|
||||
'obs_tag', 'obs_column_to_column', 'obs_dump_version', 'obs_meta',
|
||||
'obs_meta_numer', 'obs_meta_denom', 'obs_meta_geom',
|
||||
'obs_meta_timespan', 'obs_column_table_tile',
|
||||
'obs_column_table_tile_simple']
|
||||
'obs_meta_timespan', 'obs_meta_geom_numer_timespan',
|
||||
'obs_column_table_tile', 'obs_column_table_tile_simple']
|
||||
|
||||
FIXTURES = [
|
||||
('us.census.acs.B01003001_quantile', 'us.census.tiger.census_tract', '2010 - 2014'),
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
requests
|
||||
nose
|
||||
nose_parameterized
|
||||
psycopg2
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
comment = 'CartoDB Observatory backend extension'
|
||||
default_version = '1.5.0'
|
||||
default_version = '1.8.0'
|
||||
requires = 'postgis'
|
||||
superuser = true
|
||||
schema = cdb_observatory
|
||||
|
||||
@@ -166,28 +166,15 @@ BEGIN
|
||||
|
||||
EXECUTE format($string$
|
||||
WITH _filters AS (SELECT
|
||||
generate_series(1, array_length($3, 1)) id,
|
||||
(unnest($3))->>'numer_id' numer_id,
|
||||
(unnest($3))->>'denom_id' denom_id,
|
||||
(unnest($3))->>'geom_id' geom_id,
|
||||
(unnest($3))->>'numer_timespan' numer_timespan,
|
||||
(unnest($3))->>'geom_timespan' geom_timespan,
|
||||
(unnest($3))->>'normalization' normalization,
|
||||
(unnest($3))->>'max_timespan_rank' max_timespan_rank,
|
||||
(unnest($3))->>'max_score_rank' max_score_rank,
|
||||
((unnest($3))->>'target_geoms')::INTEGER target_geoms,
|
||||
((unnest($3))->>'target_area')::Numeric target_area
|
||||
row_number() over () id, *
|
||||
FROM json_to_recordset($3)
|
||||
AS x(numer_id TEXT, denom_id TEXT, geom_id TEXT, numer_timespan TEXT,
|
||||
geom_timespan TEXT, normalization TEXT, max_timespan_rank TEXT,
|
||||
max_score_rank TEXT, target_geoms INTEGER, target_area Numeric
|
||||
)
|
||||
), meta AS (SELECT
|
||||
id,
|
||||
f.numer_id,
|
||||
LOWER(TRIM(BOTH '_' FROM regexp_replace(CASE WHEN f.numer_id IS NOT NULL
|
||||
THEN CASE
|
||||
WHEN normalization ILIKE 'area%%' THEN numer_colname || ' per sq km'
|
||||
WHEN normalization ILIKE 'denom%%' THEN numer_colname || ' rate'
|
||||
ELSE numer_colname
|
||||
END || ' ' || m.numer_timespan
|
||||
ELSE geom_name || ' ' || m.geom_timespan
|
||||
END, '[^a-zA-Z0-9]+', '_', 'g'))) suggested_name,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_aggregate END numer_aggregate,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_colname END numer_colname,
|
||||
CASE WHEN f.numer_id IS NULL THEN NULL ELSE numer_geomref_colname END numer_geomref_colname,
|
||||
@@ -217,7 +204,17 @@ BEGIN
|
||||
geom_description,
|
||||
geom_t_description,
|
||||
geom_type,
|
||||
normalization,
|
||||
Coalesce(normalization,
|
||||
-- automatically assign normalization to numeric numerators
|
||||
CASE WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
CASE WHEN denom_reltype ILIKE 'denominator' THEN 'denominated'
|
||||
WHEN numer_aggregate ILIKE 'sum' THEN 'area'
|
||||
WHEN numer_aggregate IN ('median', 'average') AND denom_reltype ILIKE 'universe'
|
||||
THEN 'prenormalized'
|
||||
ELSE 'prenormalized'
|
||||
END ELSE NULL
|
||||
END
|
||||
) normalization,
|
||||
max_timespan_rank,
|
||||
max_score_rank,
|
||||
target_geoms,
|
||||
@@ -249,7 +246,16 @@ BEGIN
|
||||
'score_rownum', row_number() over
|
||||
(PARTITION BY id, numer_timespan ORDER BY score DESC, Coalesce(denom_id, '')),
|
||||
'score', scores.score,
|
||||
'suggested_name', cdb_observatory.FIRST(meta.suggested_name),
|
||||
'suggested_name', cdb_observatory.FIRST(
|
||||
LOWER(TRIM(BOTH '_' FROM regexp_replace(CASE WHEN numer_id IS NOT NULL
|
||||
THEN CASE
|
||||
WHEN normalization ILIKE 'area%%' THEN numer_colname || ' per sq km'
|
||||
WHEN normalization ILIKE 'denom%%' THEN numer_colname || ' rate'
|
||||
ELSE numer_colname
|
||||
END || ' ' || numer_timespan
|
||||
ELSE geom_name || ' ' || geom_timespan
|
||||
END, '[^a-zA-Z0-9]+', '_', 'g')))
|
||||
),
|
||||
'numer_aggregate', cdb_observatory.FIRST(meta.numer_aggregate),
|
||||
'numer_colname', cdb_observatory.FIRST(meta.numer_colname),
|
||||
'numer_geomref_colname', cdb_observatory.FIRST(meta.numer_geomref_colname),
|
||||
@@ -305,7 +311,7 @@ BEGIN
|
||||
ELSE geom
|
||||
END,
|
||||
target_geoms,
|
||||
(SELECT ARRAY(SELECT json_array_elements_text(params))::json[]),
|
||||
params,
|
||||
num_timespan_options,
|
||||
num_score_options, numer_filters, geom_filters
|
||||
;
|
||||
@@ -587,14 +593,9 @@ RETURNS TABLE (
|
||||
)
|
||||
AS $$
|
||||
DECLARE
|
||||
geom_colspecs TEXT;
|
||||
geom_tables TEXT;
|
||||
geomrefs_alias TEXT;
|
||||
geomrefs_noalias TEXT;
|
||||
data_colspecs TEXT;
|
||||
data_tables TEXT;
|
||||
obs_wheres TEXT;
|
||||
user_wheres TEXT;
|
||||
procgeom_clauses TEXT;
|
||||
val_clauses TEXT;
|
||||
json_clause TEXT;
|
||||
geomtype TEXT;
|
||||
BEGIN
|
||||
IF params IS NULL OR JSON_ARRAY_LENGTH(params) = 0 OR ARRAY_LENGTH(geomvals, 1) IS NULL THEN
|
||||
@@ -604,222 +605,208 @@ BEGIN
|
||||
|
||||
geomtype := ST_GeometryType(geomvals[1].geom);
|
||||
|
||||
EXECUTE
|
||||
$query$
|
||||
WITH _meta AS (SELECT
|
||||
row_number() over () colid,
|
||||
meta->>'id' id,
|
||||
meta->>'numer_id' numer_id,
|
||||
meta->>'numer_aggregate' numer_aggregate,
|
||||
meta->>'numer_colname' numer_colname,
|
||||
meta->>'numer_geomref_colname' numer_geomref_colname,
|
||||
meta->>'numer_tablename' numer_tablename,
|
||||
meta->>'numer_type' numer_type,
|
||||
meta->>'denom_id' denom_id,
|
||||
meta->>'denom_aggregate' denom_aggregate,
|
||||
meta->>'denom_colname' denom_colname,
|
||||
meta->>'denom_geomref_colname' denom_geomref_colname,
|
||||
meta->>'denom_tablename' denom_tablename,
|
||||
meta->>'denom_type' denom_type,
|
||||
meta->>'denom_reltype' denom_reltype,
|
||||
meta->>'geom_id' geom_id,
|
||||
meta->>'geom_colname' geom_colname,
|
||||
meta->>'geom_geomref_colname' geom_geomref_colname,
|
||||
meta->>'geom_tablename' geom_tablename,
|
||||
meta->>'geom_type' geom_type,
|
||||
meta->>'numer_timespan' numer_timespan,
|
||||
meta->>'geom_timespan' geom_timespan,
|
||||
meta->>'normalization' normalization,
|
||||
meta->>'api_method' api_method,
|
||||
meta->'api_args' api_args
|
||||
FROM UNNEST($1) AS meta
|
||||
)
|
||||
/* Read metadata to generate clauses for query */
|
||||
EXECUTE $query$
|
||||
WITH _meta AS (SELECT
|
||||
row_number() over () colid, *
|
||||
FROM json_to_recordset($1)
|
||||
AS x(id TEXT, numer_id TEXT, numer_aggregate TEXT, numer_colname TEXT,
|
||||
numer_geomref_colname TEXT, numer_tablename TEXT, numer_type TEXT,
|
||||
denom_id TEXT, denom_aggregate TEXT, denom_colname TEXT,
|
||||
denom_geomref_colname TEXT, denom_tablename TEXT, denom_type TEXT,
|
||||
denom_reltype TEXT, geom_id TEXT, geom_colname TEXT,
|
||||
geom_geomref_colname TEXT, geom_tablename TEXT, geom_type TEXT,
|
||||
numer_timespan TEXT, geom_timespan TEXT, normalization TEXT,
|
||||
api_method TEXT, api_args JSON)
|
||||
),
|
||||
|
||||
-- Generate procgeom clauses.
|
||||
-- These join the users' geoms to the relevant geometries for the
|
||||
-- asked-for measures in the Observatory.
|
||||
_procgeom_clauses AS (
|
||||
SELECT
|
||||
String_Agg(DISTINCT
|
||||
CASE
|
||||
-- pass-through geom if user is requesting it only
|
||||
WHEN numer_id IS NULL AND api_method IS NULL THEN
|
||||
geom_tablename || '.' || geom_colname || ' AS geom_' || geom_tablename
|
||||
WHEN cdb_observatory.isnumeric(numer_type) AND api_method IS NULL THEN
|
||||
-- for numeric points with area normalization, include areas of underlying geoms
|
||||
CASE
|
||||
WHEN $2 = 'ST_Point' AND (LOWER(normalization) LIKE 'area%' OR
|
||||
(normalization IS NULL AND numer_aggregate ILIKE 'sum')) THEN
|
||||
' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
|
||||
' AS area_' || geom_tablename
|
||||
-- for numeric areas, include more complex calcs
|
||||
WHEN $2 != 'ST_Point' THEN
|
||||
'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
|
||||
' THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)' ||
|
||||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
|
||||
' THEN 1 ' ||
|
||||
' ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' ||
|
||||
geom_tablename || '.' || geom_colname || ')) / ' ||
|
||||
'Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0) ' ||
|
||||
'END pct_' || geom_tablename
|
||||
ELSE NULL
|
||||
END
|
||||
ELSE NULL END
|
||||
, ', ') AS geom_colspecs,
|
||||
String_Agg(DISTINCT 'observatory.' || geom_tablename, ', ') AS geom_tables,
|
||||
String_Agg(
|
||||
'JSON_Build_Object(' || CASE
|
||||
-- api-delivered values
|
||||
WHEN api_method IS NOT NULL THEN
|
||||
'''value'', ' ||
|
||||
'ARRAY_AGG( ' ||
|
||||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
|
||||
-- numeric internal values
|
||||
WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
'''value'', ' || CASE
|
||||
-- denominated
|
||||
WHEN LOWER(normalization) LIKE 'denom%' OR
|
||||
(normalization IS NULL AND LOWER(denom_reltype) LIKE 'denominator')
|
||||
THEN CASE
|
||||
-- denominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
|
||||
-- denominated polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
|
||||
ELSE
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
|
||||
' * pct_' || geom_tablename || '), 0) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- areaNormalized
|
||||
WHEN LOWER(normalization) LIKE 'area%' OR
|
||||
(normalization IS NULL AND numer_aggregate ILIKE 'sum')
|
||||
THEN CASE
|
||||
-- areaNormalized point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / area_' || geom_tablename || ')'
|
||||
-- areaNormalized polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / area of big geom
|
||||
ELSE
|
||||
--' NULL END '
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- median/average measures with universe
|
||||
WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
|
||||
denom_reltype ILIKE 'universe' AND
|
||||
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation weighted by universe
|
||||
-- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
|
||||
-- (10 * 1000 * 1) / (1000 * 1) = 10
|
||||
-- (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
|
||||
' SUM(' || numer_tablename || '.' || numer_colname ||
|
||||
' * ' || denom_tablename || '.' || denom_colname ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
|
||||
' * pct_' || geom_tablename || '), 0) ' ||
|
||||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- prenormalized for summable measures. point or summable only!
|
||||
WHEN numer_aggregate ILIKE 'sum' AND
|
||||
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation
|
||||
-- SUM (numer * (% user geom in OBS geom))
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * pct_' || geom_tablename ||
|
||||
' ) / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
|
||||
END
|
||||
-- Everything else. Point only!
|
||||
ELSE CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
|
||||
numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
|
||||
END
|
||||
END || '::' || numer_type
|
||||
|
||||
-- categorical/text
|
||||
WHEN LOWER(numer_type) LIKE 'text' THEN
|
||||
'''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
|
||||
|
||||
-- geometry
|
||||
WHEN numer_id IS NULL THEN
|
||||
'''geomref'', geomref_' || geom_tablename || ', ' ||
|
||||
'''value'', ' || 'cdb_observatory.FIRST(geom_' || geom_tablename ||
|
||||
')::TEXT'
|
||||
-- code below will return the intersection of the user's geom and the
|
||||
-- OBS geom
|
||||
--'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
|
||||
-- '.' || geom_colname || '))::TEXT'
|
||||
ELSE ''
|
||||
END || ')', ', ')
|
||||
AS colspecs,
|
||||
|
||||
-- geomrefs, used to separate out rows in case we don't want to merge
|
||||
-- results by user input IDs
|
||||
--
|
||||
-- api_method and geom_tablename are interchangeable since when an
|
||||
-- api_method is passed, geom_tablename is ignored
|
||||
String_Agg(DISTINCT COALESCE(geom_tablename, api_method) || '.' || geom_geomref_colname ||
|
||||
' AS geomref_' || COALESCE(geom_tablename, api_method), ', ') AS geomrefs_alias,
|
||||
|
||||
String_Agg(DISTINCT 'geomref_' || COALESCE(geom_tablename, api_method)
|
||||
, ', ') AS geomrefs_noalias,
|
||||
|
||||
(SELECT String_Agg(DISTINCT CASE
|
||||
-- External API
|
||||
WHEN tablename LIKE 'cdb_observatory.%' THEN
|
||||
'LATERAL (SELECT * FROM ' || tablename || ') ' ||
|
||||
REPLACE(split_part(tablename, '(', 1), 'cdb_observatory.', '')
|
||||
-- Internal obs_ table
|
||||
ELSE 'observatory.' || tablename
|
||||
END, ', ') FROM (
|
||||
SELECT DISTINCT UNNEST(tablenames_ary) tablename FROM (
|
||||
SELECT ARRAY_AGG(numer_tablename) ||
|
||||
ARRAY_AGG(denom_tablename) ||
|
||||
ARRAY_AGG('cdb_observatory.' || api_method || '(_procgeoms.geom' || COALESCE(', ' ||
|
||||
(SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
|
||||
FROM (SELECT json_array_elements(api_args) as val) as vals),
|
||||
'') || ')')
|
||||
tablenames_ary
|
||||
) tablenames_inner
|
||||
) tablenames_outer) data_tables,
|
||||
|
||||
String_Agg(DISTINCT array_to_string(ARRAY[
|
||||
CASE WHEN numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL
|
||||
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
|
||||
'_procgeoms.geomref_' || geom_tablename
|
||||
ELSE NULL END,
|
||||
CASE WHEN numer_tablename != denom_tablename
|
||||
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
|
||||
denom_tablename || '.' || denom_geomref_colname
|
||||
ELSE NULL END
|
||||
], ' AND '),
|
||||
' AND ') FILTER (WHERE numer_tablename != denom_tablename OR
|
||||
(numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL)) AS obs_wheres,
|
||||
|
||||
String_Agg(DISTINCT 'ST_Intersects(' || geom_tablename || '.' || geom_colname
|
||||
|| ', _geoms.geom)', ' AND ')
|
||||
AS user_wheres
|
||||
'_procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (' ||
|
||||
CASE WHEN api_method IS NULL THEN
|
||||
'SELECT _geoms.id, ' ||
|
||||
CASE $3 WHEN True THEN '_geoms.geom'
|
||||
ELSE geom_tablename || '.' || geom_colname
|
||||
END || ' AS geom, ' ||
|
||||
geom_tablename || '.' || geom_geomref_colname || ' AS geomref, ' ||
|
||||
CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
|
||||
' AS area'
|
||||
-- for numeric areas, include more complex calcs
|
||||
ELSE
|
||||
'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')
|
||||
THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
|
||||
WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom)
|
||||
THEN 1
|
||||
ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) /
|
||||
Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)
|
||||
END pct_obs'
|
||||
END || '
|
||||
FROM _geoms, observatory.' || geom_tablename || '
|
||||
WHERE ST_Intersects(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')'
|
||||
-- pass through input geometries for api_method
|
||||
ELSE 'SELECT _geoms.id, _geoms.geom FROM _geoms'
|
||||
END ||
|
||||
') '
|
||||
AS procgeom_clause
|
||||
FROM _meta
|
||||
;
|
||||
$query$
|
||||
INTO geom_colspecs, geom_tables, data_colspecs, geomrefs_alias,
|
||||
geomrefs_noalias, data_tables, obs_wheres, user_wheres
|
||||
USING (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]), geomtype;
|
||||
GROUP BY api_method, geom_tablename, geom_geomref_colname, geom_colname
|
||||
),
|
||||
|
||||
-- Generate val clauses.
|
||||
-- These perform interpolations or other necessary calculations to
|
||||
-- provide values according to users geometries.
|
||||
_val_clauses AS (
|
||||
SELECT
|
||||
'_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' AS (
|
||||
SELECT _procgeoms.id, ' ||
|
||||
String_Agg('json_build_object(' || CASE
|
||||
-- api-delivered values
|
||||
WHEN api_method IS NOT NULL THEN
|
||||
'''value'', ' ||
|
||||
'ARRAY_AGG( ' ||
|
||||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
|
||||
-- numeric internal values
|
||||
WHEN cdb_observatory.isnumeric(numer_type) THEN
|
||||
'''value'', ' || CASE
|
||||
-- denominated
|
||||
WHEN LOWER(normalization) LIKE 'denom%'
|
||||
THEN CASE
|
||||
WHEN denom_tablename IS NULL THEN ' NULL '
|
||||
-- denominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
|
||||
-- denominated polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
|
||||
ELSE
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs ' ||
|
||||
' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs), 0) '
|
||||
END
|
||||
-- areaNormalized
|
||||
WHEN LOWER(normalization) LIKE 'area%'
|
||||
THEN CASE
|
||||
-- areaNormalized point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
|
||||
' / _procgeoms.area)'
|
||||
-- areaNormalized polygon interpolation
|
||||
-- SUM (numer * (% OBS geom in user geom)) / area of big geom
|
||||
ELSE
|
||||
--' NULL END '
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs' ||
|
||||
' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) '
|
||||
END
|
||||
-- median/average measures with universe
|
||||
WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
|
||||
denom_reltype ILIKE 'universe' AND LOWER(normalization) LIKE 'pre%'
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation weighted by universe
|
||||
-- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
|
||||
-- (10 * 1000 * 1) / (1000 * 1) = 10
|
||||
-- (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
|
||||
' SUM(' || numer_tablename || '.' || numer_colname ||
|
||||
' * ' || denom_tablename || '.' || denom_colname ||
|
||||
' * _procgeoms.pct_obs ' ||
|
||||
' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
|
||||
' * _procgeoms.pct_obs ' || '), 0) '
|
||||
END
|
||||
-- prenormalized for summable measures. point or summable only!
|
||||
WHEN numer_aggregate ILIKE 'sum' AND LOWER(normalization) LIKE 'pre%'
|
||||
THEN CASE
|
||||
-- predenominated point-in-poly
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
-- predenominated polygon interpolation
|
||||
-- SUM (numer * (% user geom in OBS geom))
|
||||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
|
||||
' * _procgeoms.pct_obs) '
|
||||
END
|
||||
-- Everything else. Point only!
|
||||
ELSE CASE
|
||||
WHEN $2 = 'ST_Point' THEN
|
||||
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
|
||||
ELSE
|
||||
' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
|
||||
numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
|
||||
END
|
||||
END || '::' || numer_type
|
||||
|
||||
-- categorical/text
|
||||
WHEN LOWER(numer_type) LIKE 'text' THEN
|
||||
'''value'', ' || 'MODE() WITHIN GROUP (ORDER BY ' || numer_tablename || '.' || numer_colname || ') '
|
||||
-- geometry
|
||||
WHEN numer_id IS NULL THEN
|
||||
'''geomref'', _procgeoms.geomref, ' ||
|
||||
'''value'', ' || 'cdb_observatory.FIRST(_procgeoms.geom)::TEXT'
|
||||
-- code below will return the intersection of the user's geom and the
|
||||
-- OBS geom
|
||||
--'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
|
||||
-- '.' || geom_colname || '))::TEXT'
|
||||
ELSE ''
|
||||
END
|
||||
|| ') val_' || colid, ', ')
|
||||
|| '
|
||||
FROM _procgeoms_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) || ' _procgeoms ' ||
|
||||
Coalesce(String_Agg(DISTINCT
|
||||
Coalesce('LEFT JOIN observatory.' || numer_tablename || ' ON _procgeoms.geomref = observatory.' || numer_tablename || '.' || numer_geomref_colname,
|
||||
', LATERAL (SELECT * FROM cdb_observatory.' || api_method || '(_procgeoms.geom' || Coalesce(', ' ||
|
||||
(SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
|
||||
FROM (SELECT JSON_Array_Elements(api_args) as val) as vals),
|
||||
'') || ')) AS ' || api_method)
|
||||
, ' '), '') ||
|
||||
CASE $3 WHEN True THEN E'\n GROUP BY _procgeoms.id ORDER BY _procgeoms.id '
|
||||
ELSE E'\n GROUP BY _procgeoms.id, _procgeoms.geomref
|
||||
ORDER BY _procgeoms.id, _procgeoms.geomref' END
|
||||
|| ')'
|
||||
AS val_clause,
|
||||
'_vals_' || Coalesce(geom_tablename || '_' || geom_geomref_colname, api_method) AS cte_name
|
||||
FROM _meta
|
||||
GROUP BY geom_tablename, geom_geomref_colname, geom_colname, api_method
|
||||
),
|
||||
|
||||
-- Generate clauses necessary to join together val_clauses
|
||||
_val_joins AS (
|
||||
SELECT String_Agg(a.cte_name || '.id = ' || b.cte_name || '.id ', ' AND ') val_joins
|
||||
FROM _val_clauses a, _val_clauses b
|
||||
WHERE a.cte_name != b.cte_name
|
||||
AND a.cte_name < b.cte_name
|
||||
),
|
||||
|
||||
-- Generate JSON clause. This puts together vals from val_clauses
|
||||
_json_clause AS (SELECT
|
||||
'SELECT ' || cdb_observatory.FIRST(cte_name) || '.id::INT,
|
||||
Array_to_JSON(ARRAY[' || (SELECT String_Agg('val_' || colid, ', ') FROM _meta) || '])
|
||||
FROM ' || String_Agg(cte_name, ', ') ||
|
||||
Coalesce(' WHERE ' || val_joins, '')
|
||||
AS json_clause
|
||||
FROM _val_clauses, _val_joins
|
||||
GROUP BY val_joins
|
||||
)
|
||||
|
||||
SELECT (SELECT String_Agg(procgeom_clause, E',\n ') FROM _procgeom_clauses),
|
||||
(SELECT String_Agg(val_clause, E',\n ') FROM _val_clauses),
|
||||
json_clause
|
||||
FROM _json_clause
|
||||
$query$ INTO
|
||||
procgeom_clauses,
|
||||
val_clauses,
|
||||
json_clause
|
||||
USING params, geomtype, merge;
|
||||
|
||||
/* Execute query */
|
||||
RETURN QUERY EXECUTE format($query$
|
||||
WITH _raw_geoms AS (%s),
|
||||
_geoms AS (SELECT id,
|
||||
@@ -827,27 +814,21 @@ BEGIN
|
||||
THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.00001)), 3)
|
||||
ELSE geom END geom
|
||||
FROM _raw_geoms),
|
||||
_procgeoms AS (SELECT _geoms.id, _geoms.geom %s %s
|
||||
FROM _geoms %s
|
||||
%s
|
||||
)
|
||||
SELECT _procgeoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
|
||||
FROM _procgeoms %s
|
||||
%s
|
||||
GROUP BY _procgeoms.id %s
|
||||
ORDER BY _procgeoms.id
|
||||
$query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1 THEN
|
||||
' SELECT $1[1].val as id, $1[1].geom as geom '
|
||||
ELSE
|
||||
' SELECT val as id, geom FROM UNNEST($1) '
|
||||
-- procgeom_clauses
|
||||
%s,
|
||||
|
||||
-- val_clauses
|
||||
%s
|
||||
|
||||
-- json_clause
|
||||
%s
|
||||
$query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1
|
||||
THEN ' SELECT $1[1].val as id, $1[1].geom as geom '
|
||||
ELSE ' SELECT val as id, geom FROM UNNEST($1) '
|
||||
END,
|
||||
', ' || NullIf(geomrefs_alias, ''),
|
||||
', ' || NullIf(geom_colspecs, ''),
|
||||
', ' || NullIf(geom_tables, ''),
|
||||
'WHERE ' || NullIf( user_wheres, ''),
|
||||
data_colspecs, ', ' || NullIf(data_tables, ''),
|
||||
'WHERE ' || NULLIF(obs_wheres, ''),
|
||||
CASE WHEN merge IS False THEN ', ' || geomrefs_noalias ELSE '' END)
|
||||
String_Agg(procgeom_clauses, E',\n '),
|
||||
String_Agg(val_clauses, E',\n '),
|
||||
json_clause)
|
||||
USING geomvals;
|
||||
RETURN;
|
||||
END;
|
||||
@@ -1095,3 +1076,46 @@ BEGIN
|
||||
RETURN result;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql STABLE;
|
||||
|
||||
-- MetadataValidation checks the metadata parameters and the geometry type
|
||||
-- of the data in order to find possible wrong cases
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory.obs_metadatavalidation(
|
||||
geometry_extent geometry(Geometry, 4326),
|
||||
geometry_type text,
|
||||
params JSON,
|
||||
target_geoms INTEGER DEFAULT NULL
|
||||
)
|
||||
RETURNS TABLE(valid boolean, errors text[]) AS $$
|
||||
DECLARE
|
||||
meta json;
|
||||
errors text[];
|
||||
BEGIN
|
||||
errors := (ARRAY[])::TEXT[];
|
||||
IF geometry_type IN ('ST_Polygon', 'ST_MultiPolygon') THEN
|
||||
FOR meta IN EXECUTE 'SELECT json_array_elements(cdb_observatory.OBS_GetMeta($1, $2, 1, 1, $3))' USING geometry_extent, params, target_geoms
|
||||
LOOP
|
||||
IF (meta->>'normalization' = 'denominated' AND meta->>'denom_id' is NULL) THEN
|
||||
errors := array_append(errors, 'Normalizated measure should have a numerator and a denominator. Please review the provided options.');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IS NULL) THEN
|
||||
errors := array_append(errors, 'For polygon geometries, aggregation is mandatory. Please review the provided options');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IN ('median', 'average') AND meta->>'denom_id' IS NULL) THEN
|
||||
errors := array_append(errors, 'Median or average aggregation for polygons requires a denominator to provide weights. Please review the provided options');
|
||||
END IF;
|
||||
IF (meta->>'numer_aggregate' IN ('median', 'average') AND meta->>'normalization' NOT LIKE 'pre%') THEN
|
||||
errors := array_append(errors, format('Median or average aggregation only supports prenormalized normalization, %s passed. Please review the provided options', meta->>'normalization'));
|
||||
END IF;
|
||||
END LOOP;
|
||||
|
||||
IF CARDINALITY(errors) > 0 THEN
|
||||
RETURN QUERY EXECUTE 'SELECT FALSE, $1' USING errors;
|
||||
ELSE
|
||||
RETURN QUERY SELECT TRUE, ARRAY[]::TEXT[];
|
||||
END IF;
|
||||
ELSE
|
||||
RETURN QUERY SELECT TRUE, ARRAY[]::TEXT[];
|
||||
END IF;
|
||||
RETURN;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql STABLE;
|
||||
|
||||
@@ -181,6 +181,86 @@ BEGIN
|
||||
END
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory._OBS_GetNumerators(
|
||||
bounds GEOMETRY DEFAULT NULL,
|
||||
section_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
subsection_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
other_tags TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
ids TEXT[] DEFAULT ARRAY[]::TEXT[],
|
||||
name TEXT DEFAULT NULL,
|
||||
denom_id TEXT DEFAULT '',
|
||||
geom_id TEXT DEFAULT '',
|
||||
timespan TEXT DEFAULT ''
|
||||
) RETURNS TABLE (
|
||||
numer_id TEXT,
|
||||
numer_name TEXT,
|
||||
numer_description TEXT,
|
||||
numer_weight NUMERIC,
|
||||
numer_license TEXT,
|
||||
numer_source TEXT,
|
||||
numer_type TEXT,
|
||||
numer_aggregate TEXT,
|
||||
numer_extra JSONB,
|
||||
numer_tags JSONB,
|
||||
valid_denom BOOLEAN,
|
||||
valid_geom BOOLEAN,
|
||||
valid_timespan BOOLEAN
|
||||
) AS $$
|
||||
DECLARE
|
||||
where_clause_elements TEXT[];
|
||||
geom_clause TEXT;
|
||||
where_clause TEXT;
|
||||
BEGIN
|
||||
where_clause_elements := (ARRAY[])::TEXT[];
|
||||
where_clause := '';
|
||||
|
||||
IF bounds IS NOT NULL THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$ST_Intersects(the_geom, '%s'::geometry)$data$, bounds));
|
||||
END IF;
|
||||
IF cardinality(section_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, section_tags));
|
||||
END IF;
|
||||
IF cardinality(subsection_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, subsection_tags));
|
||||
END IF;
|
||||
IF cardinality(other_tags) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_tags ?| '%s'$data$, other_tags));
|
||||
END IF;
|
||||
IF cardinality(ids) > 0 THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_id IN (array_to_string('%s'::text[], ','))$data$, ids));
|
||||
END IF;
|
||||
IF name IS NOT NULL AND name != '' THEN
|
||||
where_clause_elements := array_append(where_clause_elements, format($data$numer_name ilike '%%%s%%'$data$, name));
|
||||
END IF;
|
||||
IF cardinality(where_clause_elements) > 0 THEN
|
||||
where_clause := format($clause$WHERE %s$clause$, array_to_string(where_clause_elements, ' AND '));
|
||||
END IF;
|
||||
RAISE DEBUG '%', array_to_string(where_clause_elements, ' AND ');
|
||||
|
||||
RETURN QUERY
|
||||
EXECUTE
|
||||
format($string$
|
||||
SELECT numer_id::TEXT,
|
||||
numer_name::TEXT,
|
||||
numer_description::TEXT,
|
||||
numer_weight::NUMERIC,
|
||||
NULL::TEXT license,
|
||||
NULL::TEXT source,
|
||||
numer_type numer_type,
|
||||
numer_aggregate numer_aggregate,
|
||||
numer_extra::JSONB numer_extra,
|
||||
numer_tags numer_tags,
|
||||
$1 = ANY(denoms) valid_denom,
|
||||
$2 = ANY(geoms) valid_geom,
|
||||
$3 = ANY(timespans) valid_timespan
|
||||
FROM observatory.obs_meta_numer
|
||||
%s
|
||||
$string$, where_clause)
|
||||
USING denom_id, geom_id, timespan;
|
||||
RETURN;
|
||||
END
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableDenominators(
|
||||
bounds GEOMETRY DEFAULT NULL,
|
||||
filter_tags TEXT[] DEFAULT NULL,
|
||||
@@ -243,7 +323,8 @@ CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetAvailableGeometries(
|
||||
filter_tags TEXT[] DEFAULT NULL,
|
||||
numer_id TEXT DEFAULT NULL,
|
||||
denom_id TEXT DEFAULT NULL,
|
||||
timespan TEXT DEFAULT NULL
|
||||
timespan TEXT DEFAULT NULL,
|
||||
number_geoms INTEGER DEFAULT NULL
|
||||
) RETURNS TABLE (
|
||||
geom_id TEXT,
|
||||
geom_name TEXT,
|
||||
@@ -292,21 +373,34 @@ BEGIN
|
||||
geom_type::TEXT,
|
||||
geom_extra::JSONB,
|
||||
geom_tags::JSONB,
|
||||
$1 = ANY(numers) valid_numer,
|
||||
$2 = ANY(denoms) valid_denom,
|
||||
$3 = ANY(timespans) valid_timespan
|
||||
FROM observatory.obs_meta_geom
|
||||
$1 = ANY(numers) valid_numer,
|
||||
$2 = ANY(denoms) valid_denom,
|
||||
CASE WHEN $3 IS NOT NULL AND $3 != '' THEN
|
||||
-- Here we are looking for geometries with: a) geometry timespan or b) numerators linked to that geometries that fit in the
|
||||
-- timespan passed. For example it look for geometries with timespan '2015 - 2015' or numerators linked to that geometry that has
|
||||
-- '2015 - 2015' as one of the valid timespans.
|
||||
-- If we pass a numerator_id, we filter by that numerator
|
||||
CASE WHEN $1 IS NOT NULL AND $1 != '' THEN
|
||||
EXISTS (SELECT 1 FROM observatory.obs_meta_geom_numer_timespan onu WHERE o.geom_id = onu.geom_id AND onu.numer_id = $1 AND ($3 = ANY(onu.timespans) OR $3 IN (select(unnest(o.timespans)))))
|
||||
ELSE
|
||||
EXISTS (SELECT 1 FROM observatory.obs_meta_geom_numer_timespan onu WHERE o.geom_id = onu.geom_id AND ($3 = ANY(onu.timespans) OR $3 IN (select(unnest(o.timespans)))))
|
||||
END
|
||||
ELSE
|
||||
false
|
||||
END as valid_timespan
|
||||
FROM observatory.obs_meta_geom o
|
||||
WHERE %s (geom_tags ?& $4 OR CARDINALITY($4) = 0)
|
||||
), scores AS (
|
||||
SELECT * FROM cdb_observatory._OBS_GetGeometryScores($5,
|
||||
(SELECT ARRAY_AGG(geom_id) FROM available_geoms)
|
||||
SELECT * FROM cdb_observatory._OBS_GetGeometryScores(bounds => $5,
|
||||
filter_geom_ids => (SELECT ARRAY_AGG(geom_id) FROM available_geoms),
|
||||
desired_num_geoms => $6::integer
|
||||
)
|
||||
) SELECT available_geoms.*, score, numtiles, notnull_percent, numgeoms,
|
||||
) SELECT DISTINCT ON (geom_id) available_geoms.*, score, numtiles, notnull_percent, numgeoms,
|
||||
percentfill, estnumgeoms, meanmediansize
|
||||
FROM available_geoms, scores
|
||||
WHERE available_geoms.geom_id = scores.column_id
|
||||
$string$, geom_clause)
|
||||
USING numer_id, denom_id, timespan, filter_tags, bounds;
|
||||
USING numer_id, denom_id, timespan, filter_tags, bounds, number_geoms;
|
||||
RETURN;
|
||||
END
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
@@ -153,6 +153,9 @@ t
|
||||
obs_getmeta_suggested_name
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name_implicit_area
|
||||
t
|
||||
(1 row)
|
||||
obs_getmeta_suggested_name_area
|
||||
t
|
||||
(1 row)
|
||||
@@ -207,6 +210,9 @@ t|t|t
|
||||
id|data_polygon_measure_one_null|data_polygon_measure_two_null
|
||||
t|t|t
|
||||
(1 row)
|
||||
id|data_polygon_measure_one_null|data_polygon_measure_two_null
|
||||
t|t|t
|
||||
(1 row)
|
||||
id|data_polygon_measure_one_predenom|data_polygon_measure_two_predenom
|
||||
t|t|t
|
||||
(1 row)
|
||||
@@ -298,3 +304,12 @@ tract_sample|tract_max_error|tract_avg_error|tract_min_error
|
||||
no_bg_point_error
|
||||
t
|
||||
(1 row)
|
||||
valid|errors
|
||||
t|{}
|
||||
(1 row)
|
||||
valid|errors
|
||||
f|{"Median or average aggregation only supports prenormalized normalization, denominated passed. Please review the provided options"}
|
||||
(1 row)
|
||||
valid|errors
|
||||
f|{"Normalizated measure should have a numerator and a denominator. Please review the provided options."}
|
||||
(1 row)
|
||||
|
||||
@@ -48,6 +48,63 @@ t
|
||||
_obs_getavailablenumerators_no_total_pop_1996
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_all
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_nyc_point
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_usa_extents
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_usa_pop_not_in_zero_point
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_usa_pop_in_age_gender_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_pop_in_income_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_male_pop_denom_by_total_pop
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_income_denom_by_total_pop
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_zillow_at_zcta5
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_zillow_at_block_group
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_2010_2014
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_no_total_pop_1996
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_name
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_section
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_in_canada
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_in_employment_subsection
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_by_id
|
||||
t
|
||||
(1 row)
|
||||
_obs_getnumerators_total_pop_not_with_other_id
|
||||
t
|
||||
(1 row)
|
||||
_obs_getavailabledenominators_usa_pop_in_all
|
||||
t
|
||||
(1 row)
|
||||
|
||||
1
src/pg/test/fixtures/drop_fixtures.sql
vendored
1
src/pg/test/fixtures/drop_fixtures.sql
vendored
@@ -12,6 +12,7 @@ DROP TABLE IF EXISTS observatory.obs_meta_numer;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_denom;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_geom;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_timespan;
|
||||
DROP TABLE IF EXISTS observatory.obs_meta_geom_numer_timespan;
|
||||
DROP TABLE IF EXISTS observatory.obs_column_table_tile;
|
||||
DROP TABLE IF EXISTS observatory.obs_column_table_tile_simple;
|
||||
DROP TABLE IF EXISTS observatory.obs_78fb6c1d6ff6505225175922c2c389ce48d7632c;
|
||||
|
||||
175993
src/pg/test/fixtures/load_fixtures.sql
vendored
175993
src/pg/test/fixtures/load_fixtures.sql
vendored
File diff suppressed because one or more lines are too long
@@ -268,7 +268,7 @@ SELECT
|
||||
(meta->0->>'numer_name') = 'Total Population' numer_name,
|
||||
(meta->0->>'denom_id') IS NULL denom_id,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'area' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes one partial measure with "best" metadata
|
||||
@@ -290,7 +290,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for polygon completes one partial measure with "best" metadata
|
||||
@@ -308,7 +308,7 @@ SELECT
|
||||
(meta->0->>'numer_name') = 'Total Population' numer_name,
|
||||
(meta->0->>'denom_id') IS NULL denom_id,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'area' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for polygon completes one partial measure with "best" metadata
|
||||
@@ -330,7 +330,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with "best"
|
||||
@@ -352,7 +352,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.block_group' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization,
|
||||
(meta->0->>'normalization') = 'denominated' normalization,
|
||||
(meta->1->>'id')::integer = 1 id,
|
||||
(meta->1->>'numer_id') = 'us.census.acs.B01001002' numer_id,
|
||||
(meta->1->>'timespan_rank')::integer = 1 timespan_rank,
|
||||
@@ -367,7 +367,7 @@ SELECT
|
||||
(meta->1->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->1->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->1->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
|
||||
(meta->1->>'normalization') IS NULL normalization
|
||||
(meta->1->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with "best" metadata
|
||||
@@ -389,7 +389,7 @@ SELECT
|
||||
(meta->0->>'denom_type') = 'Numeric' denom_type,
|
||||
(meta->0->>'denom_name') = 'Total Population' denom_name,
|
||||
(meta->0->>'geom_id') = 'us.census.tiger.census_tract' geom_id,
|
||||
(meta->0->>'normalization') IS NULL normalization
|
||||
(meta->0->>'normalization') = 'denominated' normalization
|
||||
FROM meta;
|
||||
|
||||
-- OBS_GetMeta for point completes several partial measures with conflicting
|
||||
@@ -400,9 +400,14 @@ AS obs_getmeta_conflicting_metadata;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001"}]'
|
||||
'[{"numer_id": "us.census.acs.B01003001", "normalization": "predenom"}]'
|
||||
)->0->>'suggested_name' = 'total_pop_2010_2014' obs_getmeta_suggested_name;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request with area norm
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001"}]'
|
||||
)->0->>'suggested_name' = 'total_pop_per_sq_km_2010_2014' obs_getmeta_suggested_name_implicit_area;
|
||||
|
||||
-- OBS_GetMeta provides suggested name for simple meta request with area norm
|
||||
SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
|
||||
'[{"numer_id": "us.census.acs.B01003001", "normalization": "area"}]'
|
||||
@@ -591,6 +596,18 @@ SELECT id = 1 id,
|
||||
abs((data->1->>'value')::Numeric - 0.4902) / 0.4902 < 0.001 data_polygon_measure_two_null
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with two measures and one return null
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"numer_id": "us.census.acs.B19013001_quantile"}, {"numer_id": "us.census.acs.B01001002"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta)))
|
||||
SELECT id = 1 id,
|
||||
(data->0->>'value') is NULL data_polygon_measure_one_null,
|
||||
abs((data->1->>'value')::Numeric - 0.4902) / 0.4902 < 0.001 data_polygon_measure_two_null
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with two standard measures predenom normalization
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
@@ -677,25 +694,25 @@ FROM data;
|
||||
-- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta), false))
|
||||
SELECT every(id = 1) is TRUE id,
|
||||
count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
|
||||
abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop
|
||||
abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop
|
||||
FROM data;
|
||||
|
||||
-- OBS_GetData/OBS_GetMeta by geom with polygons inside a polygon + one measure + one text
|
||||
WITH
|
||||
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
'[{"geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.acs.B01003001", "normalization": "predenom", "geom_id": "us.census.tiger.block_group"}, {"numer_id": "us.census.tiger.name", "geom_id": "us.census.tiger.block_group"}]') meta),
|
||||
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
|
||||
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
|
||||
(SELECT meta FROM meta), false))
|
||||
SELECT every(id = 1) is TRUE id,
|
||||
count(distinct (data->0->>'value')::geometry) = 16 correct_num_geoms,
|
||||
abs(sum((data->1->>'value')::numeric) - 15787) / 15787 < 0.001 correct_pop,
|
||||
abs(sum((data->1->>'value')::numeric) - 12327) / 12327 < 0.001 correct_pop,
|
||||
array_agg(distinct data->2->>'value') = '{"Block Group 1","Block Group 2","Block Group 3","Block Group 4","Block Group 5"}' correct_bg_names
|
||||
FROM data;
|
||||
|
||||
@@ -956,3 +973,9 @@ WITH _geoms AS (
|
||||
FROM geoms, results
|
||||
WHERE cartodb_id = id
|
||||
;
|
||||
|
||||
-- OBS_MetadataValidation
|
||||
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B01003001","denom_id": null,"normalization": "prenormalized","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B25058001","denom_id": null,"normalization": "denominated","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
SELECT * FROM cdb_observatory.OBS_MetadataValidation(NULL, 'ST_Polygon', '[{"numer_id": "us.census.acs.B15003001","denom_id": null,"normalization": "denominated","geom_id": null,"numer_timespan": "2010 - 2014"}]'::json, 500);
|
||||
|
||||
@@ -119,6 +119,142 @@ FROM cdb_observatory.OBS_GetAvailableNumerators(
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getavailablenumerators_no_total_pop_1996;
|
||||
|
||||
--
|
||||
-- _OBS_GetNumerators tests
|
||||
--
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators())
|
||||
AS _obs_getnumerators_usa_pop_in_all;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_usa_pop_in_nyc_point;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakeEnvelope(
|
||||
-169.8046875, 21.289374355860424,
|
||||
-47.4609375, 72.0739114882038
|
||||
), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_usa_pop_in_usa_extents;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(0, 0), 4326),
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
)) AS _obs_getnumerators_no_usa_pop_not_in_zero_point;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
subsection_tags => ARRAY['subsection/tags.age_gender']
|
||||
))
|
||||
AS _obs_getnumerators_usa_pop_in_age_gender_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
subsection_tags => ARRAY['subsection/tags.income']
|
||||
))
|
||||
AS _obs_getnumerators_no_pop_in_income_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01001002' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
denom_id => 'us.census.acs.B01003001'
|
||||
) WHERE valid_denom = True)
|
||||
AS _obs_getnumerators_male_pop_denom_by_total_pop;
|
||||
|
||||
SELECT 'us.census.acs.B19013001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
denom_id => 'us.census.acs.B01003001'
|
||||
) WHERE valid_denom = True)
|
||||
AS _obs_getnumerators_no_income_denom_by_total_pop;
|
||||
|
||||
SELECT 'us.zillow.AllHomes_Zhvi' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
geom_id => 'us.census.tiger.zcta5'
|
||||
) WHERE valid_geom = True)
|
||||
AS _obs_getnumerators_zillow_at_zcta5;
|
||||
|
||||
SELECT 'us.zillow.AllHomes_Zhvi' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
geom_id => 'us.census.tiger.block_group'
|
||||
) WHERE valid_geom = True)
|
||||
AS _obs_getnumerators_no_zillow_at_block_group;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
timespan => '2010 - 2014'
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getnumerators_total_pop_2010_2014;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
timespan => '1996'
|
||||
) WHERE valid_timespan = True)
|
||||
AS _obs_getnumerators_no_total_pop_1996;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
name => 'tot'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_name;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_section;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.ca}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_in_canada;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}',
|
||||
subsection_tags => '{subsection/tags.age_gender}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
section_tags => '{section/tags.united_states}',
|
||||
subsection_tags => '{subsection/tags.employment}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_in_employment_subsection;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
ids => '{us.census.acs.B01003001}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_by_id;
|
||||
|
||||
SELECT 'us.census.acs.B01003001' NOT IN (SELECT numer_id
|
||||
FROM cdb_observatory._OBS_GetNumerators(
|
||||
ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326),
|
||||
ids => '{us.census.acs.B01003002}'
|
||||
))
|
||||
AS _obs_getnumerators_total_pop_not_with_other_id;
|
||||
|
||||
--
|
||||
-- OBS_GetAvailableDenominators tests
|
||||
--
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
from nose.tools import assert_equal, assert_is_not_none
|
||||
from nose.plugins.skip import SkipTest
|
||||
from nose_parameterized import parameterized
|
||||
|
||||
from itertools import izip_longest
|
||||
@@ -55,84 +54,50 @@ SKIP_COLUMNS = set([
|
||||
u'us.census.tiger.mtfcc',
|
||||
u'whosonfirst.wof_county_name',
|
||||
u'whosonfirst.wof_region_name',
|
||||
'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
, 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
, 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
, 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
, 'uk.ons.LC3202WA0007'
|
||||
, 'uk.ons.LC3202WA0010'
|
||||
, 'uk.ons.LC3202WA0004'
|
||||
, 'uk.ons.LC3204WA0004'
|
||||
, 'uk.ons.LC3204WA0007'
|
||||
, 'uk.ons.LC3204WA0010'
|
||||
, 'br.geo.subdistritos_name'
|
||||
u'fr.insee.P12_RP_CHOS',
|
||||
u'fr.insee.P12_RP_HABFOR',
|
||||
u'fr.insee.P12_RP_EAUCH',
|
||||
u'fr.insee.P12_RP_BDWC',
|
||||
u'fr.insee.P12_RP_MIDUR',
|
||||
u'fr.insee.P12_RP_CLIM',
|
||||
u'fr.insee.P12_RP_MIBOIS',
|
||||
u'fr.insee.P12_RP_CASE',
|
||||
u'fr.insee.P12_RP_TTEGOU',
|
||||
u'fr.insee.P12_RP_ELEC',
|
||||
u'fr.insee.P12_ACTOCC15P_ILT45D',
|
||||
u'fr.insee.P12_RP_CHOS',
|
||||
u'fr.insee.P12_RP_HABFOR',
|
||||
u'fr.insee.P12_RP_EAUCH',
|
||||
u'fr.insee.P12_RP_BDWC',
|
||||
u'fr.insee.P12_RP_MIDUR',
|
||||
u'fr.insee.P12_RP_CLIM',
|
||||
u'fr.insee.P12_RP_MIBOIS',
|
||||
u'fr.insee.P12_RP_CASE',
|
||||
u'fr.insee.P12_RP_TTEGOU',
|
||||
u'fr.insee.P12_RP_ELEC',
|
||||
u'fr.insee.P12_ACTOCC15P_ILT45D',
|
||||
u'uk.ons.LC3202WA0007',
|
||||
u'uk.ons.LC3202WA0010',
|
||||
u'uk.ons.LC3202WA0004',
|
||||
u'uk.ons.LC3204WA0004',
|
||||
u'uk.ons.LC3204WA0007',
|
||||
u'uk.ons.LC3204WA0010',
|
||||
u'br.geo.subdistritos_name'
|
||||
])
|
||||
|
||||
MEASURE_COLUMNS = query('''
|
||||
SELECT ARRAY_AGG(DISTINCT numer_id) numer_ids,
|
||||
SELECT cdb_observatory.FIRST(distinct numer_id) numer_ids,
|
||||
numer_aggregate,
|
||||
denom_reltype,
|
||||
section_tags
|
||||
denom_reltype
|
||||
FROM observatory.obs_meta
|
||||
WHERE numer_weight > 0
|
||||
AND numer_id NOT IN ('{skip}')
|
||||
AND numer_id NOT LIKE 'eu.%' --Skipping Eurostat
|
||||
AND section_tags IS NOT NULL
|
||||
AND subsection_tags IS NOT NULL
|
||||
GROUP BY numer_aggregate, section_tags, denom_reltype
|
||||
GROUP BY numer_id, numer_aggregate, denom_reltype
|
||||
'''.format(skip="', '".join(SKIP_COLUMNS))).fetchall()
|
||||
|
||||
#CATEGORY_COLUMNS = query('''
|
||||
#SELECT distinct numer_id
|
||||
#FROM observatory.obs_meta
|
||||
#WHERE numer_type ILIKE 'text'
|
||||
#AND numer_weight > 0
|
||||
#''').fetchall()
|
||||
#
|
||||
#BOUNDARY_COLUMNS = query('''
|
||||
#SELECT id FROM observatory.obs_column
|
||||
#WHERE type ILIKE 'geometry'
|
||||
#AND weight > 0
|
||||
#''').fetchall()
|
||||
#
|
||||
#US_CENSUS_MEASURE_COLUMNS = query('''
|
||||
#SELECT distinct numer_name
|
||||
#FROM observatory.obs_meta
|
||||
#WHERE numer_type ILIKE 'numeric'
|
||||
#AND 'us.census.acs' = ANY (subsection_tags)
|
||||
#AND numer_weight > 0
|
||||
#''').fetchall()
|
||||
|
||||
|
||||
#def default_geometry_id(column_id):
|
||||
# '''
|
||||
# Returns default test point for the column_id.
|
||||
# '''
|
||||
# if column_id == 'whosonfirst.wof_disputed_geom':
|
||||
# return 'ST_SetSRID(ST_MakePoint(76.57, 33.78), 4326)'
|
||||
# elif column_id == 'whosonfirst.wof_marinearea_geom':
|
||||
# return 'ST_SetSRID(ST_MakePoint(-68.47, 43.33), 4326)'
|
||||
# elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
# 'us.census.tiger.school_district_secondary',
|
||||
# 'us.census.tiger.school_district_elementary_clipped',
|
||||
# 'us.census.tiger.school_district_secondary_clipped'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-73.7067, 40.7025), 4326)'
|
||||
# elif column_id.startswith('es.ine'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-2.51141249535454, 42.8226119029222), 4326)'
|
||||
# elif column_id.startswith('us.zillow'):
|
||||
# return 'ST_SetSRID(ST_MakePoint(-81.3544048197256, 28.3305906291771), 4326)'
|
||||
# elif column_id.startswith('ca.'):
|
||||
# return ''
|
||||
# else:
|
||||
# return 'ST_SetSRID(ST_MakePoint(-73.9, 40.7), 4326)'
|
||||
|
||||
|
||||
def default_lonlat(column_id):
|
||||
'''
|
||||
@@ -142,11 +107,6 @@ def default_lonlat(column_id):
|
||||
return (76.57, 33.78)
|
||||
elif column_id == 'whosonfirst.wof_marinearea_geom':
|
||||
return (-68.47, 43.33)
|
||||
elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
'us.census.tiger.school_district_secondary',
|
||||
'us.census.tiger.school_district_elementary_clipped',
|
||||
'us.census.tiger.school_district_secondary_clipped'):
|
||||
return (40.7025, -73.7067)
|
||||
elif column_id.startswith('uk'):
|
||||
if 'WA' in column_id:
|
||||
return (51.46844551219723, -3.184833526611328)
|
||||
@@ -158,30 +118,19 @@ def default_lonlat(column_id):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('mx.'):
|
||||
return (19.41347699386547, -99.17019367218018)
|
||||
elif column_id.startswith('th.'):
|
||||
return (13.725377712079784, 100.49263000488281)
|
||||
# cols for French Guyana only
|
||||
#elif column_id in ('fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
# , 'fr.insee.P12_ACTOCC15P_ILT45D'
|
||||
# , 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
|
||||
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
|
||||
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
|
||||
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
|
||||
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
|
||||
# , 'fr.insee.P12_ACTOCC15P_ILT45D'):
|
||||
# return (4.938408371206558, -52.32908248901367)
|
||||
elif column_id.startswith('fr.'):
|
||||
return (48.860875144709475, 2.3613739013671875)
|
||||
elif column_id.startswith('ca.'):
|
||||
return (43.65594991256823, -79.37965393066406)
|
||||
elif column_id in ('us.census.tiger.school_district_elementary',
|
||||
'us.census.tiger.school_district_secondary',
|
||||
'us.census.tiger.school_district_elementary_clipped',
|
||||
'us.census.tiger.school_district_secondary_clipped',
|
||||
'us.census.tiger.school_district_elementary_geoname',
|
||||
'us.census.tiger.school_district_secondary_geoname'):
|
||||
return (40.7025, -73.7067)
|
||||
elif column_id.startswith('us.census.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.dma.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.ihme.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.bls.'):
|
||||
@@ -192,8 +141,6 @@ def default_lonlat(column_id):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('us.epa.'):
|
||||
return (28.3305906291771, -81.3544048197256)
|
||||
elif column_id.startswith('eu.'):
|
||||
raise SkipTest('No tests for Eurostat!')
|
||||
elif column_id.startswith('br.'):
|
||||
return (-23.53, -46.63)
|
||||
elif column_id.startswith('au.'):
|
||||
@@ -202,56 +149,65 @@ def default_lonlat(column_id):
|
||||
raise Exception('No catalog point set for {}'.format(column_id))
|
||||
|
||||
|
||||
def default_point(column_id):
|
||||
lat, lng = default_lonlat(column_id)
|
||||
def default_point(test_point):
|
||||
lat, lng = test_point
|
||||
return 'ST_SetSRID(ST_MakePoint({lng}, {lat}), 4326)'.format(
|
||||
lat=lat, lng=lng)
|
||||
|
||||
|
||||
def default_area(column_id):
|
||||
def default_area(test_point):
|
||||
'''
|
||||
Returns default test area for the column_id
|
||||
'''
|
||||
point = default_point(column_id)
|
||||
point = default_point(test_point)
|
||||
area = 'ST_Transform(ST_Buffer(ST_Transform({point}, 3857), 250), 4326)'.format(
|
||||
point=point)
|
||||
return area
|
||||
|
||||
#@parameterized(US_CENSUS_MEASURE_COLUMNS)
|
||||
#def test_get_us_census_measure_points(name):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetUSCensusMeasure({point}, '{name}')
|
||||
# '''.format(name=name.replace("'", "''"),
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point('')))
|
||||
# rows = resp.fetchall()
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
def filter_points():
|
||||
return MEASURE_COLUMNS
|
||||
|
||||
|
||||
def grouped_measure_columns():
|
||||
for numer_ids, numer_aggregate, denom_reltype, section_tags in MEASURE_COLUMNS:
|
||||
def filter_areas():
|
||||
filtered = []
|
||||
for numer_ids, numer_aggregate, denom_reltype in MEASURE_COLUMNS:
|
||||
if numer_aggregate is None or numer_aggregate.lower() not in ('sum', 'median', 'average'):
|
||||
continue
|
||||
if numer_aggregate.lower() in ('median', 'average') \
|
||||
and (denom_reltype is None or denom_reltype.lower() != 'universe'):
|
||||
continue
|
||||
filtered.append((numer_ids, numer_aggregate, denom_reltype))
|
||||
|
||||
return filtered
|
||||
|
||||
|
||||
def grouped_measure_columns(filtered_columns):
|
||||
groupbypoint = dict()
|
||||
for row in filtered_columns:
|
||||
numer_ids = row[0]
|
||||
point = default_lonlat(numer_ids)
|
||||
if point in groupbypoint:
|
||||
groupbypoint[point].append(numer_ids)
|
||||
else:
|
||||
groupbypoint[point] = [numer_ids]
|
||||
|
||||
for point, numer_ids in groupbypoint.iteritems():
|
||||
for colgroup in grouper(numer_ids, 50):
|
||||
yield [c for c in colgroup if c], numer_aggregate, denom_reltype, section_tags
|
||||
yield point, [c for c in colgroup if c]
|
||||
|
||||
|
||||
@parameterized(grouped_measure_columns())
|
||||
def test_get_measure_points(numer_ids, numer_aggregate, denom_reltype, section_tags):
|
||||
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_point(numer_ids[0]))
|
||||
@parameterized(grouped_measure_columns(filter_points()))
|
||||
def test_get_measure_points(point, numer_ids):
|
||||
_test_measures(numer_ids, default_point(point))
|
||||
|
||||
|
||||
@parameterized(grouped_measure_columns())
|
||||
def test_get_measure_areas(numer_ids, numer_aggregate, denom_reltype, section_tags):
|
||||
if numer_aggregate is None or numer_aggregate.lower() not in ('sum', 'median', 'average'):
|
||||
return
|
||||
if numer_aggregate.lower() in ('median', 'average') \
|
||||
and (denom_reltype is None \
|
||||
or denom_reltype.lower() != 'universe'):
|
||||
return
|
||||
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_area(numer_ids[0]))
|
||||
@parameterized(grouped_measure_columns(filter_areas()))
|
||||
def test_get_measure_areas(point, numer_ids):
|
||||
_test_measures(numer_ids, default_area(point))
|
||||
|
||||
|
||||
def _test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, geom):
|
||||
def _test_measures(numer_ids, geom):
|
||||
in_params = []
|
||||
for numer_id in numer_ids:
|
||||
in_params.append({
|
||||
@@ -284,90 +240,3 @@ def _test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, geom
|
||||
assert_equal(len(vals), len(in_params))
|
||||
for i, val in enumerate(vals):
|
||||
assert_is_not_none(val, 'NULL for {}'.format(in_params[i]['numer_id']))
|
||||
|
||||
|
||||
#@parameterized(CATEGORY_COLUMNS)
|
||||
#def test_get_category_areas(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetCategory({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(CATEGORY_COLUMNS)
|
||||
#def test_get_category_points(column_id):
|
||||
# if column_id in SKIP_COLUMNS:
|
||||
# raise SkipTest('Column {} should be skipped'.format(column_id))
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetCategory({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# rows = resp.fetchall()
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundaries_by_geometry(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundariesByGeometry({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_points_by_geometry(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetPointsByGeometry({area}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# area=default_area(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_points(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundary({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_id(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundaryId({point}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# point=default_point(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
#@parameterized(BOUNDARY_COLUMNS)
|
||||
#def test_get_boundary_by_id(column_id):
|
||||
# resp = query('''
|
||||
#SELECT * FROM {schema}OBS_GetBoundaryById({geometry_id}, '{column_id}')
|
||||
# '''.format(column_id=column_id,
|
||||
# schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
# geometry_id=default_geometry_id(column_id)))
|
||||
# assert_equal(resp.status_code, 200)
|
||||
# rows = resp.json()['rows']
|
||||
# assert_equal(1, len(rows))
|
||||
# assert_is_not_none(rows[0][0])
|
||||
|
||||
|
||||
@@ -44,33 +44,7 @@ for q in (
|
||||
-73.81885528564453,41.745696344339564, 4326),
|
||||
'us.census.tiger.county_clipped')) foo
|
||||
ORDER BY ST_NPoints(the_geom) DESC
|
||||
LIMIT 50;''',
|
||||
'DROP TABLE IF EXISTS obs_perftest_country_simple',
|
||||
'''CREATE TABLE obs_perftest_country_simple (cartodb_id SERIAL PRIMARY KEY,
|
||||
geom GEOMETRY,
|
||||
name TEXT) ''',
|
||||
'''INSERT INTO obs_perftest_country_simple (geom, name)
|
||||
SELECT the_geom geom,
|
||||
geom_refs AS name
|
||||
FROM (SELECT * FROM {schema}OBS_GetBoundariesByGeometry(
|
||||
st_makeenvelope(-179,-89, 179,89, 4326),
|
||||
'whosonfirst.wof_country_geom')) foo
|
||||
ORDER BY ST_NPoints(the_geom) ASC
|
||||
LIMIT 50;''',
|
||||
'DROP TABLE IF EXISTS obs_perftest_country_complex',
|
||||
'''CREATE TABLE obs_perftest_country_complex (cartodb_id SERIAL PRIMARY KEY,
|
||||
geom GEOMETRY,
|
||||
name TEXT) ''',
|
||||
'''INSERT INTO obs_perftest_country_complex (geom, name)
|
||||
SELECT the_geom geom,
|
||||
geom_refs AS name
|
||||
FROM (SELECT * FROM {schema}OBS_GetBoundariesByGeometry(
|
||||
st_makeenvelope(-179,-89, 179,89, 4326),
|
||||
'whosonfirst.wof_country_geom')) foo
|
||||
ORDER BY ST_NPoints(the_geom) DESC
|
||||
LIMIT 50;''',
|
||||
#'''SET statement_timeout = 5000;'''
|
||||
):
|
||||
LIMIT 50;'''):
|
||||
q_formatted = q.format(
|
||||
schema='cdb_observatory.' if USE_SCHEMA else '',
|
||||
)
|
||||
@@ -118,15 +92,7 @@ def record(params, results):
|
||||
|
||||
('complex', '_OBS_GetGeometryScores', 'NULL', 1),
|
||||
('complex', '_OBS_GetGeometryScores', 'NULL', 500),
|
||||
('complex', '_OBS_GetGeometryScores', 'NULL', 3000),
|
||||
|
||||
('country_simple', '_OBS_GetGeometryScores', 'NULL', 1),
|
||||
('country_simple', '_OBS_GetGeometryScores', 'NULL', 500),
|
||||
('country_simple', '_OBS_GetGeometryScores', 'NULL', 5000),
|
||||
|
||||
('country_complex', '_OBS_GetGeometryScores', 'NULL', 1),
|
||||
('country_complex', '_OBS_GetGeometryScores', 'NULL', 500),
|
||||
('country_complex', '_OBS_GetGeometryScores', 'NULL', 5000),
|
||||
('complex', '_OBS_GetGeometryScores', 'NULL', 3000)
|
||||
])
|
||||
def test_getgeometryscores_performance(geom_complexity, api_method, filters, target_geoms):
|
||||
print api_method, geom_complexity, filters, target_geoms
|
||||
|
||||
Reference in New Issue
Block a user