58 Commits
1.3.0 ... 1.3.3

Author SHA1 Message Date
John Krauss
ec53d354e9 release 1.3.3 artifact 2017-03-10 19:48:23 +00:00
John Krauss
c1aa91da5b update NEWS.md 2017-03-10 19:36:00 +00:00
John Krauss
93ebd9aa0f test getdata across multiple input columns; remove dead code from autotest 2017-03-10 19:27:06 +00:00
John Krauss
4a29c060ef fix unittest bug, easier to read use of unnest, static geomvals when one passed in 2017-03-10 19:18:06 +00:00
John Krauss
1639bea74a mark relevant functions STABLE 2017-03-10 18:36:51 +00:00
John Krauss
765cbfcccc only do polygon operations when polygons passed in 2017-03-10 16:32:31 +00:00
John Krauss
c4f3c5d534 selectively pass through obs geometries and area calcs 2017-03-10 16:23:27 +00:00
John Krauss
d5e7d95824 fix performance regression on getboundariesbygeometry, where pct overlap was being unnecessarily calculated 2017-03-09 21:26:53 +00:00
John Krauss
3ff1b36d7f remove erroneous NOTICE 2017-03-09 20:46:38 +00:00
John Krauss
c28cdeb767 Merge branch 'release-v-1.3.3' into separate-geom-from-data-calcs 2017-03-09 18:09:45 +00:00
John Krauss
b1d672bfe4 Merge branch 'release-v-1.3.3' into faster-autotest 2017-03-09 18:07:28 +00:00
John Krauss
524d477f7b Merge remote-tracking branch 'origin/release-v-1.3.3' into release-v-1.3.3 2017-03-09 17:59:49 +00:00
John Krauss
7ef035580f avoid geom calculation when points are passed in 2017-03-09 17:29:41 +00:00
John Krauss
20b347528c tests passing 2017-03-09 17:14:20 +00:00
John Krauss
d070802f53 resolving API bugs 2017-03-09 16:17:58 +00:00
John Krauss
751f470049 Merge branch 'faster-autotest' into separate-geom-from-data-calcs 2017-03-09 14:58:26 +00:00
John Krauss
a1b5f01d57 Merge remote-tracking branch 'origin/develop' into faster-autotest 2017-03-09 14:50:48 +00:00
John Krauss
f2d2b32bf1 Merge remote-tracking branch 'origin/develop' into separate-geom-from-data-calcs 2017-03-09 14:50:01 +00:00
csobier
02413eb974 line 412, bad tag 2017-03-09 08:15:43 -05:00
csobier
1a4a2edbc6 lien 207 2017-03-09 08:10:23 -05:00
csobier
47c6453bbc tags 2017-03-09 08:05:59 -05:00
csobier
5f2daad408 wrong quotes around context, breaking docs 2017-03-09 07:36:10 -05:00
csobier
764a1ce7cd highlight missing, breaking docs 2017-03-09 07:29:19 -05:00
csobier
12235c7138 missing tag on line 387, breaking docs. 2017-03-09 07:16:49 -05:00
John Krauss
3f817f8e9a bugfixes, most unit tests passing 2017-03-09 05:03:25 +00:00
John Krauss
5ca2664a17 first pass much faster multicolumn getdata via precalcs 2017-03-09 04:12:38 +00:00
John Krauss
1b913c77c4 fix last oustanding bug with autotest 2017-03-08 23:18:07 +00:00
John Krauss
22eb6349c2 fix issues with python autotest failing for nulls, try removing case statements around geometries in getdata 2017-03-08 21:17:45 +00:00
John Krauss
862db2c33a Merge remote-tracking branch 'origin/release-v-1.3.3' into faster-autotest
Conflicts:
	src/pg/sql/40_observatory_utility.sql
2017-03-08 20:52:31 +00:00
John Krauss
e2f92d78cf much faster autotest by grouping in getdata, fixes to getdata to prevent hangs 2017-03-08 20:51:41 +00:00
john krauss
3df1ffc3c8 Merge pull request #265 from CartoDB/check-intersection-errors
Resolve intersection errors
2017-03-08 15:38:39 -05:00
John Krauss
6a60cfc417 Merge branch 'develop' into faster-autotest 2017-03-08 15:57:14 +00:00
John Krauss
3b6b1b4843 limit safe_intersection to SRID 4326, DRY out ST_MakeValid 2017-03-08 15:52:19 +00:00
John Krauss
460059f2cf Merge branch 'develop' into check-intersection-errors 2017-03-07 20:45:05 +00:00
John Krauss
fc111dd1e2 Merge branch 'obs-getavailableX-docs' into develop 2017-03-07 20:39:40 +00:00
John Krauss
7cbef7e1b5 Merge branch 'obs-getdata-getmeta-docs' into develop 2017-03-07 20:39:25 +00:00
John Krauss
deede798e9 fix non-noded intersection between shoreline clipped and non-shoreline clipped geometries by using a safe_intersection function 2017-03-07 20:38:12 +00:00
John Krauss
fd3918b29c fix divide-by-zero errors 2017-03-07 16:45:15 +00:00
John Krauss
cdf7b17a4d tmp commit 2017-03-07 15:29:09 +00:00
John Krauss
50ec6dddf6 release-v1.3.2 artifact 2017-03-02 21:16:22 +00:00
John Krauss
0ebe9babeb update tests 2017-03-02 21:09:23 +00:00
John Krauss
4fad32d5f2 fix and NEWS.md 2017-03-02 21:07:29 +00:00
John Krauss
f0efa1e2eb release v1.3.1 2017-03-01 16:33:35 +00:00
John Krauss
bcbd8a2be4 change OBS_GetLegacyMetadata to return median/average measures too when called for polygons 2017-03-01 16:03:14 +00:00
John Krauss
63ae7c1392 add obs_getavailableX metadata API docs 2017-02-28 21:33:06 +00:00
John Krauss
af671931d4 integrate michelles comments 2017-02-23 20:12:27 +00:00
John Krauss
71d891c067 handle blank aggregates 2017-02-16 17:53:38 +00:00
John Krauss
01b56fbfcc update fixtures 2017-02-16 17:38:18 +00:00
John Krauss
6215f6585c update NEWS.md 2017-02-16 17:23:47 +00:00
John Krauss
9bda063148 Merge branch 'nonsum-interpolation' into release-v-1.3.1 2017-02-16 17:20:38 +00:00
John Krauss
4a97689705 add point for au 2017-02-16 16:50:40 +00:00
John Krauss
2edb850a45 estimate average and median across arbitrary areas if universe is provided, otherwise return null and raise a notice. fixes #252 2017-02-10 01:08:10 +00:00
Michelle Ho
8120081d68 Typo fix
Typo fix of "measured" to "measure"
2017-02-06 16:37:27 -05:00
Michelle Ho
72ced1a7a7 Change 'raise' to 'raises'
Changes semantic meaning-- user does not raise the error, CARTO raises the error
2017-02-06 16:27:43 -05:00
Michelle Ho
d15b74a594 Change ``OBS_GetUSCensusMeasure`` 2017-02-06 16:18:56 -05:00
Michelle Ho
60ab773549 change point to polygon in GetUSCensusMeasure 2017-02-06 15:57:49 -05:00
Michelle Ho
01b70dd06e proof-reading changes 2017-02-06 14:58:07 -05:00
John Krauss
4b409cc9f4 first-pass docs for obs_getdata and obs_getmeta 2017-02-01 09:12:18 -05:00
23 changed files with 8798 additions and 952 deletions

39
NEWS.md
View File

@@ -1,3 +1,42 @@
1.3.3 (2017-03-10)
__Bugfixes__
* Resolve divide-by-zero errors in cases where the intersection of an
Observatory geometry and user geometry has 0 area
([#265](https://github.com/CartoDB/observatory-extension/pull/265))
* Run MakeValid on geometry's when intersecting, if necessary
([#268](https://github.com/CartoDB/observatory-extension/pull/268))
__Improvements__
* Add performance tests for multiple columns in `OBS_GetData`
* Major performance boost for `autotest.py` through the use of multi-column
`OBS_GetData` instead of separate `OBS_GetMeasure` calls for every single
measurement.
([#268](https://github.com/CartoDB/observatory-extension/pull/268))
* Major performance boost for `OBS_GetData` in cases where multiple columns are
requested. Previously, each additional column would result in a linear
slowdown, even if geometries could be reused.
([#267](https://github.com/CartoDB/observatory-extension/pull/267))
1.3.2 (2017-03-02)
__Bugfixes__
* Accept "prenormalized" as well as "predenominated" to bypass normalization.
This fixes issues with Camshaft.
1.3.1 (2017-02-16)
__Improvements__
* It is now possible to obtain measures that are averages or medians over
arbitrary polygons ([#254](https://github.com/CartoDB/observatory-extension/pull/254).
* Added test point for Australian data
* `OBS_GetLegacyMetadata` now returns median and averages in cases where it is
called for measures for polygons
1.3.0 (2017-01-17)
__API Changes__

View File

@@ -56,3 +56,306 @@ time_span | the timespan attached the boundary. this does not mean that the boun
```SQL
SELECT * FROM OBS_GetAvailableBoundaries(CDB_LatLng(40.7, -73.9))
```
## OBS_GetAvailableNumerators(bounds, filter_tags, denom_id, geom_id, timespan)
Return available numerators within a boundary and with the specified
`filter_tags`.
#### Arguments
Name | Type | Description
--- | --- | ---
bounds | Geometry(Geometry, 4326) | a geometry which some of the numerator's data must intersect with
filter_tags | Text[] | a list of filters. Only numerators for which all of these apply are returned `NULL` to ignore (optional)
denom_id | Text | the ID of a denominator to check whether the numerator is valid against. Will not reduce length of returned table, but will change values for `valid_denom` (optional)
geom_id | Text | the ID of a geometry to check whether the numerator is valid against. Will not reduce length of returned table, but will change values for `valid_geom` (optional)
timespan | Text | the ID of a timespan to check whether the numerator is valid against. Will not reduce length of returned table, but will change values for `valid_timespan` (optional)
#### Returns
A TABLE containing the following properties
Key | Type | Description
--- | ---- | -----------
numer_id | Text | The ID of the numerator
numer_name | Text | A human readable name for the numerator
numer_description | Text | Description of the numerator. Is sometimes NULL
numer_weight | Numeric | Numeric "weight" of the numerator. Ignored.
numer_license | Text | ID of the license for the numerator
numer_source | Text | ID of the source for the numerator
numer_type | Text | Postgres type of the numerator
numer_aggregate | Text | Aggregate type of the numerator. If `'SUM'`, this can be normalized by area
numer_extra | JSONB | Extra information about the numerator column. Ignored.
numer_tags | Text[] | Array of all tags applying to this numerator
valid_denom | Boolean | True if the `denom_id` argument is a valid denominator for this numerator, False otherwise
valid_geom | Boolean | True if the `geom_id` argument is a valid geometry for this numerator, False otherwise
valid_timespan | Boolean | True if the `timespan` argument is a valid timespan for this numerator, False otherwise
#### Examples
Obtain all numerators that are available within a small rectangle.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326))
```
Obtain all numerators that are available within a small rectangle and are for
the United States only.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), '{section/tags.united_states}');
```
Obtain all numerators that are available within a small rectangle and are
employment related for the United States only.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), '{section/tags.united_states, subsection/tags.employment}');
```
Obtain all numerators that are available within a small rectangle and are
related to both employment and age & gender for the United States only.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), '{section/tags.united_states, subsection/tags.employment, subsection/tags.age_gender}');
```
Obtain all numerators that work with US population (`us.census.acs.B01003001`)
as a denominator.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, 'us.census.acs.B01003001')
WHERE valid_denom IS True;
```
Obtain all numerators that work with US states (`us.census.tiger.state`)
as a geometry.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, 'us.census.tiger.state')
WHERE valid_geom IS True;
```
Obtain all numerators available in the timespan `2011 - 2015`.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableNumerators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, NULL, '2011 - 2015')
WHERE valid_timespan IS True;
```
## OBS_GetAvailableDenominators(bounds, filter_tags, numer_id, geom_id, timespan)
Return available denominators within a boundary and with the specified
`filter_tags`.
#### Arguments
Name | Type | Description
--- | --- | ---
bounds | Geometry(Geometry, 4326) | a geometry which some of the denominator's data must intersect with
filter_tags | Text[] | a list of filters. Only denominators for which all of these apply are returned `NULL` to ignore (optional)
numer_id | Text | the ID of a numerator to check whether the denominator is valid against. Will not reduce length of returned table, but will change values for `valid_numer` (optional)
geom_id | Text | the ID of a geometry to check whether the denominator is valid against. Will not reduce length of returned table, but will change values for `valid_geom` (optional)
timespan | Text | the ID of a timespan to check whether the denominator is valid against. Will not reduce length of returned table, but will change values for `valid_timespan` (optional)
#### Returns
A TABLE containing the following properties
Key | Type | Description
--- | ---- | -----------
denom_id | Text | The ID of the denominator
denom_name | Text | A human readable name for the denominator
denom_description | Text | Description of the denominator. Is sometimes NULL
denom_weight | Numeric | Numeric "weight" of the denominator. Ignored.
denom_license | Text | ID of the license for the denominator
denom_source | Text | ID of the source for the denominator
denom_type | Text | Postgres type of the denominator
denom_aggregate | Text | Aggregate type of the denominator. If `'SUM'`, this can be normalized by area
denom_extra | JSONB | Extra information about the denominator column. Ignored.
denom_tags | Text[] | Array of all tags applying to this denominator
valid_numer | Boolean | True if the `numer_id` argument is a valid numerator for this denominator, False otherwise
valid_geom | Boolean | True if the `geom_id` argument is a valid geometry for this denominator, False otherwise
valid_timespan | Boolean | True if the `timespan` argument is a valid timespan for this denominator, False otherwise
#### Examples
Obtain all denominators that are available within a small rectangle.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326));
```
Obtain all denominators that are available within a small rectangle and are for
the United States only.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), '{section/tags.united_states}');
```
Obtain all denominators for male population (`us.census.acs.B01001002`).
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, 'us.census.acs.B01001002')
WHERE valid_numer IS True;
```
Obtain all denominators that work with US states (`us.census.tiger.state`)
as a geometry.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, 'us.census.tiger.state')
WHERE valid_geom IS True;
```
Obtain all denominators available in the timespan `2011 - 2015`.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableDenominators(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, NULL, '2011 - 2015')
WHERE valid_timespan IS True;
```
## OBS_GetAvailableGeometries(bounds, filter_tags, numer_id, denom_id, timespan)
Return available geometries within a boundary and with the specified
`filter_tags`.
#### Arguments
Name | Type | Description
--- | --- | ---
bounds | Geometry(Geometry, 4326) | a geometry which must intersect the geometry
filter_tags | Text[] | a list of filters. Only geometries for which all of these apply are returned `NULL` to ignore (optional)
numer_id | Text | the ID of a numerator to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_numer` (optional)
denom_id | Text | the ID of a denominator to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_denom` (optional)
timespan | Text | the ID of a timespan to check whether the geometry is valid against. Will not reduce length of returned table, but will change values for `valid_timespan` (optional)
#### Returns
A TABLE containing the following properties
Key | Type | Description
--- | ---- | -----------
geom_id | Text | The ID of the geometry
geom_name | Text | A human readable name for the geometry
geom_description | Text | Description of the geometry. Is sometimes NULL
geom_weight | Numeric | Numeric "weight" of the geometry. Ignored.
geom_aggregate | Text | Aggregate type of the geometry. Ignored.
geom_license | Text | ID of the license for the geometry
geom_source | Text | ID of the source for the geometry
geom_type | Text | Postgres type of the geometry
geom_extra | JSONB | Extra information about the geometry column. Ignored.
geom_tags | Text[] | Array of all tags applying to this geometry
valid_numer | Boolean | True if the `numer_id` argument is a valid numerator for this geometry, False otherwise
valid_denom | Boolean | True if the `geom_id` argument is a valid geometry for this geometry, False otherwise
valid_timespan | Boolean | True if the `timespan` argument is a valid timespan for this geometry, False otherwise
score | Numeric | Score between 0 and 100 for this geometry, higher numbers mean that this geometry is a better choice for the passed extent
numtiles | Numeric | How many raster tiles were read for score, numgeoms, and percentfill estimates
numgeoms | Numeric | About how many of these geometries fit inside the passed extent
percentfill | Numeric | About what percentage of the passed extent is filled with these geometries
estnumgeoms | Numeric | Ignored
meanmediansize | Numeric | Ignored
#### Examples
Obtain all geometries that are available within a small rectangle.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableGeometries(
ST_MakeEnvelope(-74, 41, -73, 40, 4326));
```
Obtain all geometries that are available within a small rectangle and are for
the United States only.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableGeometries(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), '{section/tags.united_states}');
```
Obtain all geometries that work with total population (`us.census.acs.B01003001`).
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableGeometries(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, 'us.census.acs.B01003001')
WHERE valid_numer IS True;
```
Obtain all geometries with timespan `2015`.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableGeometries(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, NULL, '2015')
WHERE valid_timespan IS True;
```
## OBS_GetAvailableTimespans(bounds, filter_tags, numer_id, denom_id, geom_id)
Return available timespans within a boundary and with the specified
`filter_tags`.
#### Arguments
Name | Type | Description
--- | --- | ---
bounds | Geometry(Geometry, 4326) | a geometry which some of the timespan's data must intersect with
filter_tags | Text[] | a list of filters. Ignore
numer_id | Text | the ID of a numerator to check whether the timespans is valid against. Will not reduce length of returned table, but will change values for `valid_numer` (optional)
denom_id | Text | the ID of a denominator to check whether the timespans is valid against. Will not reduce length of returned table, but will change values for `valid_denom` (optional)
geom_id | Text | the ID of a geometry to check whether the timespans is valid against. Will not reduce length of returned table, but will change values for `valid_geom` (optional)
#### Returns
A TABLE containing the following properties
Key | Type | Description
--- | ---- | -----------
timespan_id | Text | The ID of the timespan
timespan_name | Text | A human readable name for the timespan
timespan_description | Text | Ignored
timespan_weight | Numeric | Ignored
timespan_license | Text | Ignored
timespan_source | Text | Ignored
timespan_aggregate | Text | Ignored
valid_numer | Boolean | True if the `numer_id` argument is a valid numerator for this timespan, False otherwise
valid_denom | Boolean | True if the `timespan` argument is a valid timespan for this timespan, False otherwise
valid_geom | Boolean | True if the `geom_id` argument is a valid geometry for this timespan, False otherwise
#### Examples
Obtain all timespans that are available within a small rectangle.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableTimespans(
ST_MakeEnvelope(-74, 41, -73, 40, 4326));
```
Obtain all timespans for total population (`us.census.acs.B01003001`).
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableTimespans(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, 'us.census.acs.B01003001')
WHERE valid_numer IS True;
```
Obtain all timespans that work with US states (`us.census.tiger.state`)
as a geometry.
```SQL
SELECT * FROM cdb_observatory.OBS_GetAvailableTimespans(
ST_MakeEnvelope(-74, 41, -73, 40, 4326), NULL, NULL, NULL, 'us.census.tiger.state')
WHERE valid_geom IS True;
```

View File

@@ -8,15 +8,15 @@ You can [access](https://carto.com/docs/carto-engine/data/accessing) measures th
## OBS_GetUSCensusMeasure(point geometry, measure_name text)
The ```OBS_GetUSCensusMeasure(point, measure_name)``` function returns a measure based on a subset of the US Census variables at a point location. The ```OBS_GetUSCensusMeasure``` function is limited to only a subset of all measures that are available in the Data Observatory, to access the full list, use measure IDs with the ```OBS_GetMeasure``` function below.
The ```OBS_GetUSCensusMeasure(point, measure_name)``` function returns a measure based on a subset of the US Census variables at a point location. The ```OBS_GetUSCensusMeasure``` function is limited to only a subset of all measures that are available in the Data Observatory. To access the full list, use measure IDs with the ```OBS_GetMeasure``` function below.
#### Arguments
Name |Description
--- | ---
point | a WGS84 point geometry (the_geom)
measure_name | a human readable name of a US Census variable. The list of measure_names is [available in the Glossary](https://carto.com/docs/carto-engine/data/glossary/#obsgetuscensusmeasure-names-table).
normalize | for measures that are are **sums** (e.g. population) the default normalization is 'area' and response comes back as a rate per square kilometer. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
measure_name | a human-readable name of a US Census variable. The list of measure_names is [available in the Glossary](https://carto.com/docs/carto-engine/data/glossary/#obsgetuscensusmeasure-names-table).
normalize | for measures that are **sums** (e.g. population) the default normalization is 'area' and response comes back as a rate per square kilometer. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
time_span | time span of interest (e.g., 2010 - 2014)
@@ -39,7 +39,7 @@ SET total_population = OBS_GetUSCensusMeasure(the_geom, 'Total Population')
## OBS_GetUSCensusMeasure(polygon geometry, measure_name text)
The ```OBS_GetUSCensusMeasure(point, measure_name)``` function returns a measure based on a subset of the US Census variables within a given polygon. The ```OBS_GetUSCensusMeasure``` function is limited to only a subset of all measures that are available in the Data Observatory, to access the full list, use the ```OBS_GetUSCensusMeasure``` function below.
The ```OBS_GetUSCensusMeasure(polygon, measure_name)``` function returns a measure based on a subset of the US Census variables within a given polygon. The ```OBS_GetUSCensusMeasure``` function is limited to only a subset of all measures that are available in the Data Observatory. To access the full list, use the ```OBS_GetMeasure``` function below.
#### Arguments
@@ -78,7 +78,7 @@ Name |Description
--- | ---
point | a WGS84 point geometry (the_geom)
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)). It is important to note that these are different than 'measure_name' used in the Census based functions above.
normalize | for measures that are are **sums** (e.g. population) the default normalization is 'area' and response comes back as a rate per square kilometer. The other option is 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html). (optional)
normalize | for measures that are **sums** (e.g. population) the default normalization is 'area' and response comes back as a rate per square kilometer. The other option is 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html). (optional)
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
time_span | time span of interest (e.g., 2010 - 2014)
@@ -109,7 +109,7 @@ Name |Description
--- | ---
polygon_geometry | a WGS84 polygon geometry (the_geom)
measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf))
normalize | for measures that are are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
normalize | for measures that are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional)
boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract')
time_span | time span of interest (e.g., 2010 - 2014)
@@ -132,7 +132,7 @@ SET household_count = OBS_GetMeasure(the_geom, 'us.census.acs.B11001001')
#### Errors
* If an unrecognized normalization type is input, raise an error: `'Only valid inputs for "normalize" are "area" (default) and "denominator".`
* If an unrecognized normalization type is input, raises error: `'Only valid inputs for "normalize" are "area" (default) and "denominator".`
## OBS_GetMeasureById(geom_ref text, measure_id text, boundary_id text)
@@ -195,3 +195,285 @@ Add the Category to an empty column text column based on point locations in your
UPDATE tablename
SET segmentation = OBS_GetCategory(the_geom, 'us.census.spielman_singleton_segments.X55')
```
## OBS_GetMeta(extent geometry, metadata json, max_timespan_rank, max_boundary_score_rank, num_target_geoms)
The ```OBS_GetMeta(extent, metadata)``` function returns a completed Data
Observatory metadata JSON Object for use in ```OBS_GetData(geomvals,
metadata)``` or ```OBS_GetData(ids, metadata)```. It is not possible to pass
metadata to those functions if it is not processed by ```OBS_GetMeta(extent,
metadata)``` first.
`OBS_GetMeta` makes it possible to automatically select appropriate timespans
and boundaries for the measurement you want.
#### Arguments
Name | Description
---- | -----------
extent | A geometry of the extent of the input geometries
metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column
max_timespan_rank | How many historical time periods to include. Defaults to 1
max_boundary_score_rank | How many alternative boundary levels to include. Defaults to 1
num_target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest.
The schema of the metadata input objects are as follows:
Metadata Input Key | Description
--- | -----------
numer_id | The identifier for the desired measurement. If left blank, but a `geom_id` is specified, the column will return a geometry instead of a measurement.
geom_id | Identifier for a desired geographic boundary level to use when calculating measures. Will be automatically assigned if undefined. If defined but `numer_id` is blank, then the column will return a geometry instead of a measurement.
normalization | The desired normalization. One of 'area', 'prenormalized', or 'denominated'. 'Area' will normalize the measure per square kilometer, 'prenormalized' will return the original value, and 'denominated' will normalize by a denominator. Ignored if this metadata object specifies a geometry.
denom_id | Identifier for a desired normalization column in case `normalization` is 'denominated'. Will be automatically assigned if necessary. Ignored if this metadata object specifies a geometry.
numer_timespan | The desired timespan for the measurement. Defaults to most recent timespan available if left unspecified.
geom_timespan | The desired timespan for the geometry. Defaults to timespan matching numer_timespan if left unspecified.
#### Returns
A JSON array composed of metadata output objects.
Key | Description
--- | -----------
meta | A JSON array with completed metadata for the requested data, including all keys below
The schema of the metadata output objects are as follows. You should pass this
array as-is to ```OBS_GetData```. If you modify any values the function will
fail.
Metadata Output Key | Description
--- | -----------
numer_id | Identifier for desired measurement
numer_timespan | Timespan that will be used of the desired measurement
numer_name | Human-readable name of desired measure
numer_type | PostgreSQL/PostGIS type of desired measure
numer_colname | Internal identifier for column name
numer_tablename | Internal identifier for table
numer_geomref_colname | Internal identifier for geomref column name
denom_id | Identifier for desired normalization
denom_timespan | Timespan that will be used of the desired normalization
denom_name | Human-readable name of desired measure's normalization
denom_type | PostgreSQL/PostGIS type of desired measure's normalization
denom_colname | Internal identifier for normalization column name
denom_tablename | Internal identifier for normalization table
denom_geomref_colname | Internal identifier for normalization geomref column name
geom_id | Identifier for desired boundary geometry
geom_timespan | Timespan that will be used of the desired boundary geometry
geom_name | Human-readable name of desired boundary geometry
geom_type | PostgreSQL/PostGIS type of desired boundary geometry
geom_colname | Internal identifier for boundary geometry column name
geom_tablename | Internal identifier for boundary geometry table
geom_geomref_colname | Internal identifier for boundary geometry ref column name
timespan_rank | Ranking of this measurement by time, most recent is 1, second most recent 2, etc.
score | The score of this measurement's boundary compared to the `extent` and `num_target_geoms` passed in. Between 0 and 100.
score_rank | The ranking of this measurement's boundary, highest ranked is 1, second is 2, etc.
numer_aggregate | The aggregate type of the numerator, either `sum`, `average`, `median`, or blank
denom_aggregate | The aggregate type of the denominator, either `sum`, `average`, `median`, or blank
normalization | The sort of normalization that will be used for this measure, either `area`, `predenominated`, or `denominated`
#### Examples
Obtain metadata that can augment with one additional column of US population
data, using a boundary relevant for the geometry provided and latest timespan.
Limit to only the most recent column most relevant to the extent & density of
input geometries in `tablename`.
```SQL
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001"}]',
1, 1,
COUNT(*)
) FROM tablename
```
Obtain metadata that can augment with one additional column of US population
data, using census tract boundaries.
```SQL
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.census_tract"}]',
1, 1,
COUNT(*)
) FROM tablename
```
Obtain metadata that can augment with two additional columns, one for total
population and one for male population.
```SQL
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001"}, {"numer_id": "us.census.acs.B01001002"}]',
1, 1,
COUNT(*)
) FROM tablename
```
## OBS_GetData(geomvals array[geomval], metadata json)
The ```OBS_GetData(geomvals, metadata)``` function returns a measure and/or
geometry corresponding to the `metadata` JSON array for each every Geometry of
the `geomval` element in the `geomvals` array. The metadata argument must be
obtained from ```OBS_GetMeta(extent, metadata)```.
#### Arguments
Name | Description
---- | -----------
geomvals | An array of `geomval` elements, which are obtained by casting together a `Geometry` and a `Numeric`. This should be obtained by using `ARRAY_AGG((the_geom, cartodb_id)::geomval)` from the CARTO table one wishes to obtain data for.
metadata | A JSON array composed of metadata output objects from ```OBS_GetMeta(extent, metadata)```. The schema of the elements of the `metadata` JSON array corresponds to that of the output of ```OBS_GetMeta(extent, metadata)```, and this argument must be obtained from that function in order for the call to be valid.
#### Returns
A TABLE with the following schema, where each element of the input `geomvals`
array corresponds to one row:
Column | Type | Description
------ | ---- | -----------
id | Numeric | ID corresponding to the `val` component of an element of the input `geomvals` array
data | JSON | A JSON array with elements corresponding to the input `metadata` JSON array
Each `data` object has the following keys:
Key | Description
--- | -----------
value | The value of the measurement or geometry for the geometry corresponding to this row and measurement corresponding to this position in the `metadata` JSON array
To determine the appropriate cast for `value`, one can use the `numer_type`
or `geom_type` key corresponding to that value in the input `metadata` JSON
array.
#### Examples
Obtain population densities for every geometry in a table, keyed by cartodb_id:
```SQL
WITH meta AS (
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001"}]',
1, 1, COUNT(*)
) meta FROM tablename)
SELECT id AS cartodb_id, (data->0->>'value')::Numeric AS pop_density
FROM OBS_GetData((SELECT ARRAY_AGG((the_geom, cartodb_id)::geomval) FROM tablename),
(SELECT meta FROM meta))
```
Update a table with a blank numeric column called `pop_density` with population
densities:
```SQL
WITH meta AS (
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001"}]',
1, 1, COUNT(*)
) meta FROM tablename),
data AS (
SELECT id AS cartodb_id, (data->0->>'value')::Numeric AS pop_density
FROM OBS_GetData((SELECT ARRAY_AGG((the_geom, cartodb_id)::geomval) FROM tablename),
(SELECT meta FROM meta)))
UPDATE tablename
SET pop_density = data.pop_density
FROM data
WHERE cartodb_id = data.id
```
Update a table with two measurements at once, population density and household
density. The table should already have a Numeric column `pop_density` and
`household_density`.
```SQL
WITH meta AS (
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom),4326),
'[{"numer_id": "us.census.acs.B01003001"},{"numer_id": "us.census.acs.B11001001"}]',
1, 1, COUNT(*)
) meta from tablename),
data AS (
SELECT id,
data->0->>'value' AS pop_density,
data->1->>'value' AS household_density
FROM OBS_GetData((SELECT ARRAY_AGG((the_geom, cartodb_id)::geomval) FROM tablename),
(SELECT meta FROM meta)))
UPDATE tablename
SET pop_density = data.pop_density,
household_density = data.household_density
FROM data
WHERE cartodb_id = data.id
```
## OBS_GetData(ids array[text], metadata json)
The ```OBS_GetData(ids, metadata)``` function returns a measure and/or
geometry corresponding to the `metadata` JSON array for each every id of
the `ids` array. The metadata argument must be obtained from
`OBS_GetMeta(extent, metadata)`. When obtaining metadata, one must include
the `geom_id` corresponding to the boundary that the `ids` refer to.
#### Arguments
Name | Description
---- | -----------
ids | An array of `TEXT` elements. This should be obtained by using `ARRAY_AGG(col_of_geom_refs)` from the CARTO table one wishes to obtain data for.
metadata | A JSON array composed of metadata output objects from ```OBS_GetMeta(extent, metadata)```. The schema of the elements of the `metadata` JSON array corresponds to that of the output of ```OBS_GetMeta(extent, metadata)```, and this argument must be obtained from that function in order for the call to be valid.
For this function to work, the `metadata` argument must include a `geom_id`
that corresponds to the ids found in `col_of_geom_refs`.
#### Returns
A TABLE with the following schema, where each element of the input `ids` array
corresponds to one row:
Column | Type | Description
------ | ---- | -----------
id | Text | ID corresponding to an element of the input `ids` array
data | JSON | A JSON array with elements corresponding to the input `metadata` JSON array
Each `data` object has the following keys:
Key | Description
--- | -----------
value | The value of the measurement or geometry for the geometry corresponding to this row and measurement corresponding to this position in the `metadata` JSON array
To determine the appropriate cast for `value`, one can use the `numer_type`
or `geom_type` key corresponding to that value in the input `metadata` JSON
array.
#### Examples
Obtain population densities for every row of a table with FIPS code county IDs
(USA).
```SQL
WITH meta AS (
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.county"}]'
) meta FROM tablename)
SELECT id AS fips, (data->0->>'value')::Numeric AS pop_density
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
(SELECT meta FROM meta))
```
Update a table with population densities for every FIPS code county ID (USA).
This table has a blank column called `pop_density` and fips codes stored in a
column `fips`.
```SQL
WITH meta AS (
SELECT OBS_GetMeta(
ST_SetSRID(ST_Extent(the_geom), 4326),
'[{"numer_id": "us.census.acs.B01003001", "geom_id": "us.census.tiger.county"}]'
) meta FROM tablename),
data as (
SELECT id AS fips, (data->0->>'value') AS pop_density
FROM OBS_GetData((SELECT ARRAY_AGG((fips) FROM tablename),
(SELECT meta FROM meta)))
UPDATE tablename
SET pop_density = data.pop_density
FROM data
WHERE fips = data.id
```

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,5 +1,5 @@
comment = 'CartoDB Observatory backend extension'
default_version = '1.3.0'
default_version = '1.3.3'
requires = 'postgis'
superuser = true
schema = cdb_observatory

View File

@@ -214,6 +214,7 @@ FIXTURES = [
('us.census.tiger.fullname', 'us.census.tiger.pointlm_geom', '2016'),
('us.census.tiger.fullname', 'us.census.tiger.prisecroads_geom', '2016'),
('us.census.tiger.name', 'us.census.tiger.county', '2015'),
('us.census.tiger.name', 'us.census.tiger.county_clipped', '2015'),
('us.census.tiger.name', 'us.census.tiger.block_group', '2015'),
]
@@ -358,7 +359,10 @@ def main():
dump('*', tablename, 'WHERE geom && ST_MakeEnvelope(-74,40.69,-73.9,40.72, 4326)')
continue
elif 'whosonfirst' in table_id:
where = '(\'85632785\',\'85633051\',\'85633111\',\'85633147\',\'85633253\',\'85633267\')'
where = "('85632785','85633051','85633111','85633147','85633253','85633267')"
compare = 'IN'
elif 'county' in table_id and 'tiger' in table_id:
where = "('48061', '36047')"
compare = 'IN'
else:
where = '\'36047%\''

View File

@@ -1,5 +1,5 @@
comment = 'CartoDB Observatory backend extension'
default_version = '1.3.0'
default_version = '1.3.3'
requires = 'postgis'
superuser = true
schema = cdb_observatory

View File

@@ -205,6 +205,18 @@ END;
$$ LANGUAGE plpgsql;
-- Function we can call to raise an exception in the midst of a SQL statement
CREATE OR REPLACE FUNCTION cdb_observatory._OBS_RaiseNotice(
message TEXT
) RETURNS TEXT
AS $$
BEGIN
RAISE NOTICE '%', message;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- Create a function that always returns the first non-NULL item
CREATE OR REPLACE FUNCTION cdb_observatory.first_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE SQL IMMUTABLE STRICT AS $$
@@ -219,3 +231,40 @@ CREATE AGGREGATE cdb_observatory.FIRST (
basetype = anyelement,
stype = anyelement
);
CREATE OR REPLACE FUNCTION cdb_observatory.isnumeric (
typename varchar
)
RETURNS BOOLEAN LANGUAGE SQL IMMUTABLE STRICT AS $$
SELECT LOWER(typename) IN (
'smallint',
'integer',
'bigint',
'decimal',
'numeric',
'real',
'double precision'
)
$$;
-- Attempt to perform intersection, if there's an exception then buffer
-- https://gis.stackexchange.com/questions/50399/how-best-to-fix-a-non-noded-intersection-problem-in-postgis
CREATE OR REPLACE FUNCTION cdb_observatory.safe_intersection(
geom_a Geometry(Geometry, 4326),
geom_b Geometry(Geometry, 4326)
)
RETURNS Geometry(Geometry, 4326) AS
$$
BEGIN
RETURN ST_MakeValid(ST_Intersection(geom_a, geom_b));
EXCEPTION
WHEN OTHERS THEN
BEGIN
RETURN ST_MakeValid(ST_Intersection(ST_Buffer(geom_a, 0.0000001), ST_Buffer(geom_b, 0.0000001)));
EXCEPTION
WHEN OTHERS THEN
RETURN NULL;
END;
END
$$
LANGUAGE 'plpgsql' STABLE STRICT;

View File

@@ -96,7 +96,7 @@ BEGIN
USING geom, meta
RETURN;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetMeta(
@@ -174,6 +174,7 @@ BEGIN
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_tablename END denom_tablename,
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_name END denom_name,
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_type END denom_type,
CASE WHEN f.numer_id IS NULL THEN NULL ELSE denom_reltype END denom_reltype,
m.geom_id,
m.geom_timespan,
geom_colname,
@@ -212,16 +213,23 @@ BEGIN
'numer_geomref_colname', cdb_observatory.FIRST(meta.numer_geomref_colname),
'numer_tablename', cdb_observatory.FIRST(meta.numer_tablename),
'numer_type', cdb_observatory.FIRST(meta.numer_type),
--'numer_description', cdb_observatory.FIRST(meta.numer_description),
--'numer_t_description', cdb_observatory.FIRST(meta.numer_t_description),
'denom_aggregate', cdb_observatory.FIRST(meta.denom_aggregate),
'denom_colname', cdb_observatory.FIRST(denom_colname),
'denom_geomref_colname', cdb_observatory.FIRST(denom_geomref_colname),
'denom_tablename', cdb_observatory.FIRST(denom_tablename),
'denom_type', cdb_observatory.FIRST(meta.denom_type),
'denom_reltype', cdb_observatory.FIRST(meta.denom_reltype),
--'denom_description', cdb_observatory.FIRST(meta.denom_description),
--'denom_t_description', cdb_observatory.FIRST(meta.denom_t_description),
'geom_colname', cdb_observatory.FIRST(geom_colname),
'geom_geomref_colname', cdb_observatory.FIRST(geom_geomref_colname),
'geom_tablename', cdb_observatory.FIRST(geom_tablename),
'geom_type', cdb_observatory.FIRST(meta.geom_type),
'geom_timespan', cdb_observatory.FIRST(meta.geom_timespan),
--'geom_description', cdb_observatory.FIRST(meta.geom_description),
--'geom_t_description', cdb_observatory.FIRST(meta.geom_t_description),
'numer_timespan', cdb_observatory.FIRST(numer_timespan),
'numer_name', cdb_observatory.FIRST(numer_name),
'denom_name', cdb_observatory.FIRST(denom_name),
@@ -252,7 +260,7 @@ BEGIN
;
RETURN result;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetMeasure(
@@ -301,7 +309,7 @@ BEGIN
map_type := 'areaNormalized';
ELSIF normalize ILIKE 'denom%' THEN
map_type := 'denominated';
ELSIF normalize ILIKE 'predenom%' THEN
ELSIF normalize ILIKE 'pre%' THEN
map_type := 'predenominated';
ELSE
-- defaults: area normalization for point if it's possible and none for
@@ -331,7 +339,7 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetMeasureById(
geom_ref TEXT,
@@ -366,7 +374,7 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
-- GetData that obtains data from array of geomrefs
@@ -409,6 +417,7 @@ BEGIN
(unnest($1))->>'denom_geomref_colname' denom_geomref_colname,
(unnest($1))->>'denom_tablename' denom_tablename,
(unnest($1))->>'denom_type' denom_type,
(unnest($1))->>'denom_reltype' denom_reltype,
(unnest($1))->>'geom_id' geom_id,
(unnest($1))->>'geom_colname' geom_colname,
(unnest($1))->>'geom_geomref_colname' geom_geomref_colname,
@@ -425,10 +434,10 @@ BEGIN
'JSON_Build_Object(' || CASE
WHEN api_method IS NOT NULL THEN
'''value'', ' ||
'cdb_observatory.FIRST( ' ||
api_method || '.' || numer_colname || ')::' || numer_type
'ARRAY_AGG( ' ||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
-- numeric internal values
WHEN LOWER(numer_type) LIKE 'numeric' THEN
WHEN cdb_observatory.isnumeric(numer_type) THEN
'''value'', ' || CASE
-- denominated
WHEN LOWER(normalization) LIKE 'denom%' OR (normalization IS NULL AND denom_id IS NOT NULL)
@@ -476,10 +485,16 @@ BEGIN
) tablenames_inner
) tablenames_outer) tablenames,
String_Agg(numer_tablename || '.' || numer_geomref_colname || ' = ' ||
geom_tablename || '.' || geom_geomref_colname ||
Coalesce(' AND ' || numer_tablename || '.' || numer_geomref_colname || ' = ' ||
denom_tablename || '.' || denom_geomref_colname, ''),
String_Agg(DISTINCT array_to_string(ARRAY[
CASE WHEN numer_tablename != geom_tablename
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
geom_tablename || '.' || geom_geomref_colname
ELSE NULL END,
CASE WHEN numer_tablename != denom_tablename
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
denom_tablename || '.' || denom_geomref_colname
ELSE NULL END
], ' AND '),
' AND ') AS obs_wheres,
String_Agg(geom_tablename || '.' || geom_geomref_colname || ' = ' ||
@@ -499,11 +514,14 @@ BEGIN
GROUP BY _geomrefs.id
ORDER BY _geomrefs.id
$query$, colspecs, tables,
'WHERE ' || NULLIF(ARRAY_TO_STRING(ARRAY[obs_wheres, user_wheres], ' AND '), ''))
'WHERE ' || NULLIF(ARRAY_TO_STRING(ARRAY[
Nullif(obs_wheres, ''), Nullif(user_wheres, '')
], ' AND '), '')
)
USING geomrefs;
RETURN;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
-- GetData that obtains data from array of (geom, id) geomvals.
@@ -518,127 +536,168 @@ RETURNS TABLE (
)
AS $$
DECLARE
colspecs TEXT;
geomrefs TEXT;
tables TEXT;
geom_colspecs TEXT;
geom_tables TEXT;
geomrefs_alias TEXT;
geomrefs_noalias TEXT;
data_colspecs TEXT;
data_tables TEXT;
obs_wheres TEXT;
user_wheres TEXT;
geomtype TEXT;
BEGIN
IF params IS NULL OR JSON_ARRAY_LENGTH(params) = 0 THEN
IF params IS NULL OR JSON_ARRAY_LENGTH(params) = 0 OR ARRAY_LENGTH(geomvals, 1) IS NULL THEN
RETURN QUERY EXECUTE $query$ SELECT NULL::INT, NULL::JSON LIMIT 0 $query$;
RETURN;
END IF;
geomtype := ST_GeometryType(geomvals[1].geom);
EXECUTE
$query$
WITH _meta AS (SELECT
generate_series(1, array_length($1, 1)) colid,
(unnest($1))->>'id' id,
(unnest($1))->>'numer_id' numer_id,
(unnest($1))->>'numer_aggregate' numer_aggregate,
(unnest($1))->>'numer_colname' numer_colname,
(unnest($1))->>'numer_geomref_colname' numer_geomref_colname,
(unnest($1))->>'numer_tablename' numer_tablename,
(unnest($1))->>'numer_type' numer_type,
(unnest($1))->>'denom_id' denom_id,
(unnest($1))->>'denom_aggregate' denom_aggregate,
(unnest($1))->>'denom_colname' denom_colname,
(unnest($1))->>'denom_geomref_colname' denom_geomref_colname,
(unnest($1))->>'denom_tablename' denom_tablename,
(unnest($1))->>'denom_type' denom_type,
(unnest($1))->>'geom_id' geom_id,
(unnest($1))->>'geom_colname' geom_colname,
(unnest($1))->>'geom_geomref_colname' geom_geomref_colname,
(unnest($1))->>'geom_tablename' geom_tablename,
(unnest($1))->>'geom_type' geom_type,
(unnest($1))->>'numer_timespan' numer_timespan,
(unnest($1))->>'geom_timespan' geom_timespan,
(unnest($1))->>'normalization' normalization,
(unnest($1))->>'api_method' api_method,
(unnest($1))->'api_args' api_args
row_number() over () colid,
meta->>'id' id,
meta->>'numer_id' numer_id,
meta->>'numer_aggregate' numer_aggregate,
meta->>'numer_colname' numer_colname,
meta->>'numer_geomref_colname' numer_geomref_colname,
meta->>'numer_tablename' numer_tablename,
meta->>'numer_type' numer_type,
meta->>'denom_id' denom_id,
meta->>'denom_aggregate' denom_aggregate,
meta->>'denom_colname' denom_colname,
meta->>'denom_geomref_colname' denom_geomref_colname,
meta->>'denom_tablename' denom_tablename,
meta->>'denom_type' denom_type,
meta->>'denom_reltype' denom_reltype,
meta->>'geom_id' geom_id,
meta->>'geom_colname' geom_colname,
meta->>'geom_geomref_colname' geom_geomref_colname,
meta->>'geom_tablename' geom_tablename,
meta->>'geom_type' geom_type,
meta->>'numer_timespan' numer_timespan,
meta->>'geom_timespan' geom_timespan,
meta->>'normalization' normalization,
meta->>'api_method' api_method,
meta->'api_args' api_args
FROM UNNEST($1) AS meta
)
SELECT String_Agg(
SELECT
String_Agg(DISTINCT
CASE
-- pass-through geom if user is requesting it only
WHEN numer_id IS NULL AND api_method IS NULL THEN
geom_tablename || '.' || geom_colname || ' AS geom_' || geom_tablename
WHEN cdb_observatory.isnumeric(numer_type) AND api_method IS NULL THEN
-- for numeric points with area normalization, include areas of underlying geoms
CASE
WHEN $2 = 'ST_Point' AND (LOWER(normalization) LIKE 'area%' OR
(normalization IS NULL AND numer_aggregate ILIKE 'sum')) THEN
' Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography), 0)/1000000 ' ||
' AS area_' || geom_tablename
-- for numeric areas, include more complex calcs
WHEN $2 != 'ST_Point' THEN
'CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
' THEN ST_Area(_geoms.geom) / Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0)' ||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
' THEN 1 ' ||
' ELSE ST_Area(cdb_observatory.safe_intersection(_geoms.geom, ' ||
geom_tablename || '.' || geom_colname || ')) / ' ||
'Nullif(ST_Area(' || geom_tablename || '.' || geom_colname || '), 0) ' ||
'END pct_' || geom_tablename
ELSE NULL
END
ELSE NULL END
, ', ') AS geom_colspecs,
String_Agg(DISTINCT 'observatory.' || geom_tablename, ', ') AS geom_tables,
String_Agg(
'JSON_Build_Object(' || CASE
-- api-delivered values
WHEN api_method IS NOT NULL THEN
'''value'', ' ||
'cdb_observatory.FIRST( ' ||
api_method || '.' || numer_colname || ')::' || numer_type
'ARRAY_AGG( ' ||
api_method || '.' || numer_colname || ')::' || numer_type || '[]'
-- numeric internal values
WHEN LOWER(numer_type) LIKE 'numeric' THEN
WHEN cdb_observatory.isnumeric(numer_type) THEN
'''value'', ' || CASE
-- denominated
WHEN LOWER(normalization) LIKE 'denom%' OR (normalization IS NULL AND denom_id IS NOT NULL)
THEN ' CASE ' ||
-- denominated point-in-poly or user polygon is same as OBS polygon
' WHEN ST_GeometryType(cdb_observatory.FIRST(_geoms.geom)) = ''ST_Point'' ' ||
' OR cdb_observatory.FIRST(_geoms.geom = ' || geom_tablename || '.' || geom_colname || ')' ||
' THEN cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))' ||
WHEN LOWER(normalization) LIKE 'denom%' OR
(normalization IS NULL AND LOWER(denom_reltype) LIKE 'denominator')
THEN CASE
-- denominated point-in-poly
WHEN $2 = 'ST_Point' THEN
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
' / NullIf(' || denom_tablename || '.' || denom_colname || ', 0))'
-- denominated polygon interpolation
-- SUM (numer * (% OBS geom in user geom)) / SUM (denom * (% OBS geom in user geom))
' ELSE ' ||
ELSE
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
' * CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
' THEN ST_Area(_geoms.geom) / ST_Area(' || geom_tablename || '.' || geom_colname || ') ' ||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
' THEN 1 ' ||
' ELSE (ST_Area(ST_Intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) ' ||
' / ST_Area(' || geom_tablename || '.' || geom_colname || '))' ||
' END) / '
' NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
' * CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
' THEN ST_Area(_geoms.geom) / ST_Area(' || geom_tablename || '.' || geom_colname || ') ' ||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
' THEN 1 ' ||
' ELSE (ST_Area(ST_Intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) ' ||
' / ST_Area(' || geom_tablename || '.' || geom_colname || '))' ||
' END), 0) ' ||
' / (COUNT(*) / COUNT(distinct ' || geom_tablename || '.' || geom_geomref_colname || ')) ' ||
' END '
' * pct_' || geom_tablename ||
' ) / NULLIF(SUM(' || denom_tablename || '.' || denom_colname || ' ' ||
' * pct_' || geom_tablename || '), 0) ' ||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
END
-- areaNormalized
WHEN LOWER(normalization) LIKE 'area%' OR (normalization IS NULL AND numer_aggregate ILIKE 'sum')
THEN ' CASE ' ||
-- areaNormalized point-in-poly or user polygon is the same as OBS polygon
' WHEN ST_GeometryType(cdb_observatory.FIRST(_geoms.geom)) = ''ST_Point'' ' ||
' OR cdb_observatory.FIRST(_geoms.geom = ' || geom_tablename || '.' || geom_colname || ')' ||
' THEN cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
' / (ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography)/1000000)) ' ||
WHEN LOWER(normalization) LIKE 'area%' OR
(normalization IS NULL AND numer_aggregate ILIKE 'sum')
THEN CASE
-- areaNormalized point-in-poly
WHEN $2 = 'ST_Point' THEN
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname ||
' / area_' || geom_tablename || ')'
-- areaNormalized polygon interpolation
-- SUM (numer * (% OBS geom in user geom)) / area of big geom
' ELSE ' ||
ELSE
--' NULL END '
' SUM((' || numer_tablename || '.' || numer_colname || ') ' ||
' * CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') THEN 1 ' ||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) THEN ' ||
' ST_Area(' || geom_tablename || '.' || geom_colname || ') ' ||
' / ST_Area(_geoms.geom)' ||
' ELSE (ST_Area(ST_Intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) ' ||
' / ST_Area(_geoms.geom))' ||
' END / (ST_Area(' || geom_tablename || '.' || geom_colname || '::Geography) / 1000000)) ' ||
' / (COUNT(*) / COUNT(distinct ' || geom_tablename || '.' || geom_geomref_colname || ')) ' ||
' END '
-- prenormalized
ELSE ' CASE ' ||
-- predenominated point-in-poly or user polygon is the same as OBS- polygon
' WHEN ST_GeometryType(cdb_observatory.FIRST(_geoms.geom)) = ''ST_Point'' ' ||
' OR cdb_observatory.FIRST(_geoms.geom = ' || geom_tablename || '.' || geom_colname || ')' ||
' THEN cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') ' ||
' ELSE ' ||
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
' * pct_' || geom_tablename ||
' ) / (Nullif(ST_Area(cdb_observatory.FIRST(_procgeoms.geom)::Geography), 0) / 1000000) ' ||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
END
-- median/average measures with universe
WHEN LOWER(numer_aggregate) IN ('median', 'average') AND
denom_reltype ILIKE 'universe' AND
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
THEN CASE
-- predenominated point-in-poly
WHEN $2 = 'ST_Point' THEN
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
ELSE
-- predenominated polygon interpolation weighted by universe
-- SUM (numer * denom * (% user geom in OBS geom)) / SUM (denom * (% user geom in OBS geom))
-- (10 * 1000 * 1) / (1000 * 1) = 10
-- (10 * 1000 * 1 + 50 * 10 * 1) / (1000 + 10) = 10500 / 10000 = 10.5
' SUM(' || numer_tablename || '.' || numer_colname ||
' * ' || denom_tablename || '.' || denom_colname ||
' * pct_' || geom_tablename ||
' ) / Nullif(SUM(' || denom_tablename || '.' || denom_colname ||
' * pct_' || geom_tablename || '), 0) ' ||
' / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
END
-- prenormalized for summable measures. point or summable only!
WHEN numer_aggregate ILIKE 'sum' AND
(normalization IS NULL OR LOWER(normalization) LIKE 'pre%')
THEN CASE
-- predenominated point-in-poly
WHEN $2 = 'ST_Point' THEN
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
ELSE
-- predenominated polygon interpolation
-- TODO should weight by universe instead of area
-- SUM (numer * (% user geom in OBS geom))
' SUM(' || numer_tablename || '.' || numer_colname || ' ' ||
' * CASE WHEN ST_Within(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ') ' ||
' THEN ST_Area(_geoms.geom) / ST_Area(' || geom_tablename || '.' || geom_colname || ') ' ||
' WHEN ST_Within(' || geom_tablename || '.' || geom_colname || ', _geoms.geom) ' ||
' THEN 1 ' ||
' ELSE (ST_Area(ST_Intersection(_geoms.geom, ' || geom_tablename || '.' || geom_colname || ')) ' ||
' / ST_Area(' || geom_tablename || '.' || geom_colname || '))' ||
' END) ' ||
' / (COUNT(*) / COUNT(distinct ' || geom_tablename || '.' || geom_geomref_colname || ')) ' ||
'END '
END || ':: ' || numer_type
' * pct_' || geom_tablename ||
' ) / (COUNT(*) / COUNT(distinct geomref_' || geom_tablename || ')) '
END
-- Everything else. Point only!
ELSE CASE
WHEN $2 = 'ST_Point' THEN
' cdb_observatory.FIRST(' || numer_tablename || '.' || numer_colname || ') '
ELSE
' cdb_observatory._OBS_RaiseNotice(''Cannot perform calculation over polygon for ' ||
numer_id || '/' || coalesce(denom_id, '') || '/' || geom_id || '/' || numer_timespan || ''')::Numeric '
END
END || '::' || numer_type
-- categorical/text
WHEN LOWER(numer_type) LIKE 'text' THEN
@@ -646,13 +705,13 @@ BEGIN
-- geometry
WHEN numer_id IS NULL THEN
'''geomref'', ' || geom_tablename || '.' || geom_geomref_colname || ', ' ||
'''value'', ' || 'cdb_observatory.FIRST(' || geom_tablename ||
'.' || geom_colname || ')::TEXT'
'''geomref'', geomref_' || geom_tablename || ', ' ||
'''value'', ' || 'cdb_observatory.FIRST(geom_' || geom_tablename ||
')::TEXT'
-- code below will return the intersection of the user's geom and the
-- OBS geom
--'"value": "'' || ' || 'cdb_observatory.FIRST(ST_Intersection(_geoms.geom, ' || geom_tablename ||
-- '.' || geom_colname || '))::TEXT || ''"'''
--'''value'', ' || 'ST_Union(cdb_observatory.safe_intersection(_geoms.geom, ' || geom_tablename ||
-- '.' || geom_colname || '))::TEXT'
ELSE ''
END || ')', ', ')
AS colspecs,
@@ -662,8 +721,11 @@ BEGIN
--
-- api_method and geom_tablename are interchangeable since when an
-- api_method is passed, geom_tablename is ignored
STRING_AGG(COALESCE(geom_tablename, api_method) ||
'.' || geom_geomref_colname, ', ') AS geomrefs,
String_Agg(DISTINCT COALESCE(geom_tablename, api_method) || '.' || geom_geomref_colname ||
' AS geomref_' || COALESCE(geom_tablename, api_method), ', ') AS geomrefs_alias,
String_Agg(DISTINCT 'geomref_' || COALESCE(geom_tablename, api_method)
, ', ') AS geomrefs_noalias,
(SELECT String_Agg(DISTINCT CASE
-- External API
@@ -676,51 +738,98 @@ BEGIN
SELECT DISTINCT UNNEST(tablenames_ary) tablename FROM (
SELECT ARRAY_AGG(numer_tablename) ||
ARRAY_AGG(denom_tablename) ||
ARRAY_AGG(geom_tablename) ||
ARRAY_AGG('cdb_observatory.' || api_method || '(_geoms.geom' || COALESCE(', ' ||
ARRAY_AGG('cdb_observatory.' || api_method || '(_procgeoms.geom' || COALESCE(', ' ||
(SELECT STRING_AGG(REPLACE(val::text, '"', ''''), ', ')
FROM (SELECT json_array_elements(api_args) as val) as vals),
'') || ')')
tablenames_ary
) tablenames_inner
) tablenames_outer) tablenames,
) tablenames_outer) data_tables,
String_Agg(DISTINCT numer_tablename || '.' || numer_geomref_colname || ' = ' ||
geom_tablename || '.' || geom_geomref_colname ||
Coalesce(' AND ' || numer_tablename || '.' || numer_geomref_colname || ' = ' ||
denom_tablename || '.' || denom_geomref_colname, ''),
' AND ') AS obs_wheres,
String_Agg(DISTINCT array_to_string(ARRAY[
CASE WHEN numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
'_procgeoms.geomref_' || geom_tablename
ELSE NULL END,
CASE WHEN numer_tablename != denom_tablename
THEN numer_tablename || '.' || numer_geomref_colname || ' = ' ||
denom_tablename || '.' || denom_geomref_colname
ELSE NULL END
], ' AND '),
' AND ') FILTER (WHERE numer_tablename != denom_tablename OR
(numer_tablename IS NOT NULL AND geom_tablename IS NOT NULL)) AS obs_wheres,
String_Agg('ST_Intersects(' || geom_tablename || '.' || geom_colname
String_Agg(DISTINCT 'ST_Intersects(' || geom_tablename || '.' || geom_colname
|| ', _geoms.geom)', ' AND ')
AS user_wheres
FROM _meta
;
$query$
INTO colspecs, geomrefs, tables, obs_wheres, user_wheres
USING (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]);
INTO geom_colspecs, geom_tables, data_colspecs, geomrefs_alias,
geomrefs_noalias, data_tables, obs_wheres, user_wheres
USING (SELECT ARRAY(SELECT json_array_elements_text(params))::json[]), geomtype;
RETURN QUERY EXECUTE format($query$
RAISE NOTICE '%', format($query$
WITH _raw_geoms AS (SELECT
(UNNEST($1)).val as id,
(UNNEST($1)).geom AS geom),
%L::integer as id,
%L::geometry AS geom),
_geoms AS (SELECT id,
CASE WHEN (ST_NPoints(geom) > 500)
THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.0001)), 3)
ELSE geom END geom
FROM _raw_geoms)
SELECT _geoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
FROM _geoms, %s
FROM _raw_geoms),
_procgeoms AS (SELECT _geoms.id, _geoms.geom %s %s
FROM _geoms %s
%s
)
SELECT _procgeoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
FROM _procgeoms %s
%s
GROUP BY _geoms.id %s
ORDER BY _geoms.id
$query$, colspecs, tables,
'WHERE ' || NULLIF(ARRAY_TO_STRING(ARRAY[obs_wheres, user_wheres], ' AND '), ''),
CASE WHEN merge IS False THEN ', ' || geomrefs ELSE '' END)
GROUP BY _procgeoms.id %s
ORDER BY _procgeoms.id
$query$, geomvals[1].val, geomvals[1].geom, ', ' || NullIf(geomrefs_alias, ''),
', ' || NullIf(geom_colspecs, ''),
', ' || NullIf(geom_tables, ''),
'WHERE ' || NullIf( user_wheres, ''),
data_colspecs, ', ' || NullIf(data_tables, ''),
'WHERE ' || NULLIF(obs_wheres, ''),
CASE WHEN merge IS False THEN ', ' || geomrefs_noalias ELSE '' END);
RETURN QUERY EXECUTE format($query$
WITH _raw_geoms AS (%s),
_geoms AS (SELECT id,
CASE WHEN (ST_NPoints(geom) > 500)
THEN ST_CollectionExtract(ST_MakeValid(ST_SimplifyVW(geom, 0.0001)), 3)
ELSE geom END geom
FROM _raw_geoms),
_procgeoms AS (SELECT _geoms.id, _geoms.geom %s %s
FROM _geoms %s
%s
)
SELECT _procgeoms.id::INT, Array_to_JSON(ARRAY[%s]::JSON[])
FROM _procgeoms %s
%s
GROUP BY _procgeoms.id %s
ORDER BY _procgeoms.id
$query$, CASE WHEN ARRAY_LENGTH(geomvals, 1) = 1 THEN
' SELECT $1[1].val as id, $1[1].geom as geom '
ELSE
' SELECT val as id, geom FROM UNNEST($1) '
END,
', ' || NullIf(geomrefs_alias, ''),
', ' || NullIf(geom_colspecs, ''),
', ' || NullIf(geom_tables, ''),
'WHERE ' || NullIf( user_wheres, ''),
data_colspecs, ', ' || NullIf(data_tables, ''),
'WHERE ' || NULLIF(obs_wheres, ''),
CASE WHEN merge IS False THEN ', ' || geomrefs_noalias ELSE '' END)
USING geomvals;
RETURN;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetCategory(
@@ -778,7 +887,7 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetUSCensusMeasure(
@@ -812,7 +921,7 @@ BEGIN
USING geom, measure_id, normalize, boundary_id, time_span;
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetUSCensusCategory(
@@ -848,7 +957,7 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetPopulation(
geom geometry(Geometry, 4326),
@@ -874,7 +983,7 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;
CREATE OR REPLACE FUNCTION cdb_observatory.OBS_GetSegmentSnapshot(
@@ -963,4 +1072,4 @@ BEGIN
RETURN result;
END;
$$ LANGUAGE plpgsql;
$$ LANGUAGE plpgsql STABLE;

View File

@@ -369,8 +369,10 @@ RETURNS TABLE (
DECLARE
aggregate_condition TEXT DEFAULT '';
BEGIN
IF aggregate_type IS NOT NULL THEN
aggregate_condition := format(' AND numer_aggregate = %L ', aggregate_type);
IF LOWER(aggregate_type) ILIKE 'sum' THEN
aggregate_condition := ' AND numer_aggregate IN (''sum'', ''median'', ''average'') ';
ELSIF aggregate_type IS NOT NULL THEN
aggregate_condition := format(' AND numer_aggregate ILIKE %L ', aggregate_type);
END IF;
RETURN QUERY
EXECUTE format($string$

View File

@@ -21,3 +21,7 @@ t
obs_dumpversion_notnull
t
(1 row)
ERROR: Error performing intersection: TopologyException: found non-noded intersection between LINESTRING (-97.1968 25.9574, -97.1971 25.9576) and LINESTRING (-97.197 25.9575, -97.1972 25.9576) at -97.19699802694231 25.957551976080605
complex_safe_intersection_works
t
(1 row)

View File

@@ -48,6 +48,15 @@ t
obs_getmeasure_out_of_bounds_geometry
t
(1 row)
obs_getmeasure_estimate_for_blank_aggregate
t
(1 row)
obs_getmeasure_per_capita_income_average
t
(1 row)
obs_getmeasure_median_capita_income_average
t
(1 row)
obs_getcategory_point
t
(1 row)
@@ -162,9 +171,15 @@ t|t|t
id|data_polygon_measure_area|nullcol
t|t|t
(1 row)
id|data_point_measure_prenormalized|nullcol
t|t|t
(1 row)
id|data_point_measure_predenominated|nullcol
t|t|t
(1 row)
id|data_polygon_measure_prenormalized|nullcol
t|t|t
(1 row)
id|data_polygon_measure_predenominated|nullcol
t|t|t
(1 row)
@@ -234,15 +249,15 @@ t|t
obs_getdata_api_geomvals_no_args
t
(1 row)
obs_getdata_api_geomvals_args_numer_return
t
ary_type|obs_getdata_api_geomvals_args_numer_return
t|t
(1 row)
obs_getdata_api_geomvals_args_string_return
t
ary_type|obs_getdata_api_geomvals_args_string_return
t|t
(1 row)
obs_getdata_api_geomrefs_args_numer_return
t
ary_type|obs_getdata_api_geomrefs_args_numer_return
t|t
(1 row)
obs_getdata_api_geomrefs_args_string_return
t
ary_type|obs_getdata_api_geomrefs_args_string_return
t|t
(1 row)

View File

@@ -192,10 +192,16 @@ t
_median_income_in_legacy_builder_metadata
t
(1 row)
_gini_in_legacy_builder_metadata
t
(1 row)
_total_pop_in_legacy_builder_metadata_sums
t
(1 row)
_median_income_not_in_legacy_builder_metadata_sums
_median_income_in_legacy_builder_metadata_sums
t
(1 row)
_gini_not_in_legacy_builder_metadata_sums
t
(1 row)
_no_dupe_subsections_in_legacy_builder_metadata

View File

@@ -26,6 +26,7 @@ DROP TABLE IF EXISTS observatory.obs_6c1309a64d8f3e6986061f4d1ca7b57743e75e74;
DROP TABLE IF EXISTS observatory.obs_0310c639744a2014bb1af82709228f05b59e7d3d;
DROP TABLE IF EXISTS observatory.obs_87a814e485deabe3b12545a537f693d16ca702c2;
DROP TABLE IF EXISTS observatory.obs_e32f8e59c7c8861ee5ee4029b3ace2af9a5c9caf;
DROP TABLE IF EXISTS observatory.obs_23cb5063486bd7cf36f17e89e5e65cd31b331f6e;
DROP TABLE IF EXISTS observatory.obs_1ea93bbc109c87c676b3270789dacf7a1430db6c;
DROP TABLE IF EXISTS observatory.obs_b393b5b88c6adda634b2071a8005b03c551b609a;
DROP TABLE IF EXISTS observatory.obs_8e30e6b3792430b410ba5b9e49cdc6a0d404d48f;

File diff suppressed because one or more lines are too long

View File

@@ -47,3 +47,15 @@ SELECT cdb_observatory._OBS_StandardizeMeasureName('test 343 %% 2 qqq }}{{}}') =
SELECT cdb_observatory.OBS_DumpVersion()
IS NOT NULL AS OBS_DumpVersion_notnull;
-- Should fail to perform intersection
SELECT ST_IsValid(ST_Intersection(
cdb_observatory.OBS_GetBoundaryByID('48061', 'us.census.tiger.county'),
cdb_observatory.OBS_GetBoundaryByID('48061', 'us.census.tiger.county_clipped')
)) AS complex_intersection_fails;
-- Should succeed in intersecting
SELECT ST_IsValid(cdb_observatory.safe_intersection(
cdb_observatory.OBS_GetBoundaryByID('48061', 'us.census.tiger.county'),
cdb_observatory.OBS_GetBoundaryByID('48061', 'us.census.tiger.county_clipped')
)) AS complex_safe_intersection_works;

View File

@@ -106,6 +106,21 @@ SELECT cdb_observatory.OBS_GetMeasure(
ST_SetSRID(st_point(0, 0), 4326),
'us.census.acs.B01003001') IS NULL As OBS_GetMeasure_out_of_bounds_geometry;
-- OBS_GetMeasure over arbitrary area for a measure we cannot estimate
SELECT cdb_observatory.OBS_GetMeasure(
ST_Buffer(cdb_observatory._testpoint(), 0.1),
'us.census.acs.B19083001') IS NULL As OBS_GetMeasure_estimate_for_blank_aggregate;
-- OBS_GetMeasure over arbitrary area for an average measure we can estimate
SELECT abs(cdb_observatory.OBS_GetMeasure(
ST_Buffer(cdb_observatory._testpoint(), 0.01),
'us.census.acs.B19301001') - 20025) / 20025 < 0.001 As OBS_GetMeasure_per_capita_income_average;
-- OBS_GetMeasure over arbitrary area for a median measure we can estimate
SELECT abs(cdb_observatory.OBS_GetMeasure(
ST_Buffer(cdb_observatory._testpoint(), 0.01),
'us.census.acs.B19013001') - 39266) / 39266 < 0.001 As OBS_GetMeasure_median_capita_income_average;
-- Point-based OBS_GetCategory
SELECT cdb_observatory.OBS_GetCategory(
cdb_observatory._TestPoint(), 'us.census.spielman_singleton_segments.X10') = 'Wealthy, urban without Kids' As OBS_GetCategory_point;
@@ -451,6 +466,19 @@ SELECT id = 1 id,
data->1 IS NULL nullcol
FROM data;
-- OBS_GetData/OBS_GetMeta by point geom with one standard measure predenom
-- called "prednormalized"
WITH
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
'[{"numer_id": "us.census.acs.B01003001", "normalization": "prenormalized"}]') meta),
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
ARRAY[(cdb_observatory._TestPoint(), 1)::geomval],
(SELECT meta FROM meta)))
SELECT id = 1 id,
abs((data->0->>'value')::Numeric - 1900) / 1900 < 0.001 data_point_measure_prenormalized,
data->1 IS NULL nullcol
FROM data;
-- OBS_GetData/OBS_GetMeta by point geom with one standard measure predenom
WITH
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestPoint(),
@@ -463,6 +491,19 @@ SELECT id = 1 id,
data->1 IS NULL nullcol
FROM data;
-- OBS_GetData/OBS_GetMeta by polygon geom with one standard measure predenom
-- called "prenormalized"
WITH
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
'[{"numer_id": "us.census.acs.B01003001", "normalization": "prenormalized"}]') meta),
data AS (SELECT * FROM cdb_observatory.OBS_GetData(
ARRAY[(cdb_observatory._TestArea(), 1)::geomval],
(SELECT meta FROM meta)))
SELECT id = 1 id,
abs((data->0->>'value')::Numeric - 12327) / 12327 < 0.001 data_polygon_measure_prenormalized,
data->1 IS NULL nullcol
FROM data;
-- OBS_GetData/OBS_GetMeta by polygon geom with one standard measure predenom
WITH
meta AS (SELECT cdb_observatory.OBS_GetMeta(cdb_observatory._TestArea(),
@@ -724,27 +765,36 @@ SELECT id = '36047048500' AS id,
FROM data;
-- OBS_GetData with an API + geomvals, no args
SELECT ARRAY['us.census.tiger.census_tract'] <@ array_agg(data->0->>'value') AS OBS_GetData_API_geomvals_no_args
SELECT (SELECT array_agg(json_array_elements::text) @> array['"us.census.tiger.census_tract"']
FROM json_array_elements(data->0->'value'))
AS OBS_GetData_API_geomvals_no_args
FROM cdb_observatory.obs_getdata(array[(cdb_observatory._testarea(), 1)::geomval],
'[{"numer_type": "text", "numer_colname": "boundary_id", "api_method": "obs_getavailableboundaries", "geom_geomref_colname": "boundary_id"}]',
false);
'[{"numer_type": "text", "numer_colname": "boundary_id", "api_method": "obs_getavailableboundaries"}]');
-- OBS_GetData with an API + geomvals, args, numeric
SELECT json_typeof(data->0->'value') = 'number' AS OBS_GetData_API_geomvals_args_numer_return
SELECT json_typeof(data->0->'value') = 'array' ary_type,
json_typeof(data->0->'value'->0) = 'number'
AS OBS_GetData_API_geomvals_args_numer_return
FROM cdb_observatory.obs_getdata(array[(cdb_observatory._testarea(), 1)::geomval],
'[{"numer_type": "numeric", "numer_colname": "obs_getmeasure", "api_method": "obs_getmeasure", "api_args": ["us.census.acs.B01003001"]}]', false);
'[{"numer_type": "numeric", "numer_colname": "obs_getmeasure", "api_method": "obs_getmeasure", "api_args": ["us.census.acs.B01003001"]}]');
-- OBS_GetData with an API + geomvals, args, text
SELECT json_typeof(data->0->'value') = 'string' AS OBS_GetData_API_geomvals_args_string_return
SELECT json_typeof(data->0->'value') = 'array' ary_type,
json_typeof(data->0->'value'->0) = 'string'
AS OBS_GetData_API_geomvals_args_string_return
FROM cdb_observatory.obs_getdata(array[(cdb_observatory._testarea(), 1)::geomval],
'[{"numer_type": "text", "numer_colname": "obs_getcategory", "api_method": "obs_getcategory", "api_args": ["us.census.spielman_singleton_segments.X55"]}]', false);
'[{"numer_type": "text", "numer_colname": "obs_getcategory", "api_method": "obs_getcategory", "api_args": ["us.census.spielman_singleton_segments.X55"]}]');
-- OBS_GetData with an API + geomrefs, args, numeric
SELECT json_typeof(data->0->'value') = 'number' AS OBS_GetData_API_geomrefs_args_numer_return
SELECT json_typeof(data->0->'value') = 'array' ary_type,
json_typeof(data->0->'value'->0) = 'number'
AS OBS_GetData_API_geomrefs_args_numer_return
FROM cdb_observatory.obs_getdata(array['36047076200'],
'[{"numer_type": "numeric", "numer_colname": "obs_getmeasurebyid", "api_method": "obs_getmeasurebyid", "api_args": ["us.census.acs.B01003001", "us.census.tiger.census_tract"]}]');
-- OBS_GetData with an API + geomrefs, args, text
SELECT json_typeof(data->0->'value') = 'string' AS OBS_GetData_API_geomrefs_args_string_return
SELECT json_typeof(data->0->'value') = 'array' ary_type,
json_typeof(data->0->'value'->0) = 'string'
AS OBS_GetData_API_geomrefs_args_string_return
FROM cdb_observatory.obs_getdata(array['36047'],
'[{"numer_type": "text", "numer_colname": "obs_getboundarybyid", "api_method": "obs_getboundarybyid", "api_args": ["us.census.tiger.county"]}]');

View File

@@ -499,15 +499,25 @@ SELECT 'us.census.acs.B19013001' IN (SELECT
FROM cdb_observatory.OBS_LegacyBuilderMetadata()
) AS _median_income_in_legacy_builder_metadata;
SELECT 'us.census.acs.B19083001' IN (SELECT
(jsonb_array_elements(((jsonb_array_elements(subsection))->'f1')->'columns')->'f1')->>'id' AS id
FROM cdb_observatory.OBS_LegacyBuilderMetadata()
) AS _gini_in_legacy_builder_metadata;
SELECT 'us.census.acs.B01003001' IN (SELECT
(jsonb_array_elements(((jsonb_array_elements(subsection))->'f1')->'columns')->'f1')->>'id' AS id
FROM cdb_observatory.OBS_LegacyBuilderMetadata('sum')
) AS _total_pop_in_legacy_builder_metadata_sums;
SELECT 'us.census.acs.B19013001' NOT IN (SELECT
SELECT 'us.census.acs.B19013001' IN (SELECT
(jsonb_array_elements(((jsonb_array_elements(subsection))->'f1')->'columns')->'f1')->>'id' AS id
FROM cdb_observatory.OBS_LegacyBuilderMetadata('sum')
) AS _median_income_not_in_legacy_builder_metadata_sums;
) AS _median_income_in_legacy_builder_metadata_sums;
SELECT 'us.census.acs.B19083001' NOT IN (SELECT
(jsonb_array_elements(((jsonb_array_elements(subsection))->'f1')->'columns')->'f1')->>'id' AS id
FROM cdb_observatory.OBS_LegacyBuilderMetadata('sum')
) AS _gini_not_in_legacy_builder_metadata_sums;
SELECT COUNT(*) = 0 _no_dupe_subsections_in_legacy_builder_metadata FROM (
SELECT name, subsection, count(*) FROM

View File

@@ -1,3 +1,4 @@
nose
nose-timer
nose_parameterized
psycopg2

View File

@@ -2,39 +2,21 @@ from nose.tools import assert_equal, assert_is_not_none
from nose.plugins.skip import SkipTest
from nose_parameterized import parameterized
from itertools import izip_longest
from util import query
from collections import OrderedDict
import json
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
USE_SCHEMA = True
MEASURE_COLUMNS = query('''
SELECT distinct numer_id, numer_aggregate NOT ILIKE 'sum' as point_only
FROM observatory.obs_meta
WHERE numer_type ILIKE 'numeric'
AND numer_weight > 0
''').fetchall()
CATEGORY_COLUMNS = query('''
SELECT distinct numer_id
FROM observatory.obs_meta
WHERE numer_type ILIKE 'text'
AND numer_weight > 0
''').fetchall()
BOUNDARY_COLUMNS = query('''
SELECT id FROM observatory.obs_column
WHERE type ILIKE 'geometry'
AND weight > 0
''').fetchall()
US_CENSUS_MEASURE_COLUMNS = query('''
SELECT distinct numer_name
FROM observatory.obs_meta
WHERE numer_type ILIKE 'numeric'
AND 'us.census.acs.acs' = ANY (subsection_tags)
AND numer_weight > 0
''').fetchall()
SKIP_COLUMNS = set([
u'mx.inegi_columns.INDI18',
u'mx.inegi_columns.ECO40',
@@ -73,8 +55,61 @@ SKIP_COLUMNS = set([
u'us.census.tiger.mtfcc',
u'whosonfirst.wof_county_name',
u'whosonfirst.wof_region_name',
'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
, 'fr.insee.P12_ACTOCC15P_ILT45D'
, 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
, 'fr.insee.P12_ACTOCC15P_ILT45D'
, 'uk.ons.LC3202WA0007'
, 'uk.ons.LC3202WA0010'
, 'uk.ons.LC3202WA0004'
, 'uk.ons.LC3204WA0004'
, 'uk.ons.LC3204WA0007'
, 'uk.ons.LC3204WA0010'
])
MEASURE_COLUMNS = query('''
SELECT ARRAY_AGG(DISTINCT numer_id) numer_ids,
numer_aggregate,
denom_reltype,
section_tags
FROM observatory.obs_meta
WHERE numer_weight > 0
AND numer_id NOT IN ('{skip}')
AND section_tags IS NOT NULL
AND subsection_tags IS NOT NULL
GROUP BY numer_aggregate, section_tags, denom_reltype
'''.format(skip="', '".join(SKIP_COLUMNS))).fetchall()
#CATEGORY_COLUMNS = query('''
#SELECT distinct numer_id
#FROM observatory.obs_meta
#WHERE numer_type ILIKE 'text'
#AND numer_weight > 0
#''').fetchall()
#
#BOUNDARY_COLUMNS = query('''
#SELECT id FROM observatory.obs_column
#WHERE type ILIKE 'geometry'
#AND weight > 0
#''').fetchall()
#
#US_CENSUS_MEASURE_COLUMNS = query('''
#SELECT distinct numer_name
#FROM observatory.obs_meta
#WHERE numer_type ILIKE 'numeric'
#AND 'us.census.acs' = ANY (subsection_tags)
#AND numer_weight > 0
#''').fetchall()
#def default_geometry_id(column_id):
# '''
# Returns default test point for the column_id.
@@ -125,41 +160,43 @@ def default_lonlat(column_id):
elif column_id.startswith('th.'):
return (13.725377712079784, 100.49263000488281)
# cols for French Guyana only
elif column_id in ('fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
, 'fr.insee.P12_ACTOCC15P_ILT45D'
, 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
, 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
, 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
, 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
, 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
, 'fr.insee.P12_ACTOCC15P_ILT45D'):
return (4.938408371206558, -52.32908248901367)
#elif column_id in ('fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
# , 'fr.insee.P12_ACTOCC15P_ILT45D'
# , 'fr.insee.P12_RP_CHOS', 'fr.insee.P12_RP_HABFOR'
# , 'fr.insee.P12_RP_EAUCH', 'fr.insee.P12_RP_BDWC'
# , 'fr.insee.P12_RP_MIDUR', 'fr.insee.P12_RP_CLIM'
# , 'fr.insee.P12_RP_MIBOIS', 'fr.insee.P12_RP_CASE'
# , 'fr.insee.P12_RP_TTEGOU', 'fr.insee.P12_RP_ELEC'
# , 'fr.insee.P12_ACTOCC15P_ILT45D'):
# return (4.938408371206558, -52.32908248901367)
elif column_id.startswith('fr.'):
return (48.860875144709475, 2.3613739013671875)
elif column_id.startswith('ca.'):
return (43.65594991256823, -79.37965393066406)
elif column_id.startswith('us.census.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('us.dma.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('us.ihme.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('us.bls.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('us.qcew.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('whosonfirst.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('us.epa.'):
return (40.7, -73.9)
return (28.3305906291771, -81.3544048197256)
elif column_id.startswith('eu.'):
raise SkipTest('No tests for Eurostat!')
elif column_id.startswith('br.'):
return (-23.53, -46.63)
elif column_id.startswith('au.'):
return (-33.8806, 151.2131)
else:
raise Exception('No catalog point set for {}'.format(column_id))
@@ -179,46 +216,74 @@ def default_area(column_id):
point=point)
return area
@parameterized(US_CENSUS_MEASURE_COLUMNS)
def test_get_us_census_measure_points(name):
resp = query('''
SELECT * FROM {schema}OBS_GetUSCensusMeasure({point}, '{name}')
'''.format(name=name.replace("'", "''"),
schema='cdb_observatory.' if USE_SCHEMA else '',
point=default_point('')))
rows = resp.fetchall()
assert_equal(1, len(rows))
assert_is_not_none(rows[0][0])
#@parameterized(US_CENSUS_MEASURE_COLUMNS)
#def test_get_us_census_measure_points(name):
# resp = query('''
#SELECT * FROM {schema}OBS_GetUSCensusMeasure({point}, '{name}')
# '''.format(name=name.replace("'", "''"),
# schema='cdb_observatory.' if USE_SCHEMA else '',
# point=default_point('')))
# rows = resp.fetchall()
# assert_equal(1, len(rows))
# assert_is_not_none(rows[0][0])
@parameterized(MEASURE_COLUMNS)
def test_get_measure_areas(column_id, point_only):
if column_id in SKIP_COLUMNS:
raise SkipTest('Column {} should be skipped'.format(column_id))
if point_only:
def grouped_measure_columns():
for numer_ids, numer_aggregate, denom_reltype, section_tags in MEASURE_COLUMNS:
for colgroup in grouper(numer_ids, 50):
yield [c for c in colgroup if c], numer_aggregate, denom_reltype, section_tags
@parameterized(grouped_measure_columns())
def test_get_measure_points(numer_ids, numer_aggregate, denom_reltype, section_tags):
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_point(numer_ids[0]))
@parameterized(grouped_measure_columns())
def test_get_measure_areas(numer_ids, numer_aggregate, denom_reltype, section_tags):
if numer_aggregate is None or numer_aggregate.lower() not in ('sum', 'median', 'average'):
return
resp = query('''
SELECT * FROM {schema}OBS_GetMeasure({area}, '{column_id}')
'''.format(column_id=column_id,
schema='cdb_observatory.' if USE_SCHEMA else '',
area=default_area(column_id)))
rows = resp.fetchall()
assert_equal(1, len(rows))
assert_is_not_none(rows[0][0])
if numer_aggregate.lower() in ('median', 'average') \
and (denom_reltype is None \
or denom_reltype.lower() != 'universe'):
return
_test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, default_area(numer_ids[0]))
@parameterized(MEASURE_COLUMNS)
def test_get_measure_points(column_id, point_only):
if column_id in SKIP_COLUMNS:
raise SkipTest('Column {} should be skipped'.format(column_id))
resp = query('''
SELECT * FROM {schema}OBS_GetMeasure({point}, '{column_id}')
'''.format(column_id=column_id,
schema='cdb_observatory.' if USE_SCHEMA else '',
point=default_point(column_id)))
rows = resp.fetchall()
assert_equal(1, len(rows))
assert_is_not_none(rows[0][0])
def _test_measures(numer_ids, numer_aggregate, section_tags, denom_reltype, geom):
in_params = []
for numer_id in numer_ids:
in_params.append({
'numer_id': numer_id,
'normalization': 'predenominated'
})
params = query(u'''
SELECT {schema}OBS_GetMeta({geom}, '{in_params}')
'''.format(schema='cdb_observatory.' if USE_SCHEMA else '',
geom=geom,
in_params=json.dumps(in_params))).fetchone()[0]
# We can get duplicate IDs from multi-denominators, so for now we
# compress those measures into a single
params = OrderedDict([(p['id'], p) for p in params]).values()
assert_equal(len(params), len(in_params),
'Inconsistent out and in params for {}'.format(in_params))
q = u'''
SELECT * FROM {schema}OBS_GetData(ARRAY[({geom}, 1)::geomval], '{params}')
'''.format(schema='cdb_observatory.' if USE_SCHEMA else '',
geom=geom,
params=json.dumps(params).replace(u"'", "''"))
resp = query(q).fetchone()
assert_is_not_none(resp, 'NULL returned for {}'.format(in_params))
rawvals = resp[1]
vals = [v['value'] for v in rawvals]
assert_equal(len(vals), len(in_params))
for i, val in enumerate(vals):
assert_is_not_none(val, 'NULL for {}'.format(in_params[i]['numer_id']))
#@parameterized(CATEGORY_COLUMNS)
#def test_get_category_areas(column_id):
@@ -232,18 +297,18 @@ SELECT * FROM {schema}OBS_GetMeasure({point}, '{column_id}')
# assert_equal(1, len(rows))
# assert_is_not_none(rows[0][0])
@parameterized(CATEGORY_COLUMNS)
def test_get_category_points(column_id):
if column_id in SKIP_COLUMNS:
raise SkipTest('Column {} should be skipped'.format(column_id))
resp = query('''
SELECT * FROM {schema}OBS_GetCategory({point}, '{column_id}')
'''.format(column_id=column_id,
schema='cdb_observatory.' if USE_SCHEMA else '',
point=default_point(column_id)))
rows = resp.fetchall()
assert_equal(1, len(rows))
assert_is_not_none(rows[0][0])
#@parameterized(CATEGORY_COLUMNS)
#def test_get_category_points(column_id):
# if column_id in SKIP_COLUMNS:
# raise SkipTest('Column {} should be skipped'.format(column_id))
# resp = query('''
#SELECT * FROM {schema}OBS_GetCategory({point}, '{column_id}')
# '''.format(column_id=column_id,
# schema='cdb_observatory.' if USE_SCHEMA else '',
# point=default_point(column_id)))
# rows = resp.fetchall()
# assert_equal(1, len(rows))
# assert_is_not_none(rows[0][0])
#@parameterized(BOUNDARY_COLUMNS)
#def test_get_boundaries_by_geometry(column_id):

View File

@@ -74,7 +74,10 @@ for q in (
q_formatted = q.format(
schema='cdb_observatory.' if USE_SCHEMA else '',
)
start = time()
resp = query(q_formatted)
end = time()
print('{} for {}'.format(int(end - start), q_formatted))
if q.lower().startswith('insert'):
if resp.rowcount == 0:
raise Exception('''Performance fixture creation "{}" inserted 0 rows,
@@ -189,29 +192,21 @@ def test_getgeometryscores_performance(geom_complexity, api_method, filters, tar
('simple', 'OBS_GetCategory', None, 'geom', "'us.census.tiger.census_tract'"),
('simple', 'OBS_GetCategory', None, 'offset_geom', "'us.census.tiger.census_tract'"),
('complex', 'OBS_GetMeasure', 'predenominated', 'point', 'NULL'),
('complex', 'OBS_GetMeasure', 'predenominated', 'geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'predenominated', 'offset_geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'area', 'point', 'NULL'),
('complex', 'OBS_GetMeasure', 'area', 'geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'area', 'offset_geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'denominator', 'point', 'NULL'),
('complex', 'OBS_GetMeasure', 'denominator', 'geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'denominator', 'offset_geom', 'NULL'),
('complex', 'OBS_GetCategory', None, 'point', 'NULL'),
('complex', 'OBS_GetCategory', None, 'geom', 'NULL'),
('complex', 'OBS_GetCategory', None, 'offset_geom', 'NULL'),
('complex', 'OBS_GetMeasure', 'predenominated', 'point', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'predenominated', 'geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'predenominated', 'offset_geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'area', 'point', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'area', 'geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'area', 'offset_geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'denominator', 'point', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'denominator', 'geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetMeasure', 'denominator', 'offset_geom', "'us.census.tiger.county'"),
('complex', 'OBS_GetCategory', None, 'point', "'us.census.tiger.census_tract'"),
('complex', 'OBS_GetCategory', None, 'geom', "'us.census.tiger.census_tract'"),
('complex', 'OBS_GetCategory', None, 'offset_geom', "'us.census.tiger.census_tract'"),
])
@@ -273,78 +268,85 @@ def test_getmeasure_performance(geom_complexity, api_method, normalization, geom
('simple', 'denominator', 'geom', "'us.census.tiger.census_tract'"),
('simple', 'denominator', 'offset_geom', "'us.census.tiger.census_tract'"),
('complex', 'predenominated', 'point', 'null'),
('complex', 'predenominated', 'geom', 'null'),
('complex', 'predenominated', 'offset_geom', 'null'),
('complex', 'area', 'point', 'null'),
('complex', 'area', 'geom', 'null'),
('complex', 'area', 'offset_geom', 'null'),
('complex', 'denominator', 'point', 'null'),
('complex', 'denominator', 'geom', 'null'),
('complex', 'denominator', 'offset_geom', 'null'),
('complex', 'predenominated', 'point', "'us.census.tiger.county'"),
('complex', 'predenominated', 'geom', "'us.census.tiger.county'"),
('complex', 'predenominated', 'offset_geom', "'us.census.tiger.county'"),
('complex', 'area', 'point', "'us.census.tiger.county'"),
('complex', 'area', 'geom', "'us.census.tiger.county'"),
('complex', 'area', 'offset_geom', "'us.census.tiger.county'"),
('complex', 'denominator', 'point', "'us.census.tiger.county'"),
('complex', 'denominator', 'geom', "'us.census.tiger.county'"),
('complex', 'denominator', 'offset_geom', "'us.census.tiger.county'"),
])
def test_getmeasure_split_performance(geom_complexity, normalization, geom, boundary):
def test_getdata_performance(geom_complexity, normalization, geom, boundary):
print geom_complexity, normalization, geom, boundary
results = []
cols = ['us.census.acs.B01001002',
'us.census.acs.B01001003',
'us.census.acs.B01001004',
'us.census.acs.B01001005',
'us.census.acs.B01001006',
'us.census.acs.B01001007',
'us.census.acs.B01001008',
'us.census.acs.B01001009',
'us.census.acs.B01001010',
'us.census.acs.B01001011', ]
in_meta = [{"numer_id": col,
"normalization": normalization,
"geom_id": None if boundary.lower() == 'null' else boundary.replace("'", '')}
for col in cols]
rownums = (1, 5, 10, ) if geom_complexity == 'complex' else (10, 50, 100)
for rows in rownums:
stmt = '''
with data as (
SELECT id, data FROM {schema}OBS_GetData(
(SELECT array_agg(({geom}, cartodb_id)::geomval)
FROM obs_perftest_{complexity}
WHERE cartodb_id <= {n}),
(SELECT {schema}OBS_GetMeta(
(SELECT st_setsrid(st_extent({geom}), 4326)
FROM obs_perftest_{complexity}
WHERE cartodb_id <= {n}),
'[{{
"numer_id": "us.census.acs.B01001002",
"normalization": "{normalization}",
"geom_id": {boundary}
}}]'::JSON
))
))
UPDATE obs_perftest_{complexity}
SET measure = (data->0->>'value')::Numeric
FROM data
WHERE obs_perftest_{complexity}.cartodb_id = data.id
;
'''.format(
point_or_poly='point' if geom == 'point' else 'polygon',
complexity=geom_complexity,
schema='cdb_observatory.' if USE_SCHEMA else '',
normalization=normalization,
geom=geom,
boundary=boundary.replace("'", '"'),
n=rows)
start = time()
query(stmt)
end = time()
qps = (rows / (end - start))
results.append({
'rows': rows,
'qps': qps,
'stmt': stmt
})
print rows, ': ', qps, ' QPS'
if 'OBS_RECORD_TEST' in os.environ:
record({
'geom_complexity': geom_complexity,
'api_method': 'OBS_GetData',
'normalization': normalization,
'boundary': boundary,
'geom': geom
}, results)
for num_meta in (1, 10, ):
results = []
for rows in rownums:
stmt = '''
with data as (
SELECT id, data FROM {schema}OBS_GetData(
(SELECT array_agg(({geom}, cartodb_id)::geomval)
FROM obs_perftest_{complexity}
WHERE cartodb_id <= {n}),
(SELECT {schema}OBS_GetMeta(
(SELECT st_setsrid(st_extent({geom}), 4326)
FROM obs_perftest_{complexity}
WHERE cartodb_id <= {n}),
'{in_meta}'::JSON
))
))
UPDATE obs_perftest_{complexity}
SET measure = (data->0->>'value')::Numeric
FROM data
WHERE obs_perftest_{complexity}.cartodb_id = data.id
;
'''.format(
point_or_poly='point' if geom == 'point' else 'polygon',
complexity=geom_complexity,
schema='cdb_observatory.' if USE_SCHEMA else '',
geom=geom,
in_meta=json.dumps(in_meta[0:num_meta]),
n=rows)
start = time()
query(stmt)
end = time()
qps = (rows / (end - start))
results.append({
'rows': rows,
'qps': qps,
'stmt': stmt
})
print rows, ': ', qps, ' QPS'
if 'OBS_RECORD_TEST' in os.environ:
record({
'geom_complexity': geom_complexity,
'api_method': 'OBS_GetData',
'normalization': normalization,
'boundary': boundary,
'geom': geom,
'num_meta': str(num_meta)
}, results)