Compare commits

..

12 Commits

Author SHA1 Message Date
Andy Eschbacher
a9222018c0 adds docs 2018-03-07 17:03:30 -05:00
Andy Eschbacher
aa9d05f614 replace all refs of get_neighbor to get_weight_and_attr 2018-03-07 15:19:13 -05:00
Andy Eschbacher
e3aa99dae3 refactors internals of analysis data provider 2018-03-05 14:51:12 -05:00
Andy Eschbacher
3174b8797c small refactoring 2018-03-05 11:36:10 -05:00
Andy Eschbacher
5b0e75f1d3 moves to unclaimed number 2018-03-02 09:05:11 -05:00
Andy Eschbacher
39bb6884a3 Merge branch 'develop' into spatial_lag 2018-03-02 09:00:40 -05:00
Andy Eschbacher
09255b586b Merge branch 'develop' into spatial_lag 2018-01-10 12:13:38 -05:00
mehak-sachdeva
ac1dbf95c6 added sql tests for spatial lag 2017-03-10 14:41:44 -05:00
mehak-sachdeva
3685b885df adding spatial lag function, tests and changing data provider name in moran test 2017-03-10 11:50:29 -05:00
mehak-sachdeva
d764f7446f adding the spatial lag function 2017-03-09 09:39:22 -05:00
mehak-sachdeva
d500212426 typo 2017-03-04 08:53:56 -05:00
mehak-sachdeva
b8ee54ea2c creating the spatial lag function 2017-03-04 08:49:50 -05:00
22 changed files with 332 additions and 619 deletions

View File

@@ -1,7 +1,3 @@
0.8.0 (yyyy-mm-dd)
------------------
* Adds `CDB_MoransILocal*` functions that return spatial lag [#202](https://github.com/CartoDB/crankshaft/pull/202)
0.7.0 (2018-02-23)
------------------
* Updated Moran and Markov documentation [#179](https://github.com/CartoDB/crankshaft/pull/179) [#155](https://github.com/CartoDB/crankshaft/pull/155)

View File

@@ -1,6 +1,4 @@
## Moran's I - Spatial Autocorrelation
Note: these functions are replacing the functions in the _Areas of Interest_ family (still documented below). `CDB_MoransILocal` and `CDB_MoransILocalRate` perform the same analysis as their `CDB_AreasOfInterest*` counterparts but return spatial lag information, which is needed for creating the Moran's I scatter plot. It recommended to use the `CDB_MoransILocal*` variants instead as they will be maintained and improved going foward.
## Areas of Interest Functions
A family of analyses to uncover groupings of areas with consistently high or low values (clusters) and smaller areas with values unlike those around them (outliers). A cluster is labeled by an 'HH' (high value compared to the entire dataset in an area with other high values), or its opposite 'LL'. An outlier is labeled by an 'LH' (low value surrounded by high values) or an 'HL' (the opposite). Each cluster and outlier classification has an associated p-value, a measure of how significant the pattern of highs and lows is compared to a random distribution.
@@ -11,108 +9,7 @@ These functions have two forms: local and global. The local versions classify ev
* Rows with null values will be omitted from this analysis. To ensure they are added to the analysis, fill the null-valued cells with an appropriate value such as the mean of a column, the mean of the most recent two time steps, or use a `LEFT JOIN` to get null outputs from the analysis.
* Input query can only accept tables (datasets) in the users database account. Common table expressions (CTEs) do not work as an input unless specified within the `subquery` argument.
### CDB_MoransILocal(subquery text, column_name text)
This function classifies your data as being part of a cluster, as an outlier, or not part of a pattern based the significance of a classification. The classification happens through an autocorrelation statistic called Local Moran's I.
#### Arguments
| Name | Type | Description |
|------|------|-------------|
| subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., `SELECT * FROM interesting_table`). This query must have the geometry column name `the_geom` and id column name `cartodb_id` unless otherwise specified in the input arguments |
| column_name | TEXT | Name of column (e.g., should be `'interesting_value'` instead of `interesting_value` without single quotes) used for the analysis. |
| weight type (optional) | TEXT | Type of weight to use when finding neighbors. Currently available options are 'knn' (default) and 'queen'. Read more about weight types in [PySAL's weights documentation](https://pysal.readthedocs.io/en/v1.11.0/users/tutorials/weights.html). |
| num_ngbrs (optional) | INT | Number of neighbors if using k-nearest neighbors weight type. Defaults to 5. |
| permutations (optional) | INT | Number of permutations to check against a random arrangement of the values in `column_name`. This influences the accuracy of the output field `significance`. Defaults to 99. |
| geom_col (optional) | TEXT | The column name for the geometries. Defaults to `'the_geom'` |
| id_col (optional) | TEXT | The column name for the unique ID of each geometry/value pair. Defaults to `'cartodb_id'`. |
#### Returns
A table with the following columns.
| Column Name | Type | Description |
|-------------|------|-------------|
| quads | TEXT | Classification of geometry. Result is one of 'HH' (a high value with neighbors high on average), 'LL' (opposite of 'HH'), 'HL' (a high value surrounded by lows on average), and 'LH' (opposite of 'HL'). Null values are returned when nulls exist in the original data. |
| significance | NUMERIC | The statistical significance (from 0 to 1) of a cluster or outlier classification. Lower numbers are more significant. |
| spatial\_lag | NUMERIC | The 'average' of the neighbors of the value in this row. The average is calculated from it's neighborhood -- defined by `weight_type`. |
| spatial\_lag\_std | NUMERIC | The standardized version of `spatial_lag` -- that is, centered on the mean and divided by the standard deviation. Useful as the y-axis in a Moran's I scatter plot. |
| orig\_val | NUMERIC | Values from `column_name`. |
| orig\_val\_std | NUMERIC | Values from `column_name` but centered on the mean and divided by the standard devation. Useful as the x-axis in Moran's I scatter plots. |
| moran\_stat | NUMERIC | Value of Moran's I (spatial autocorrelation measure) for the geometry with id of `rowid` |
| rowid | INT | Row id of the values which correspond to the input rows. |
#### Example Usage
```sql
SELECT
c.the_geom,
m.quads,
m.significance,
c.num_cyclists_per_total_population
FROM
cdb_crankshaft.CDB_MoransILocal(
'SELECT * FROM commute_data'
'num_cyclists_per_total_population') As m
JOIN commute_data As c
ON c.cartodb_id = m.rowid;
```
### CDB_MoransILocalRate(subquery text, numerator text, denominator text)
Just like `CDB_MoransILocal`, this function classifies your data as being part of a cluster, as an outlier, or not part of a pattern based the significance of a classification. This function differs in that it calculates the classifications based on input `numerator` and `denominator` columns for finding the areas where there are clusters and outliers for the resulting rate of those two values.
#### Arguments
| Name | Type | Description |
|------|------|-------------|
| subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., `SELECT * FROM interesting_table`). This query must have the geometry column name `the_geom` and id column name `cartodb_id` unless otherwise specified in the input arguments |
| numerator | TEXT | Name of the numerator for forming a rate to be used in analysis. |
| denominator | TEXT | Name of the denominator for forming a rate to be used in analysis. |
| weight type (optional) | TEXT | Type of weight to use when finding neighbors. Currently available options are 'knn' (default) and 'queen'. Read more about weight types in [PySAL's weights documentation](https://pysal.readthedocs.io/en/v1.11.0/users/tutorials/weights.html). |
| num_ngbrs (optional) | INT | Number of neighbors if using k-nearest neighbors weight type. Defaults to 5. |
| permutations (optional) | INT | Number of permutations to check against a random arrangement of the values in `column_name`. This influences the accuracy of the output field `significance`. Defaults to 99. |
| geom_col (optional) | TEXT | The column name for the geometries. Defaults to `the_geom` |
| id_col (optional) | TEXT | The column name for the unique ID of each geometry/value pair. Defaults to `cartodb_id`. |
#### Returns
A table with the following columns.
| Column Name | Type | Description |
|-------------|------|-------------|
| quads | TEXT | Classification of geometry. Result is one of 'HH' (a high value with neighbors high on average), 'LL' (opposite of 'HH'), 'HL' (a high value surrounded by lows on average), and 'LH' (opposite of 'HL'). Null values are returned when nulls exist in the original data. |
| significance | NUMERIC | The statistical significance (from 0 to 1) of a cluster or outlier classification. Lower numbers are more significant. |
| spatial\_lag | NUMERIC | The 'average' of the neighbors of the value in this row. The average is calculated from it's neighborhood -- defined by `weight_type`. |
| spatial\_lag\_std | NUMERIC | The standardized version of `spatial_lag` -- that is, centered on the mean and divided by the standard deviation. |
| orig\_val | NUMERIC | Standardized rate (centered on the mean and normalized by the standard deviation) calculated from `numerator` and `denominator`. This is calculated by [Assuncao Rate](http://pysal.readthedocs.io/en/latest/library/esda/smoothing.html?highlight=assuncao#pysal.esda.smoothing.assuncao_rate) in the PySAL library. |
| orig\_val\_std | NUMERIC | Values from `column_name` but centered on the mean and divided by the standard devation. Useful as the x-axis in Moran's I scatter plots. |
| moran\_stat | NUMERIC | Value of Moran's I (spatial autocorrelation measure) for the geometry with id of `rowid` |
| rowid | INT | Row id of the values which correspond to the input rows. |
A table with the following columns. |
#### Example Usage
```sql
SELECT
c.the_geom,
m.quads,
m.significance,
c.cyclists_per_total_population
FROM
cdb_crankshaft.CDB_MoransILocalRate(
'SELECT * FROM commute_data'
'num_cyclists',
'total_population') As m
JOIN commute_data As c
ON c.cartodb_id = m.rowid;
```
### CDB_AreasOfInterestLocal(subquery text, column_name text) (deprecated)
### CDB_AreasOfInterestLocal(subquery text, column_name text)
This function classifies your data as being part of a cluster, as an outlier, or not part of a pattern based the significance of a classification. The classification happens through an autocorrelation statistic called Local Moran's I.
@@ -158,7 +55,7 @@ JOIN commute_data As c
ON c.cartodb_id = aoi.rowid;
```
### CDB_AreasOfInterestGlobal(subquery text, column_name text) (deprecated)
### CDB_AreasOfInterestGlobal(subquery text, column_name text)
This function identifies the extent to which geometries cluster (the groupings of geometries with similarly high or low values relative to the mean) or form outliers (areas where geometries have values opposite of their neighbors). The output of this function gives values between -1 and 1 as well as a significance of that classification. Values close to 0 mean that there is little to no distribution of values as compared to what one would see in a randomly distributed collection of geometries and values.
@@ -194,7 +91,7 @@ FROM
'num_cyclists_per_total_population')
```
### CDB_AreasOfInterestLocalRate(subquery text, numerator_column text, denominator_column text) (deprecated)
### CDB_AreasOfInterestLocalRate(subquery text, numerator_column text, denominator_column text)
Just like `CDB_AreasOfInterestLocal`, this function classifies your data as being part of a cluster, as an outlier, or not part of a pattern based the significance of a classification. This function differs in that it calculates the classifications based on input `numerator` and `denominator` columns for finding the areas where there are clusters and outliers for the resulting rate of those two values.
@@ -241,7 +138,7 @@ JOIN commute_data As c
ON c.cartodb_id = aoi.rowid;
```
### CDB_AreasOfInterestGlobalRate(subquery text, column_name text) (deprecated)
### CDB_AreasOfInterestGlobalRate(subquery text, column_name text)
This function identifies the extent to which geometries cluster (the groupings of geometries with similarly high or low values relative to the mean) or form outliers (areas where geometries have values opposite of their neighbors). The output of this function gives values between -1 and 1 as well as a significance of that classification. Values close to 0 mean that there is little to no distribution of values as compared to what one would see in a randomly distributed collection of geometries and values.
@@ -281,7 +178,7 @@ FROM
## Hotspot, Coldspot, and Outlier Functions
These functions are convenience functions for extracting only information that you are interested in exposing based on the outputs of the `CDB_MoransI*` functions. For instance, you can use `CDB_GetSpatialHotspots` to output only the classifications of `HH` and `HL`.
These functions are convenience functions for extracting only information that you are interested in exposing based on the outputs of the `CDB_AreasOfInterest` functions. For instance, you can use `CDB_GetSpatialHotspots` to output only the classifications of `HH` and `HL`.
### Non-rate functions

View File

@@ -17,7 +17,7 @@ AS $$
num_ngbrs, permutations, geom_col, id_col)
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Local (internal function) - DEPRECATED
-- Moran's I Local (internal function)
CREATE OR REPLACE FUNCTION
_CDB_AreasOfInterestLocal(
subquery TEXT,
@@ -27,82 +27,16 @@ CREATE OR REPLACE FUNCTION
permutations INT,
geom_col TEXT,
id_col TEXT)
RETURNS TABLE (
moran NUMERIC,
quads TEXT,
significance NUMERIC,
rowid INT,
vals NUMERIC)
RETURNS TABLE (moran NUMERIC, quads TEXT, significance NUMERIC, rowid INT, vals NUMERIC)
AS $$
from crankshaft.clustering import Moran
moran = Moran()
result = moran.local_stat(subquery, column_name, w_type,
num_ngbrs, permutations, geom_col, id_col)
# remove spatial lag
return [(r[6], r[0], r[1], r[7], r[5]) for r in result]
# TODO: use named parameters or a dictionary
return moran.local_stat(subquery, column_name, w_type,
num_ngbrs, permutations, geom_col, id_col)
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Local (internal function)
CREATE OR REPLACE FUNCTION
_CDB_MoransILocal(
subquery TEXT,
column_name TEXT,
w_type TEXT,
num_ngbrs INT,
permutations INT,
geom_col TEXT,
id_col TEXT)
RETURNS TABLE (
quads TEXT,
significance NUMERIC,
spatial_lag NUMERIC,
spatial_lag_std NUMERIC,
orig_val NUMERIC,
orig_val_std NUMERIC,
moran_stat NUMERIC,
rowid INT)
AS $$
from crankshaft.clustering import Moran
moran = Moran()
return moran.local_stat(subquery, column_name, w_type,
num_ngbrs, permutations, geom_col, id_col)
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Local (public-facing function)
-- Replaces CDB_AreasOfInterestLocal
CREATE OR REPLACE FUNCTION
CDB_MoransILocal(
subquery TEXT,
column_name TEXT,
w_type TEXT DEFAULT 'knn',
num_ngbrs INT DEFAULT 5,
permutations INT DEFAULT 99,
geom_col TEXT DEFAULT 'the_geom',
id_col TEXT DEFAULT 'cartodb_id')
RETURNS TABLE (
quads TEXT,
significance NUMERIC,
spatial_lag NUMERIC,
spatial_lag_std NUMERIC,
orig_val NUMERIC,
orig_val_std NUMERIC,
moran_stat NUMERIC,
rowid INT)
AS $$
SELECT
quads, significance, spatial_lag, spatial_lag_std,
orig_val, orig_val_std, moran_stat, rowid
FROM cdb_crankshaft._CDB_MoransILocal(
subquery, column_name, w_type,
num_ngbrs, permutations, geom_col, id_col);
$$ LANGUAGE SQL VOLATILE PARALLEL UNSAFE;
-- Moran's I Local (public-facing function) - DEPRECATED
CREATE OR REPLACE FUNCTION
CDB_AreasOfInterestLocal(
subquery TEXT,
@@ -198,7 +132,7 @@ AS $$
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Local Rate (internal function) - DEPRECATED
-- Moran's I Local Rate (internal function)
CREATE OR REPLACE FUNCTION
_CDB_AreasOfInterestLocalRate(
subquery TEXT,
@@ -210,22 +144,15 @@ CREATE OR REPLACE FUNCTION
geom_col TEXT,
id_col TEXT)
RETURNS
TABLE(
moran NUMERIC,
quads TEXT,
significance NUMERIC,
rowid INT,
vals NUMERIC)
TABLE(moran NUMERIC, quads TEXT, significance NUMERIC, rowid INT, vals NUMERIC)
AS $$
from crankshaft.clustering import Moran
moran = Moran()
# TODO: use named parameters or a dictionary
result = moran.local_rate_stat(subquery, numerator, denominator, w_type, num_ngbrs, permutations, geom_col, id_col)
# remove spatial lag
return [(r[6], r[0], r[1], r[7], r[4]) for r in result]
return moran.local_rate_stat(subquery, numerator, denominator, w_type, num_ngbrs, permutations, geom_col, id_col)
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Local Rate (public-facing function) - DEPRECATED
-- Moran's I Local Rate (public-facing function)
CREATE OR REPLACE FUNCTION
CDB_AreasOfInterestLocalRate(
subquery TEXT,
@@ -245,75 +172,6 @@ AS $$
$$ LANGUAGE SQL VOLATILE PARALLEL UNSAFE;
-- Internal function
CREATE OR REPLACE FUNCTION
_CDB_MoransILocalRate(
subquery TEXT,
numerator TEXT,
denominator TEXT,
w_type TEXT,
num_ngbrs INT,
permutations INT,
geom_col TEXT,
id_col TEXT)
RETURNS
TABLE(
quads TEXT,
significance NUMERIC,
spatial_lag NUMERIC,
spatial_lag_std NUMERIC,
orig_val NUMERIC,
orig_val_std NUMERIC,
moran_stat NUMERIC,
rowid INT)
AS $$
from crankshaft.clustering import Moran
moran = Moran()
return moran.local_rate_stat(
subquery,
numerator,
denominator,
w_type,
num_ngbrs,
permutations,
geom_col,
id_col
)
$$ LANGUAGE plpythonu VOLATILE PARALLEL UNSAFE;
-- Moran's I Rate
-- Replaces CDB_AreasOfInterestLocalRate
CREATE OR REPLACE FUNCTION
CDB_MoransILocalRate(
subquery TEXT,
numerator TEXT,
denominator TEXT,
w_type TEXT DEFAULT 'knn',
num_ngbrs INT DEFAULT 5,
permutations INT DEFAULT 99,
geom_col TEXT DEFAULT 'the_geom',
id_col TEXT DEFAULT 'cartodb_id')
RETURNS
TABLE(
quads TEXT,
significance NUMERIC,
spatial_lag NUMERIC,
spatial_lag_std NUMERIC,
orig_val NUMERIC,
orig_val_std NUMERIC,
moran_stat NUMERIC,
rowid INT)
AS $$
SELECT
quads, significance, spatial_lag, spatial_lag_std,
orig_val, orig_val_std, moran_stat, rowid
FROM cdb_crankshaft._CDB_MoransILocalRate(
subquery, numerator, denominator, w_type,
num_ngbrs, permutations, geom_col, id_col);
$$ LANGUAGE SQL VOLATILE PARALLEL UNSAFE;
-- Moran's I Local Rate only for HH and HL (public-facing function)
CREATE OR REPLACE FUNCTION
CDB_GetSpatialHotspotsRate(

View File

@@ -0,0 +1,16 @@
-- Spatial Lag with kNN neighbors (internal function)
CREATE OR REPLACE FUNCTION
CDB_SpatialLag(
subquery TEXT,
column_name TEXT,
w_type TEXT DEFAULT 'knn',
num_ngbrs INT DEFAULT 5,
geom_col TEXT DEFAULT 'the_geom',
id_col TEXT DEFAULT 'cartodb_id')
RETURNS TABLE (spatial_lag NUMERIC, rowid INT)
AS $$
from crankshaft.spatial_lag import SpatialLag
s_lag = SpatialLag()
return s_lag.spatial_lag(subquery, column_name, w_type,
num_ngbrs, geom_col, id_col)
$$ LANGUAGE plpythonu;

View File

@@ -68,63 +68,6 @@ code|quads
(52 rows)
_cdb_random_seeds
(1 row)
code|quads|diff_orig|expected|moran_stat_not_null|significance_not_null|value_comparison
01|HH|t|t|t|t|t
02|HL|t|t|t|t|t
03|LL|t|t|t|t|t
04|LL|t|t|t|t|t
05|LH|t|t|t|t|t
06|LL|t|t|t|t|t
07|HH|t|t|t|t|t
08|HH|t|t|t|t|t
09|HH|t|t|t|t|t
10|LL|t|t|t|t|t
11|LL|t|t|t|t|t
12|LL|t|t|t|t|t
13|HL|t|t|t|t|t
14|LL|t|t|t|t|t
15|LL|t|t|t|t|t
16|HH|t|t|t|t|t
17|HH|t|t|t|t|t
18|LL|t|t|t|t|t
19|HH|t|t|t|t|t
20|HH|t|t|t|t|t
21|LL|t|t|t|t|t
22|HH|t|t|t|t|t
23|LL|t|t|t|t|t
24|LL|t|t|t|t|t
25|HH|t|t|t|t|t
26|HH|t|t|t|t|t
27|LL|t|t|t|t|t
28|HH|t|t|t|t|t
29|LL|t|t|t|t|t
30|LL|t|t|t|t|t
31|HH|t|t|t|t|t
32|LL|t|t|t|t|t
33|HL|t|t|t|t|t
34|LH|t|t|t|t|t
35|LL|t|t|t|t|t
36|LL|t|t|t|t|t
37|HL|t|t|t|t|t
38|HL|t|t|t|t|t
39|HH|t|t|t|t|t
40|HH|t|t|t|t|t
41|HL|t|t|t|t|t
42|LH|t|t|t|t|t
43|LH|t|t|t|t|t
44|LL|t|t|t|t|t
45|LH|t|t|t|t|t
46|LL|t|t|t|t|t
47|LL|t|t|t|t|t
48|HH|t|t|t|t|t
49|LH|t|t|t|t|t
50|HH|t|t|t|t|t
51|LL|t|t|t|t|t
52|LL|t|t|t|t|t
(52 rows)
_cdb_random_seeds
(1 row)
code|quads
01|HH
@@ -261,63 +204,6 @@ code|quads
(52 rows)
_cdb_random_seeds
(1 row)
code|quads|diff_orig|expected|moran_stat_not_null|significance_not_null
01|HH|t|t|t|t
02|HL|t|t|t|t
03|LL|t|t|t|t
04|LL|t|t|t|t
05|LH|t|t|t|t
06|LL|t|t|t|t
07|HH|t|t|t|t
08|HH|t|t|t|t
09|HH|t|t|t|t
10|LL|t|t|t|t
11|LL|t|t|t|t
12|LL|t|t|t|t
13|HL|t|t|t|t
14|LL|t|t|t|t
15|LL|t|t|t|t
16|HH|t|t|t|t
17|HH|t|t|t|t
18|LL|t|t|t|t
19|HH|t|t|t|t
20|HH|t|t|t|t
21|LL|t|t|t|t
22|HH|t|t|t|t
23|LL|t|t|t|t
24|LL|t|t|t|t
25|HH|t|t|t|t
26|HH|t|t|t|t
27|LL|t|t|t|t
28|HH|t|t|t|t
29|LL|t|t|t|t
30|LL|t|t|t|t
31|HH|t|t|t|t
32|LL|t|t|t|t
33|HL|t|t|t|t
34|LH|t|t|t|t
35|LL|t|t|t|t
36|LL|t|t|t|t
37|HL|t|t|t|t
38|HL|t|t|t|t
39|HH|t|t|t|t
40|HH|t|t|t|t
41|HL|t|t|t|t
42|LH|t|t|t|t
43|LH|t|t|t|t
44|LL|t|t|t|t
45|LH|t|t|t|t
46|LL|t|t|t|t
47|LL|t|t|t|t
48|HH|t|t|t|t
49|LH|t|t|t|t
50|HH|t|t|t|t
51|LL|t|t|t|t
52|LL|t|t|t|t
(52 rows)
_cdb_random_seeds
(1 row)
code|quads
01|HH

View File

@@ -0,0 +1,47 @@
\pset format unaligned
\set ECHO all
\i test/fixtures/spatial_lag_file.sql
SET client_min_messages TO WARNING;
\set ECHO none
rowid|spatial_lag
26|1.5558245520965668
36|0.86458398182170748
47|0.95270659529918711
48|1.2321749304579666
52|0.7067108549416905
63|1.7648393915166776
70|8.5933781209139841
74|8.585129430302981
81|1.2762005195217829
96|8.5862165805895536
97|4.4808366319910027
112|11.952508899443769
113|9.8633331700241413
116|11.950331957611654
117|4.6317289221482971
119|1.5003998060451806
125|4.623201011124598
140|0.81320866139607006
143|1.1652798506659032
145|1.0029589151893603
146|0.73142355952732685
148|0.89698376389437184
151|0.89265718517960235
153|4.4672874115845964
154|0.79363641536354224
160|4.4995909659653517
174|0.99031984840512266
201|0.9850340860853366
203|0.95995789366679918
205|0.44649384203794357
210|0.84287567979703126
216|0.45638121464530523
221|0.83657469267602325
222|0.32468328149670383
226|0.96856924566947733
234|0.98029560059066778
236|0.36577884387846982
239|0.46532563118851461
265|0.55308241153547621
274|0.37779583501369646
(40 rows)

View File

@@ -0,0 +1,45 @@
SET client_min_messages TO WARNING;
\set ECHO none
-- test table (Manhattan census tracts with bachelors' degree and above data)
CREATE TABLE spatial_lag_file (cartodb_id integer, the_geom geometry, value float);
INSERT INTO spatial_lag_file VALUES
geometry, 0),
geometry, 2.1228813559322),
(125,'0106000020E61000000100000001030000000100000013000000EF535568207E52C0C59272F73962444088BB7A15197E52C055C1A8A44E624440F775E09C117E52C0C119FCFD62624440C5387F130A7E52C07061DD7877624440AB3DEC85027E52C02F14B01D8C624440C1C6F5EFFA7D52C08A01124DA06244405A2E1B9DF37D52C04AB4E4F1B4624440E411DC48D97D52C0CC24EA059F6244403F7100FDBE7D52C0BA4BE2AC88624440D0B69A75C67D52C06B0C3A2174624440793E03EACD7D52C0BDC458A65F624440F241CF66D57D52C06E85B01A4B6244406B459BE3DC7D52C0F0C16B9736624440E4486760E47D52C012F6ED2422624440758E01D9EB7D52C0F33AE2900D62444036583849F37D52C075779D0DF96144404C6C3EAE0D7E52C0466117450F624440094FE8F5277E52C07653CA6B25624440EF535568207E52C0C59272F739624440'::geometry,1.29238643634037),
(112,'0106000020E6100000010000000103000000010000000A0000000D7217618A7E52C022C66B5ED5614440EDB94C4D827E52C010C99063EB61444092CF2B9E7A7E52C0B22E6EA301624440548D5E0D507E52C059C0046EDD6144401EE1B4E0457E52C08202EFE4D36144403EB324404D7E52C0E525FF93BF6144406FD6E07D557E52C0B07614E7A8614440EE96E4805D7E52C0207C28D19261444074F04C68927E52C080608E1EBF6144400D7217618A7E52C022C66B5ED5614440'::geometry, 3.42171717171717),
(174,'0106000020E6100000010000000103000000010000001500000046ED7E15E07C52C00F7BA180ED64444073B8567BD87C52C059FD1186016544406094A0BFD07C52C02A1BD65416654440933655F7C87C52C049D6E1E82A654440A9BF5E61C17C52C098158A743F65444030BC92E4B97C52C0A565A4DE53654440FE7E315BB27C52C0A70705A5686544402F3196E9977C52C083C30B2252654440724EECA17D7C52C0234DBC033C654440E57ADB4C857C52C03FC4060B27654440054D4BAC8C7C52C0145B41D31265444061376C5B947C52C0605628D2FD644440C8CF46AE9B7C52C077DCF0BBE9644440124F7633A37C52C0B7291E17D5644440FCC56CC9AA7C52C0567F8461C0644440BD8FA339B27C52C078B306EFAB6444401A16A3AEB57C52C0E8BE9CD9AE6444402AE5B512BA7C52C0E065868DB264444091B41B7DCC7C52C0A9A5B915C264444013656F29E77C52C02CF2EB87D864444046ED7E15E07C52C00F7BA180ED644440'::geometry, 0.30903869820867),
(81,'0106000020E6100000010000000103000000010000000F0000001FBE4C14218052C07C4276DEC65E44404C89247A198052C01EDC9DB5DB5E4440F1845E7F128052C0AF3E1EFAEE5E4440958098840B8052C0A0A9D72D025F444087A8C29FE17F52C0414AECDADE5E4440F4FA93F8DC7F52C096404AECDA5E44409B3BFA5FAE7F52C02DCF83BBB35E4440C7BB2363B57F52C09C6C0377A05E4440F33B4D66BC7F52C0AC014A438D5E4440DDB243FCC37F52C02881CD39785E44400FD6FF39CC7F52C0452C62D8615E444095607138F37F52C0B75ED383825E4440689599D2FA7F52C09C3237DF885E4440389F3A56298052C0170FEF39B05E44401FBE4C14218052C07C4276DEC65E4440'::geometry, 0.504633247739198),
(145,'0106000020E6100000010000000103000000010000000B000000B6D8EDB3CA7C52C09CA4F9635A6344400E6B2A8BC27C52C00E863AAC70634440E525FF93BF7C52C062484E266E6344409DD497A59D7C52C04D8237A4516344407DAEB6627F7C52C0A8C7B60C38634440F3599E07777C52C01C3F541A316344403CBF28417F7C52C0696E85B01A63444056BABBCE867C52C0AABBB20B0663444018778368AD7C52C0990F0874266344404147AB5AD27C52C03CFA5FAE45634440B6D8EDB3CA7C52C09CA4F9635A634440'::geometry, 0.581484315225708),
(143,'0106000020E61000000100000001030000000100000016000000CD785BE9B57D52C065C746205E6344403C4D66BCAD7D52C01898158A74634440AB21718FA57D52C0FAEC80EB8A6344402F34D769A47D52C04CDF6B088E634440A968ACFD9D7D52C09C86A8C29F634440EF75525F967D52C09E280989B46344401C412AC58E7D52C0103E9468C963444097A949F0867D52C0C9FFE4EFDE6344404563EDEF6C7D52C0E1ED4108C8634440FFCA4A93527D52C09F909DB7B163444002840F255A7D52C09EEE3CF19C634440ECFA05BB617D52C0CCD0782288634440BE2F2E55697D52C09BAA7B64736344402D211FF46C7D52C05F27F56569634440A8A624EB707D52C0F910548D5E6344405114E813797D52C0B7B3AF3C48634440E23FDD40817D52C004E3E0D231634440FB3A70CE887D52C00341800C1D634440B2D826158D7D52C08A743FA720634440FA0CA837A37D52C0A4A65D4C3363444016F88A6EBD7D52C034A1496249634440CD785BE9B57D52C065C746205E634440'::geometry, 0.620102977389747),
(210,'0106000020E6100000010000000103000000010000000C00000030629F008A7D52C06B10E6762F674440DF6FB4E3867D52C0A8C7B60C38674440412E71E4817D52C0EE5C18E94567444060AC6F60727D52C02C9FE57970674440B0FF3A376D7D52C05DBF60376C6744407A50508A567D52C0A3957B8159674440130CE71A667D52C035CF11F92E6744406FF607CA6D7D52C0E2067C7E186744409D2CB5DE6F7D52C04DA088450C6744405BEF37DA717D52C08F72309B00674440629F008A917D52C01CD13DEB1A67444030629F008A7D52C06B10E6762F674440'::geometry,0.957387935805202),
geometry, 0.322135750172397),
(26,'0106000020E6100000010000000103000000010000000E0000003FE257ACE17E52C049F59D5F945C444049BDA772DA7E52C0E014562AA85C4440CAE2FE23D37E52C0F9122A38BC5C44400726378AAC7E52C00ABFD4CF9B5C44403FAC376A857E52C074B680D07A5C444017B83CD68C7E52C08A3C49BA665C44406A12BC218D7E52C001A1F5F0655C4440BBB6B75B927E52C01557957D575C4440F69A1E14947E52C0B22D03CE525C4440BEBD6BD0977E52C093C3279D485C4440BC3B32569B7E52C0EC89AE0B3F5C4440A3CEDC43C27E52C0101FD8F15F5C44404D49D6E1E87E52C04162BB7B805C44403FE257ACE17E52C049F59D5F945C4440'::geometry, 0.724247614387081),
geometry, 0.371976866456362),
(140,'0106000020E6100000010000000103000000010000001100000035B8AD2D3C7D52C0573D601E32634440E15D2EE23B7D52C0E0D8B3E73263444062838593347D52C059DFC0E446634440A8902BF52C7D52C02AFD84B35B6344408E959867257D52C08BA71E6970634440A31EA2D11D7D52C02D41464085634440156F641EF97C52C0AE2CD159666344404147AB5AD27C52C03CFA5FAE456344408BC6DADFD97C52C0C9E4D4CE30634440753DD175E17C52C0F8C610001C63444077F69507E97C52C026A94C310763444070D1C952EB7C52C011514CDE00634440616D8C9DF07C52C0B493C151F2624440DA56B3CEF87C52C042B28009DC62444067B8019F1F7D52C0B4E4F1B4FC6244400DAA0D4E447D52C0A46C91B41B63444035B8AD2D3C7D52C0573D601E32634440'::geometry, 0.382920528795016),
geometry,0.568195793875292),
(148,'0106000020E61000000100000001030000000100000009000000B9A7AB3B167D52C0FF5E0A0F9A634440113AE8120E7D52C0E1B37570B06344409ACC785BE97C52C021B07268916344400E6B2A8BC27C52C00E863AAC70634440B6D8EDB3CA7C52C09CA4F9635A6344404147AB5AD27C52C03CFA5FAE45634440156F641EF97C52C0AE2CD15966634440A31EA2D11D7D52C02D41464085634440B9A7AB3B167D52C0FF5E0A0F9A634440'::geometry, 1.11136007170065),
(216,'0106000020E6100000010000000103000000010000001000000012F5824F737C52C02E9276A38F6744406A6D1ADB6B7C52C06B662D05A467444008AC1C5A647C52C019AE0E80B8674440284701A2607C52C008944DB9C267444030A017EE5C7C52C0F6798CF2CC6744402810768A557C52C0753DD175E167444014ECBFCE4D7C52C0A663CE33F6674440BE16F4DE187C52C0F8E12021CA674440723271AB207C52C062F6B2EDB46744401BBAD91F287C52C013B70A62A0674440C44142942F7C52C035EB8CEF8B6744403D450E11377C52C0581F0F7D77674440B648DA8D3E7C52C0EBC6BB23636744408F54DFF9457C52C00DFB3DB14E674440A33A1DC87A7C52C0804A95287B67444012F5824F737C52C02E9276A38F674440'::geometry,0.362781954887218),
geometry,0.844695710009977),
(239,'0106000020E6100000010000000103000000010000000F000000357D76C0757C52C086713788D6684440A437DC476E7C52C005357C0BEB684440CB2BD7DB667C52C0718DCF64FF6844403AE63C635F7C52C07FDDE9CE13694440F96871C6307C52C0166C239EEC6844401904560E2D7C52C023827170E9684440C66CC9AA087C52C0A46DFC89CA684440E198654F027C52C02A1C412AC568444060730E9E097C52C0FFB27BF2B068444051C1E105117C52C003CE52B29C684440410FB56D187C52C0376DC66988684440E4BF4010207C52C095D39E92736844406D0377A04E7C52C06FB88FDC9A684440F646AD307D7C52C079211D1EC2684440357D76C0757C52C086713788D6684440'::geometry,0.141483961550923),
(274,'0106000020E6100000010000000103000000010000001C0000002AC423F1F27B52C04B5AF10D856D44409F0436E7E07B52C0BB0D6ABFB56D44400D6E6B0BCF7B52C0726DA818E76D4440C11E1329CD7B52C050508A56EE6D444082E50819C87B52C0B262B83A006E4440412AC58EC67B52C0BEDC2747016E44401FA2D11DC47B52C0ED2C7AA7026E4440B55373B9C17B52C0FE63213A046E4440033FAA61BF7B52C02106BAF6056E44400A647616BD7B52C0551344DD076E44409A3E3BE0BA7B52C09C8BBFED096E4440E35295B6B87B52C0F46E2C280C6E4440B41CE8A1B67B52C05EBD8A8C0E6E4440D93D7958A87B52C0B68311FB046E4440DFFAB0DEA87B52C092AD2EA7046E4440153C855CA97B52C0FE63213A046E44407901F6D1A97B52C0F8A6E9B3036E44403ECF9F36AA7B52C081768714036E444061A5828AAA7B52C06A4E5E64026E4440552E54FEB57B52C0E0A293A5D66D4440F9F884ECBC7B52C0E78D93C2BC6D44408D429259BD7B52C094675E0EBB6D444062F9F36DC17B52C06762BA10AB6D4440F1F09E03CB7B52C01C0A9FAD836D44405ABDC3EDD07B52C01CB3EC49606D44400C74ED0BE87B52C00FEECEDA6D6D4440CA5356D3F57B52C0EED0B018756D44402AC423F1F27B52C04B5AF10D856D4440'::geometry,0.996865203761755),
(97,'0106000020E61000000100000001030000000100000012000000E3E313B2F37E52C068CD8FBFB4604440529E7939EC7E52C046990D32C9604440C158DFC0E47E52C0D6C743DFDD60444030134548DD7E52C031B5A50EF26044402D5A80B6D57E52C0325706D506614440554E7B4ACE7E52C0102384471B614440ACC612D6C67E52C0BE6A65C22F61444075AC527AA67E52C0F60D4C6E1461444056F146E6917E52C07B9FAA420361444070ECD973997E52C05CE49EAEEE604440A80018CFA07E52C06DAD2F12DA604440F17F4754A87E52C05F5D15A8C56044403BFF76D9AF7E52C0111E6D1CB1604440FCC8AD49B77E52C0D449B6BA9C6044405D8AABCABE7E52C0B48EAA26886044407C4276DEC67E52C02594BE1072604440EA5910CAFB7E52C0D8D2A3A99E604440E3E313B2F37E52C068CD8FBFB4604440'::geometry,39.3959731543624),
(236,'0106000020E6100000010000000103000000010000001400000096ECD808C47C52C0855FEAE74D6944404C6DA983BC7C52C033A7CB62626944407461A417B57C52C0E1EEACDD76694440B3976DA7AD7C52C0EE3EC7478B694440C3633F8BA57C52C03C4A253CA16944405E30B8E68E7C52C05E4A5D328E694440C9ACDEE1767C52C0C26D6DE179694440D7A546E8677C52C039B5334C6D69444044A7E7DD587C52C06F0D6C95606944401B65FD66627C52C00F63D2DF4B69444094347F4C6B7C52C042CEFBFF38694440062D2460747C52C01074B4AA25694440A8A9656B7D7C52C02009FB7612694440C45DBD8A8C7C52C0EAB0C22D1F694440249D8191977C52C00F0C207C28694440D2E1218C9F7C52C0B8AD2D3C2F69444030B95164AD7C52C0D6E3BED53A694440927A4FE5B47C52C0698B6B7C2669444027327381CB7C52C05AF624B03969444096ECD808C47C52C0855FEAE74D694440'::geometry,0.639217898101147),
(146,'0106000020E6100000010000000103000000010000001E0000000E6B2A8BC27C52C00E863AAC7063444006F52D73BA7C52C00EF450DB86634440A43330F2B27C52C0FE2AC0779B634440E469F981AB7C52C0AC72A1F2AF634440EB1D6E87867C52C0EB6E9EEA906344403CA583F57F7C52C08F368E588B63444065187783687C52C087A3AB747763444074965984627C52C060AC6F6072634440EC2E5052607C52C08BA71E69706344404A7EC4AF587C52C039EFFFE384634440BB253960577C52C0DAE6C6F484634440815B77F3547C52C04B5AF10D8563444058E20165537C52C0EC51B81E85634440618907944D7C52C073B9C15087634440280D350A497C52C0FFCD8B135F6344401C7920B2487C52C052D7DAFB54634440EFC7ED974F7C52C0618907944D6344408F71C5C5517C52C090BC732843634440B682A625567C52C0064B75012F634440C7D97404707C52C05323F433F5624440C68B8521727C52C054573ECBF3624440B9C32632737C52C01EE21FB6F462444040F7E5CC767C52C0E04735ECF762444056BABBCE867C52C0AABBB20B066344403CBF28417F7C52C0696E85B01A634440F3599E07777C52C01C3F541A316344407DAEB6627F7C52C0A8C7B60C386344409DD497A59D7C52C04D8237A451634440E525FF93BF7C52C062484E266E6344400E6B2A8BC27C52C00E863AAC70634440'::geometry,0.480781758957655),
(154,'0106000020E61000000100000001030000000100000009000000B9C667B27F7D52C0728A8EE4F263444087A3AB74777D52C08463963D096444409B3C65355D7D52C0F568AA27F36344409C6A2DCC427D52C05303CDE7DC634440155454FD4A7D52C071AE6186C6634440FFCA4A93527D52C09F909DB7B16344404563EDEF6C7D52C0E1ED4108C863444097A949F0867D52C0C9FFE4EFDE634440B9C667B27F7D52C0728A8EE4F2634440'::geometry,1.76741803278689),
(226,'0106000020E6100000010000000103000000010000001000000066A19DD32C7D52C0992D5915E166444047E9D2BF247D52C0C91F0C3CF76644403F73D6A71C7D52C0F911BF620D674440DDB1D826157D52C0E9482EFF21674440F73B1405FA7C52C0A12E52280B67444084F23E8EE67C52C00F643DB5FA6644402B33A5F5B77C52C0766EDA8CD366444074B2D47ABF7C52C0E63FA4DFBE664440946A9F8EC77C52C0B64DF1B8A86644407EC7F0D8CF7C52C0809E060C92664440ABAFAE0AD47C52C0A3570394866644409E996038D77C52C066A032FE7D664440F3E670ADF67C52C0A661F888986644403F1F65C4057D52C0A08D5C37A566444098DEFE5C347D52C0A9F6E978CC66444066A19DD32C7D52C0992D5915E1664440'::geometry,0.319725194873827),
(74,'0106000020E610000001000000010300000001000000130000000F0C207C287F52C08A56EE05665F444037001B10217F52C02733DE567A5F444088A1D5C9197F52C0B1A4DC7D8E5F4440AF95D05D127F52C00C923EADA25F44408FC360FE0A7F52C079EA9106B75F4440C93CF207037F52C0850662D9CC5F4440D331E719FB7E52C021AF0793E25F4440FE0C6FD6E07E52C0D3A3A99ECC5F4440BE4BA94BC67E52C09146054EB65F444079245E9ECE7E52C05A2F8672A25F44408C48145AD67E52C0D7169E978A5F44400C23BDA8DD7E52C03B3AAE46765F44405B79C9FFE47E52C0E04C4C17625F44407B4B395FEC7E52C014ECBFCE4D5F44409B1DA9BEF37E52C0E882FA96395F4440A3AD4A22FB7E52C01D226E4E255F444013F3ACA4157F52C02FFB75A73B5F4440FF59F3E32F7F52C0BEF561BD515F44400F0C207C287F52C08A56EE05665F4440'::geometry,1.11407249466951),
geometry,1.77398720682303),
(153,'0106000020E61000000100000001030000000100000012000000242A5437177F52C04A79AD84EE624440F2ECF2AD0F7F52C099B85510036344401AE1ED41087F52C07784D382176344408F72309B007F52C0EF5696E82C634440B5FE9600FC7E52C0444DF4F9286344403067B62BF47E52C03CA3AD4A2263444059DAA9B9DC7E52C0F3203D450E634440BE688F17D27E52C0620FED63056344407DEBC37AA37E52C0EDEF6C8FDE62444037DE1D19AB7E52C01CD2A8C0C9624440815D4D9EB27E52C09D0E643DB5624440B39AAE27BA7E52C04FCFBBB1A0624440A3E8818FC17E52C071033E3F8C6244405437177FDB7E52C0EF92382BA2624440672AC423F17E52C056629E95B4624440E08101840F7F52C05B25581CCE624440A304FD851E7F52C0E3DD91B1DA624440242A5437177F52C04A79AD84EE624440'::geometry,0.483653522076171),
geometry,1.22920021470746),
geometry,1.22764474083055),
geometry,0.143149284253579),
geometry,0.336783308690652),
(116,'0106000020E6100000010000000103000000010000000D0000001C28F04E3E7E52C04E637B2DE86144405B5EB9DE367E52C0EA3F6B7EFC61444053CE177B2F7E52C098874CF910624440094FE8F5277E52C07653CA6B256244404C6C3EAE0D7E52C0466117450F62444036583849F37D52C075779D0DF9614440AF5B04C6FA7D52C0C72FBC92E4614440111D0247027E52C078F01307D061444060730E9E097E52C0BEFA78E8BB61444033A83638117E52C08DD47B2AA761444061FE0A992B7E52C01CCF6740BD6144401EE1B4E0457E52C08202EFE4D36144401C28F04E3E7E52C04E637B2DE8614440'::geometry,3.43260188087774),
(201,'0106000020E61000000100000001030000000100000014000000F4A62215C67D52C0A1BB24CE8A6644404B1FBAA0BE7D52C0616EF7729F66444002A08A1BB77D52C0C2189128B4664440F3380CE6AF7D52C046CD57C9C766444005FC1A49827D52C066F7E461A16644408A3F8A3A737D52C073BC02D193664440172D40DB6A7D52C0417FA1478C66444051F4C0C7607D52C01633C2DB836644406D54A703597D52C0EF6FD05E7D664440A3C9C518587D52C03750E09D7C664440350C1F11537D52C07BDB4C857866444069FF03AC557D52C0AE635C7171664440C09481035A7D52C0AE4676A56566444021567F84617D52C000FF942A516644405393E00D697D52C0CFD8976C3C664440CC7C073F717D52C08D7BF31B26664440B47405DB887D52C0E868554B3A66444043554CA59F7D52C0D8D30E7F4D664440CC988235CE7D52C012C138B874664440F4A62215C67D52C0A1BB24CE8A664440'::geometry,0.246595904363675),
(113,'0106000020E6100000010000000103000000010000000B0000006FD6E07D557E52C0B07614E7A86144403EB324404D7E52C0E525FF93BF6144401EE1B4E0457E52C08202EFE4D361444061FE0A992B7E52C01CCF6740BD61444033A83638117E52C08DD47B2AA7614440ACAB02B5187E52C00E1137A792614440E3A59BC4207E52C06DAB59677C614440035E66D8287E52C07EA834626661444049F60835437E52C07F164B917C614440EE96E4805D7E52C0207C28D1926144406FD6E07D557E52C0B07614E7A8614440'::geometry,13.8675958188153),
(63,'0106000020E6100000010000000103000000010000001400000069E388B5F87F52C0317E1AF7E65D444090A339B2F27F52C0DA70581AF85D4440AC0320EEEA7F52C0E1CFF0660D5E44405FCE6C57E87F52C04F3FA88B145E44409D9CA1B8E37F52C0E962D34A215E444042B28009DC7F52C09D67EC4B365E4440707D586FD47F52C03F0114234B5E44400FD6FF39CC7F52C0452C62D8615E4440FAEFC16B977F52C0A3586E69355E44405B971AA19F7F52C06DA983BC1E5E4440E605D847A77F52C0BAA46ABB095E4440A1F831E6AE7F52C077137CD3F45D4440C7A17E17B67F52C093567C43E15D44403AE8120EBD7F52C061FC34EECD5D44409015FC36C47F52C08FAA2688BA5D4440C3BAF1EEC87F52C0D1B01875AD5D4440963D096CCE7F52C0028063CF9E5D444029594E42E97F52C000529B38B95D4440DCF5D214018052C01EA5129ED05D444069E388B5F87F52C0317E1AF7E65D4440'::geometry,0.687203791469194),
(151,'0106000020E61000000100000001030000000100000009000000155454FD4A7D52C071AE6186C66344409C6A2DCC427D52C05303CDE7DC63444015E3FC4D287D52C071AE6186C6634440113AE8120E7D52C0E1B37570B0634440B9A7AB3B167D52C0FF5E0A0F9A634440A31EA2D11D7D52C02D41464085634440A8C7B60C387D52C05D33F9669B634440FFCA4A93527D52C09F909DB7B1634440155454FD4A7D52C071AE6186C6634440'::geometry,1.27231418370659),
geometry,1.1443433029909),
geometry,0.120432321152856),
geometry,2.93929712460064),
(47,'0106000020E610000001000000010300000001000000120000006519E258178052C0E692AAED265C44407D586FD40A8052C007D3307C445C4440AEF36F97FD7F52C0D94125AE635C44408DB62A89EC7F52C01EA9BEF38B5C4440F98381E7DE7F52C0BE839F38805C4440D86322A5D97F52C00EBDC5C37B5C4440B9C5FCDCD07F52C01E6FF25B745C44400A849D62D57F52C0AF5DDA70585C4440C1559E40D87F52C0DA3BA3AD4A5C4440376C5B94D97F52C0C5E3A25A445C44405A1135D1E77F52C0300F99F2215C444048C32973F37F52C0AAD55757055C4440B1F7E28BF67F52C00725CCB4FD5B4440185C7347FF7F52C06C7C26FBE75B44407FA65EB7088052C08E3BA583F55B44408A5759DB148052C0BB0CFFE9065C4440AF9811DE1E8052C06667D13B155C44406519E258178052C0E692AAED265C4440'::geometry,3.73983739837398)

View File

@@ -24,25 +24,6 @@ SELECT ppoints.code, m.quads
SELECT cdb_crankshaft._cdb_random_seeds(1234);
-- Moran's I local
SELECT
ppoints.code, m.quads,
abs(avg(m.orig_val_std) OVER ()) < 1e-6 as diff_orig,
CASE WHEN m.quads = 'HL' THEN m.orig_val_std > m.spatial_lag_std
WHEN m.quads = 'HH' THEN m.orig_val_std >= 0 and m.spatial_lag_std >= 0
WHEN m.quads = 'LH' THEN m.orig_val_std < m.spatial_lag_std
WHEN m.quads = 'LL' THEN m.orig_val_std <= 0 and m.spatial_lag_std <= 0
ELSE null END as expected,
moran_stat is not null moran_stat_not_null,
significance >= 0.001 significance_not_null, -- greater than 1/1000 (default)
abs(m.orig_val - ppoints.value) <= 1e-6 as value_comparison
FROM ppoints
JOIN cdb_crankshaft.CDB_MoransILocal('SELECT * FROM ppoints', 'value') m
ON ppoints.cartodb_id = m.rowid
ORDER BY ppoints.code;
SELECT cdb_crankshaft._cdb_random_seeds(1234);
-- Spatial Hotspots
SELECT ppoints.code, m.quads
FROM ppoints
@@ -80,24 +61,6 @@ SELECT ppoints2.code, m.quads
SELECT cdb_crankshaft._cdb_random_seeds(1234);
-- Moran's I local rate
SELECT
ppoints2.code, m.quads,
abs(avg(m.orig_val_std) OVER ()) < 1e-6 as diff_orig,
CASE WHEN m.quads = 'HL' THEN m.orig_val_std > m.spatial_lag_std
WHEN m.quads = 'HH' THEN m.orig_val_std >= 0 and m.spatial_lag_std >= 0
WHEN m.quads = 'LH' THEN m.orig_val_std < m.spatial_lag_std
WHEN m.quads = 'LL' THEN m.orig_val_std <= 0 and m.spatial_lag_std <= 0
ELSE null END as expected,
moran_stat is not null moran_stat_not_null,
significance >= 0.001 significance_not_null -- greater than 1/1000 (default)
FROM ppoints2
JOIN cdb_crankshaft.CDB_MoransILocalRate('SELECT * FROM ppoints2', 'numerator', 'denominator') m
ON ppoints2.cartodb_id = m.rowid
ORDER BY ppoints2.code;
SELECT cdb_crankshaft._cdb_random_seeds(1234);
-- Spatial Hotspots (rate)
SELECT ppoints2.code, m.quads
FROM ppoints2

View File

@@ -0,0 +1,11 @@
\pset format unaligned
\set ECHO all
\i test/fixtures/spatial_lag_file.sql
-- Spatial Lag test
SELECT m.rowid, m.spatial_lag
FROM spatial_lag_file
JOIN cdb_crankshaft.CDB_SpatialLag('SELECT * FROM spatial_lag_file', 'value', 'knn',5, 'the_geom','cartodb_id') m
ON spatial_lag_file.cartodb_id = m.rowid
ORDER BY spatial_lag_file.cartodb_id;

View File

@@ -26,39 +26,63 @@ def verify_data(func):
class AnalysisDataProvider(object):
@verify_data
def get_getis(self, w_type, params):
"""fetch data for getis ord's g"""
query = pu.construct_neighbor_query(w_type, params)
return plpy.execute(query)
def get_weight_and_attrs(self, w_type, params):
"""fetch data for moran's i, getis, and spark markov analyses
This method returns a feature id, a list of its neighbors ids, and the
attribute(s) of the feature.
@verify_data
def get_markov(self, w_type, params):
"""fetch data for spatial markov"""
query = pu.construct_neighbor_query(w_type, params)
return plpy.execute(query)
@verify_data
def get_moran(self, w_type, params):
"""fetch data for moran's i analyses"""
Args:
w_type (str): Type of weight. One of ``knn`` (default) or
``queen``.
params (:obj:`dict`): Parameters for data retrieval. The keys are
defined below, with the descriptions of their values.
- `id_col` (str): Name of database index. Defaults to
`cartodb_id`
- `geom_col` (str): Geometry column. Defaults to `the_geom`.
- `subquery` (str): Query to get access to data
- `num_ngbrs` (int, optional): Number of neighbors if using kNN
- `time_cols` (list of str, optional): If using with spatial
markov, this is a list of columns for the analysis. They should
be ordered in time.
- `numerator` (str, optional): The numerator in Moran's I local
rate
- `denominator` (str, optional): Used in conjunction with
`numerator`.
"""
if params.get('w_type') == 'queen':
geom_type = plpy.execute('''
SELECT DISTINCT ST_GeometryType("{geom_col}") as g
FROM ({subquery}) as _w
WHERE "{geom_col}" is not null;
'''.format(
geom_col=params.get('geom_col'),
subquery=prams.get('subquery')
))
if geom_type[0]['g'] not in ('ST_Polygon', 'ST_MultiPolygon'):
raise plpy.error(
'Polygon geometries are needed when using `queen` weights '
'with this analysis. {} was found instead.'.format(
geom_type[0]['g']))
query = pu.construct_neighbor_query(w_type, params)
return plpy.execute(query)
@verify_data
def get_nonspatial_kmeans(self, params):
"""
Fetch data for non-spatial k-means.
Fetch data for non-spatial k-means.
Inputs - a dict (params) with the following keys:
colnames: a (text) list of column names (e.g.,
`['andy', 'cookie']`)
id_col: the name of the id column (e.g., `'cartodb_id'`)
subquery: the subquery for exposing the data (e.g.,
SELECT * FROM favorite_things)
Output:
A SQL query for packaging the data for consumption within
`KMeans().nonspatial`. Format will be a list of length one,
with the first element a dict with keys ('rowid', 'attr1',
'attr2', ...)
Args:
params (:obj:`dict`) - A :obj:`dict` with the following keys:
- colnames: a (text) list of column names (e.g.,
`['andy', 'cookie']`)
- id_col: the name of the id column (e.g., `'cartodb_id'`)
- subquery: the subquery for exposing the data (e.g.,
SELECT * FROM favorite_things)
Returns:
`plpy.respone`: A response from the database. The data has been
packaged consumption within `KMeans().nonspatial`. Format will be a
list of length one, with the first element a dict with keys
('rowid', 'attr1', 'attr2', ...)
"""
agg_cols = ', '.join([
'array_agg({0}) As arr_col{1}'.format(val, idx+1)

View File

@@ -37,7 +37,7 @@ class Getis(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_getis(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
attr_vals = pu.get_attributes(result)
# build PySAL weight object

View File

@@ -1,29 +1,21 @@
"""
Moran's I geostatistics (global clustering & outliers presence)
Functionality relies on a combination of `PySAL
<http://pysal.readthedocs.io/en/latest/>`__ and the data providered provided in
the class instantiation (which defaults to PostgreSQL's plpy module's `database
access functions <https://www.postgresql.org/docs/10/static/plpython.html>`__).
"""
from collections import OrderedDict
# TODO: Fill in local neighbors which have null/NoneType values with the
# average of the their neighborhood
import pysal as ps
from collections import OrderedDict
from crankshaft.analysis_data_provider import AnalysisDataProvider
# crankshaft module
import crankshaft.pysal_utils as pu
from crankshaft.analysis_data_provider import AnalysisDataProvider
# High level interface ---------------------------------------
class Moran(object):
"""Class for calculation of Moran's I statistics (global, local, and local
rate)
Parameters:
data_provider (:obj:`AnalysisDataProvider`): Class for fetching data. See
the `crankshaft.analysis_data_provider` module for more information.
"""
def __init__(self, data_provider=None):
if data_provider is None:
self.data_provider = AnalysisDataProvider()
@@ -36,26 +28,7 @@ class Moran(object):
Moran's I (global)
Implementation building neighbors with a PostGIS database and Moran's I
core clusters with PySAL.
Args:
subquery (str): Query to give access to the data needed. This query
must give access to ``attr_name``, ``geom_col``, and ``id_col``.
attr_name (str): Column name of data to analyze
w_type (str): Type of spatial weight. Must be one of `knn`
or `queen`. See `PySAL documentation
<http://pysal.readthedocs.io/en/latest/users/tutorials/weights.html>`__
for more information.
num_ngbrs (int): If using `knn` for ``w_type``, this
specifies the number of neighbors to be used to define the spatial
neighborhoods.
permutations (int): Number of permutations for performing
conditional randomization to find the p-value. Higher numbers
takes a longer time for getting results.
geom_col (str): Name of the geometry column in the dataset for
finding the spatial neighborhoods.
id_col (str): Row index for each value. Usually the database index.
Andy Eschbacher
"""
params = OrderedDict([("id_col", id_col),
("attr1", attr_name),
@@ -63,7 +36,7 @@ class Moran(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_moran(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
# collect attributes
attr_vals = pu.get_attributes(result)
@@ -80,38 +53,8 @@ class Moran(object):
def local_stat(self, subquery, attr,
w_type, num_ngbrs, permutations, geom_col, id_col):
"""
Moran's I (local)
Args:
subquery (str): Query to give access to the data needed. This query
must give access to ``attr_name``, ``geom_col``, and ``id_col``.
attr (str): Column name of data to analyze
w_type (str): Type of spatial weight. Must be one of `knn`
or `queen`. See `PySAL documentation
<http://pysal.readthedocs.io/en/latest/users/tutorials/weights.html>`__
for more information.
num_ngbrs (int): If using `knn` for ``w_type``, this
specifies the number of neighbors to be used to define the spatial
neighborhoods.
permutations (int): Number of permutations for performing
conditional randomization to find the p-value. Higher numbers
takes a longer time for getting results.
geom_col (str): Name of the geometry column in the dataset for
finding the spatial neighborhoods.
id_col (str): Row index for each value. Usually the database index.
Returns:
list of tuples: Where each tuple consists of the following values:
- quadrants classification (one of `HH`, `HL`, `LL`, or `LH`)
- p-value
- spatial lag
- standardized spatial lag (centered on the mean, normalized by the
standard deviation)
- original value
- standardized value
- Moran's I statistic
- original row index
Moran's I implementation for PL/Python
Andy Eschbacher
"""
# geometries with attributes that are null are ignored
@@ -123,7 +66,7 @@ class Moran(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_moran(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
attr_vals = pu.get_attributes(result)
weight = pu.get_weight(result, w_type, num_ngbrs)
@@ -135,45 +78,13 @@ class Moran(object):
# find quadrants for each geometry
quads = quad_position(lisa.q)
# calculate spatial lag
lag = ps.weights.spatial_lag.lag_spatial(weight, lisa.y)
lag_std = ps.weights.spatial_lag.lag_spatial(weight, lisa.z)
return zip(
quads,
lisa.p_sim,
lag,
lag_std,
lisa.y,
lisa.z,
lisa.Is,
weight.id_order
)
return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)
def global_rate_stat(self, subquery, numerator, denominator,
w_type, num_ngbrs, permutations, geom_col, id_col):
"""
Moran's I Rate (global)
Args:
subquery (str): Query to give access to the data needed. This query
must give access to ``attr_name``, ``geom_col``, and ``id_col``.
numerator (str): Column name of numerator to analyze
denominator (str): Column name of the denominator
w_type (str): Type of spatial weight. Must be one of `knn`
or `queen`. See `PySAL documentation
<http://pysal.readthedocs.io/en/latest/users/tutorials/weights.html>`__
for more information.
num_ngbrs (int): If using `knn` for ``w_type``, this
specifies the number of neighbors to be used to define the spatial
neighborhoods.
permutations (int): Number of permutations for performing
conditional randomization to find the p-value. Higher numbers
takes a longer time for getting results.
geom_col (str): Name of the geometry column in the dataset for
finding the spatial neighborhoods.
id_col (str): Row index for each value. Usually the database index.
Andy Eschbacher
"""
params = OrderedDict([("id_col", id_col),
("attr1", numerator),
@@ -182,7 +93,7 @@ class Moran(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_moran(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
# collect attributes
numer = pu.get_attributes(result, 1)
@@ -199,39 +110,8 @@ class Moran(object):
def local_rate_stat(self, subquery, numerator, denominator,
w_type, num_ngbrs, permutations, geom_col, id_col):
"""
Moran's I Local Rate
Args:
subquery (str): Query to give access to the data needed. This query
must give access to ``attr_name``, ``geom_col``, and ``id_col``.
numerator (str): Column name of numerator to analyze
denominator (str): Column name of the denominator
w_type (str): Type of spatial weight. Must be one of `knn`
or `queen`. See `PySAL documentation
<http://pysal.readthedocs.io/en/latest/users/tutorials/weights.html>`__
for more information.
num_ngbrs (int): If using `knn` for ``w_type``, this
specifies the number of neighbors to be used to define the spatial
neighborhoods.
permutations (int): Number of permutations for performing
conditional randomization to find the p-value. Higher numbers
takes a longer time for getting results.
geom_col (str): Name of the geometry column in the dataset for
finding the spatial neighborhoods.
id_col (str): Row index for each value. Usually the database index.
Returns:
list of tuples: Where each tuple consists of the following values:
- quadrants classification (one of `HH`, `HL`, `LL`, or `LH`)
- p-value
- spatial lag
- standardized spatial lag (centered on the mean, normalized by the
standard deviation)
- original value (roughly numerator divided by denominator)
- standardized value
- Moran's I statistic
- original row index
Moran's I Local Rate
Andy Eschbacher
"""
# geometries with values that are null are ignored
# resulting in a collection of not as near neighbors
@@ -243,7 +123,7 @@ class Moran(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_moran(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
# collect attributes
numer = pu.get_attributes(result, 1)
@@ -258,20 +138,7 @@ class Moran(object):
# find quadrants for each geometry
quads = quad_position(lisa.q)
# spatial lag
lag = ps.weights.spatial_lag.lag_spatial(weight, lisa.y)
lag_std = ps.weights.spatial_lag.lag_spatial(weight, lisa.z)
return zip(
quads,
lisa.p_sim,
lag,
lag_std,
lisa.y,
lisa.z,
lisa.Is,
weight.id_order
)
return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)
def local_bivariate_stat(self, subquery, attr1, attr2,
permutations, geom_col, id_col,
@@ -287,7 +154,7 @@ class Moran(object):
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_moran(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
# collect attributes
attr1_vals = pu.get_attributes(result, 1)
@@ -310,12 +177,12 @@ class Moran(object):
def map_quads(coord):
"""
Map a quadrant number to Moran's I designation
HH=1, LH=2, LL=3, HL=4
Args:
coord (int): quadrant of a specific measurement
Returns:
classification (one of 'HH', 'LH', 'LL', or 'HL')
Map a quadrant number to Moran's I designation
HH=1, LH=2, LL=3, HL=4
Input:
@param coord (int): quadrant of a specific measurement
Output:
classification (one of 'HH', 'LH', 'LL', or 'HL')
"""
if coord == 1:
return 'HH'
@@ -325,17 +192,17 @@ def map_quads(coord):
return 'LL'
elif coord == 4:
return 'HL'
return None
else:
return None
def quad_position(quads):
"""
Map all quads
Args:
quads (:obj:`numpy.ndarray`): an array of quads classified by
1-4 (PySAL default)
Returns:
list: an array of quads classied by 'HH', 'LL', etc.
Produce Moran's I classification based of n
Input:
@param quads ndarray: an array of quads classified by
1-4 (PySAL default)
Output:
@param list: an array of quads classied by 'HH', 'LL', etc.
"""
return [map_quads(q) for q in quads]

View File

@@ -9,14 +9,16 @@ import pysal as ps
def construct_neighbor_query(w_type, query_vals):
"""Return query (a string) used for finding neighbors
@param w_type text: type of neighbors to calculate ('knn' or 'queen')
@param query_vals dict: values used to construct the query
Args:
w_type (:obj:`str`): type of neighbors to calculate. One of 'knn'
or 'queen')
query_vals (:obj:`dict`): values used to construct the query
"""
if w_type.lower() == 'knn':
return knn(query_vals)
else:
if w_type.lower() == 'queen':
return queen(query_vals)
return knn(query_vals)
# Build weight object

View File

@@ -61,7 +61,7 @@ class Markov(object):
"subquery": subquery,
"num_ngbrs": num_ngbrs}
result = self.data_provider.get_markov(w_type, params)
result = self.data_provider.get_weight_and_attrs(w_type, params)
# build weight
weights = pu.get_weight(result, w_type)

View File

@@ -0,0 +1,2 @@
"""Import all functions from for spatial lag"""
from spatial_lag import SpatialLag

View File

@@ -0,0 +1,46 @@
"""
Spatial Lag (using local kNN neighbors identifying spatial lag for a feature)
"""
from collections import OrderedDict
import pysal as ps
# crankshaft module
from crankshaft.analysis_data_provider import AnalysisDataProvider
import crankshaft.pysal_utils as pu
# High level interface ---------------------------------------
class SpatialLag(object):
def __init__(self, data_provider=None):
if data_provider is None:
self.data_provider = AnalysisDataProvider()
else:
self.data_provider = data_provider
def spatial_lag(self, subquery, attr,
w_type, num_ngbrs, geom_col, id_col):
"""
Querying spatial lags for kNN neighbors
"""
# geometries with attributes that are null are ignored
# resulting in a collection of not as near neighbors
params = OrderedDict([("id_col", id_col),
("attr1", attr),
("geom_col", geom_col),
("subquery", subquery),
("num_ngbrs", num_ngbrs)])
result = self.data_provider.get_weight_and_attrs(w_type, params)
attr_vals = pu.get_attributes(result)
weight = pu.get_weight(result, w_type, num_ngbrs)
# calculate spatial_lag values
spatial_lag = ps.weights.spatial_lag.lag_spatial(weight, attr_vals)
return zip(spatial_lag, weight.id_order)

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -41,7 +41,7 @@ class FakeDataProvider(AnalysisDataProvider):
def __init__(self, mock_data):
self.mock_result = mock_data
def get_getis(self, w_type, param):
def get_weight_and_attrs(self, w_type, param):
return self.mock_result

View File

@@ -14,7 +14,7 @@ class FakeDataProvider(AnalysisDataProvider):
def __init__(self, mock_data):
self.mock_result = mock_data
def get_moran(self, w_type, params):
def get_weight_and_attrs(self, w_type, params):
return self.mock_result
@@ -71,10 +71,10 @@ class MoranTest(unittest.TestCase):
random_seeds.set_random_seeds(1234)
result = moran.local_stat('subquery', 'value',
'knn', 5, 99, 'the_geom', 'cartodb_id')
result = [(row[0], row[6]) for row in result]
result = [(row[0], row[1]) for row in result]
zipped_values = zip(result, self.moran_data)
for ([res_quad, res_val], [exp_val, exp_quad]) in zipped_values:
for ([res_val, res_quad], [exp_val, exp_quad]) in zipped_values:
self.assertAlmostEqual(res_val, exp_val)
self.assertEqual(res_quad, exp_quad)
@@ -89,11 +89,11 @@ class MoranTest(unittest.TestCase):
moran = Moran(FakeDataProvider(data))
result = moran.local_rate_stat('subquery', 'numerator', 'denominator',
'knn', 5, 99, 'the_geom', 'cartodb_id')
result = [(row[0], row[6]) for row in result]
result = [(row[0], row[1]) for row in result]
zipped_values = zip(result, self.moran_data)
for ([res_quad, res_val], [exp_val, exp_quad]) in zipped_values:
for ([res_val, res_quad], [exp_val, exp_quad]) in zipped_values:
self.assertAlmostEqual(res_val, exp_val)
def test_moran(self):

View File

@@ -17,7 +17,7 @@ class FakeDataProvider(AnalysisDataProvider):
def __init__(self, data):
self.mock_result = data
def get_markov(self, w_type, params):
def get_weight_and_attrs(self, w_type, params):
return self.mock_result

View File

@@ -0,0 +1,51 @@
import unittest
import numpy as np
from helper import fixture_file
from crankshaft.spatial_lag import SpatialLag
from crankshaft.analysis_data_provider import AnalysisDataProvider
import crankshaft.pysal_utils as pu
from crankshaft import random_seeds
import json
from collections import OrderedDict
class FakeDataProvider(AnalysisDataProvider):
"""Data provider for existing parsed data"""
def __init__(self, mock_data):
self.mock_result = mock_data
def get_weight_and_attrs(self, w_type, params): # pylint: disable=unused-argument
"""mock get_weight_and_attrs"""
return self.mock_result
class SpatialLagTest(unittest.TestCase):
"""Testing class for Spatial Lag function"""
def setUp(self):
self.params = {"id_col": "cartodb_id",
"attr1": "mehak",
"subquery": "SELECT * FROM m_list",
"geom_col": "the_geom",
"num_ngbrs": 10}
self.neighbors_data = json.loads(
open(fixture_file('lag_data.json')).read())
self.lag_result = json.loads(
open(fixture_file('lag_result.json')).read())
def test_local_stat(self):
"""Test Spatial Lag function"""
data = [OrderedDict([('id', d['id']),
('attr1', d['value']),
('neighbors', d['neighbors'])])
for d in self.neighbors_data]
spatial = SpatialLag(FakeDataProvider(data))
result = spatial.spatial_lag('subquery', 'value',
'knn', 5, 'the_geom', 'cartodb_id')
result = [(row[0], row[1]) for row in result]
zipped_values = zip(result, self.lag_result)
for ([res_lag, _], [_, exp_lag]) in zipped_values:
self.assertEqual(res_lag, exp_lag)