updating to run in crankshaft. Output still wrong but getting there

adding deps
missing init
2016-05-20 20:44:40 +00:00 · 2016-05-18 17:53:35 -04:00 · 2016-05-18 17:34:22 -04:00 · 2016-05-18 17:32:14 -04:00 · 2016-05-18 17:22:42 -04:00 · 2016-03-30 15:40:29 -04:00
59 changed files with 2912 additions and 679 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,3 @@
+envs/
+*.pyc
+.DS_Store
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,91 @@
+# Development process
+
+Please read the Working Process/Quickstart Guide in [README.md](https://github.com/CartoDB/crankshaft/blob/master/README.md) first.
+
+For any modification of crankshaft, such as adding new features,
+refactoring or bug-fixing, topic branch must be created out of the `develop`
+branch and be used for the development process.
+
+Modifications are done inside `src/pg/sql` and `src/py/crankshaft`.
+
+Take into account:
+
+*  Tests must be added for any new functionality
+   (inside `src/pg/test`, `src/py/crankshaft/test`) as well as to
+   detect any bugs that are being fixed.
+*  Add or modify the corresponding documentation files in the `doc` folder.
+   Since we expect to have highly technical functions here, an extense
+   background explanation would be of great help to users of this extension.
+*  Convention: snake case(i.e. `snake_case` and not `CamelCase`)
+   shall be used for all function names.
+   Prefix function names intended for public use with `cdb_`
+   and private functions (to be used only internally inside
+   the extension)  with `_cdb_`.
+
+Once the code is ready to be tested, update the local development installation
+with `sudo make install`.
+This will update the 'dev' version of the extension in `src/pg/` and
+make it available to PostgreSQL.
+It will also install the python package (crankshaft) in a virtual
+environment `env/dev`.
+
+The version number of the Python package, defined in
+`src/pg/crankshaft/setup.py` will be overridden when
+the package is released and always match the extension version number,
+but for development it shall be kept as '0.0.0'.
+
+Run the tests with `make test`.
+
+To use the python extension for custom tests, activate the virtual
+environment with:
+
+```
+source envs/dev/bin/activate
+```
+
+Update extension in a working database with:
+
+* `ALTER EXTENSION crankshaft VERSION TO 'current';`
+  `ALTER EXTENSION crankshaft VERSION TO 'dev';`
+
+Note: we keep the current development version install as 'dev' always;
+we update through the 'current' alias to allow changing the extension
+contents but not the version identifier. This will fail if the
+changes involve incompatible function changes such as a different
+return type; in that case the offending function (or the whole extension)
+should be dropped manually before the update.
+
+If the extension has not previously been installed in a database,
+it can be installed directly with:
+
+* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
+
+Note: the development extension uses the development python virtual
+environment automatically.
+
+Before proceeding to the release process peer code reviewing of the code is
+a must.
+
+Once the feature or bugfix is completed and all the tests are passing
+a Pull-Request shall be created on the topic branch, reviewed by a peer
+and then merged back into the `develop` branch when all CI tests pass.
+
+When the changes in the `develop` branch are to be released in a new
+version of the extension, a PR must be created on the `develop` branch.
+
+The release manage will take hold of the PR at this moment to proceed
+to the release process for a new revision of the extension.
+
+## Relevant development tasks available in the Makefile
+
+```
+* `make help` show a short description of the available targets
+
+* `sudo make install` will generate the extension scripts for the development
+  version ('dev'/'current') and install the python package into the
+  development virtual environment `envs/dev`.
+  Intended for use by developers.
+
+* `make test` will run the tests for the installed development extension.
+  Intended for use by developers.
+```
--- a/DEPLOYING.md
+++ b/DEPLOYING.md
@@ -1,43 +0,0 @@
-# Workflow
-
-... (branching/merging flow)
-
-# Deployment
-
-...
-
-Deployment to db servers: the next command will install both the Python
-package and the extension.
-
-```
-sudo make install
-```
-
-Installing only the Python package:
-
-```
-sudo pip install python/crankshaft --upgrade
-```
-
-Caveat: note that `pip install ./crankshaft` will install
-from local files, but `pip install crankshaft` will not.
-
-CI: Install and run the tests on the installed extension and package:
-
-```
-(sudo make install && PGUSER=postgres make testinstalled)
-```
-
-Installing the extension in user databases:
-Once installed in a server, the extension can be added
-to a database with the next SQL command:
-
-```
-CREATE EXTENSION crankshaft;
-```
-
-To upgrade the extension to an specific version X.Y.Z:
-
-```
-ALTER EXTENSION crankshaft UPGRADE TO 'X.Y.Z';
-```
--- a/65
+++ b/65
@@ -1,13 +1,70 @@
+include ./Makefile.global
+
 EXT_DIR = src/pg
 PYP_DIR = src/py

 .PHONY: install
 .PHONY: run_tests
+.PHONY: release
+.PHONY: deploy

-install:
+# Generate and install developmet versions of the extension
+# and python package.
+# The extension is named 'dev' with a 'current' alias for easily upgrading.
+# The Python package is installed in a virtual environment envs/dev/
+# Requires sudo.
+install: ## Generate and install development version of the extension; requires sudo.
 	$(MAKE) -C $(PYP_DIR) install
 	$(MAKE) -C $(EXT_DIR) install

-testinstalled:
-	$(MAKE) -C $(PYP_DIR) testinstalled
-	$(MAKE) -C $(EXT_DIR) installcheck
+# Run the tests for the installed development extension and
+# python package
+test:   ## Run the tests for the development version of the extension
+	$(MAKE) -C $(PYP_DIR) test
+	$(MAKE) -C $(EXT_DIR) test
+
+# Generate a new release into release
+release: ## Generate a new release of the extension. Only for telease manager
+	$(MAKE) -C $(EXT_DIR) release
+	$(MAKE) -C $(PYP_DIR) release
+
+# Install the current release.
+# The Python package is installed in a virtual environment envs/X.Y.Z/
+# Requires sudo.
+# Use the RELEASE_VERSION environment variable to deploy a specific version:
+#     sudo make deploy RELEASE_VERSION=1.0.0
+deploy: ## Deploy a released extension. Only for release manager. Requires sudo.
+	$(MAKE) -C $(EXT_DIR) deploy
+	$(MAKE) -C $(PYP_DIR) deploy
+
+# Cleanup development extension script files
+clean-dev: ## clean up development extension script files
+	rm -f src/pg/$(EXTENSION)--*.sql
+
+# Cleanup all releases
+clean-releases: ## clean up all releases
+	rm -rf release/python/*
+	rm -f release/$(EXTENSION)--*.sql
+	rm -f release/$(EXTENSION).control
+
+# Cleanup current/specific version
+clean-release: ## clean up current release
+	rm -rf release/python/$(RELEASE_VERSION)
+	rm -f release/$(RELEASE_VERSION)--*.sql
+
+# Cleanup all virtual environments
+clean-environments: ## clean up all virtual environments
+	rm -rf envs/*
+
+clean-all: clean-dev clean-release clean-environments
+
+help:
+	@IFS=$$'\n' ; \
+	help_lines=(`fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//'`); \
+	for help_line in $${help_lines[@]}; do \
+		IFS=$$'#' ; \
+		help_split=($$help_line) ; \
+		help_command=`echo $${help_split[0]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
+		help_info=`echo $${help_split[2]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
+		printf "%-30s %s\n" $$help_command $$help_info ; \
+	done
--- a/Makefile.global
+++ b/Makefile.global
@@ -0,0 +1,6 @@
+SELF_DIR         := $(dir $(lastword $(MAKEFILE_LIST)))
+EXTENSION        = crankshaft
+PACKAGE          = crankshaft
+EXTVERSION       = $(shell grep default_version $(SELF_DIR)/src/pg/$(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
+RELEASE_VERSION ?= $(EXTVERSION)
+SED              = sed
--- a/NEWS.md
+++ b/NEWS.md
@@ -0,0 +1,7 @@
+0.0.2 (2016-03-16)
+------------------
+* New versioning approach using per-version Python virtual environments
+
+0.0.1 (2016-02-22)
+------------------
+* Preliminar release
--- a/README.md
+++ b/README.md
@@ -9,94 +9,63 @@ CartoDB Spatial Analysis extension for PostgreSQL.
 * - *src/pg* contains the PostgreSQL extension source code
 * - *src/py* Python module source code
 * *release* reseleased versions
+* *env* base directory for Python virtual environments

 ## Requirements

 * pip, virtualenv, PostgreSQL
-* python-scipy system package (see src/py/README.md)
+* python-scipy system package (see [src/py/README.md](https://github.com/CartoDB/crankshaft/blob/master/src/py/README.md))

-# Working Process
+# Working Process -- Quickstart Guide

-## Development
+We distinguish two roles regarding the development cycle of crankshaft:

-Work in `src/pg/sql`, `src/py/crankshaft`;
-use a topic branch. See src/py/README.md
-for the procedure to work with the Python local environment.
+* *developers* will implement new functionality and bugfixes into
+  the codebase and will request for new releases of the extension.
+* A *release manager* will attend these requests and will handle
+  the release process. The release process is sequential:
+  no concurrent releases will ever be in the works.

-Take into account:
+We use the default `develop` branch as the basis for development.
+The `master` branch is used to merge and tag releases to be
+deployed in production.

-*  Always remember to add tests for any new functionality
-   documentation.
-*  Add or modify the corresponding documentation files in the `doc` folder.
-   Since we expect to have highly technical functions here, an extense
-   background explanation would be of great help to users of this extension.
-*  Convention: Use snake case (i.e. `snake_case` and not `CamelCase`) for all
-   functions. Prefix functions intended for public use with `cdb_`
-   and private functions (to be used only internally inside
-   the extension)  with `_cdb_`.
+Developers shall create a new topic branch from `develop` for any new feature
+or bugfix and commit their changes to it and eventually merge back into
+the `develop` branch. When a new release is required a Pull Request
+will be open against the `develop` branch.

-Update local installation with `sudo make install`
-(this will update the 'dev' version of the extension in 'src/pg/')
+The `develop` pull requests will be handled by the release manage,
+who will merge into master where new releases are prepared and tagged.
+The `master` branch is the sole responsibility of the release masters
+and developers must not commit or merge into it.

-Run the tests with `PGUSER=postgres make test`
+## Development Guidelines

-Update extension in working database with
+For a detailed description of the development process please see
+the [CONTRIBUTING.md](https://github.com/CartoDB/crankshaft/blob/master/CONTRIBUTING.md) guide.

-* `ALTER EXTENSION crankshaft VERSION TO 'current';`
-  `ALTER EXTENSION crankshaft VERSION TO 'dev';`
+Any modification to the source code (`src/pg/sql` for the SQL extension,
+`src/py/crankshaft` for the Python package) shall always be done
+in a topic branch created from the `develop` branch.

-Note: we keep the current development version install as 'dev' always;
-we update through the 'current' alias to allow changing the extension
-contents but not the version identifier. This will fail if the
-changes involve incompatible function changes such as a different
-return type; in that case the offending function (or the whole extension)
-should be dropped manually before the update.
+Tests, documentation and peer code reviewing are required for all
+modifications.

-If the extension has not previously been installed in a database
-we can:
+The tests (both for SQL and Python) are executed by running,
+from the top directory:

-* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
-
-Once the tests are succeeding a new Pull-Request can be created.
-CI-tests must be checked to be successfull.
-
-Before merging a topic branch peer code reviewing of the code is a must.
+```
+sudo make install
+make test
+```

+To request a new release, which will be handled by them
+release manager, a Pull Request must be created in the `develop`
+branch.

 ## Release

-The release process of a new version of the extension
-shall by performed by the designated *Release Manager*.
-
-Note that we expect to gradually automate this process.
-
-Having checkout the topic branch of the PR to be released:
-
-The version number in `pg/cranckshaft.control` must first be updated.
-To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
-
-We now will explain the process for the case of backwards-compatible
-releases (updating the minor or patch version numbers).
-
-TODO: document the complex case of major releases.
-
-The next command must be executed to produce the main installation
-script for the new release, `release/cranckshaft--X.Y.Z.sql`.
-
-```
-make release
-```
-
-Then, the release manager shall produce upgrade and downgrade scripts
-to migrate to/from the previous release. In the case of minor/patch
-releases this simply consist in extracting the functions that have changed
-and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
-file.
-
-TODO: configure the local enviroment to be used by the release;
-currently should be directory `src/py/X.Y.Z`, but this must be fixed;
-a possibility to explore is to use the `cdb_conf` table.
-
-TODO: testing procedure for the new release
-
-TODO: push, merge, tag, deploy procedures.
+The release and deployment process is described in the
+[RELEASE.md](https://github.com/CartoDB/crankshaft/blob/master/RELEASE.md) guide and it is the responsibility of the designated
+release manager.
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -0,0 +1,93 @@
+# Release & Deployment Process
+
+Please read the Working Process/Quickstart Guide in README.md
+and the Development guidelines in CONTRIBUTING.md.
+
+The release process of a new version of the extension
+shall be performed by the designated *Release Manager*.
+
+Note that we expect to gradually automate more of this process.
+
+Having checked PR to be released it shall be
+merged back into the `master` branch to prepare the new release.
+
+The version number in `pg/cranckshaft.control` must first be updated.
+To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
+
+Thew `NEWS.md` will be updated.
+
+We now will explain the process for the case of backwards-compatible
+releases (updating the minor or patch version numbers).
+
+TODO: document the complex case of major releases.
+
+The next command must be executed to produce the main installation
+script for the new release, `release/cranckshaft--X.Y.Z.sql` and
+also to copy the python package to `release/python/X.Y.Z/crankshaft`.
+
+```
+make release
+```
+
+Then, the release manager shall produce upgrade and downgrade scripts
+to migrate to/from the previous release. In the case of minor/patch
+releases this simply consist in extracting the functions that have changed
+and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
+file.
+
+The new release can be deployed for staging/smoke tests with this command:
+
+```
+sudo make deploy
+```
+
+This will copy the current 'X.Y.Z' released version of the extension to
+PostgreSQL. The corresponding Python extension will be installed in a
+virtual environment in `envs/X.Y.Z`.
+
+It can be activated with:
+
+```
+source envs/X.Y.Z/bin/activate
+```
+
+But note that this is needed only for using the package directly;
+the 'X.Y.Z' version of the extension will automatically use the
+python package from this virtual environment.
+
+The `sudo make deploy` operation can be also used for installing
+the new version after it has been released.
+
+To install a specific version 'X.Y.Z' different from the current one
+(which must be present in `releases/`) you can:
+
+```
+sudo make deploy RELEASE_VERSION=X.Y.Z
+```
+
+TODO: testing procedure for the new release.
+
+TODO: procedure for staging deployment.
+
+TODO: procedure for merging to master, tagging and deploying
+in production.
+
+## Relevant release & deployment tasks available in the Makefile
+
+```
+* `make help` show a short description of the available targets
+
+* `make release` will generate a new release (version number defined in
+  `src/pg/crankshaft.control`) into `release/`.
+  Intended for use by the release manager.
+
+* `sudo make deploy` will install the current release X.Y.Z from the
+  `release/` files into PostgreSQL and a Python virtual environment
+  `envs/X.Y.Z`.
+  Intended for use by the release manager and deployment jobs.
+
+* `sudo make deploy RELEASE_VERSION=X.Y.Z` will install specified version
+  previously generated in `release/`
+  into PostgreSQL and a Python virtual environment `envs/X.Y.Z`.
+  Intended for use by the release manager and deployment jobs.
+```
--- a/TODO.md
+++ b/TODO.md
@@ -1,9 +0,0 @@
-* [x] Support versioning
-* [x] Test use of `plpy` from python Package
-* [x] Add `pysal` etc. dependencies
-* [x] Define documentation practices (general, per extension/package?)
-* [x] Add initial function set (WIP)
-* Unify style of function comments
-* [x] Add integration tests
-* Make target to open a new version development (create symlinks, etc.)
-* [x] Should add cartodb ext. as a dependency?
--- a/release/.gitignore
+++ b/release/.gitignore
--- a/release/crankshaft--0.0.1--0.0.2.sql
+++ b/release/crankshaft--0.0.1--0.0.2.sql
@@ -0,0 +1,74 @@
+CREATE OR REPLACE FUNCTION cdb_crankshaft.cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '0.0.2'::text;
+$$ language 'sql' STABLE STRICT;
+
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' STABLE STRICT;
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    RETURN '/home/ubuntu/crankshaft/envs';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+-- Moran's I
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
--- a/release/crankshaft--0.0.1.sql
+++ b/release/crankshaft--0.0.1.sql
@@ -0,0 +1,148 @@
+--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
+-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
+-- Internal function.
+-- Set the seeds of the RNGs (Random Number Generators)
+-- used internally.
+CREATE OR REPLACE FUNCTION
+_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+-- Moran's I
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+-- Moran's I Local Rate
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+-- Function by Stuart Lynn for a simple interpolation of a value
+-- from a polygon table over an arbitrary polygon
+-- (weighted by the area proportion overlapped)
+-- Aereal weighting is a very simple form of aereal interpolation.
+--
+-- Parameters:
+--   * geom a Polygon geometry which defines the area where a value will be
+--     estimated as the area-weighted sum of a given table/column
+--   * target_table_name table name of the table that provides the values
+--   * target_column column name of the column that provides the values
+--   * schema_name optional parameter to defina the schema the target table
+--     belongs to, which is necessary if its not in the search_path.
+--     Note that target_table_name should never include the schema in it.
+-- Return value:
+--   Aereal-weighted interpolation of the column values over the geometry
+CREATE OR REPLACE
+FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
+  RETURNS numeric AS
+$$
+DECLARE
+	result numeric;
+  qualified_name text;
+BEGIN
+  IF schema_name IS NULL THEN
+    qualified_name := Format('%I', target_table_name);
+  ELSE
+    qualified_name := Format('%I.%s', schema_name, target_table_name);
+  END IF;
+  EXECUTE Format('
+    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
+    FROM %s AS a
+    WHERE $1 && a.the_geom
+  ', target_column, qualified_name)
+  USING geom
+  INTO result;
+  RETURN result;
+END;
+$$ LANGUAGE plpgsql;
+--
+-- Creates N points randomly distributed arround the polygon
+--
+-- @param g - the geometry to be turned in to points
+--
+-- @param no_points - the number of points to generate
+--
+-- @params max_iter_per_point - the function generates points in the polygon's bounding box
+-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
+-- misses per point the funciton accepts before giving up.
+--
+-- Returns: Multipoint with the requested points
+CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
+RETURNS GEOMETRY AS $$
+DECLARE
+  extent GEOMETRY;
+  test_point Geometry;
+  width                NUMERIC;
+  height               NUMERIC;
+  x0                   NUMERIC;
+  y0                   NUMERIC;
+  xp                   NUMERIC;
+  yp                   NUMERIC;
+  no_left              INTEGER;
+  remaining_iterations INTEGER;
+  points               GEOMETRY[];
+  bbox_line            GEOMETRY;
+  intersection_line    GEOMETRY;
+BEGIN
+  extent  := ST_Envelope(geom);
+  width   := ST_XMax(extent) - ST_XMIN(extent);
+  height  := ST_YMax(extent) - ST_YMIN(extent);
+  x0 	  := ST_XMin(extent);
+  y0 	  := ST_YMin(extent);
+  no_left := no_points;
+
+  LOOP
+    if(no_left=0) THEN
+      EXIT;
+    END IF;
+    yp = y0 + height*random();
+    bbox_line  = ST_MakeLine(
+      ST_SetSRID(ST_MakePoint(yp, x0),4326),
+      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
+    );
+    intersection_line = ST_Intersection(bbox_line,geom);
+  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
+	  points := points || test_point;
+	  no_left = no_left - 1 ;
+  END LOOP;
+  RETURN ST_Collect(points);
+END;
+$$
+LANGUAGE plpgsql VOLATILE;
+-- Make sure by default there are no permissions for publicuser
+-- NOTE: this happens at extension creation time, as part of an implicit transaction.
+-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
+
+-- Grant permissions on the schema to publicuser (but just the schema)
+GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
+
+-- Revoke execute permissions on all functions in the schema by default
+-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/release/crankshaft--0.0.2--0.0.1.sql
+++ b/release/crankshaft--0.0.2--0.0.1.sql
@@ -0,0 +1,44 @@
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+DROP FUNCTION IF EXISTS cdb_crankshaft.cdb_crankshaft_version();
+DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_internal_version();
+DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_activate_py();
--- a/release/crankshaft--0.0.2.sql
+++ b/release/crankshaft--0.0.2.sql
@@ -0,0 +1,186 @@
+--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
+-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
+-- Version number of the extension release
+CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '0.0.2'::text;
+$$ language 'sql' STABLE STRICT;
+
+-- Internal identifier of the installed extension instence
+-- e.g. 'dev' for current development version
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' STABLE STRICT;
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    -- RETURN '/opt/virtualenvs/crankshaft';
+    RETURN '/home/ubuntu/crankshaft/envs';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+-- Use the crankshaft python module
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
+-- Internal function.
+-- Set the seeds of the RNGs (Random Number Generators)
+-- used internally.
+CREATE OR REPLACE FUNCTION
+_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+-- Moran's I
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+-- Moran's I Local Rate
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+-- Function by Stuart Lynn for a simple interpolation of a value
+-- from a polygon table over an arbitrary polygon
+-- (weighted by the area proportion overlapped)
+-- Aereal weighting is a very simple form of aereal interpolation.
+--
+-- Parameters:
+--   * geom a Polygon geometry which defines the area where a value will be
+--     estimated as the area-weighted sum of a given table/column
+--   * target_table_name table name of the table that provides the values
+--   * target_column column name of the column that provides the values
+--   * schema_name optional parameter to defina the schema the target table
+--     belongs to, which is necessary if its not in the search_path.
+--     Note that target_table_name should never include the schema in it.
+-- Return value:
+--   Aereal-weighted interpolation of the column values over the geometry
+CREATE OR REPLACE
+FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
+  RETURNS numeric AS
+$$
+DECLARE
+	result numeric;
+  qualified_name text;
+BEGIN
+  IF schema_name IS NULL THEN
+    qualified_name := Format('%I', target_table_name);
+  ELSE
+    qualified_name := Format('%I.%s', schema_name, target_table_name);
+  END IF;
+  EXECUTE Format('
+    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
+    FROM %s AS a
+    WHERE $1 && a.the_geom
+  ', target_column, qualified_name)
+  USING geom
+  INTO result;
+  RETURN result;
+END;
+$$ LANGUAGE plpgsql;
+--
+-- Creates N points randomly distributed arround the polygon
+--
+-- @param g - the geometry to be turned in to points
+--
+-- @param no_points - the number of points to generate
+--
+-- @params max_iter_per_point - the function generates points in the polygon's bounding box
+-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
+-- misses per point the funciton accepts before giving up.
+--
+-- Returns: Multipoint with the requested points
+CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
+RETURNS GEOMETRY AS $$
+DECLARE
+  extent GEOMETRY;
+  test_point Geometry;
+  width                NUMERIC;
+  height               NUMERIC;
+  x0                   NUMERIC;
+  y0                   NUMERIC;
+  xp                   NUMERIC;
+  yp                   NUMERIC;
+  no_left              INTEGER;
+  remaining_iterations INTEGER;
+  points               GEOMETRY[];
+  bbox_line            GEOMETRY;
+  intersection_line    GEOMETRY;
+BEGIN
+  extent  := ST_Envelope(geom);
+  width   := ST_XMax(extent) - ST_XMIN(extent);
+  height  := ST_YMax(extent) - ST_YMIN(extent);
+  x0 	  := ST_XMin(extent);
+  y0 	  := ST_YMin(extent);
+  no_left := no_points;
+
+  LOOP
+    if(no_left=0) THEN
+      EXIT;
+    END IF;
+    yp = y0 + height*random();
+    bbox_line  = ST_MakeLine(
+      ST_SetSRID(ST_MakePoint(yp, x0),4326),
+      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
+    );
+    intersection_line = ST_Intersection(bbox_line,geom);
+  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
+	  points := points || test_point;
+	  no_left = no_left - 1 ;
+  END LOOP;
+  RETURN ST_Collect(points);
+END;
+$$
+LANGUAGE plpgsql VOLATILE;
+-- Make sure by default there are no permissions for publicuser
+-- NOTE: this happens at extension creation time, as part of an implicit transaction.
+-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
+
+-- Grant permissions on the schema to publicuser (but just the schema)
+GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
+
+-- Revoke execute permissions on all functions in the schema by default
+-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/release/crankshaft.control
+++ b/release/crankshaft.control
@@ -0,0 +1,5 @@
+comment = 'CartoDB Spatial Analysis extension'
+default_version = '0.0.2'
+requires = 'plpythonu, postgis, cartodb'
+superuser = true
+schema = cdb_crankshaft
--- a/release/python/.gitignore
+++ b/release/python/.gitignore
--- a/release/python/0.0.1/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/init.py
@@ -0,0 +1,2 @@
+import random_seeds
+import clustering
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
@@ -0,0 +1 @@
+from moran import *
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
@@ -0,0 +1,321 @@
+"""
+Moran's I geostatistics (global clustering & outliers presence)
+"""
+
+# TODO: Fill in local neighbors which have null/NoneType values with the
+#       average of the their neighborhood
+
+import numpy as np
+import pysal as ps
+import plpy
+
+# High level interface ---------------------------------------
+
+def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I implementation for PL/Python
+    Andy Eschbacher
+    """
+    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
+    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+            "attr1": attr,
+            "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    y = get_attributes(r, 1)
+    w = get_weight(r, w_type)
+
+    # calculate LISA values
+    lisa = ps.Moran_Local(y, w)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I Local Rate
+    Andy Eschbacher
+    """
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+             "numerator": numerator,
+             "denominator": denominator,
+             "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+        plpy.notice('r.nrows() = %d' % r.nrows())
+
+    ## collect attributes
+    numer = get_attributes(r, 1)
+    denom = get_attributes(r, 2)
+
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    ## TODO: Decide on which return values here
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
+
+def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    plpy.notice('** Constructing query')
+
+    qvals = {"num_ngbrs": num_ngbrs,
+             "attr1": attr1,
+             "attr2": attr2,
+             "table": t,
+             "geom_col": geom_column,
+             "id_col": id_col}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    ## collect attributes
+    attr1_vals = get_attributes(r, 1)
+    attr2_vals = get_attributes(r, 2)
+
+    # create weights
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
+
+    plpy.notice("len of Is: %d" % len(lisa.Is))
+
+    # find clustering of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+# Low level functions ----------------------------------------
+
+def map_quads(coord):
+    """
+        Map a quadrant number to Moran's I designation
+        HH=1, LH=2, LL=3, HL=4
+        Input:
+        :param coord (int): quadrant of a specific measurement
+    """
+    if coord == 1:
+        return 'HH'
+    elif coord == 2:
+        return 'LH'
+    elif coord == 3:
+        return 'LL'
+    elif coord == 4:
+        return 'HL'
+    else:
+        return None
+
+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        :param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        :param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM \"{table}\" As j " \
+                              "WHERE %(attr_where_j)s " \
+                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        :param params: dict of information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM \"{table}\" As j " \
+                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                 "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_query(w_type, query_vals):
+    """Return requested query.
+        :param w_type: type of neighbors to calculate (knn or queen)
+        :param query_vals: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+def get_attributes(query_res, attr_num):
+    """
+        :param query_res: query results with attributes and neighbors
+        :param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+## Build weight object
+def get_weight(query_res, w_type='queen', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        :param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    elif w_type == 'queen':
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
+def quad_position(quads):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    lisa_sig = np.array([map_quads(q) for q in quads])
+
+    return lisa_sig
+
+def lisa_sig_vals(pvals, quads, threshold):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    sig = (pvals <= threshold)
+
+    lisa_sig = np.empty(len(sig), np.chararray)
+
+    for idx, val in enumerate(sig):
+        if val:
+            lisa_sig[idx] = map_quads(quads[idx])
+        else:
+            lisa_sig[idx] = 'Not significant'
+
+    return lisa_sig
--- a/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
@@ -0,0 +1,10 @@
+import random
+import numpy
+
+def set_random_seeds(value):
+    """
+    Set the seeds of the RNGs (Random Number Generators)
+    used internally.
+    """
+    random.seed(value)
+    numpy.random.seed(value)
--- a/release/python/0.0.1/crankshaft/setup.py
+++ b/release/python/0.0.1/crankshaft/setup.py
@@ -0,0 +1,48 @@
+
+"""
+CartoDB Spatial Analysis Python Library
+See:
+https://github.com/CartoDB/crankshaft
+"""
+
+from setuptools import setup, find_packages
+
+setup(
+    name='crankshaft',
+
+    version='0.0.01',
+
+    description='CartoDB Spatial Analysis Python Library',
+
+    url='https://github.com/CartoDB/crankshaft',
+
+    author='Data Services Team - CartoDB',
+    author_email='dataservices@cartodb.com',
+
+    license='MIT',
+
+    classifiers=[
+        'Development Status :: 3 - Alpha',
+        'Intended Audience :: Mapping comunity',
+        'Topic :: Maps :: Mapping Tools',
+        'License :: OSI Approved :: MIT License',
+        'Programming Language :: Python :: 2.7',
+    ],
+
+    keywords='maps mapping tools spatial analysis geostatistics',
+
+    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
+
+    extras_require={
+        'dev': ['unittest'],
+        'test': ['unittest', 'nose', 'mock'],
+    },
+
+    # The choice of component versions is dictated by what's
+    # provisioned in the production servers.
+    install_requires=['pysal==1.11.0','numpy==1.6.1','scipy==0.17.0'],
+
+    requires=['pysal', 'numpy'],
+
+    test_suite='test'
+)
--- a/release/python/0.0.1/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/moran.json
@@ -0,0 +1,52 @@
+[[0.9319096128346788, "HH"],
+[-1.135787401862846, "HL"],
+[0.11732030672508517, "Not significant"],
+[0.6152779669180425, "Not significant"],
+[-0.14657336660125297, "Not significant"],
+[0.6967858120189607, "Not significant"],
+[0.07949310115714454, "Not significant"],
+[0.4703198759258987, "Not significant"],
+[0.4421125200498064, "Not significant"],
+[0.5724288737143592, "Not significant"],
+[0.8970743435692062, "LL"],
+[0.18327334401918674, "Not significant"],
+[-0.01466729201304962, "Not significant"],
+[0.3481559372544409, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329988, "HH"],
+[0.4373841193538136, "Not significant"],
+[0.15971286468915544, "Not significant"],
+[1.0543588860308968, "Not significant"],
+[1.7372866900020818, "HH"],
+[1.091998586053999, "LL"],
+[0.1171572584252222, "Not significant"],
+[0.08438455015300014, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329985, "HH"],
+[1.1627044812890683, "HH"],
+[0.06547094736902978, "Not significant"],
+[0.795275137550483, "Not significant"],
+[0.18562939195219, "LL"],
+[0.3010757406693439, "Not significant"],
+[2.8205795942839376, "HH"],
+[0.11259190602909264, "Not significant"],
+[-0.07116352791516614, "Not significant"],
+[-0.09945240794119009, "Not significant"],
+[0.18562939195219, "LL"],
+[0.1832733440191868, "Not significant"],
+[-0.39054253768447705, "Not significant"],
+[-0.1672071289487642, "HL"],
+[0.3337669247916343, "Not significant"],
+[0.2584386102554792, "Not significant"],
+[-0.19733845476322634, "HL"],
+[-0.9379282899805409, "LH"],
+[-0.028770969951095866, "Not significant"],
+[0.051367269430983485, "Not significant"],
+[-0.2172548045913472, "LH"],
+[0.05136726943098351, "Not significant"],
+[0.04191046803899837, "Not significant"],
+[0.7482357030403517, "HH"],
+[-0.014585767863118111, "Not significant"],
+[0.5410013139159929, "Not significant"],
+[1.0223932668429925, "LL"],
+[1.4179402898927476, "LL"]]
--- a/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
@@ -0,0 +1,54 @@
+[
+    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
+    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
+    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
+    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
+    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
+    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
+    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
+    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
+    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
+    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
+    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
+    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
+    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
+    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
+    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
+    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
+    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
+    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
+    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
+    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
+    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
+    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
+    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
+    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
+    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
+    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
+    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
+    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
+    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
+    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
+    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
+    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
+    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
+    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
+    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
+    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
+    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
+    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
+    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
+    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
+    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
+    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
+    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
+    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
+    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
+    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
+    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
+    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
+    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
+    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
+    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
+    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
+  ]
--- a/release/python/0.0.1/crankshaft/test/helper.py
+++ b/release/python/0.0.1/crankshaft/test/helper.py
@@ -0,0 +1,13 @@
+import unittest
+
+from mock_plpy import MockPlPy
+plpy = MockPlPy()
+
+import sys
+sys.modules['plpy'] = plpy
+
+import os
+
+def fixture_file(name):
+    dir = os.path.dirname(os.path.realpath(__file__))
+    return os.path.join(dir, 'fixtures', name)
--- a/release/python/0.0.1/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.1/crankshaft/test/mock_plpy.py
@@ -0,0 +1,34 @@
+import re
+
+class MockPlPy:
+    def __init__(self):
+        self._reset()
+
+    def _reset(self):
+        self.infos = []
+        self.notices = []
+        self.debugs = []
+        self.logs = []
+        self.warnings = []
+        self.errors = []
+        self.fatals = []
+        self.executes = []
+        self.results = []
+        self.prepares = []
+        self.results = []
+
+    def _define_result(self, query, result):
+        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
+        self.results.append([pattern, result])
+
+    def notice(self, msg):
+        self.notices.append(msg)
+
+    def info(self, msg):
+        self.infos.append(msg)
+
+    def execute(self, query): # TODO: additional arguments
+       for result in self.results:
+          if result[0].match(query):
+            return result[1]
+       return []
--- a/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
@@ -0,0 +1,144 @@
+import unittest
+import numpy as np
+
+import unittest
+
+
+# from mock_plpy import MockPlPy
+# plpy = MockPlPy()
+#
+# import sys
+# sys.modules['plpy'] = plpy
+from helper import plpy, fixture_file
+
+import crankshaft.clustering as cc
+from crankshaft import random_seeds
+import json
+
+class MoranTest(unittest.TestCase):
+    """Testing class for Moran's I functions."""
+
+    def setUp(self):
+        plpy._reset()
+        self.params = {"id_col": "cartodb_id",
+                       "attr1": "andy",
+                       "attr2": "jay_z",
+                       "table": "a_list",
+                       "geom_col": "the_geom",
+                       "num_ngbrs": 321}
+        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
+        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
+
+    def test_map_quads(self):
+        """Test map_quads."""
+        self.assertEqual(cc.map_quads(1), 'HH')
+        self.assertEqual(cc.map_quads(2), 'LH')
+        self.assertEqual(cc.map_quads(3), 'LL')
+        self.assertEqual(cc.map_quads(4), 'HL')
+        self.assertEqual(cc.map_quads(33), None)
+        self.assertEqual(cc.map_quads('andy'), None)
+
+    def test_query_attr_select(self):
+        """Test query_attr_select."""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(cc.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test query_attr_where."""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(cc.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
+              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
+              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
+              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
+              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
+              "BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
+              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
+              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
+              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
+              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
+              "i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.queen(self.params), ans)
+
+    def test_get_query(self):
+        """Test get_query."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
+              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
+              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
+              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
+              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
+              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.get_query('knn', self.params), ans)
+
+    def test_get_attributes(self):
+        """Test get_attributes."""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight."""
+
+        self.assertEqual(True, True)
+
+
+    def test_quad_position(self):
+        """Test lisa_sig_vals."""
+
+        quads = np.array([1, 2, 3, 4], np.int)
+
+        ans = np.array(['HH', 'LH', 'LL', 'HL'])
+        test_ans = cc.quad_position(quads)
+
+        self.assertTrue((test_ans == ans).all())
+
+    def test_moran_local(self):
+        """Test Moran's I local"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
+            self.assertEqual(res_quad, exp_quad)
+
+    def test_moran_local_rate(self):
+        """Test Moran's I rate"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
--- a/release/python/0.0.2/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/init.py
@@ -0,0 +1,2 @@
+import random_seeds
+import clustering
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
@@ -0,0 +1 @@
+from moran import *
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
@@ -0,0 +1,321 @@
+"""
+Moran's I geostatistics (global clustering & outliers presence)
+"""
+
+# TODO: Fill in local neighbors which have null/NoneType values with the
+#       average of the their neighborhood
+
+import numpy as np
+import pysal as ps
+import plpy
+
+# High level interface ---------------------------------------
+
+def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I implementation for PL/Python
+    Andy Eschbacher
+    """
+    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
+    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+            "attr1": attr,
+            "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    y = get_attributes(r, 1)
+    w = get_weight(r, w_type)
+
+    # calculate LISA values
+    lisa = ps.Moran_Local(y, w)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I Local Rate
+    Andy Eschbacher
+    """
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+             "numerator": numerator,
+             "denominator": denominator,
+             "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+        plpy.notice('r.nrows() = %d' % r.nrows())
+
+    ## collect attributes
+    numer = get_attributes(r, 1)
+    denom = get_attributes(r, 2)
+
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    ## TODO: Decide on which return values here
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
+
+def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    plpy.notice('** Constructing query')
+
+    qvals = {"num_ngbrs": num_ngbrs,
+             "attr1": attr1,
+             "attr2": attr2,
+             "table": t,
+             "geom_col": geom_column,
+             "id_col": id_col}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    ## collect attributes
+    attr1_vals = get_attributes(r, 1)
+    attr2_vals = get_attributes(r, 2)
+
+    # create weights
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
+
+    plpy.notice("len of Is: %d" % len(lisa.Is))
+
+    # find clustering of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+# Low level functions ----------------------------------------
+
+def map_quads(coord):
+    """
+        Map a quadrant number to Moran's I designation
+        HH=1, LH=2, LL=3, HL=4
+        Input:
+        :param coord (int): quadrant of a specific measurement
+    """
+    if coord == 1:
+        return 'HH'
+    elif coord == 2:
+        return 'LH'
+    elif coord == 3:
+        return 'LL'
+    elif coord == 4:
+        return 'HL'
+    else:
+        return None
+
+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        :param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        :param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM \"{table}\" As j " \
+                              "WHERE %(attr_where_j)s " \
+                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        :param params: dict of information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM \"{table}\" As j " \
+                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                 "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_query(w_type, query_vals):
+    """Return requested query.
+        :param w_type: type of neighbors to calculate (knn or queen)
+        :param query_vals: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+def get_attributes(query_res, attr_num):
+    """
+        :param query_res: query results with attributes and neighbors
+        :param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+## Build weight object
+def get_weight(query_res, w_type='queen', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        :param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    elif w_type == 'queen':
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
+def quad_position(quads):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    lisa_sig = np.array([map_quads(q) for q in quads])
+
+    return lisa_sig
+
+def lisa_sig_vals(pvals, quads, threshold):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    sig = (pvals <= threshold)
+
+    lisa_sig = np.empty(len(sig), np.chararray)
+
+    for idx, val in enumerate(sig):
+        if val:
+            lisa_sig[idx] = map_quads(quads[idx])
+        else:
+            lisa_sig[idx] = 'Not significant'
+
+    return lisa_sig
--- a/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
@@ -0,0 +1,10 @@
+import random
+import numpy
+
+def set_random_seeds(value):
+    """
+    Set the seeds of the RNGs (Random Number Generators)
+    used internally.
+    """
+    random.seed(value)
+    numpy.random.seed(value)
--- a/release/python/0.0.2/crankshaft/setup.py
+++ b/release/python/0.0.2/crankshaft/setup.py
@@ -0,0 +1,48 @@
+
+"""
+CartoDB Spatial Analysis Python Library
+See:
+https://github.com/CartoDB/crankshaft
+"""
+
+from setuptools import setup, find_packages
+
+setup(
+    name='crankshaft',
+
+    version='0.0.2',
+
+    description='CartoDB Spatial Analysis Python Library',
+
+    url='https://github.com/CartoDB/crankshaft',
+
+    author='Data Services Team - CartoDB',
+    author_email='dataservices@cartodb.com',
+
+    license='MIT',
+
+    classifiers=[
+        'Development Status :: 3 - Alpha',
+        'Intended Audience :: Mapping comunity',
+        'Topic :: Maps :: Mapping Tools',
+        'License :: OSI Approved :: MIT License',
+        'Programming Language :: Python :: 2.7',
+    ],
+
+    keywords='maps mapping tools spatial analysis geostatistics',
+
+    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
+
+    extras_require={
+        'dev': ['unittest'],
+        'test': ['unittest', 'nose', 'mock'],
+    },
+
+    # The choice of component versions is dictated by what's
+    # provisioned in the production servers.
+    install_requires=['pysal==1.9.1'],
+
+    requires=['pysal', 'numpy' ],
+
+    test_suite='test'
+)
--- a/release/python/0.0.2/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/moran.json
@@ -0,0 +1,52 @@
+[[0.9319096128346788, "HH"],
+[-1.135787401862846, "HL"],
+[0.11732030672508517, "Not significant"],
+[0.6152779669180425, "Not significant"],
+[-0.14657336660125297, "Not significant"],
+[0.6967858120189607, "Not significant"],
+[0.07949310115714454, "Not significant"],
+[0.4703198759258987, "Not significant"],
+[0.4421125200498064, "Not significant"],
+[0.5724288737143592, "Not significant"],
+[0.8970743435692062, "LL"],
+[0.18327334401918674, "Not significant"],
+[-0.01466729201304962, "Not significant"],
+[0.3481559372544409, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329988, "HH"],
+[0.4373841193538136, "Not significant"],
+[0.15971286468915544, "Not significant"],
+[1.0543588860308968, "Not significant"],
+[1.7372866900020818, "HH"],
+[1.091998586053999, "LL"],
+[0.1171572584252222, "Not significant"],
+[0.08438455015300014, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329985, "HH"],
+[1.1627044812890683, "HH"],
+[0.06547094736902978, "Not significant"],
+[0.795275137550483, "Not significant"],
+[0.18562939195219, "LL"],
+[0.3010757406693439, "Not significant"],
+[2.8205795942839376, "HH"],
+[0.11259190602909264, "Not significant"],
+[-0.07116352791516614, "Not significant"],
+[-0.09945240794119009, "Not significant"],
+[0.18562939195219, "LL"],
+[0.1832733440191868, "Not significant"],
+[-0.39054253768447705, "Not significant"],
+[-0.1672071289487642, "HL"],
+[0.3337669247916343, "Not significant"],
+[0.2584386102554792, "Not significant"],
+[-0.19733845476322634, "HL"],
+[-0.9379282899805409, "LH"],
+[-0.028770969951095866, "Not significant"],
+[0.051367269430983485, "Not significant"],
+[-0.2172548045913472, "LH"],
+[0.05136726943098351, "Not significant"],
+[0.04191046803899837, "Not significant"],
+[0.7482357030403517, "HH"],
+[-0.014585767863118111, "Not significant"],
+[0.5410013139159929, "Not significant"],
+[1.0223932668429925, "LL"],
+[1.4179402898927476, "LL"]]
--- a/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
@@ -0,0 +1,54 @@
+[
+    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
+    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
+    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
+    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
+    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
+    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
+    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
+    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
+    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
+    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
+    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
+    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
+    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
+    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
+    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
+    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
+    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
+    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
+    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
+    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
+    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
+    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
+    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
+    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
+    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
+    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
+    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
+    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
+    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
+    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
+    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
+    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
+    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
+    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
+    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
+    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
+    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
+    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
+    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
+    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
+    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
+    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
+    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
+    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
+    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
+    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
+    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
+    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
+    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
+    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
+    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
+    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
+  ]
--- a/release/python/0.0.2/crankshaft/test/helper.py
+++ b/release/python/0.0.2/crankshaft/test/helper.py
@@ -0,0 +1,13 @@
+import unittest
+
+from mock_plpy import MockPlPy
+plpy = MockPlPy()
+
+import sys
+sys.modules['plpy'] = plpy
+
+import os
+
+def fixture_file(name):
+    dir = os.path.dirname(os.path.realpath(__file__))
+    return os.path.join(dir, 'fixtures', name)
--- a/release/python/0.0.2/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.2/crankshaft/test/mock_plpy.py
@@ -0,0 +1,34 @@
+import re
+
+class MockPlPy:
+    def __init__(self):
+        self._reset()
+
+    def _reset(self):
+        self.infos = []
+        self.notices = []
+        self.debugs = []
+        self.logs = []
+        self.warnings = []
+        self.errors = []
+        self.fatals = []
+        self.executes = []
+        self.results = []
+        self.prepares = []
+        self.results = []
+
+    def _define_result(self, query, result):
+        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
+        self.results.append([pattern, result])
+
+    def notice(self, msg):
+        self.notices.append(msg)
+
+    def info(self, msg):
+        self.infos.append(msg)
+
+    def execute(self, query): # TODO: additional arguments
+       for result in self.results:
+          if result[0].match(query):
+            return result[1]
+       return []
--- a/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
@@ -0,0 +1,144 @@
+import unittest
+import numpy as np
+
+import unittest
+
+
+# from mock_plpy import MockPlPy
+# plpy = MockPlPy()
+#
+# import sys
+# sys.modules['plpy'] = plpy
+from helper import plpy, fixture_file
+
+import crankshaft.clustering as cc
+from crankshaft import random_seeds
+import json
+
+class MoranTest(unittest.TestCase):
+    """Testing class for Moran's I functions."""
+
+    def setUp(self):
+        plpy._reset()
+        self.params = {"id_col": "cartodb_id",
+                       "attr1": "andy",
+                       "attr2": "jay_z",
+                       "table": "a_list",
+                       "geom_col": "the_geom",
+                       "num_ngbrs": 321}
+        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
+        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
+
+    def test_map_quads(self):
+        """Test map_quads."""
+        self.assertEqual(cc.map_quads(1), 'HH')
+        self.assertEqual(cc.map_quads(2), 'LH')
+        self.assertEqual(cc.map_quads(3), 'LL')
+        self.assertEqual(cc.map_quads(4), 'HL')
+        self.assertEqual(cc.map_quads(33), None)
+        self.assertEqual(cc.map_quads('andy'), None)
+
+    def test_query_attr_select(self):
+        """Test query_attr_select."""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(cc.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test query_attr_where."""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(cc.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
+              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
+              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
+              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
+              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
+              "BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
+              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
+              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
+              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
+              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
+              "i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.queen(self.params), ans)
+
+    def test_get_query(self):
+        """Test get_query."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
+              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
+              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
+              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
+              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
+              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.get_query('knn', self.params), ans)
+
+    def test_get_attributes(self):
+        """Test get_attributes."""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight."""
+
+        self.assertEqual(True, True)
+
+
+    def test_quad_position(self):
+        """Test lisa_sig_vals."""
+
+        quads = np.array([1, 2, 3, 4], np.int)
+
+        ans = np.array(['HH', 'LH', 'LL', 'HL'])
+        test_ans = cc.quad_position(quads)
+
+        self.assertTrue((test_ans == ans).all())
+
+    def test_moran_local(self):
+        """Test Moran's I local"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
+            self.assertEqual(res_quad, exp_quad)
+
+    def test_moran_local_rate(self):
+        """Test Moran's I rate"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
--- a/src/pg/Makefile
+++ b/src/pg/Makefile
@@ -1,13 +1,15 @@
-# Generation of a new development version 'dev' (with an alias 'current' for
-# updating easily by upgrading to 'current', then 'dev')
+include ../../Makefile.global

-# sudo make install -- generate the 'dev' version from current source
-#                      and make it available to PostgreSQL
-# PGUSER=postgres make installcheck -- test the 'dev' extension
-
-SED = sed
-
-EXTENSION    = crankshaft
+# Development tasks:
+#
+# * install generates the control & script files into src/pg/
+#   and installs then into the PostgreSQL extensions directory;
+#   requires sudo. In additionof the current development version
+#   named 'dev', an alias 'current' is generating for ease of
+#   update (upgrade to 'current', then to 'dev').
+#   the python module is installed in a virtualenv in envs/dev/
+# * test runs the tests for the currently generated Development
+#   extension.

 DATA         = $(EXTENSION)--dev.sql \
 	             $(EXTENSION)--current--dev.sql \
@@ -16,7 +18,7 @@ DATA         = $(EXTENSION)--dev.sql \
 SOURCES_DATA_DIR = sql
 SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql)

-VIRTUALENV_PATH = $(realpath ../py/)
+VIRTUALENV_PATH = $(realpath ../../envs)
 ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH))

 REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \
@@ -36,13 +38,23 @@ include $(PGXS)
 # This seems to be needed at least for PG 9.3.11
 all: $(DATA)

+test: export PGUSER=postgres
+test: installcheck

-# WIP: goals for releasing the extension...
+# Release tasks

-EXTVERSION   = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
-
-../release/$(EXTENSION).control: $(EXTENSION).control
+../../release/$(EXTENSION).control: $(EXTENSION).control
 	cp $< $@

-release: ../release/$(EXTENSION).control
-	cp $(EXTENSION)--dev.sql $(EXTENSION)--$(EXTVERSION).sql
+# Prepare new release from the currently installed development version,
+# for the current version X.Y.Z (defined in the control file)
+# producing the extension script and control files in releases/
+# and the python package in releases/python/X.Y.Z/crankshaft/
+release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
+	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > ../../release/$(EXTENSION)--$(EXTVERSION).sql
+
+# Install the current relese into the PostgreSQL extensions directory
+# and the Python package in a virtual environment envs/X.Y.Z
+deploy:
+	$(INSTALL_DATA) ../../release/$(EXTENSION).control '$(DESTDIR)$(datadir)/extension/'
+	$(INSTALL_DATA) ../../release/*.sql '$(DESTDIR)$(datadir)/extension/'
--- a/src/pg/README.md
+++ b/src/pg/README.md
@@ -1,7 +0,0 @@
-
-# Running the tests:
-
-```
-sudo make install
-PGUSER=postgres make installcheck
-```
--- a/src/pg/crankshaft.control
+++ b/src/pg/crankshaft.control
@@ -1,5 +1,5 @@
 comment = 'CartoDB Spatial Analysis extension'
-default_version = '0.0.1'
+default_version = '0.0.2'
 requires = 'plpythonu, postgis, cartodb'
 superuser = true
 schema = cdb_crankshaft
--- a/src/pg/sql/01_version.sql
+++ b/src/pg/sql/01_version.sql
@@ -2,11 +2,11 @@
 CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
 RETURNS text AS $$
  SELECT '@@VERSION@@'::text;
-$$ language 'sql' IMMUTABLE STRICT;
+$$ language 'sql' STABLE STRICT;

 -- Internal identifier of the installed extension instence
 -- e.g. 'dev' for current development version
-CREATE OR REPLACE FUNCTION cdb_crankshaft_internal_version()
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
 RETURNS text AS $$
  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
-$$ language 'sql' IMMUTABLE STRICT;
+$$ language 'sql' STABLE STRICT;
--- a/src/pg/sql/02_py.sql
+++ b/src/pg/sql/02_py.sql
@@ -14,7 +14,7 @@ AS $$
    import os
    # plpy.notice('%',str(os.environ))
    # activate virtualenv
-    crankshaft_version = plpy.execute('SELECT cdb_crankshaft.cdb_crankshaft_internal_version()')[0]['cdb_crankshaft_internal_version']
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
    default_venv_path = os.path.join(base_path, crankshaft_version)
    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
--- a/src/pg/sql/10_moran.sql
+++ b/src/pg/sql/10_moran.sql
@@ -1,37 +1,89 @@
-- Moran's I
+-- Moran's I (global)
 CREATE OR REPLACE FUNCTION
-  cdb_moran_local (
-      t TEXT,
-  	  attr TEXT,
-  	  significance float DEFAULT 0.05,
-  	  num_ngbrs INT DEFAULT 5,
-  	  permutations INT DEFAULT 99,
-  	  geom_column TEXT DEFAULT 'the_geom',
-  	  id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn')
-RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+  CDB_AreasOfInterest_Global (
+      subquery TEXT,
+      attr_name TEXT,
+      permutations INT DEFAULT 99,
+      geom_col TEXT DEFAULT 'the_geom',
+      id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn',
+      num_ngbrs INT DEFAULT 5)
+RETURNS TABLE (moran NUMERIC, significance NUMERIC)
 AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+  # TODO: use named parameters or a dictionary
+  return moran(subquery, attr, num_ngbrs, permutations, geom_col, id_col, w_type)
 $$ LANGUAGE plpythonu;

+-- Moran's I Local
+CREATE OR REPLACE FUNCTION
+  CDB_AreasOfInterest_Local(
+      subquery TEXT,
+      attr TEXT,
+      permutations INT DEFAULT 99,
+      geom_col TEXT DEFAULT 'the_geom',
+      id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn',
+      num_ngbrs INT DEFAULT 5)
+RETURNS TABLE (moran NUMERIC, quads TEXT, significance NUMERIC, ids INT, y NUMERIC)
+AS $$
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(subquery, attr, permutations, geom_col, id_col, w_type, num_ngbrs)
+$$ LANGUAGE plpythonu;
+
+-- Moran's I Rate (global)
+CREATE OR REPLACE FUNCTION
+  CDB_AreasOfInterest_Global_Rate(
+      subquery TEXT,
+      numerator TEXT,
+      denominator TEXT,
+      permutations INT DEFAULT 99,
+      geom_col TEXT DEFAULT 'the_geom',
+      id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn',
+      num_ngbrs INT DEFAULT 5)
+RETURNS TABLE (moran FLOAT, significance FLOAT)
+AS $$
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_rate(subquery, numerator, denominator, permutations, geom_col, id_col, w_type, num_ngbrs)
+$$ LANGUAGE plpythonu;
+
+
 -- Moran's I Local Rate
 CREATE OR REPLACE FUNCTION
-  cdb_moran_local_rate(t TEXT,
-		 numerator TEXT,
-		 denominator TEXT,
-		 significance FLOAT DEFAULT 0.05,
-		 num_ngbrs INT DEFAULT 5,
-		 permutations INT DEFAULT 99,
-		 geom_column TEXT DEFAULT 'the_geom',
-		 id_col TEXT DEFAULT 'cartodb_id',
-		 w_type TEXT DEFAULT 'knn')
-RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+  CDB_AreasOfInterest_Local_Rate(
+      subquery TEXT,
+      numerator TEXT,
+      denominator TEXT,
+      permutations INT DEFAULT 99,
+      geom_col TEXT DEFAULT 'the_geom',
+      id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn',
+      num_ngbrs INT DEFAULT 5)
+RETURNS
+TABLE(moran NUMERIC, quads TEXT, significance NUMERIC, ids INT, y NUMERIC)
 AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(subquery, numerator, denominator, permutations, geom_col, id_col, w_type, num_ngbrs)
 $$ LANGUAGE plpythonu;
+
+-- -- Moran's I Local Bivariate
+-- CREATE OR REPLACE FUNCTION
+--   cdb_moran_local_bv(
+--       subquery TEXT,
+--       attr1 TEXT,
+--       attr2 TEXT,
+--       permutations INT DEFAULT 99,
+--       geom_col TEXT DEFAULT 'the_geom',
+--       id_col TEXT DEFAULT 'cartodb_id',
+--       w_type TEXT DEFAULT 'knn',
+--       num_ngbrs INT DEFAULT 5)
+-- RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+-- AS $$
+--   from crankshaft.clustering import moran_local_bv
+--   # TODO: use named parameters or a dictionary
+--   return moran_local_bv(t, attr1, attr2, permutations, geom_col, id_col, w_type, num_ngbrs)
+-- $$ LANGUAGE plpythonu;
--- a/src/pg/sql/50_contours.sql
+++ b/src/pg/sql/50_contours.sql
@@ -0,0 +1,32 @@
+
+CREATE OR REPLACE FUNCTION
+  _CDB_Contours (
+      subquery TEXT,
+      grid_size NUMERIC DEFAULT 100,
+      bandwidth NUMERIC DEFAULT 0.0001,
+      levels NUMERIC[] DEFAULT null
+      )
+RETURNS table (level Numeric, geom_text text )
+AS $$
+  from crankshaft.contours import cdb_generate_contours
+  # TODO: use named parameters or a dictionary
+  return cdb_generate_contours(subquery, grid_size, bandwidth, levels)
+$$ LANGUAGE plpythonu;
+
+
+CREATE OR REPLACE FUNCTION
+  CDB_Contours (
+    subquery TEXT,
+    grid_size NUMERIC DEFAULT 100,
+    bandwidth NUMERIC DEFAULT 0.0001,
+    levels NUMERIC[] DEFAULT null
+    )
+RETURNS table (level Numeric, geom geometry )
+AS $$
+BEGIN
+
+  RETURN QUERY
+    select cont.level as level, ST_GeomFromText(cont.geom_text, 4326)::geometry as geom from _CDB_Contours(subquery,grid_size,bandwidth,levels) as cont;
+END;
+$$ LANGUAGE plpgsql;
+
--- a/src/pg/test/expected/02_moran_test.out
+++ b/src/pg/test/expected/02_moran_test.out
@@ -110,7 +110,7 @@ INSERT INTO ppoints2 VALUES
 (24,'0101000020E61000009C5F91C5095C17C0C78784B15A4F4540'::geometry,'24','07',0.3, 1.0),
 (29,'0101000020E6100000C34D4A5B48E712C092E680892C684240'::geometry,'29','01',0.3, 1.0),
 (52,'0101000020E6100000406A545EB29A07C04E5F0BDA39A54140'::geometry,'52','19',0.0, 1.01)
-- Moral functions perform some nondeterministic computations
+-- Areas of Interest functions perform some nondeterministic computations
 -- (to estimate the significance); we will set the seeds for the RNGs
 -- that affect those results to have repeateble results
 SELECT cdb_crankshaft._cdb_random_seeds(1234);
@@ -121,67 +121,61 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN cdb_crankshaft.cdb_moran_local('ppoints', 'value') m
+  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local('SELECT * FROM ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;
-NOTICE:  ** Constructing query
-CONTEXT:  PL/Python function "cdb_moran_local"
-NOTICE:  ** Query returned with 52 rows
-CONTEXT:  PL/Python function "cdb_moran_local"
-NOTICE:  ** Finished calculations
-CONTEXT:  PL/Python function "cdb_moran_local"
- code |      quads      
------+-----------------
+ code | quads 
+------+-------
 01   | HH
 02   | HL
- 03   | Not significant
- 04   | Not significant
- 05   | Not significant
- 06   | Not significant
- 07   | Not significant
- 08   | Not significant
- 09   | Not significant
- 10   | Not significant
+ 03   | LL
+ 04   | LL
+ 05   | LH
+ 06   | LL
+ 07   | HH
+ 08   | HH
+ 09   | HH
+ 10   | LL
 11   | LL
- 12   | Not significant
- 13   | Not significant
- 14   | Not significant
- 15   | Not significant
+ 12   | LL
+ 13   | HL
+ 14   | LL
+ 15   | LL
 16   | HH
- 17   | Not significant
- 18   | Not significant
- 19   | Not significant
+ 17   | HH
+ 18   | LL
+ 19   | HH
 20   | HH
 21   | LL
- 22   | Not significant
- 23   | Not significant
- 24   | Not significant
+ 22   | HH
+ 23   | LL
+ 24   | LL
 25   | HH
 26   | HH
- 27   | Not significant
- 28   | Not significant
+ 27   | LL
+ 28   | HH
 29   | LL
- 30   | Not significant
+ 30   | LL
 31   | HH
- 32   | Not significant
- 33   | Not significant
- 34   | Not significant
+ 32   | LL
+ 33   | HL
+ 34   | LH
 35   | LL
- 36   | Not significant
- 37   | Not significant
+ 36   | LL
+ 37   | HL
 38   | HL
- 39   | Not significant
- 40   | Not significant
+ 39   | HH
+ 40   | HH
 41   | HL
 42   | LH
- 43   | Not significant
- 44   | Not significant
+ 43   | LH
+ 44   | LL
 45   | LH
- 46   | Not significant
- 47   | Not significant
+ 46   | LL
+ 47   | LL
 48   | HH
- 49   | Not significant
- 50   | Not significant
+ 49   | LH
+ 50   | HH
 51   | LL
 52   | LL
 (52 rows)
@@ -194,67 +188,61 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints2.code, m.quads
  FROM ppoints2
-  JOIN cdb_crankshaft.cdb_moran_local_rate('ppoints2', 'numerator', 'denominator') m
+  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local_Rate('SELECT * FROM ppoints2', 'numerator', 'denominator') m
    ON ppoints2.cartodb_id = m.ids
  ORDER BY ppoints2.code;
-NOTICE:  ** Constructing query
-CONTEXT:  PL/Python function "cdb_moran_local_rate"
-NOTICE:  ** Query returned with 51 rows
-CONTEXT:  PL/Python function "cdb_moran_local_rate"
-NOTICE:  ** Finished calculations
-CONTEXT:  PL/Python function "cdb_moran_local_rate"
- code |      quads      
------+-----------------
+ code | quads 
+------+-------
 01   | LL
- 02   | Not significant
- 03   | Not significant
- 04   | Not significant
- 05   | Not significant
- 06   | Not significant
- 07   | Not significant
- 08   | Not significant
+ 02   | LH
+ 03   | HH
+ 04   | HH
+ 05   | LL
+ 06   | HH
+ 07   | LL
+ 08   | LL
 09   | LL
- 10   | Not significant
+ 10   | HH
 11   | HH
- 12   | Not significant
- 13   | Not significant
- 14   | Not significant
- 15   | Not significant
- 16   | Not significant
+ 12   | HL
+ 13   | LL
+ 14   | HH
+ 15   | LL
+ 16   | LL
 17   | LL
- 18   | Not significant
- 19   | Not significant
+ 18   | LH
+ 19   | LL
 20   | LL
- 21   | Not significant
- 22   | Not significant
- 23   | Not significant
- 24   | Not significant
+ 21   | HH
+ 22   | LL
+ 23   | HL
+ 24   | LL
 25   | LL
 26   | LL
- 27   | Not significant
- 28   | Not significant
+ 27   | LL
+ 28   | LL
 29   | LH
- 30   | Not significant
+ 30   | HH
 31   | LL
- 32   | Not significant
- 33   | Not significant
- 34   | Not significant
+ 32   | LL
+ 33   | LL
+ 34   | LL
 35   | LH
- 36   | Not significant
- 37   | Not significant
+ 36   | HL
+ 37   | LH
 38   | LH
- 39   | Not significant
- 40   | Not significant
+ 39   | LL
+ 40   | LL
 41   | LH
 42   | HL
- 43   | Not significant
- 44   | Not significant
+ 43   | LL
+ 44   | HL
 45   | LL
- 46   | Not significant
- 47   | Not significant
+ 46   | HL
+ 47   | LL
 48   | LL
- 49   | Not significant
- 50   | Not significant
- 51   | Not significant
+ 49   | HL
+ 50   | LL
+ 51   | HH
 (51 rows)

--- a/src/pg/test/sql/02_moran_test.sql
+++ b/src/pg/test/sql/02_moran_test.sql
@@ -1,14 +1,14 @@
 \i test/fixtures/ppoints.sql
 \i test/fixtures/ppoints2.sql

-- Moral functions perform some nondeterministic computations
+-- Areas of Interest functions perform some nondeterministic computations
 -- (to estimate the significance); we will set the seeds for the RNGs
 -- that affect those results to have repeateble results
 SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN cdb_crankshaft.cdb_moran_local('ppoints', 'value') m
+  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local('SELECT * FROM ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;

@@ -16,6 +16,6 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints2.code, m.quads
  FROM ppoints2
-  JOIN cdb_crankshaft.cdb_moran_local_rate('ppoints2', 'numerator', 'denominator') m
+  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local_Rate('SELECT * FROM ppoints2', 'numerator', 'denominator') m
    ON ppoints2.cartodb_id = m.ids
  ORDER BY ppoints2.code;
--- a/src/pg/test/sql/90_permissions.sql
+++ b/src/pg/test/sql/90_permissions.sql
@@ -9,7 +9,7 @@ SET search_path TO public,cartodb,cdb_crankshaft;
 -- Exercise public functions
 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN cdb_moran_local('ppoints', 'value') m
+  JOIN CDB_AreasOfInterest_Local('ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;
 SELECT round(cdb_overlap_sum(
--- a/src/py/.gitignore
+++ b/src/py/.gitignore
@@ -1,2 +0,0 @@
-*.pyc
-dev/
--- a/src/py/Makefile
+++ b/src/py/Makefile
@@ -1,9 +1,22 @@
+include ../../Makefile.global
+
 # Install the package locally for development
 install:
-	virtualenv --system-site-packages dev
-	./dev/bin/pip install -I ./crankshaft
-	./dev/bin/pip install -I nose
+	virtualenv --system-site-packages ../../envs/dev
+	# source ../../envs/dev/bin/activate
+	../../envs/dev/bin/pip install -I ./crankshaft
+	../../envs/dev/bin/pip install -I nose

 # Test develpment install
-testinstalled:
-	./dev/bin/nosetests crankshaft/test/
+test:
+	../../envs/dev/bin/nosetests crankshaft/test/
+
+release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
+	mkdir -p ../../release/python/$(EXTVERSION)
+	cp -r ./$(PACKAGE) ../../release/python/$(EXTVERSION)/
+	$(SED) -i -r 's/version='"'"'[0-9]+\.[0-9]+\.[0-9]+'"'"'/version='"'"'$(EXTVERSION)'"'"'/g'  ../../release/python/$(EXTVERSION)/$(PACKAGE)/setup.py
+
+deploy:
+	virtualenv --system-site-packages $(VIRTUALENV_PATH)/$(RELEASE_VERSION)
+	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I -U ../../release/python/$(RELEASE_VERSION)/$(PACKAGE)
+	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I nose
--- a/src/py/README.md
+++ b/src/py/README.md
@@ -8,7 +8,7 @@ cd crankshaft
 nosetests test/
 ```

-## Notes about python dependencies
+## Notes about Python dependencies
 * This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments.
 * We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has  all the dependencies
 * Every dependency should be:
@@ -20,37 +20,29 @@ nosetests test/

 ---

-We have two possible approaches being considered as to how manage
-the Python virtual environment: using a pure virtual enviroment
-or combine it with some system packages that include depencencies
-for the *hard-to-compile* packages (and pin them in somewhat old versions).
+To avoid troublesome compilations/linkings we will use
+the available system package `python-scipy`.
+This package and its dependencies provide numpy 1.6.1
+and scipy 0.9.0. To be able to use these versions we cannot
+PySAL 1.10 or later, so we'll stick to 1.9.1.

-### Alternative A: pure virtual environment
+```
+apt-get install -y python-scipy
+```

-In this case we will install all the packages needed in the
-virtual environment.
-This will involve, specially for the numerical packages compiling
-and linking code that uses a number of third party libraries,
-and requires having theses depencencies solved for the production
-environments.
+We'll use virtual environments to install our packages,
+but configued to use also system modules so that the
+mentioned scipy and numpy are used.

-#### Create and use a virtual env
-
-We'll use a virtual enviroment directory `dev`
-under the `src/pg` directory.
-
-    # Create the virtual environment for python
-    $ virtualenv dev
+    # Create a virtual environment for python
+    $ virtualenv --system-site-packages dev

    # Activate the virtualenv
    $ source dev/bin/activate

    # Install all the requirements
    # expect this to take a while, as it will trigger a few compilations
-    (dev) $ pip install -r requirements.txt
-
-    # Add a new pip to the party
-    (dev) $ pip install pandas
+    (dev) $ pip install -I ./crankshaft

 #### Test the libraries with that virtual env

@@ -94,37 +86,3 @@ Then, execute the tests with:
    import pysal
    import nose
    nose.runmodule('pysal')
-
-
-### Alternative B: using some packaged modules
-
-This option avoids troublesome compilations/linkings, at the cost
-of freezing some module versions as available in system packages,
-namely numpy 1.6.1 and scipy 0.9.0. (in turn, this implies
-the most recent version of PySAL we can use is 1.9.1)
-
-
-TODO: to use this alternative the python-scipy package must be
-installed (this will have to be included in server provisioning)
-
-```
-apt-get install -y python-scipy
-```
-
-#### Create and use a virtual env
-
-We'll use a `dev` enviroment as before, but will configure it to
-use also system modules.
-
-
-    # Create the virtual environment for python
-    $ virtualenv --system-site-packages dev
-
-    # Activate the virtualenv
-    $ source dev/bin/activate
-
-    # Install all the requirements
-    # expect this to take a while, as it will trigger a few compilations
-    (dev) $ pip install -I ./crankshaft
-
-Then we can proceed to testing as in Alternative A.
--- a/src/py/crankshaft/crankshaft/init.py
+++ b/src/py/crankshaft/crankshaft/init.py
@@ -1,2 +1,3 @@
 import random_seeds
 import clustering
+import contours
--- a/src/py/crankshaft/crankshaft/clustering/moran.py
+++ b/src/py/crankshaft/crankshaft/clustering/moran.py
@@ -5,143 +5,226 @@ Moran's I geostatistics (global clustering & outliers presence)
 # TODO: Fill in local neighbors which have null/NoneType values with the
 #       average of the their neighborhood

-import numpy as np
 import pysal as ps
 import plpy

+# crankshaft module
+import crankshaft.pysal_utils as pu
+
 # High level interface ---------------------------------------

-def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+def moran(subquery, attr_name,
+          permutations, geom_col, id_col, w_type, num_ngbrs):
+    """
+    Moran's I (global)
+    Implementation building neighbors with a PostGIS database and Moran's I
+     core clusters with PySAL.
+    Andy Eschbacher
+    """
+    qvals = {"id_col": id_col,
+             "attr1": attr_name,
+             "geom_col": geom_col,
+             "subquery": subquery,
+             "num_ngbrs": num_ngbrs}
+
+    query = pu.construct_neighbor_query(w_type, qvals)
+
+    plpy.notice('** Query: %s' % query)
+
+    try:
+        result = plpy.execute(query)
+        # if there are no neighbors, exit
+        if len(result) == 0:
+            return pu.empty_zipped_array(2)
+        plpy.notice('** Query returned with %d rows' % len(result))
+    except plpy.SPIError:
+        plpy.error('Error: areas of interest query failed, check input parameters')
+        plpy.notice('** Query failed: "%s"' % query)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        return pu.empty_zipped_array(2)
+
+    ## collect attributes
+    attr_vals = pu.get_attributes(result)
+
+    ## calculate weights
+    weight = pu.get_weight(result, w_type, num_ngbrs)
+
+    ## calculate moran global
+    moran_global = ps.esda.moran.Moran(attr_vals, weight,
+                                       permutations=permutations)
+
+    return zip([moran_global.I], [moran_global.EI])
+
+def moran_local(subquery, attr,
+                permutations, geom_col, id_col, w_type, num_ngbrs):
    """
    Moran's I implementation for PL/Python
    Andy Eschbacher
    """
-    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
-    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
-
-    plpy.notice('** Constructing query')

    # geometries with attributes that are null are ignored
    # resulting in a collection of not as near neighbors

    qvals = {"id_col": id_col,
-            "attr1": attr,
-            "geom_col": geom_column,
-             "table": t,
+             "attr1": attr,
+             "geom_col": geom_col,
+             "subquery": subquery,
             "num_ngbrs": num_ngbrs}

-    q = get_query(w_type, qvals)
+    query = pu.construct_neighbor_query(w_type, qvals)

    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
+        result = plpy.execute(query)
+        # if there are no neighbors, exit
+        if len(result) == 0:
+            return pu.empty_zipped_array(5)
    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
+        plpy.error('Error: areas of interest query failed, check input parameters')
+        plpy.notice('** Query failed: "%s"' % query)
+        return pu.empty_zipped_array(5)

-    y = get_attributes(r, 1)
-    w = get_weight(r, w_type)
+    attr_vals = pu.get_attributes(result)
+    weight = pu.get_weight(result, w_type, num_ngbrs)

    # calculate LISA values
-    lisa = ps.Moran_Local(y, w)
+    lisa = ps.esda.moran.Moran_Local(attr_vals, weight,
+                                     permutations=permutations)

-    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+    # find quadrants for each geometry
+    quads = quad_position(lisa.q)

-    plpy.notice('** Finished calculations')
+    return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)

-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
-
-def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+def moran_rate(subquery, numerator, denominator,
+               permutations, geom_col, id_col, w_type, num_ngbrs):
    """
-    Moran's I Local Rate
+    Moran's I Rate (global)
    Andy Eschbacher
    """
-
-    plpy.notice('** Constructing query')
-
-    # geometries with attributes that are null are ignored
-    # resulting in a collection of not as near neighbors
-
    qvals = {"id_col": id_col,
-             "numerator": numerator,
-             "denominator": denominator,
-             "geom_col": geom_column,
-             "table": t,
+             "attr1": numerator,
+             "attr2": denominator,
+             "geom_col": geom_col,
+             "subquery": subquery,
             "num_ngbrs": num_ngbrs}

-    q = get_query(w_type, qvals)
+    query = pu.construct_neighbor_query(w_type, qvals)
+
+    plpy.notice('** Query: %s' % query)

    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
+        result = plpy.execute(query)
+        # if there are no neighbors, exit
+        if len(result) == 0:
+            return pu.empty_zipped_array(2)
+        plpy.notice('** Query returned with %d rows' % len(result))
    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
+        plpy.error('Error: areas of interest query failed, check input parameters')
+        plpy.notice('** Query failed: "%s"' % query)
        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-        plpy.notice('r.nrows() = %d' % r.nrows())
+        return pu.empty_zipped_array(2)

    ## collect attributes
-    numer = get_attributes(r, 1)
-    denom = get_attributes(r, 2)
+    numer = pu.get_attributes(result, 1)
+    denom = pu.get_attributes(result, 2)

-    w = get_weight(r, w_type, num_ngbrs)
+    weight = pu.get_weight(result, w_type, num_ngbrs)
+
+    ## calculate moran global rate
+    lisa_rate = ps.esda.moran.Moran_Rate(numer, denom, weight,
+                                         permutations=permutations)
+
+    return zip([lisa_rate.I], [lisa_rate.EI])
+
+def moran_local_rate(subquery, numerator, denominator,
+                     permutations, geom_col, id_col, w_type, num_ngbrs):
+    """
+        Moran's I Local Rate
+        Andy Eschbacher
+    """
+    # geometries with values that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    query = pu.construct_neighbor_query(w_type,
+                                     {"id_col": id_col,
+                                      "numerator": numerator,
+                                      "denominator": denominator,
+                                      "geom_col": geom_col,
+                                      "subquery": subquery,
+                                      "num_ngbrs": num_ngbrs})
+
+    try:
+        result = plpy.execute(query)
+        # if there are no neighbors, exit
+        if len(result) == 0:
+            return pu.empty_zipped_array(5)
+    except plpy.SPIError:
+        plpy.error('Error: areas of interest query failed, check input parameters')
+        plpy.notice('** Query failed: "%s"' % query)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        return pu.empty_zipped_array(5)
+
+    ## collect attributes
+    numer = pu.get_attributes(result, 1)
+    denom = pu.get_attributes(result, 2)
+
+    weight = pu.get_weight(result, w_type, num_ngbrs)

    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, weight,
+                                          permutations=permutations)

    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+    quads = quad_position(lisa.q)

-    plpy.notice('** Finished calculations')
+    return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)

-    ## TODO: Decide on which return values here
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
-
-def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+def moran_local_bv(subquery, attr1, attr2,
+                   permutations, geom_col, id_col, w_type, num_ngbrs):
+    """
+        Moran's I (local) Bivariate (untested)
+    """
    plpy.notice('** Constructing query')

    qvals = {"num_ngbrs": num_ngbrs,
             "attr1": attr1,
             "attr2": attr2,
-             "table": t,
-             "geom_col": geom_column,
+             "subquery": subquery,
+             "geom_col": geom_col,
             "id_col": id_col}

-    q = get_query(w_type, qvals)
+    query = pu.construct_neighbor_query(w_type, qvals)

    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
+        result = plpy.execute(query)
+        # if there are no neighbors, exit
+        if len(result) == 0:
+            return pu.empty_zipped_array(4)
    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
+        plpy.error("Error: areas of interest query failed, " \
+                   "check input parameters")
+        plpy.notice('** Query failed: "%s"' % query)
+        return pu.empty_zipped_array(4)

    ## collect attributes
-    attr1_vals = get_attributes(r, 1)
-    attr2_vals = get_attributes(r, 2)
+    attr1_vals = pu.get_attributes(result, 1)
+    attr2_vals = pu.get_attributes(result, 2)

    # create weights
-    w = get_weight(r, w_type, num_ngbrs)
+    weight = pu.get_weight(result, w_type, num_ngbrs)

    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, weight,
+                                        permutations=permutations)

    plpy.notice("len of Is: %d" % len(lisa.Is))

    # find clustering of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+    lisa_sig = quad_position(lisa.q)

    plpy.notice('** Finished calculations')

-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, weight.id_order)

 # Low level functions ----------------------------------------

@@ -150,7 +233,9 @@ def map_quads(coord):
        Map a quadrant number to Moran's I designation
        HH=1, LH=2, LL=3, HL=4
        Input:
-        :param coord (int): quadrant of a specific measurement
+        @param coord (int): quadrant of a specific measurement
+        Output:
+            classification (one of 'HH', 'LH', 'LL', or 'HL')
    """
    if coord == 1:
        return 'HH'
@@ -163,159 +248,13 @@ def map_quads(coord):
    else:
        return None

-def query_attr_select(params):
-    """
-        Create portion of SELECT statement for attributes inolved in query.
-        :param params: dict of information used in query (column names,
-                       table name, etc.)
-    """
-
-    attrs = [k for k in params
-             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
-
-    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
-
-    attr_string = ""
-
-    for idx, val in enumerate(sorted(attrs)):
-        attr_string += template % {"col": val, "alias_num": idx + 1}
-
-    return attr_string
-
-def query_attr_where(params):
-    """
-        Create portion of WHERE clauses for weeding out NULL-valued geometries
-    """
-    attrs = sorted([k for k in params
-                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
-
-    attr_string = []
-
-    for attr in attrs:
-        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
-
-    if len(attrs) == 2:
-        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
-
-    out = " AND ".join(attr_string)
-
-    return out
-
-def knn(params):
-    """SQL query for k-nearest neighbors.
-        :param vars: dict of values to fill template
-    """
-
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                              "FROM \"{table}\" As j " \
-                              "WHERE %(attr_where_j)s " \
-                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
-                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## SQL query for finding queens neighbors (all contiguous polygons)
-def queen(params):
-    """SQL query for queen neighbors.
-        :param params: dict of information to fill query
-    """
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                 "FROM \"{table}\" As j " \
-                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
-                 "%(attr_where_j)s)" \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## to add more weight methods open a ticket or pull request
-
-def get_query(w_type, query_vals):
-    """Return requested query.
-        :param w_type: type of neighbors to calculate (knn or queen)
-        :param query_vals: values used to construct the query
-    """
-
-    if w_type == 'knn':
-        return knn(query_vals)
-    else:
-        return queen(query_vals)
-
-def get_attributes(query_res, attr_num):
-    """
-        :param query_res: query results with attributes and neighbors
-        :param attr_num: attribute number (1, 2, ...)
-    """
-    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
-
-## Build weight object
-def get_weight(query_res, w_type='queen', num_ngbrs=5):
-    """
-        Construct PySAL weight from return value of query
-        :param query_res: query results with attributes and neighbors
-    """
-    if w_type == 'knn':
-        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
-        weights = {x['id']: row_normed_weights for x in query_res}
-    elif w_type == 'queen':
-        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
-                            if len(x['neighbors']) > 0
-                            else [] for x in query_res}
-
-    neighbors = {x['id']: x['neighbors'] for x in query_res}
-
-    return ps.W(neighbors, weights)
-
 def quad_position(quads):
    """
        Produce Moran's I classification based of n
+        Input:
+        @param quads ndarray: an array of quads classified by
+          1-4 (PySAL default)
+        Output:
+        @param list: an array of quads classied by 'HH', 'LL', etc.
    """
-
-    lisa_sig = np.array([map_quads(q) for q in quads])
-
-    return lisa_sig
-
-def lisa_sig_vals(pvals, quads, threshold):
-    """
-        Produce Moran's I classification based of n
-    """
-
-    sig = (pvals <= threshold)
-
-    lisa_sig = np.empty(len(sig), np.chararray)
-
-    for idx, val in enumerate(sig):
-        if val:
-            lisa_sig[idx] = map_quads(quads[idx])
-        else:
-            lisa_sig[idx] = 'Not significant'
-
-    return lisa_sig
+    return [map_quads(q) for q in quads]
--- a/src/py/crankshaft/crankshaft/contours/init.py
+++ b/src/py/crankshaft/crankshaft/contours/init.py
@@ -0,0 +1 @@
+from contours import *
--- a/src/py/crankshaft/crankshaft/contours/contours.py
+++ b/src/py/crankshaft/crankshaft/contours/contours.py
@@ -0,0 +1,58 @@
+from scipy.stats import gaussian_kde
+from scipy.interpolate import griddata
+import numpy as np 
+from sklearn.neighbors import KernelDensity
+from skimage.measure import find_contours
+import plpy
+
+def cdb_generate_contours(query, grid_size, bandwidth, levels):
+    plpy.notice('one')
+    data   = plpy.execute( 'select ST_X(the_geom) as x , ST_Y(the_geom) as y from ({0}) as a '.format(query))
+    plpy.notice('two')
+
+    xs = [d['x'] for d in data]
+    ys = [d['y'] for d in data]
+    plpy.notice('three')
+    return generate_contours(xs,ys,grid_size,bandwidth,levels)
+  
+def scale_coord(coord, x_range,y_range,grid_size):
+    plpy.notice('ranges %,  % ', x_range, y_range)
+    return [coord[0]*(x_range[1]-x_range[0])/float(grid_size)+x_range[0],
+            coord[1]*(y_range[1]-y_range[0])/float(grid_size)+y_range[0]]
+    
+def make_wkt(data,x_range, y_range, grid_size):
+    joined = ','.join([' '.join(map(str,scale_coord(coord_pair, x_range, y_range, grid_size))) for coord_pair in data])
+    return '({0})'.format(joined)
+    
+def make_multi_line(data,x_range,y_range, grid_size):
+    joined = ','.join([ make_wkt(ring,x_range,y_range,grid_size)  for ring in data ])
+    return 'MULTILINESTRING({0})'.format(joined)
+
+def generate_contours(xs,ys, grid_res=100, bandwidth=0.001, levels=None):
+    plpy.notice("HERE")
+    max_y, min_y = np.max(ys), np.min(ys)
+    max_x, min_x = np.max(xs), np.min(xs)
+    positions = np.vstack([ys,xs]).T
+    grid_x,grid_y = np.meshgrid(np.linspace(min_x, max_x , grid_res), np.linspace(min_y, max_y, grid_res))
+    xy = np.vstack([grid_y.ravel(), grid_x.ravel()]).T
+    xy *= np.pi / 180.
+
+    plpy.notice(" Generating kernel density")
+    kde = KernelDensity(bandwidth=bandwidth, metric='haversine',
+                        kernel='gaussian', algorithm='ball_tree')
+    kde.fit(positions*np.pi/180.)
+    results = np.exp(kde.score_samples(xy))
+    results = results.reshape((grid_x.shape[0], grid_y.shape[0]))
+    
+    if not levels:
+        levels = np.linspace(results.min(), results.max(),60)
+    plpy.notice(' finding contours')
+    CS = [find_contours(results, level) for level in levels]
+    
+    vertices = []
+    for contours,level in zip(CS,levels):
+        if len(contours)>0:
+            multiline = make_multi_line(contours, (min_x,max_x), (min_y, max_y), grid_res)
+            vertices.append([level, multiline ])
+    plpy.notice('generated vertices retunring ?')
+    return vertices
--- a/src/py/crankshaft/crankshaft/pysal_utils/init.py
+++ b/src/py/crankshaft/crankshaft/pysal_utils/init.py
@@ -0,0 +1 @@
+from pysal_utils import *
--- a/src/py/crankshaft/crankshaft/pysal_utils/pysal_utils.py
+++ b/src/py/crankshaft/crankshaft/pysal_utils/pysal_utils.py
@@ -0,0 +1,152 @@
+"""
+    Utilities module for generic PySAL functionality, mainly centered on translating queries into numpy arrays or PySAL weights objects
+"""
+
+import numpy as np
+import pysal as ps
+
+def construct_neighbor_query(w_type, query_vals):
+    """Return query (a string) used for finding neighbors
+        @param w_type text: type of neighbors to calculate ('knn' or 'queen')
+        @param query_vals dict: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+## Build weight object
+def get_weight(query_res, w_type='knn', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        @param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    else:
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        @param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'subquery', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'subquery', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        @param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM ({subquery}) As j " \
+                              "WHERE " \
+                                "i.\"{id_col}\" <> j.\"{id_col}\" AND " \
+                                "%(attr_where_j)s " \
+                              "ORDER BY " \
+                                "j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs})" \
+                ") As neighbors " \
+            "FROM ({subquery}) As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        @param params dict: information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM ({subquery}) As j " \
+                 "WHERE i.\"{id_col}\" <> j.\"{id_col}\" AND " \
+                       "ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                       "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM ({subquery}) As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_attributes(query_res, attr_num=1):
+    """
+        @param query_res: query results with attributes and neighbors
+        @param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+def empty_zipped_array(num_nones):
+    """
+        prepare return values for cases of empty weights objects (no neighbors)
+        Input:
+        @param num_nones int: number of columns (e.g., 4)
+        Output:
+        [(None, None, None, None)]
+    """
+
+    return [tuple([None] * num_nones)]
--- a/src/py/crankshaft/setup.py
+++ b/src/py/crankshaft/setup.py
@@ -10,7 +10,7 @@ from setuptools import setup, find_packages
 setup(
    name='crankshaft',

-    version='0.0.1',
+    version='0.0.0',

    description='CartoDB Spatial Analysis Python Library',

@@ -42,7 +42,7 @@ setup(
    # provisioned in the production servers.
    install_requires=['pysal==1.9.1'],

-    requires=['pysal', 'numpy' ],
+    requires=['pysal', 'numpy', 'sklearn', 'scikit-image'],

    test_suite='test'
 )
--- a/src/py/crankshaft/test/fixtures/moran.json
+++ b/src/py/crankshaft/test/fixtures/moran.json
@@ -1,52 +1,52 @@
 [[0.9319096128346788, "HH"],
 [-1.135787401862846, "HL"],
-[0.11732030672508517, "Not significant"],
-[0.6152779669180425, "Not significant"],
-[-0.14657336660125297, "Not significant"],
-[0.6967858120189607, "Not significant"],
-[0.07949310115714454, "Not significant"],
-[0.4703198759258987, "Not significant"],
-[0.4421125200498064, "Not significant"],
-[0.5724288737143592, "Not significant"],
+[0.11732030672508517, "LL"],
+[0.6152779669180425, "LL"],
+[-0.14657336660125297, "LH"],
+[0.6967858120189607, "LL"],
+[0.07949310115714454, "HH"],
+[0.4703198759258987, "HH"],
+[0.4421125200498064, "HH"],
+[0.5724288737143592, "LL"],
 [0.8970743435692062, "LL"],
-[0.18327334401918674, "Not significant"],
-[-0.01466729201304962, "Not significant"],
-[0.3481559372544409, "Not significant"],
-[0.06547094736902978, "Not significant"],
+[0.18327334401918674, "LL"],
+[-0.01466729201304962, "HL"],
+[0.3481559372544409, "LL"],
+[0.06547094736902978, "LL"],
 [0.15482141569329988, "HH"],
-[0.4373841193538136, "Not significant"],
-[0.15971286468915544, "Not significant"],
-[1.0543588860308968, "Not significant"],
+[0.4373841193538136, "HH"],
+[0.15971286468915544, "LL"],
+[1.0543588860308968, "HH"],
 [1.7372866900020818, "HH"],
 [1.091998586053999, "LL"],
-[0.1171572584252222, "Not significant"],
-[0.08438455015300014, "Not significant"],
-[0.06547094736902978, "Not significant"],
+[0.1171572584252222, "HH"],
+[0.08438455015300014, "LL"],
+[0.06547094736902978, "LL"],
 [0.15482141569329985, "HH"],
 [1.1627044812890683, "HH"],
-[0.06547094736902978, "Not significant"],
-[0.795275137550483, "Not significant"],
+[0.06547094736902978, "LL"],
+[0.795275137550483, "HH"],
 [0.18562939195219, "LL"],
-[0.3010757406693439, "Not significant"],
+[0.3010757406693439, "LL"],
 [2.8205795942839376, "HH"],
-[0.11259190602909264, "Not significant"],
-[-0.07116352791516614, "Not significant"],
-[-0.09945240794119009, "Not significant"],
+[0.11259190602909264, "LL"],
+[-0.07116352791516614, "HL"],
+[-0.09945240794119009, "LH"],
 [0.18562939195219, "LL"],
-[0.1832733440191868, "Not significant"],
-[-0.39054253768447705, "Not significant"],
+[0.1832733440191868, "LL"],
+[-0.39054253768447705, "HL"],
 [-0.1672071289487642, "HL"],
-[0.3337669247916343, "Not significant"],
-[0.2584386102554792, "Not significant"],
+[0.3337669247916343, "HH"],
+[0.2584386102554792, "HH"],
 [-0.19733845476322634, "HL"],
 [-0.9379282899805409, "LH"],
-[-0.028770969951095866, "Not significant"],
-[0.051367269430983485, "Not significant"],
+[-0.028770969951095866, "LH"],
+[0.051367269430983485, "LL"],
 [-0.2172548045913472, "LH"],
-[0.05136726943098351, "Not significant"],
-[0.04191046803899837, "Not significant"],
+[0.05136726943098351, "LL"],
+[0.04191046803899837, "LL"],
 [0.7482357030403517, "HH"],
-[-0.014585767863118111, "Not significant"],
-[0.5410013139159929, "Not significant"],
+[-0.014585767863118111, "LH"],
+[0.5410013139159929, "HH"],
 [1.0223932668429925, "LL"],
-[1.4179402898927476, "LL"]]
+[1.4179402898927476, "LL"]]
--- a/src/py/crankshaft/test/test_clustering_moran.py
+++ b/src/py/crankshaft/test/test_clustering_moran.py
@@ -1,8 +1,6 @@
 import unittest
 import numpy as np

-import unittest
-

 # from mock_plpy import MockPlPy
 # plpy = MockPlPy()
@@ -12,25 +10,26 @@ import unittest
 from helper import plpy, fixture_file

 import crankshaft.clustering as cc
+import crankshaft.pysal_utils as pu
 from crankshaft import random_seeds
 import json

 class MoranTest(unittest.TestCase):
-    """Testing class for Moran's I functions."""
+    """Testing class for Moran's I functions"""

    def setUp(self):
        plpy._reset()
        self.params = {"id_col": "cartodb_id",
                       "attr1": "andy",
                       "attr2": "jay_z",
-                       "table": "a_list",
+                       "subquery": "SELECT * FROM a_list",
                       "geom_col": "the_geom",
                       "num_ngbrs": 321}
        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
        self.moran_data = json.loads(open(fixture_file('moran.json')).read())

    def test_map_quads(self):
-        """Test map_quads."""
+        """Test map_quads"""
        self.assertEqual(cc.map_quads(1), 'HH')
        self.assertEqual(cc.map_quads(2), 'LH')
        self.assertEqual(cc.map_quads(3), 'LL')
@@ -38,80 +37,8 @@ class MoranTest(unittest.TestCase):
        self.assertEqual(cc.map_quads(33), None)
        self.assertEqual(cc.map_quads('andy'), None)

-    def test_query_attr_select(self):
-        """Test query_attr_select."""
-
-        ans = "i.\"{attr1}\"::numeric As attr1, " \
-              "i.\"{attr2}\"::numeric As attr2, "
-
-        self.assertEqual(cc.query_attr_select(self.params), ans)
-
-    def test_query_attr_where(self):
-        """Test query_attr_where."""
-
-        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" <> 0"
-
-        self.assertEqual(cc.query_attr_where(self.params), ans)
-
-    def test_knn(self):
-        """Test knn function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
-              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
-              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
-              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
-              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
-              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
-              "BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.knn(self.params), ans)
-
-    def test_queen(self):
-        """Test queen neighbors function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
-              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
-              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
-              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
-              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
-              "i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.queen(self.params), ans)
-
-    def test_get_query(self):
-        """Test get_query."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
-              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
-              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
-              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
-              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
-              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.get_query('knn', self.params), ans)
-
-    def test_get_attributes(self):
-        """Test get_attributes."""
-
-        ## need to add tests
-
-        self.assertEqual(True, True)
-
-    def test_get_weight(self):
-        """Test get_weight."""
-
-        self.assertEqual(True, True)
-
-
    def test_quad_position(self):
-        """Test lisa_sig_vals."""
+        """Test lisa_sig_vals"""

        quads = np.array([1, 2, 3, 4], np.int)

@@ -125,7 +52,7 @@ class MoranTest(unittest.TestCase):
        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
        plpy._define_result('select', data)
        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = cc.moran_local('subquery', 'value', 99, 'the_geom', 'cartodb_id', 'knn', 5)
        result = [(row[0], row[1]) for row in result]
        expected = self.moran_data
        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
@@ -137,8 +64,20 @@ class MoranTest(unittest.TestCase):
        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
        plpy._define_result('select', data)
        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = cc.moran_local_rate('subquery', 'numerator', 'denominator', 99, 'the_geom', 'cartodb_id', 'knn', 5)
+        print 'result == None? ', result == None
        result = [(row[0], row[1]) for row in result]
        expected = self.moran_data
        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
            self.assertAlmostEqual(res_val, exp_val)
+
+    def test_moran(self):
+        """Test Moran's I global"""
+        data = [{ 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1235)
+        result = cc.moran('table', 'value', 99, 'the_geom', 'cartodb_id', 'knn', 5)
+        print 'result == None?', result == None
+        result_moran = result[0][0]
+        expected_moran = np.array([row[0] for row in self.moran_data]).mean()
+        self.assertAlmostEqual(expected_moran, result_moran, delta=10e-2)
--- a/src/py/crankshaft/test/test_pysal_utils.py
+++ b/src/py/crankshaft/test/test_pysal_utils.py
@@ -0,0 +1,107 @@
+import unittest
+
+import crankshaft.pysal_utils as pu
+from crankshaft import random_seeds
+
+
+class PysalUtilsTest(unittest.TestCase):
+    """Testing class for utility functions related to PySAL integrations"""
+
+    def setUp(self):
+        self.params = {"id_col": "cartodb_id",
+                       "attr1": "andy",
+                       "attr2": "jay_z",
+                       "subquery": "SELECT * FROM a_list",
+                       "geom_col": "the_geom",
+                       "num_ngbrs": 321}
+
+    def test_query_attr_select(self):
+        """Test query_attr_select"""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(pu.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test pu.query_attr_where"""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND " \
+              "idx_replace.\"{attr2}\" IS NOT NULL AND " \
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(pu.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn neighbors constructor"""
+
+        ans = "SELECT i.\"cartodb_id\" As id, " \
+                     "i.\"andy\"::numeric As attr1, " \
+                     "i.\"jay_z\"::numeric As attr2, " \
+                     "(SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+                                   "FROM (SELECT * FROM a_list) As j " \
+                                   "WHERE " \
+                                    "i.\"cartodb_id\" <> j.\"cartodb_id\" AND " \
+                                    "j.\"andy\" IS NOT NULL AND " \
+                                    "j.\"jay_z\" IS NOT NULL AND " \
+                                    "j.\"jay_z\" <> 0 " \
+                                   "ORDER BY " \
+                                    "j.\"the_geom\" <-> i.\"the_geom\" ASC " \
+                      "LIMIT 321)) As neighbors " \
+              "FROM (SELECT * FROM a_list) As i " \
+              "WHERE i.\"andy\" IS NOT NULL AND " \
+                    "i.\"jay_z\" IS NOT NULL AND " \
+                    "i.\"jay_z\" <> 0 " \
+              "ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(pu.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors constructor"""
+
+        ans = "SELECT i.\"cartodb_id\" As id, " \
+                     "i.\"andy\"::numeric As attr1, " \
+                     "i.\"jay_z\"::numeric As attr2, " \
+                     "(SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+                                   "FROM (SELECT * FROM a_list) As j " \
+                                   "WHERE " \
+                                   "i.\"cartodb_id\" <> j.\"cartodb_id\" AND " \
+                                   "ST_Touches(i.\"the_geom\", " \
+                                              "j.\"the_geom\") AND " \
+                                   "j.\"andy\" IS NOT NULL AND " \
+                                   "j.\"jay_z\" IS NOT NULL AND " \
+                                   "j.\"jay_z\" <> 0)" \
+                                  ") As neighbors " \
+              "FROM (SELECT * FROM a_list) As i " \
+              "WHERE i.\"andy\" IS NOT NULL AND " \
+                    "i.\"jay_z\" IS NOT NULL AND " \
+                    "i.\"jay_z\" <> 0 " \
+              "ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(pu.queen(self.params), ans)
+
+    def test_construct_neighbor_query(self):
+        """Test construct_neighbor_query"""
+
+        # Compare to raw knn query
+        self.assertEqual(pu.construct_neighbor_query('knn', self.params),
+                         pu.knn(self.params))
+
+    def test_get_attributes(self):
+        """Test get_attributes"""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight"""
+
+        self.assertEqual(True, True)
+
+    def test_empty_zipped_array(self):
+        """Test empty_zipped_array"""
+        ans2 = [(None, None)]
+        ans4 = [(None, None, None, None)]
+        self.assertEqual(pu.empty_zipped_array(2), ans2)
+        self.assertEqual(pu.empty_zipped_array(4), ans4)
Author	SHA1	Message	Date
Ubuntu	1dbbb5ecaa	updating to run in crankshaft. Output still wrong but getting there	2016-05-20 20:44:40 +00:00
Stuart Lynn	a00c8df201	adding deps	2016-05-18 17:53:35 -04:00
Stuart Lynn	a216a06cbc	missing init	2016-05-18 17:34:22 -04:00
Stuart Lynn	874b5318ff	fixing bugs and adding contours to the payload	2016-05-18 17:32:14 -04:00
Stuart Lynn	e59befae82	first stab at contouring code	2016-05-18 17:22:42 -04:00
Andy Eschbacher	633b63bccc	Merge pull request #25 from CartoDB/improve-moran-queries-revisited adding condition to avoid self-comparison in neighbor queries	2016-03-30 15:40:29 -04:00
Andy Eschbacher	ea02f36235	adding condition to avoid self-comparison in neighbor queries	2016-03-30 15:37:51 -04:00
Andy Eschbacher	22b6aed7c1	Merge pull request #16 from CartoDB/proof-read-and-gitignore-update Proof read and gitignore update	2016-03-30 12:37:29 -04:00
Andy Eschbacher	f6e8524669	Merge pull request #19 from CartoDB/restructure-moran-redux Restructure moran redux	2016-03-30 12:10:36 -04:00
Andy Eschbacher	02b74813ac	add test for global moran	2016-03-30 12:09:49 -04:00
Andy Eschbacher	4c243bf1d3	correct func signatures	2016-03-30 11:44:44 -04:00
Andy Eschbacher	b0150d4fec	adding tests for pysal_utils	2016-03-30 08:27:14 -04:00
Andy Eschbacher	6bb4f36df5	extracting util code to new submodule	2016-03-30 08:10:35 -04:00
Andy Eschbacher	5a46f65e59	update tests to remove plpy notices	2016-03-30 08:09:48 -04:00
Andy Eschbacher	e56519f599	removed unneded comments, make outputs more consistent	2016-03-29 23:39:29 -07:00
Andy Eschbacher	8dd8ab37a5	refactored from pylint	2016-03-29 22:49:31 -07:00
Andy Eschbacher	06f5cf9951	standarizing error reporting	2016-03-29 12:34:23 -07:00
Andy Eschbacher	bc67ae8f69	changed name of functions for observatory	2016-03-29 12:18:52 -07:00
Andy Eschbacher	eecbe39547	updating tests	2016-03-22 10:42:44 -04:00
Andy Eschbacher	1578b17eb8	updated function flow without significance	2016-03-22 10:42:06 -04:00
Andy Eschbacher	3eda8ecd16	new signatures for moran (w/o significance)	2016-03-22 10:34:22 -04:00
Andy Eschbacher	0aa4d0a50e	typo fixes, linking, etc.	2016-03-21 08:51:10 -04:00
Andy Eschbacher	3b31da783a	adding mac ds_store ignore	2016-03-21 08:40:37 -04:00
Javier Goizueta	8762f6ca1c	Merge pull request #12 from CartoDB/feat-moran-free-queries Allow to pass free queries as `select * from table limit 100` in moran	2016-03-16 19:43:15 +01:00
Raul Ochoa	58c141d217	Allow to pass free queries as `select * from table limit 100` in moran	2016-03-16 19:40:06 +01:00
Javier Goizueta	5a7d3178dd	Release 0.0.2 This version is the first with the new versioning approach which uses separate per-version Pyhton virtual enironments.	2016-03-16 19:22:21 +01:00
Javier Goizueta	4903af6cdc	Add existing release 0.0.1 The existing 0.0.1 files are placed into their location in release/	2016-03-16 18:41:49 +01:00
Javier Goizueta	692014d694	Merge pull request #11 from CartoDB/new-versioning-package-varenv New versioning process (with multiple virtual environments)	2016-03-16 18:21:52 +01:00
Javier Goizueta	47e0253652	Fixes to the documentation	2016-03-16 18:18:59 +01:00
Javier Goizueta	9f03a9b075	Reorganize the documentation into separate files Keep a "Quickstart Guide" in the README, add separate detailed sections for development (CONTRIBUTING) and release/deployment (RELEASE).	2016-03-16 17:42:28 +01:00
Javier Goizueta	b5281d0681	Documentation clarifications and corrections.	2016-03-16 17:19:21 +01:00
Javier Goizueta	689ec8a925	Change version function from IMMUTABLE to STABLE These functions' results will change when the extension is updated.	2016-03-16 17:09:50 +01:00
Javier Goizueta	a7e42e93cc	Rename cdb_crankshaft_internal_version as internal function	2016-03-16 16:41:54 +01:00
Javier Goizueta	bad09ffd7b	Remove abandoned alternatives from the documentation	2016-03-16 16:30:03 +01:00
Javier Goizueta	4706442a1d	Add documentation about useful make targets	2016-03-16 15:56:19 +01:00
Javier Goizueta	935c7f9963	Add missing Makefile comment	2016-03-16 15:54:39 +01:00
Javier Goizueta	ef3bcaeee8	Restore commented-out make target	2016-03-16 15:52:47 +01:00
Javier Goizueta	4ffb2c9664	Review and fix the documentation	2016-03-16 15:45:13 +01:00
Javier Goizueta	dea6e2f1a7	Refactor the Makefile Separate concerns properly for each subdirectory's Makefile	2016-03-16 15:40:40 +01:00
Javier Goizueta	d13f167d47	Add RELEASE_VERSION option to make deploy Now make deploy installs by default the current version, but can be made to install any prior specific version using a environmnt varialbe RELEASE_VERSION	2016-03-16 14:38:18 +01:00
Javier Goizueta	a518034e65	Fix .pyc files need not only be ignored inside src/py	2016-03-16 11:13:26 +01:00
Javier Goizueta	24e4037995	Fix version number of released extension script	2016-03-16 11:11:16 +01:00
Javier Goizueta	82a738fe40	Fix make clean tasks	2016-03-16 10:18:07 +01:00
Javier Goizueta	e801c9cb60	Release tasks using release-specific virtual environments Refine the development process and define the procedure for releasing new versions.	2016-03-15 18:48:46 +01:00