Release 0.0.2

This version is the first with the new versioning approach which uses separate per-version Pyhton virtual enironments.
Add existing release 0.0.1
2016-03-16 19:22:21 +01:00 · 2016-03-16 18:41:49 +01:00 · 2016-03-16 18:21:52 +01:00 · 2016-03-16 18:18:59 +01:00 · 2016-03-16 17:42:28 +01:00 · 2016-03-16 17:19:21 +01:00
79 changed files with 2190 additions and 200 deletions
--- a/python/.gitignore
+++ b/python/.gitignore
@@ -1 +1,2 @@
+envs/
 *.pyc
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,84 +1,91 @@
-# Contributing guide
+# Development process

-## How to add new functions
+Please read the Working Process/Quickstart Guide in README.md first.

-Try to put as little logic in the SQL extension as possible and
-just use it as a wrapper to the Python module functionality.
+For any modification of crankshaft, such as adding new features,
+refactoring or bug-fixing, topic branch must be created out of the `develop`
+branch and be used for the development process.

-Once a function is defined it should never change its signature in subsequent
-versions. To change a function's signature a new function with a different
-name must be created.
+Modifications are done inside `src/pg/sql` and `src/py/crankshaft`.

-### Version numbers
+Take into account:

-The version of both the SQL extension and the Python package shall
-follow the [Semantic Versioning 2.0](http://semver.org/) guidelines:
+*  Tests must be added for any new functionality
+   (inside `src/pg/test`, `src/py/crankshaft/test`) as well as to
+   detect any bugs that are being fixed.
+*  Add or modify the corresponding documentation files in the `doc` folder.
+   Since we expect to have highly technical functions here, an extense
+   background explanation would be of great help to users of this extension.
+*  Convention: snake case(i.e. `snake_case` and not `CamelCase`)
+   shall be used for all function names.
+   Prefix function names intended for public use with `cdb_`
+   and private functions (to be used only internally inside
+   the extension)  with `_cdb_`.

-* When backwards incompatibility is introduced the major number is incremented
-* When functionally is added (in a backwards-compatible manner) the minor number
-  is incremented
-* When only fixes are introduced (backwards-compatible) the patch number is
-  incremented
+Once the code is ready to be tested, update the local development installation
+with `sudo make install`.
+This will update the 'dev' version of the extension in `src/pg/` and
+make it available to PostgreSQL.
+It will also install the python package (crankshaft) in a virtual
+environment `env/dev`.

-### Python Package
+The version number of the Python package, defined in
+`src/pg/crankshaft/setup.py` will be overridden when
+the package is released and always match the extension version number,
+but for development it shall be kept as '0.0.0'.

-...
+Run the tests with `make test`.

-### SQL Extension
-
-* Generate a **new subfolder version** for `sql` and `test` folders to define
-  the new functions and tests
-  - Use symlinks to avoid file duplication between versions that don't update them
-  - Add new files or modify copies of the old files to add new functions or
-    modify existing functions (remember to rename a function if the signature
-    changes)
-  - Add or modify the corresponding documentation files in the `doc` folder.
-    Since we expect to have highly technical functions here, an extense
-    background explanation would be of great help to users of this extension.
-  - Create tests for the new functions/behaviour
-
-* Generate the **upgrade and downgrade files** for the extension
-
-* Update the control file and the Makefile to generate the complete SQL
-  file for the new created version. After running `make` a new
-  file `crankshaft--X.Y.Z.sql` will be created for the current version.
-  Additional files for migrating to/from the previous version A.B.Z should be
-  created:
-  - `crankshaft--X.Y.Z--A.B.C.sql`
-  - `crankshaft--A.B.C--X.Y.Z.sql`
-  All these new files must be added to git and pushed.
-
-* Update the public docs! ;-)
-
-## Conventions
-
-# SQL
-
-Use snake case (i.e. `snake_case` and not `CamelCase`) for all
-functions. Prefix functions intended for public use with `cdb_`
-and private functions (to be used only internally inside
-the extension)  with `_cdb_`.
-
-# Python
-
-...
-
-## Testing
-
-Running just the Python tests:
+To use the python extension for custom tests, activate the virtual
+environment with:

 ```
-(cd python && make test)
+source envs/dev/bin/activate
 ```

-Installing the Extension and running just the PostgreSQL tests:
+Update extension in a working database with:
+
+* `ALTER EXTENSION crankshaft VERSION TO 'current';`
+  `ALTER EXTENSION crankshaft VERSION TO 'dev';`
+
+Note: we keep the current development version install as 'dev' always;
+we update through the 'current' alias to allow changing the extension
+contents but not the version identifier. This will fail if the
+changes involve incompatible function changes such as a different
+return type; in that case the offending function (or the whole extension)
+should be dropped manually before the update.
+
+If the extension has not previously been installed in a database,
+it can be installed directly with:
+
+* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
+
+Note: the development extension uses the development python virtual
+environment automatically.
+
+Before proceeding to the release process peer code reviewing of the code is
+a must.
+
+Once the feature or bugfix is completed and all the tests are passing
+a Pull-Request shall be created on the topic branch, reviewed by a peer
+and then merged back into the `develop` branch when all CI tests pass.
+
+When the changes in the `develop` branch are to be released in a new
+version of the extension, a PR must be created on the `develop` branch.
+
+The release manage will take hold of the PR at this moment to proceed
+to the release process for a new revision of the extension.
+
+## Relevant development tasks available in the Makefile

 ```
-(cd pg && sudo make install && PGUSER=postgres make installcheck)
-```
+* `make help` show a short description of the available targets

-Installing and testing everything:
+* `sudo make install` will generate the extension scripts for the development
+  version ('dev'/'current') and install the python package into the
+  development virtual environment `envs/dev`.
+  Intended for use by developers.

-```
-sudo make install && PGUSER=postgres make testinstalled
+* `make test` will run the tests for the installed development extension.
+  Intended for use by developers.
 ```
--- a/DEPLOYING.md
+++ b/DEPLOYING.md
@@ -1,43 +0,0 @@
-# Workflow
-
-... (branching/merging flow)
-
-# Deployment
-
-...
-
-Deployment to db servers: the next command will install both the Python
-package and the extension.
-
-```
-sudo make install
-```
-
-Installing only the Python package:
-
-```
-sudo pip install python/crankshaft --upgrade
-```
-
-Caveat: note that `pip install ./crankshaft` will install
-from local files, but `pip install crankshaft` will not.
-
-CI: Install and run the tests on the installed extension and package:
-
-```
-(sudo make install && PGUSER=postgres make testinstalled)
-```
-
-Installing the extension in user databases:
-Once installed in a server, the extension can be added
-to a database with the next SQL command:
-
-```
-CREATE EXTENSION crankshaft;
-```
-
-To upgrade the extension to an specific version X.Y.Z:
-
-```
-ALTER EXTENSION crankshaft UPGRADE TO 'X.Y.Z';
-```
--- a/69
+++ b/69
@@ -1,13 +1,70 @@
-EXT_DIR = pg
-PYP_DIR = python
+include ./Makefile.global
+
+EXT_DIR = src/pg
+PYP_DIR = src/py

 .PHONY: install
 .PHONY: run_tests
+.PHONY: release
+.PHONY: deploy

-install:
+# Generate and install developmet versions of the extension
+# and python package.
+# The extension is named 'dev' with a 'current' alias for easily upgrading.
+# The Python package is installed in a virtual environment envs/dev/
+# Requires sudo.
+install: ## Generate and install development version of the extension; requires sudo.
 	$(MAKE) -C $(PYP_DIR) install
 	$(MAKE) -C $(EXT_DIR) install

-testinstalled:
-	$(MAKE) -C $(PYP_DIR) testinstalled
-	$(MAKE) -C $(EXT_DIR) installcheck
+# Run the tests for the installed development extension and
+# python package
+test:   ## Run the tests for the development version of the extension
+	$(MAKE) -C $(PYP_DIR) test
+	$(MAKE) -C $(EXT_DIR) test
+
+# Generate a new release into release
+release: ## Generate a new release of the extension. Only for telease manager
+	$(MAKE) -C $(EXT_DIR) release
+	$(MAKE) -C $(PYP_DIR) release
+
+# Install the current release.
+# The Python package is installed in a virtual environment envs/X.Y.Z/
+# Requires sudo.
+# Use the RELEASE_VERSION environment variable to deploy a specific version:
+#     sudo make deploy RELEASE_VERSION=1.0.0
+deploy: ## Deploy a released extension. Only for release manager. Requires sudo.
+	$(MAKE) -C $(EXT_DIR) deploy
+	$(MAKE) -C $(PYP_DIR) deploy
+
+# Cleanup development extension script files
+clean-dev: ## clean up development extension script files
+	rm -f src/pg/$(EXTENSION)--*.sql
+
+# Cleanup all releases
+clean-releases: ## clean up all releases
+	rm -rf release/python/*
+	rm -f release/$(EXTENSION)--*.sql
+	rm -f release/$(EXTENSION).control
+
+# Cleanup current/specific version
+clean-release: ## clean up current release
+	rm -rf release/python/$(RELEASE_VERSION)
+	rm -f release/$(RELEASE_VERSION)--*.sql
+
+# Cleanup all virtual environments
+clean-environments: ## clean up all virtual environments
+	rm -rf envs/*
+
+clean-all: clean-dev clean-release clean-environments
+
+help:
+	@IFS=$$'\n' ; \
+	help_lines=(`fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//'`); \
+	for help_line in $${help_lines[@]}; do \
+		IFS=$$'#' ; \
+		help_split=($$help_line) ; \
+		help_command=`echo $${help_split[0]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
+		help_info=`echo $${help_split[2]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
+		printf "%-30s %s\n" $$help_command $$help_info ; \
+	done
--- a/Makefile.global
+++ b/Makefile.global
@@ -0,0 +1,6 @@
+SELF_DIR         := $(dir $(lastword $(MAKEFILE_LIST)))
+EXTENSION        = crankshaft
+PACKAGE          = crankshaft
+EXTVERSION       = $(shell grep default_version $(SELF_DIR)/src/pg/$(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
+RELEASE_VERSION ?= $(EXTVERSION)
+SED              = sed
--- a/NEWS.md
+++ b/NEWS.md
@@ -0,0 +1,7 @@
+0.0.2 (2016-03-16)
+------------------
+* New versioning approach using per-version Python virtual environments
+
+0.0.1 (2016-02-22)
+------------------
+* Preliminar release
--- a/README.md
+++ b/README.md
@@ -4,9 +4,68 @@ CartoDB Spatial Analysis extension for PostgreSQL.

 ## Code organization

-* *pg* contains the PostgreSQL extension source code
-* *python* Python module
+* *doc* documentation
+* *src* source code
+* - *src/pg* contains the PostgreSQL extension source code
+* - *src/py* Python module source code
+* *release* reseleased versions
+* *env* base directory for Python virtual environments

 ## Requirements

-* pip
+* pip, virtualenv, PostgreSQL
+* python-scipy system package (see src/py/README.md)
+
+# Working Process -- Quickstart Guide
+
+We distinguish two roles regarding the development cycle of crankshaft:
+
+* *developers* will implement new functionality and bugfixes into
+  the codebase and will request for new releases of the extension.
+* A *release manager* will attend these requests and will handle
+  the release process. The release process is sequential:
+  no concurrent releases will ever be in the works.
+
+We use the default `develop` branch as the basis for development.
+The `master` branch is used to merge and tag releases to be
+deployed in production.
+
+Developers shall create a new topic branch from `develop` for any new feature
+or bugfix and commit their changes to it and eventually merge back into
+the `develop` branch. When a new release is required a Pull Request
+will be open againt the `develop` branch.
+
+The `develop` pull requests will be handled by the release manage,
+who will merge into master where new releases are prepared and tagged.
+The `master` branch is the sole responsibility of the release masters
+and developers must not commit or merge into it.
+
+## Development Guidelines
+
+For a detailed description of the development process please see
+the CONTRIBUTING.md guide.
+
+Any modification to the source code (`src/pg/sql` for the SQL extension,
+`src/py/crankshaft` for the Python package) shall always be done
+in a topic branch created from the `develop` branch.
+
+Tests, documentation and peer code reviewing are required for all
+modifications.
+
+The tests (both for SQL and Pyhton) are executed by running,
+from the top directory:
+
+```
+sudo make install
+make test
+```
+
+To request a new release, which will be handled by them
+release manager, a Pull Request must be created in the `develop`
+branch.
+
+## Release
+
+The release and deployment process is described in the
+RELEASE.md guide and it is the responsibility of the designated
+release manager.
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -0,0 +1,93 @@
+# Release & Deployment Process
+
+Please read the Working Process/Quickstart Guide in README.md
+and the Development guidelines in CONTRIBUTING.md.
+
+The release process of a new version of the extension
+shall be performed by the designated *Release Manager*.
+
+Note that we expect to gradually automate more of this process.
+
+Having checked PR to be released it shall be
+merged back into the `master` branch to prepare the new release.
+
+The version number in `pg/cranckshaft.control` must first be updated.
+To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
+
+Thew `NEWS.md` will be updated.
+
+We now will explain the process for the case of backwards-compatible
+releases (updating the minor or patch version numbers).
+
+TODO: document the complex case of major releases.
+
+The next command must be executed to produce the main installation
+script for the new release, `release/cranckshaft--X.Y.Z.sql` and
+also to copy the python package to `release/python/X.Y.Z/crankshaft`.
+
+```
+make release
+```
+
+Then, the release manager shall produce upgrade and downgrade scripts
+to migrate to/from the previous release. In the case of minor/patch
+releases this simply consist in extracting the functions that have changed
+and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
+file.
+
+The new release can be deployed for staging/smoke tests with this command:
+
+```
+sudo make deploy
+```
+
+This will copy the current 'X.Y.Z' released version of the extension to
+PostgreSQL. The corresponding Python extension will be installed in a
+virtual environment in `envs/X.Y.Z`.
+
+It can be activated with:
+
+```
+source envs/X.Y.Z/bin/activate
+```
+
+But note that this is needed only for using the package directly;
+the 'X.Y.Z' version of the extension will automatically use the
+python package from this virtual environment.
+
+The `sudo make deploy` operation can be also used for installing
+the new version after it has been released.
+
+To install a specific version 'X.Y.Z' different from the current one
+(which must be present in `releases/`) you can:
+
+```
+sudo make deploy RELEASE_VERSION=X.Y.Z
+```
+
+TODO: testing procedure for the new release.
+
+TODO: procedure for staging deployment.
+
+TODO: procedure for merging to master, tagging and deploying
+in production.
+
+## Relevant release & deployment tasks available in the Makefile
+
+```
+* `make help` show a short description of the available targets
+
+* `make release` will generate a new release (version number defined in
+  `src/pg/crankshaft.control`) into `release/`.
+  Intended for use by the release manager.
+
+* `sudo make deploy` will install the current release X.Y.Z from the
+  `release/` files into PostgreSQL and a Python virtual environment
+  `envs/X.Y.Z`.
+  Intended for use by the release manager and deployment jobs.
+
+* `sudo make deploy RELEASE_VERSION=X.Y.Z` will install specified version
+  previously generated in `release/`
+  into PostgreSQL and a Python virtual environment `envs/X.Y.Z`.
+  Intended for use by the release manager and deployment jobs.
+```
--- a/TODO.md
+++ b/TODO.md
@@ -1,9 +0,0 @@
-* [x] Support versioning
-* [x] Test use of `plpy` from python Package
-* [x] Add `pysal` etc. dependencies
-* [x] Define documentation practices (general, per extension/package?)
-* [x] Add initial function set (WIP)
-* Unify style of function comments
-* [x] Add integration tests
-* Make target to open a new version development (create symlinks, etc.)
-* [x] Should add cartodb ext. as a dependency?
--- a/pg/doc/02_moran.md
+++ b/pg/doc/02_moran.md
--- a/pg/doc/03_overlap_sum.md
+++ b/pg/doc/03_overlap_sum.md
--- a/pg/.gitignore
+++ b/pg/.gitignore
@@ -1,3 +0,0 @@
-regression.diffs
-regression.out
-results/
--- a/pg/Makefile
+++ b/pg/Makefile
@@ -1,33 +0,0 @@
-# Makefile to generate the extension out of separate sql source files.
-# Once a version is released, it is not meant to be changed. E.g: once version 0.0.1 is out, it SHALL NOT be changed.
-
-EXTENSION    = crankshaft
-EXTVERSION   = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
-
-# The new version to be generated from templates
-NEW_EXTENSION_ARTIFACT = $(EXTENSION)--$(EXTVERSION).sql
-
-# DATA is a special variable used by postgres build infrastructure
-# These are the files to be installed in the server shared dir,
-# for installation from scratch, upgrades and downgrades.
-# @see http://www.postgresql.org/docs/current/static/extend-pgxs.html
-DATA =  $(NEW_EXTENSION_ARTIFACT)
-
-SOURCES_DATA_DIR = sql/$(EXTVERSION)
-SOURCES_DATA = $(wildcard sql/$(EXTVERSION)/*.sql)
-
-# The extension installation artifacts are stored in the base subdirectory
-$(NEW_EXTENSION_ARTIFACT): $(SOURCES_DATA)
-	rm -f $@
-	cat $(SOURCES_DATA_DIR)/*.sql >> $@
-
-REGRESS = $(notdir $(basename $(wildcard test/$(EXTVERSION)/sql/*test.sql)))
-TEST_DIR = test/$(EXTVERSION)
-REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
-
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-
-# This seems to be needed at least for PG 9.3.11
-all: $(DATA)
--- a/pg/README.md
+++ b/pg/README.md
@@ -1,7 +0,0 @@
-
-# Running the tests:
-
-```
-sudo make install
-PGUSER=postgres make installcheck
-```
--- a/pg/test/0.0.1/results/01_install_test.out
+++ b/pg/test/0.0.1/results/01_install_test.out
@@ -1,6 +0,0 @@
-- Install dependencies
-CREATE EXTENSION plpythonu;
-CREATE EXTENSION postgis;
-CREATE EXTENSION cartodb;
-- Install the extension
-CREATE EXTENSION crankshaft;
--- a/python/Makefile
+++ b/python/Makefile
@@ -1,11 +0,0 @@
-# Install the package (needs root privileges)
-install:
-	pip install ./crankshaft --upgrade
-
-# Test from source code
-test:
-	(cd crankshaft && nosetests test/)
-
-# Test currently installed package
-testinstalled:
-	nosetests crankshaft/test/
--- a/python/README.md
+++ b/python/README.md
@@ -1,9 +0,0 @@
-# Crankshaft Python Package
-
-...
-### Run the tests
-
-```bash
-cd crankshaft
-nosetests test/
-```
--- a/release/.gitignore
+++ b/release/.gitignore
--- a/release/crankshaft--0.0.1--0.0.2.sql
+++ b/release/crankshaft--0.0.1--0.0.2.sql
@@ -0,0 +1,74 @@
+CREATE OR REPLACE FUNCTION cdb_crankshaft.cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '0.0.2'::text;
+$$ language 'sql' STABLE STRICT;
+
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' STABLE STRICT;
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    RETURN '/home/ubuntu/crankshaft/envs';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+-- Moran's I
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
--- a/release/crankshaft--0.0.1.sql
+++ b/release/crankshaft--0.0.1.sql
--- a/release/crankshaft--0.0.2--0.0.1.sql
+++ b/release/crankshaft--0.0.2--0.0.1.sql
@@ -0,0 +1,44 @@
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+CREATE OR REPLACE FUNCTION
+cdb_crankshaft.cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+DROP FUNCTION IF EXISTS cdb_crankshaft.cdb_crankshaft_version();
+DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_internal_version();
+DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_activate_py();
--- a/release/crankshaft--0.0.2.sql
+++ b/release/crankshaft--0.0.2.sql
@@ -0,0 +1,186 @@
+--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
+-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
+-- Version number of the extension release
+CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '0.0.2'::text;
+$$ language 'sql' STABLE STRICT;
+
+-- Internal identifier of the installed extension instence
+-- e.g. 'dev' for current development version
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' STABLE STRICT;
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    -- RETURN '/opt/virtualenvs/crankshaft';
+    RETURN '/home/ubuntu/crankshaft/envs';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+-- Use the crankshaft python module
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
+-- Internal function.
+-- Set the seeds of the RNGs (Random Number Generators)
+-- used internally.
+CREATE OR REPLACE FUNCTION
+_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft import random_seeds
+  random_seeds.set_random_seeds(seed_value)
+$$ LANGUAGE plpythonu;
+-- Moran's I
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local (
+      t TEXT,
+  	  attr TEXT,
+  	  significance float DEFAULT 0.05,
+  	  num_ngbrs INT DEFAULT 5,
+  	  permutations INT DEFAULT 99,
+  	  geom_column TEXT DEFAULT 'the_geom',
+  	  id_col TEXT DEFAULT 'cartodb_id',
+      w_type TEXT DEFAULT 'knn')
+RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local
+  # TODO: use named parameters or a dictionary
+  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+
+-- Moran's I Local Rate
+CREATE OR REPLACE FUNCTION
+  cdb_moran_local_rate(t TEXT,
+		 numerator TEXT,
+		 denominator TEXT,
+		 significance FLOAT DEFAULT 0.05,
+		 num_ngbrs INT DEFAULT 5,
+		 permutations INT DEFAULT 99,
+		 geom_column TEXT DEFAULT 'the_geom',
+		 id_col TEXT DEFAULT 'cartodb_id',
+		 w_type TEXT DEFAULT 'knn')
+RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
+AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
+  from crankshaft.clustering import moran_local_rate
+  # TODO: use named parameters or a dictionary
+  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
+$$ LANGUAGE plpythonu;
+-- Function by Stuart Lynn for a simple interpolation of a value
+-- from a polygon table over an arbitrary polygon
+-- (weighted by the area proportion overlapped)
+-- Aereal weighting is a very simple form of aereal interpolation.
+--
+-- Parameters:
+--   * geom a Polygon geometry which defines the area where a value will be
+--     estimated as the area-weighted sum of a given table/column
+--   * target_table_name table name of the table that provides the values
+--   * target_column column name of the column that provides the values
+--   * schema_name optional parameter to defina the schema the target table
+--     belongs to, which is necessary if its not in the search_path.
+--     Note that target_table_name should never include the schema in it.
+-- Return value:
+--   Aereal-weighted interpolation of the column values over the geometry
+CREATE OR REPLACE
+FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
+  RETURNS numeric AS
+$$
+DECLARE
+	result numeric;
+  qualified_name text;
+BEGIN
+  IF schema_name IS NULL THEN
+    qualified_name := Format('%I', target_table_name);
+  ELSE
+    qualified_name := Format('%I.%s', schema_name, target_table_name);
+  END IF;
+  EXECUTE Format('
+    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
+    FROM %s AS a
+    WHERE $1 && a.the_geom
+  ', target_column, qualified_name)
+  USING geom
+  INTO result;
+  RETURN result;
+END;
+$$ LANGUAGE plpgsql;
+--
+-- Creates N points randomly distributed arround the polygon
+--
+-- @param g - the geometry to be turned in to points
+--
+-- @param no_points - the number of points to generate
+--
+-- @params max_iter_per_point - the function generates points in the polygon's bounding box
+-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
+-- misses per point the funciton accepts before giving up.
+--
+-- Returns: Multipoint with the requested points
+CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
+RETURNS GEOMETRY AS $$
+DECLARE
+  extent GEOMETRY;
+  test_point Geometry;
+  width                NUMERIC;
+  height               NUMERIC;
+  x0                   NUMERIC;
+  y0                   NUMERIC;
+  xp                   NUMERIC;
+  yp                   NUMERIC;
+  no_left              INTEGER;
+  remaining_iterations INTEGER;
+  points               GEOMETRY[];
+  bbox_line            GEOMETRY;
+  intersection_line    GEOMETRY;
+BEGIN
+  extent  := ST_Envelope(geom);
+  width   := ST_XMax(extent) - ST_XMIN(extent);
+  height  := ST_YMax(extent) - ST_YMIN(extent);
+  x0 	  := ST_XMin(extent);
+  y0 	  := ST_YMin(extent);
+  no_left := no_points;
+
+  LOOP
+    if(no_left=0) THEN
+      EXIT;
+    END IF;
+    yp = y0 + height*random();
+    bbox_line  = ST_MakeLine(
+      ST_SetSRID(ST_MakePoint(yp, x0),4326),
+      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
+    );
+    intersection_line = ST_Intersection(bbox_line,geom);
+  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
+	  points := points || test_point;
+	  no_left = no_left - 1 ;
+  END LOOP;
+  RETURN ST_Collect(points);
+END;
+$$
+LANGUAGE plpgsql VOLATILE;
+-- Make sure by default there are no permissions for publicuser
+-- NOTE: this happens at extension creation time, as part of an implicit transaction.
+-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
+
+-- Grant permissions on the schema to publicuser (but just the schema)
+GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
+
+-- Revoke execute permissions on all functions in the schema by default
+-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/release/crankshaft.control
+++ b/release/crankshaft.control
@@ -1,5 +1,5 @@
 comment = 'CartoDB Spatial Analysis extension'
-default_version = '0.0.1'
+default_version = '0.0.2'
 requires = 'plpythonu, postgis, cartodb'
 superuser = true
 schema = cdb_crankshaft
--- a/release/python/.gitignore
+++ b/release/python/.gitignore
--- a/release/python/0.0.1/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/init.py
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
--- a/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
--- a/release/python/0.0.1/crankshaft/setup.py
+++ b/release/python/0.0.1/crankshaft/setup.py
@@ -10,7 +10,7 @@ from setuptools import setup, find_packages
 setup(
    name='crankshaft',

-    version='0.0.1',
+    version='0.0.01',

    description='CartoDB Spatial Analysis Python Library',

--- a/release/python/0.0.1/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/moran.json
--- a/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
--- a/release/python/0.0.1/crankshaft/test/helper.py
+++ b/release/python/0.0.1/crankshaft/test/helper.py
--- a/release/python/0.0.1/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.1/crankshaft/test/mock_plpy.py
--- a/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
--- a/release/python/0.0.2/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/init.py
@@ -0,0 +1,2 @@
+import random_seeds
+import clustering
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
@@ -0,0 +1 @@
+from moran import *
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
@@ -0,0 +1,321 @@
+"""
+Moran's I geostatistics (global clustering & outliers presence)
+"""
+
+# TODO: Fill in local neighbors which have null/NoneType values with the
+#       average of the their neighborhood
+
+import numpy as np
+import pysal as ps
+import plpy
+
+# High level interface ---------------------------------------
+
+def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I implementation for PL/Python
+    Andy Eschbacher
+    """
+    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
+    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+            "attr1": attr,
+            "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    y = get_attributes(r, 1)
+    w = get_weight(r, w_type)
+
+    # calculate LISA values
+    lisa = ps.Moran_Local(y, w)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I Local Rate
+    Andy Eschbacher
+    """
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+             "numerator": numerator,
+             "denominator": denominator,
+             "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+        plpy.notice('r.nrows() = %d' % r.nrows())
+
+    ## collect attributes
+    numer = get_attributes(r, 1)
+    denom = get_attributes(r, 2)
+
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    ## TODO: Decide on which return values here
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
+
+def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    plpy.notice('** Constructing query')
+
+    qvals = {"num_ngbrs": num_ngbrs,
+             "attr1": attr1,
+             "attr2": attr2,
+             "table": t,
+             "geom_col": geom_column,
+             "id_col": id_col}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    ## collect attributes
+    attr1_vals = get_attributes(r, 1)
+    attr2_vals = get_attributes(r, 2)
+
+    # create weights
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
+
+    plpy.notice("len of Is: %d" % len(lisa.Is))
+
+    # find clustering of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+# Low level functions ----------------------------------------
+
+def map_quads(coord):
+    """
+        Map a quadrant number to Moran's I designation
+        HH=1, LH=2, LL=3, HL=4
+        Input:
+        :param coord (int): quadrant of a specific measurement
+    """
+    if coord == 1:
+        return 'HH'
+    elif coord == 2:
+        return 'LH'
+    elif coord == 3:
+        return 'LL'
+    elif coord == 4:
+        return 'HL'
+    else:
+        return None
+
+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        :param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        :param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM \"{table}\" As j " \
+                              "WHERE %(attr_where_j)s " \
+                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        :param params: dict of information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM \"{table}\" As j " \
+                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                 "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_query(w_type, query_vals):
+    """Return requested query.
+        :param w_type: type of neighbors to calculate (knn or queen)
+        :param query_vals: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+def get_attributes(query_res, attr_num):
+    """
+        :param query_res: query results with attributes and neighbors
+        :param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+## Build weight object
+def get_weight(query_res, w_type='queen', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        :param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    elif w_type == 'queen':
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
+def quad_position(quads):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    lisa_sig = np.array([map_quads(q) for q in quads])
+
+    return lisa_sig
+
+def lisa_sig_vals(pvals, quads, threshold):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    sig = (pvals <= threshold)
+
+    lisa_sig = np.empty(len(sig), np.chararray)
+
+    for idx, val in enumerate(sig):
+        if val:
+            lisa_sig[idx] = map_quads(quads[idx])
+        else:
+            lisa_sig[idx] = 'Not significant'
+
+    return lisa_sig
--- a/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
@@ -0,0 +1,10 @@
+import random
+import numpy
+
+def set_random_seeds(value):
+    """
+    Set the seeds of the RNGs (Random Number Generators)
+    used internally.
+    """
+    random.seed(value)
+    numpy.random.seed(value)
--- a/release/python/0.0.2/crankshaft/setup.py
+++ b/release/python/0.0.2/crankshaft/setup.py
@@ -0,0 +1,48 @@
+
+"""
+CartoDB Spatial Analysis Python Library
+See:
+https://github.com/CartoDB/crankshaft
+"""
+
+from setuptools import setup, find_packages
+
+setup(
+    name='crankshaft',
+
+    version='0.0.2',
+
+    description='CartoDB Spatial Analysis Python Library',
+
+    url='https://github.com/CartoDB/crankshaft',
+
+    author='Data Services Team - CartoDB',
+    author_email='dataservices@cartodb.com',
+
+    license='MIT',
+
+    classifiers=[
+        'Development Status :: 3 - Alpha',
+        'Intended Audience :: Mapping comunity',
+        'Topic :: Maps :: Mapping Tools',
+        'License :: OSI Approved :: MIT License',
+        'Programming Language :: Python :: 2.7',
+    ],
+
+    keywords='maps mapping tools spatial analysis geostatistics',
+
+    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
+
+    extras_require={
+        'dev': ['unittest'],
+        'test': ['unittest', 'nose', 'mock'],
+    },
+
+    # The choice of component versions is dictated by what's
+    # provisioned in the production servers.
+    install_requires=['pysal==1.9.1'],
+
+    requires=['pysal', 'numpy' ],
+
+    test_suite='test'
+)
--- a/release/python/0.0.2/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/moran.json
@@ -0,0 +1,52 @@
+[[0.9319096128346788, "HH"],
+[-1.135787401862846, "HL"],
+[0.11732030672508517, "Not significant"],
+[0.6152779669180425, "Not significant"],
+[-0.14657336660125297, "Not significant"],
+[0.6967858120189607, "Not significant"],
+[0.07949310115714454, "Not significant"],
+[0.4703198759258987, "Not significant"],
+[0.4421125200498064, "Not significant"],
+[0.5724288737143592, "Not significant"],
+[0.8970743435692062, "LL"],
+[0.18327334401918674, "Not significant"],
+[-0.01466729201304962, "Not significant"],
+[0.3481559372544409, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329988, "HH"],
+[0.4373841193538136, "Not significant"],
+[0.15971286468915544, "Not significant"],
+[1.0543588860308968, "Not significant"],
+[1.7372866900020818, "HH"],
+[1.091998586053999, "LL"],
+[0.1171572584252222, "Not significant"],
+[0.08438455015300014, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329985, "HH"],
+[1.1627044812890683, "HH"],
+[0.06547094736902978, "Not significant"],
+[0.795275137550483, "Not significant"],
+[0.18562939195219, "LL"],
+[0.3010757406693439, "Not significant"],
+[2.8205795942839376, "HH"],
+[0.11259190602909264, "Not significant"],
+[-0.07116352791516614, "Not significant"],
+[-0.09945240794119009, "Not significant"],
+[0.18562939195219, "LL"],
+[0.1832733440191868, "Not significant"],
+[-0.39054253768447705, "Not significant"],
+[-0.1672071289487642, "HL"],
+[0.3337669247916343, "Not significant"],
+[0.2584386102554792, "Not significant"],
+[-0.19733845476322634, "HL"],
+[-0.9379282899805409, "LH"],
+[-0.028770969951095866, "Not significant"],
+[0.051367269430983485, "Not significant"],
+[-0.2172548045913472, "LH"],
+[0.05136726943098351, "Not significant"],
+[0.04191046803899837, "Not significant"],
+[0.7482357030403517, "HH"],
+[-0.014585767863118111, "Not significant"],
+[0.5410013139159929, "Not significant"],
+[1.0223932668429925, "LL"],
+[1.4179402898927476, "LL"]]
--- a/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
@@ -0,0 +1,54 @@
+[
+    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
+    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
+    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
+    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
+    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
+    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
+    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
+    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
+    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
+    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
+    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
+    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
+    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
+    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
+    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
+    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
+    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
+    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
+    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
+    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
+    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
+    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
+    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
+    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
+    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
+    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
+    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
+    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
+    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
+    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
+    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
+    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
+    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
+    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
+    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
+    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
+    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
+    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
+    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
+    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
+    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
+    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
+    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
+    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
+    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
+    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
+    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
+    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
+    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
+    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
+    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
+    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
+  ]
--- a/release/python/0.0.2/crankshaft/test/helper.py
+++ b/release/python/0.0.2/crankshaft/test/helper.py
@@ -0,0 +1,13 @@
+import unittest
+
+from mock_plpy import MockPlPy
+plpy = MockPlPy()
+
+import sys
+sys.modules['plpy'] = plpy
+
+import os
+
+def fixture_file(name):
+    dir = os.path.dirname(os.path.realpath(__file__))
+    return os.path.join(dir, 'fixtures', name)
--- a/release/python/0.0.2/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.2/crankshaft/test/mock_plpy.py
@@ -0,0 +1,34 @@
+import re
+
+class MockPlPy:
+    def __init__(self):
+        self._reset()
+
+    def _reset(self):
+        self.infos = []
+        self.notices = []
+        self.debugs = []
+        self.logs = []
+        self.warnings = []
+        self.errors = []
+        self.fatals = []
+        self.executes = []
+        self.results = []
+        self.prepares = []
+        self.results = []
+
+    def _define_result(self, query, result):
+        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
+        self.results.append([pattern, result])
+
+    def notice(self, msg):
+        self.notices.append(msg)
+
+    def info(self, msg):
+        self.infos.append(msg)
+
+    def execute(self, query): # TODO: additional arguments
+       for result in self.results:
+          if result[0].match(query):
+            return result[1]
+       return []
--- a/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
@@ -0,0 +1,144 @@
+import unittest
+import numpy as np
+
+import unittest
+
+
+# from mock_plpy import MockPlPy
+# plpy = MockPlPy()
+#
+# import sys
+# sys.modules['plpy'] = plpy
+from helper import plpy, fixture_file
+
+import crankshaft.clustering as cc
+from crankshaft import random_seeds
+import json
+
+class MoranTest(unittest.TestCase):
+    """Testing class for Moran's I functions."""
+
+    def setUp(self):
+        plpy._reset()
+        self.params = {"id_col": "cartodb_id",
+                       "attr1": "andy",
+                       "attr2": "jay_z",
+                       "table": "a_list",
+                       "geom_col": "the_geom",
+                       "num_ngbrs": 321}
+        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
+        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
+
+    def test_map_quads(self):
+        """Test map_quads."""
+        self.assertEqual(cc.map_quads(1), 'HH')
+        self.assertEqual(cc.map_quads(2), 'LH')
+        self.assertEqual(cc.map_quads(3), 'LL')
+        self.assertEqual(cc.map_quads(4), 'HL')
+        self.assertEqual(cc.map_quads(33), None)
+        self.assertEqual(cc.map_quads('andy'), None)
+
+    def test_query_attr_select(self):
+        """Test query_attr_select."""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(cc.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test query_attr_where."""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(cc.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
+              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
+              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
+              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
+              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
+              "BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
+              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
+              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
+              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
+              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
+              "i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.queen(self.params), ans)
+
+    def test_get_query(self):
+        """Test get_query."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
+              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
+              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
+              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
+              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
+              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.get_query('knn', self.params), ans)
+
+    def test_get_attributes(self):
+        """Test get_attributes."""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight."""
+
+        self.assertEqual(True, True)
+
+
+    def test_quad_position(self):
+        """Test lisa_sig_vals."""
+
+        quads = np.array([1, 2, 3, 4], np.int)
+
+        ans = np.array(['HH', 'LH', 'LL', 'HL'])
+        test_ans = cc.quad_position(quads)
+
+        self.assertTrue((test_ans == ans).all())
+
+    def test_moran_local(self):
+        """Test Moran's I local"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
+            self.assertEqual(res_quad, exp_quad)
+
+    def test_moran_local_rate(self):
+        """Test Moran's I rate"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
--- a/src/pg/.gitignore
+++ b/src/pg/.gitignore
@@ -0,0 +1,6 @@
+regression.diffs
+regression.out
+results/
+crankshaft--dev.sql
+crankshaft--dev--current.sql
+crankshaft--current--dev.sql
--- a/src/pg/Makefile
+++ b/src/pg/Makefile
@@ -0,0 +1,60 @@
+include ../../Makefile.global
+
+# Development tasks:
+#
+# * install generates the control & script files into src/pg/
+#   and installs then into the PostgreSQL extensions directory;
+#   requires sudo. In additionof the current development version
+#   named 'dev', an alias 'current' is generating for ease of
+#   update (upgrade to 'current', then to 'dev').
+#   the python module is installed in a virtualenv in envs/dev/
+# * test runs the tests for the currently generated Development
+#   extension.
+
+DATA         = $(EXTENSION)--dev.sql \
+	             $(EXTENSION)--current--dev.sql \
+	             $(EXTENSION)--dev--current.sql
+
+SOURCES_DATA_DIR = sql
+SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql)
+
+VIRTUALENV_PATH = $(realpath ../../envs)
+ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH))
+
+REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \
+               -e 's/@@VIRTUALENV_PATH@@/$(ESC_VIRVIRTUALENV_PATH)/g'
+
+$(DATA): $(SOURCES_DATA)
+	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > $@
+
+TEST_DIR = test
+REGRESS = $(notdir $(basename $(wildcard $(TEST_DIR)/sql/*test.sql)))
+REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
+
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+
+# This seems to be needed at least for PG 9.3.11
+all: $(DATA)
+
+test: export PGUSER=postgres
+test: installcheck
+
+# Release tasks
+
+../../release/$(EXTENSION).control: $(EXTENSION).control
+	cp $< $@
+
+# Prepare new release from the currently installed development version,
+# for the current version X.Y.Z (defined in the control file)
+# producing the extension script and control files in releases/
+# and the python package in releases/python/X.Y.Z/crankshaft/
+release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
+	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > ../../release/$(EXTENSION)--$(EXTVERSION).sql
+
+# Install the current relese into the PostgreSQL extensions directory
+# and the Python package in a virtual environment envs/X.Y.Z
+deploy:
+	$(INSTALL_DATA) ../../release/$(EXTENSION).control '$(DESTDIR)$(datadir)/extension/'
+	$(INSTALL_DATA) ../../release/*.sql '$(DESTDIR)$(datadir)/extension/'
--- a/src/pg/crankshaft.control
+++ b/src/pg/crankshaft.control
@@ -0,0 +1,5 @@
+comment = 'CartoDB Spatial Analysis extension'
+default_version = '0.0.2'
+requires = 'plpythonu, postgis, cartodb'
+superuser = true
+schema = cdb_crankshaft
--- a/pg/sql/0.0.1/00_header.sql
+++ b/pg/sql/0.0.1/00_header.sql
--- a/src/pg/sql/01_version.sql
+++ b/src/pg/sql/01_version.sql
@@ -0,0 +1,12 @@
+-- Version number of the extension release
+CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '@@VERSION@@'::text;
+$$ language 'sql' STABLE STRICT;
+
+-- Internal identifier of the installed extension instence
+-- e.g. 'dev' for current development version
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' STABLE STRICT;
--- a/src/pg/sql/02_py.sql
+++ b/src/pg/sql/02_py.sql
@@ -0,0 +1,23 @@
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    -- RETURN '/opt/virtualenvs/crankshaft';
+    RETURN '@@VIRTUALENV_PATH@@';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+-- Use the crankshaft python module
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
--- a/pg/sql/0.0.1/01_random_seeds.sql
+++ b/pg/sql/0.0.1/01_random_seeds.sql
@@ -4,6 +4,7 @@
 CREATE OR REPLACE FUNCTION
 _cdb_random_seeds (seed_value INTEGER) RETURNS VOID
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft import random_seeds
  random_seeds.set_random_seeds(seed_value)
 $$ LANGUAGE plpythonu;
--- a/pg/sql/0.0.1/02_moran.sql
+++ b/pg/sql/0.0.1/02_moran.sql
@@ -11,6 +11,7 @@ CREATE OR REPLACE FUNCTION
      w_type TEXT DEFAULT 'knn')
 RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local
  # TODO: use named parameters or a dictionary
  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
@@ -29,6 +30,7 @@ CREATE OR REPLACE FUNCTION
 		 w_type TEXT DEFAULT 'knn')
 RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local_rate
  # TODO: use named parameters or a dictionary
  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
--- a/pg/sql/0.0.1/03_overlap_sum.sql
+++ b/pg/sql/0.0.1/03_overlap_sum.sql
--- a/pg/sql/0.0.1/04_dot_density.sql
+++ b/pg/sql/0.0.1/04_dot_density.sql
--- a/pg/sql/0.0.1/90_permissions.sql
+++ b/pg/sql/0.0.1/90_permissions.sql
--- a/pg/test/0.0.1/expected/01_install_test.out
+++ b/pg/test/0.0.1/expected/01_install_test.out
@@ -3,4 +3,4 @@ CREATE EXTENSION plpythonu;
 CREATE EXTENSION postgis;
 CREATE EXTENSION cartodb;
 -- Install the extension
-CREATE EXTENSION crankshaft;
+CREATE EXTENSION crankshaft VERSION 'dev';
--- a/pg/test/0.0.1/expected/02_moran_test.out
+++ b/pg/test/0.0.1/expected/02_moran_test.out
--- a/pg/test/0.0.1/expected/03_overlap_sum_test.out
+++ b/pg/test/0.0.1/expected/03_overlap_sum_test.out
--- a/pg/test/0.0.1/expected/04_dot_density_test.out
+++ b/pg/test/0.0.1/expected/04_dot_density_test.out
--- a/src/pg/test/fixtures/polyg_values.sql
+++ b/src/pg/test/fixtures/polyg_values.sql
--- a/src/pg/test/fixtures/ppoints.sql
+++ b/src/pg/test/fixtures/ppoints.sql
--- a/src/pg/test/fixtures/ppoints2.sql
+++ b/src/pg/test/fixtures/ppoints2.sql
--- a/pg/test/0.0.1/sql/01_install_test.sql
+++ b/pg/test/0.0.1/sql/01_install_test.sql
@@ -4,4 +4,4 @@ CREATE EXTENSION postgis;
 CREATE EXTENSION cartodb;

 -- Install the extension
-CREATE EXTENSION crankshaft;
+CREATE EXTENSION crankshaft VERSION 'dev';
--- a/pg/test/0.0.1/sql/02_moran_test.sql
+++ b/pg/test/0.0.1/sql/02_moran_test.sql
--- a/pg/test/0.0.1/sql/03_overlap_sum_test.sql
+++ b/pg/test/0.0.1/sql/03_overlap_sum_test.sql
--- a/pg/test/0.0.1/sql/04_dot_density_test.sql
+++ b/pg/test/0.0.1/sql/04_dot_density_test.sql
--- a/pg/test/0.0.1/sql/90_permissions.sql
+++ b/pg/test/0.0.1/sql/90_permissions.sql
--- a/src/py/Makefile
+++ b/src/py/Makefile
@@ -0,0 +1,22 @@
+include ../../Makefile.global
+
+# Install the package locally for development
+install:
+	virtualenv --system-site-packages ../../envs/dev
+	# source ../../envs/dev/bin/activate
+	../../envs/dev/bin/pip install -I ./crankshaft
+	../../envs/dev/bin/pip install -I nose
+
+# Test develpment install
+test:
+	../../envs/dev/bin/nosetests crankshaft/test/
+
+release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
+	mkdir -p ../../release/python/$(EXTVERSION)
+	cp -r ./$(PACKAGE) ../../release/python/$(EXTVERSION)/
+	$(SED) -i -r 's/version='"'"'[0-9]+\.[0-9]+\.[0-9]+'"'"'/version='"'"'$(EXTVERSION)'"'"'/g'  ../../release/python/$(EXTVERSION)/$(PACKAGE)/setup.py
+
+deploy:
+	virtualenv --system-site-packages $(VIRTUALENV_PATH)/$(RELEASE_VERSION)
+	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I -U ../../release/python/$(RELEASE_VERSION)/$(PACKAGE)
+	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I nose
--- a/src/py/README.md
+++ b/src/py/README.md
@@ -0,0 +1,88 @@
+# Crankshaft Python Package
+
+...
+### Run the tests
+
+```bash
+cd crankshaft
+nosetests test/
+```
+
+## Notes about python dependencies
+* This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments.
+* We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has  all the dependencies
+* Every dependency should be:
+  - Added to the `setup.py` file
+  - Installed through it
+  - Tested, when they have a test suite.
+  - Fixed in the `requirements.txt`
+* At present we use Python version 2.7.3
+
+---
+
+To avoid troublesome compilations/linkings we will use
+the available system package `python-scipy`.
+This package and its dependencies provide numpy 1.6.1
+and scipy 0.9.0. To be able to use these versions we cannot
+PySAL 1.10 or later, so we'll stick to 1.9.1.
+
+```
+apt-get install -y python-scipy
+```
+
+We'll use virtual environments to install our packages,
+but configued to use also system modules so that the
+mentioned scipy and numpy are used.
+
+    # Create a virtual environment for python
+    $ virtualenv --system-site-packages dev
+
+    # Activate the virtualenv
+    $ source dev/bin/activate
+
+    # Install all the requirements
+    # expect this to take a while, as it will trigger a few compilations
+    (dev) $ pip install -I ./crankshaft
+
+#### Test the libraries with that virtual env
+
+##### Test numpy library dependency:
+
+    import numpy
+    numpy.test('full')
+
+##### Run scipy tests
+
+    import scipy
+    scipy.test('full')
+
+##### Testing pysal
+
+See [http://pysal.readthedocs.org/en/latest/developers/testing.html]
+
+This will require putting this into `dev/lib/python2.7/site-packages/setup.cfg`:
+
+```
+[nosetests]
+ignore-files=collection
+exclude-dir=pysal/contrib
+
+[wheel]
+universal=1
+```
+
+And copying some files before executing the tests:
+(we'll use a temporary directory from where the tests will be executed because
+some tests expect some files in the current directory). Next must be executed
+from
+
+```
+cp dev/lib/python2.7/site-packages/pysal/examples/geodanet/* dev/local/lib/python2.7/site-packages/pysal/examples
+mkdir -p test_tmp && cd test_tmp && cp ../dev/lib/python2.7/site-packages/pysal/examples/geodanet/* ./
+```
+
+Then, execute the tests with:
+
+    import pysal
+    import nose
+    nose.runmodule('pysal')
--- a/src/py/crankshaft/crankshaft/init.py
+++ b/src/py/crankshaft/crankshaft/init.py
@@ -0,0 +1,2 @@
+import random_seeds
+import clustering
--- a/src/py/crankshaft/crankshaft/clustering/init.py
+++ b/src/py/crankshaft/crankshaft/clustering/init.py
@@ -0,0 +1 @@
+from moran import *
--- a/src/py/crankshaft/crankshaft/clustering/moran.py
+++ b/src/py/crankshaft/crankshaft/clustering/moran.py
@@ -0,0 +1,321 @@
+"""
+Moran's I geostatistics (global clustering & outliers presence)
+"""
+
+# TODO: Fill in local neighbors which have null/NoneType values with the
+#       average of the their neighborhood
+
+import numpy as np
+import pysal as ps
+import plpy
+
+# High level interface ---------------------------------------
+
+def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I implementation for PL/Python
+    Andy Eschbacher
+    """
+    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
+    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+            "attr1": attr,
+            "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    y = get_attributes(r, 1)
+    w = get_weight(r, w_type)
+
+    # calculate LISA values
+    lisa = ps.Moran_Local(y, w)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    """
+    Moran's I Local Rate
+    Andy Eschbacher
+    """
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+             "numerator": numerator,
+             "denominator": denominator,
+             "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+        plpy.notice('r.nrows() = %d' % r.nrows())
+
+    ## collect attributes
+    numer = get_attributes(r, 1)
+    denom = get_attributes(r, 2)
+
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    ## TODO: Decide on which return values here
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
+
+def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
+    plpy.notice('** Constructing query')
+
+    qvals = {"num_ngbrs": num_ngbrs,
+             "attr1": attr1,
+             "attr2": attr2,
+             "table": t,
+             "geom_col": geom_column,
+             "id_col": id_col}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+    ## collect attributes
+    attr1_vals = get_attributes(r, 1)
+    attr2_vals = get_attributes(r, 2)
+
+    # create weights
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
+
+    plpy.notice("len of Is: %d" % len(lisa.Is))
+
+    # find clustering of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+# Low level functions ----------------------------------------
+
+def map_quads(coord):
+    """
+        Map a quadrant number to Moran's I designation
+        HH=1, LH=2, LL=3, HL=4
+        Input:
+        :param coord (int): quadrant of a specific measurement
+    """
+    if coord == 1:
+        return 'HH'
+    elif coord == 2:
+        return 'LH'
+    elif coord == 3:
+        return 'LL'
+    elif coord == 4:
+        return 'HL'
+    else:
+        return None
+
+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        :param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        :param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM \"{table}\" As j " \
+                              "WHERE %(attr_where_j)s " \
+                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        :param params: dict of information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM \"{table}\" As j " \
+                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                 "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_query(w_type, query_vals):
+    """Return requested query.
+        :param w_type: type of neighbors to calculate (knn or queen)
+        :param query_vals: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+def get_attributes(query_res, attr_num):
+    """
+        :param query_res: query results with attributes and neighbors
+        :param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+## Build weight object
+def get_weight(query_res, w_type='queen', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        :param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    elif w_type == 'queen':
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
+def quad_position(quads):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    lisa_sig = np.array([map_quads(q) for q in quads])
+
+    return lisa_sig
+
+def lisa_sig_vals(pvals, quads, threshold):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    sig = (pvals <= threshold)
+
+    lisa_sig = np.empty(len(sig), np.chararray)
+
+    for idx, val in enumerate(sig):
+        if val:
+            lisa_sig[idx] = map_quads(quads[idx])
+        else:
+            lisa_sig[idx] = 'Not significant'
+
+    return lisa_sig
--- a/src/py/crankshaft/crankshaft/random_seeds.py
+++ b/src/py/crankshaft/crankshaft/random_seeds.py
@@ -0,0 +1,10 @@
+import random
+import numpy
+
+def set_random_seeds(value):
+    """
+    Set the seeds of the RNGs (Random Number Generators)
+    used internally.
+    """
+    random.seed(value)
+    numpy.random.seed(value)
--- a/src/py/crankshaft/setup.py
+++ b/src/py/crankshaft/setup.py
@@ -0,0 +1,48 @@
+
+"""
+CartoDB Spatial Analysis Python Library
+See:
+https://github.com/CartoDB/crankshaft
+"""
+
+from setuptools import setup, find_packages
+
+setup(
+    name='crankshaft',
+
+    version='0.0.0',
+
+    description='CartoDB Spatial Analysis Python Library',
+
+    url='https://github.com/CartoDB/crankshaft',
+
+    author='Data Services Team - CartoDB',
+    author_email='dataservices@cartodb.com',
+
+    license='MIT',
+
+    classifiers=[
+        'Development Status :: 3 - Alpha',
+        'Intended Audience :: Mapping comunity',
+        'Topic :: Maps :: Mapping Tools',
+        'License :: OSI Approved :: MIT License',
+        'Programming Language :: Python :: 2.7',
+    ],
+
+    keywords='maps mapping tools spatial analysis geostatistics',
+
+    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
+
+    extras_require={
+        'dev': ['unittest'],
+        'test': ['unittest', 'nose', 'mock'],
+    },
+
+    # The choice of component versions is dictated by what's
+    # provisioned in the production servers.
+    install_requires=['pysal==1.9.1'],
+
+    requires=['pysal', 'numpy' ],
+
+    test_suite='test'
+)
--- a/src/py/crankshaft/test/fixtures/moran.json
+++ b/src/py/crankshaft/test/fixtures/moran.json
@@ -0,0 +1,52 @@
+[[0.9319096128346788, "HH"],
+[-1.135787401862846, "HL"],
+[0.11732030672508517, "Not significant"],
+[0.6152779669180425, "Not significant"],
+[-0.14657336660125297, "Not significant"],
+[0.6967858120189607, "Not significant"],
+[0.07949310115714454, "Not significant"],
+[0.4703198759258987, "Not significant"],
+[0.4421125200498064, "Not significant"],
+[0.5724288737143592, "Not significant"],
+[0.8970743435692062, "LL"],
+[0.18327334401918674, "Not significant"],
+[-0.01466729201304962, "Not significant"],
+[0.3481559372544409, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329988, "HH"],
+[0.4373841193538136, "Not significant"],
+[0.15971286468915544, "Not significant"],
+[1.0543588860308968, "Not significant"],
+[1.7372866900020818, "HH"],
+[1.091998586053999, "LL"],
+[0.1171572584252222, "Not significant"],
+[0.08438455015300014, "Not significant"],
+[0.06547094736902978, "Not significant"],
+[0.15482141569329985, "HH"],
+[1.1627044812890683, "HH"],
+[0.06547094736902978, "Not significant"],
+[0.795275137550483, "Not significant"],
+[0.18562939195219, "LL"],
+[0.3010757406693439, "Not significant"],
+[2.8205795942839376, "HH"],
+[0.11259190602909264, "Not significant"],
+[-0.07116352791516614, "Not significant"],
+[-0.09945240794119009, "Not significant"],
+[0.18562939195219, "LL"],
+[0.1832733440191868, "Not significant"],
+[-0.39054253768447705, "Not significant"],
+[-0.1672071289487642, "HL"],
+[0.3337669247916343, "Not significant"],
+[0.2584386102554792, "Not significant"],
+[-0.19733845476322634, "HL"],
+[-0.9379282899805409, "LH"],
+[-0.028770969951095866, "Not significant"],
+[0.051367269430983485, "Not significant"],
+[-0.2172548045913472, "LH"],
+[0.05136726943098351, "Not significant"],
+[0.04191046803899837, "Not significant"],
+[0.7482357030403517, "HH"],
+[-0.014585767863118111, "Not significant"],
+[0.5410013139159929, "Not significant"],
+[1.0223932668429925, "LL"],
+[1.4179402898927476, "LL"]]
--- a/src/py/crankshaft/test/fixtures/neighbors.json
+++ b/src/py/crankshaft/test/fixtures/neighbors.json
@@ -0,0 +1,54 @@
+[
+    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
+    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
+    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
+    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
+    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
+    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
+    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
+    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
+    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
+    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
+    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
+    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
+    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
+    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
+    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
+    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
+    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
+    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
+    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
+    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
+    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
+    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
+    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
+    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
+    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
+    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
+    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
+    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
+    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
+    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
+    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
+    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
+    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
+    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
+    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
+    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
+    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
+    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
+    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
+    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
+    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
+    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
+    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
+    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
+    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
+    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
+    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
+    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
+    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
+    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
+    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
+    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
+  ]
--- a/src/py/crankshaft/test/helper.py
+++ b/src/py/crankshaft/test/helper.py
@@ -0,0 +1,13 @@
+import unittest
+
+from mock_plpy import MockPlPy
+plpy = MockPlPy()
+
+import sys
+sys.modules['plpy'] = plpy
+
+import os
+
+def fixture_file(name):
+    dir = os.path.dirname(os.path.realpath(__file__))
+    return os.path.join(dir, 'fixtures', name)
--- a/src/py/crankshaft/test/mock_plpy.py
+++ b/src/py/crankshaft/test/mock_plpy.py
@@ -0,0 +1,34 @@
+import re
+
+class MockPlPy:
+    def __init__(self):
+        self._reset()
+
+    def _reset(self):
+        self.infos = []
+        self.notices = []
+        self.debugs = []
+        self.logs = []
+        self.warnings = []
+        self.errors = []
+        self.fatals = []
+        self.executes = []
+        self.results = []
+        self.prepares = []
+        self.results = []
+
+    def _define_result(self, query, result):
+        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
+        self.results.append([pattern, result])
+
+    def notice(self, msg):
+        self.notices.append(msg)
+
+    def info(self, msg):
+        self.infos.append(msg)
+
+    def execute(self, query): # TODO: additional arguments
+       for result in self.results:
+          if result[0].match(query):
+            return result[1]
+       return []
--- a/src/py/crankshaft/test/test_clustering_moran.py
+++ b/src/py/crankshaft/test/test_clustering_moran.py
@@ -0,0 +1,144 @@
+import unittest
+import numpy as np
+
+import unittest
+
+
+# from mock_plpy import MockPlPy
+# plpy = MockPlPy()
+#
+# import sys
+# sys.modules['plpy'] = plpy
+from helper import plpy, fixture_file
+
+import crankshaft.clustering as cc
+from crankshaft import random_seeds
+import json
+
+class MoranTest(unittest.TestCase):
+    """Testing class for Moran's I functions."""
+
+    def setUp(self):
+        plpy._reset()
+        self.params = {"id_col": "cartodb_id",
+                       "attr1": "andy",
+                       "attr2": "jay_z",
+                       "table": "a_list",
+                       "geom_col": "the_geom",
+                       "num_ngbrs": 321}
+        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
+        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
+
+    def test_map_quads(self):
+        """Test map_quads."""
+        self.assertEqual(cc.map_quads(1), 'HH')
+        self.assertEqual(cc.map_quads(2), 'LH')
+        self.assertEqual(cc.map_quads(3), 'LL')
+        self.assertEqual(cc.map_quads(4), 'HL')
+        self.assertEqual(cc.map_quads(33), None)
+        self.assertEqual(cc.map_quads('andy'), None)
+
+    def test_query_attr_select(self):
+        """Test query_attr_select."""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(cc.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test query_attr_where."""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(cc.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
+              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
+              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
+              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
+              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
+              "BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
+              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
+              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
+              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
+              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
+              "i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.queen(self.params), ans)
+
+    def test_get_query(self):
+        """Test get_query."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
+              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
+              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
+              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
+              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
+              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.get_query('knn', self.params), ans)
+
+    def test_get_attributes(self):
+        """Test get_attributes."""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight."""
+
+        self.assertEqual(True, True)
+
+
+    def test_quad_position(self):
+        """Test lisa_sig_vals."""
+
+        quads = np.array([1, 2, 3, 4], np.int)
+
+        ans = np.array(['HH', 'LH', 'LL', 'HL'])
+        test_ans = cc.quad_position(quads)
+
+        self.assertTrue((test_ans == ans).all())
+
+    def test_moran_local(self):
+        """Test Moran's I local"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
+            self.assertEqual(res_quad, exp_quad)
+
+    def test_moran_local_rate(self):
+        """Test Moran's I rate"""
+        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
+        plpy._define_result('select', data)
+        random_seeds.set_random_seeds(1234)
+        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
+        result = [(row[0], row[1]) for row in result]
+        expected = self.moran_data
+        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
+            self.assertAlmostEqual(res_val, exp_val)
Author	SHA1	Message	Date
Javier Goizueta	5a7d3178dd	Release 0.0.2 This version is the first with the new versioning approach which uses separate per-version Pyhton virtual enironments.	2016-03-16 19:22:21 +01:00
Javier Goizueta	4903af6cdc	Add existing release 0.0.1 The existing 0.0.1 files are placed into their location in release/	2016-03-16 18:41:49 +01:00
Javier Goizueta	692014d694	Merge pull request #11 from CartoDB/new-versioning-package-varenv New versioning process (with multiple virtual environments)	2016-03-16 18:21:52 +01:00
Javier Goizueta	47e0253652	Fixes to the documentation	2016-03-16 18:18:59 +01:00
Javier Goizueta	9f03a9b075	Reorganize the documentation into separate files Keep a "Quickstart Guide" in the README, add separate detailed sections for development (CONTRIBUTING) and release/deployment (RELEASE).	2016-03-16 17:42:28 +01:00
Javier Goizueta	b5281d0681	Documentation clarifications and corrections.	2016-03-16 17:19:21 +01:00
Javier Goizueta	689ec8a925	Change version function from IMMUTABLE to STABLE These functions' results will change when the extension is updated.	2016-03-16 17:09:50 +01:00
Javier Goizueta	a7e42e93cc	Rename cdb_crankshaft_internal_version as internal function	2016-03-16 16:41:54 +01:00
Javier Goizueta	bad09ffd7b	Remove abandoned alternatives from the documentation	2016-03-16 16:30:03 +01:00
Javier Goizueta	4706442a1d	Add documentation about useful make targets	2016-03-16 15:56:19 +01:00
Javier Goizueta	935c7f9963	Add missing Makefile comment	2016-03-16 15:54:39 +01:00
Javier Goizueta	ef3bcaeee8	Restore commented-out make target	2016-03-16 15:52:47 +01:00
Javier Goizueta	4ffb2c9664	Review and fix the documentation	2016-03-16 15:45:13 +01:00
Javier Goizueta	dea6e2f1a7	Refactor the Makefile Separate concerns properly for each subdirectory's Makefile	2016-03-16 15:40:40 +01:00
Javier Goizueta	d13f167d47	Add RELEASE_VERSION option to make deploy Now make deploy installs by default the current version, but can be made to install any prior specific version using a environmnt varialbe RELEASE_VERSION	2016-03-16 14:38:18 +01:00
Javier Goizueta	a518034e65	Fix .pyc files need not only be ignored inside src/py	2016-03-16 11:13:26 +01:00
Javier Goizueta	24e4037995	Fix version number of released extension script	2016-03-16 11:11:16 +01:00
Javier Goizueta	82a738fe40	Fix make clean tasks	2016-03-16 10:18:07 +01:00
Javier Goizueta	e801c9cb60	Release tasks using release-specific virtual environments Refine the development process and define the procedure for releasing new versions.	2016-03-15 18:48:46 +01:00
Javier Goizueta	0206cc6c44	Update documentation	2016-03-10 19:13:46 +01:00
Rafa de la Torre	b754ffe42a	Add info about python dependencies	2016-03-10 18:06:21 +01:00
Javier Goizueta	0056f411b5	Set the path to virtualenvs in the Makefile Also, version the virtualenv	2016-03-09 19:04:21 +01:00
Javier Goizueta	1810f02242	Use SciPy from system package python-scipy	2016-03-09 15:03:17 +01:00
Javier Goizueta	8e972128eb	Modify sql code to user the python virtualenv	2016-03-09 15:00:50 +01:00
Javier Goizueta	cdd2d9e722	Directory reorganization and sketch of new versioning procedure	2016-03-08 19:35:02 +01:00