Fix typo

Add info about python dependencies
Constraint version numbers of reqs a little
2016-03-10 10:11:11 +01:00 · 2016-03-09 18:51:04 +01:00 · 2016-03-09 17:45:50 +01:00 · 2016-03-09 14:40:02 +01:00
62 changed files with 726 additions and 2959 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +0,0 @@
-envs/
-*.pyc
-.DS_Store
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,91 +1,84 @@
-# Development process
+# Contributing guide

-Please read the Working Process/Quickstart Guide in [README.md](https://github.com/CartoDB/crankshaft/blob/master/README.md) first.
+## How to add new functions

-For any modification of crankshaft, such as adding new features,
-refactoring or bug-fixing, topic branch must be created out of the `develop`
-branch and be used for the development process.
+Try to put as little logic in the SQL extension as possible and
+just use it as a wrapper to the Python module functionality.

-Modifications are done inside `src/pg/sql` and `src/py/crankshaft`.
+Once a function is defined it should never change its signature in subsequent
+versions. To change a function's signature a new function with a different
+name must be created.

-Take into account:
+### Version numbers

-*  Tests must be added for any new functionality
-   (inside `src/pg/test`, `src/py/crankshaft/test`) as well as to
-   detect any bugs that are being fixed.
-*  Add or modify the corresponding documentation files in the `doc` folder.
-   Since we expect to have highly technical functions here, an extense
-   background explanation would be of great help to users of this extension.
-*  Convention: snake case(i.e. `snake_case` and not `CamelCase`)
-   shall be used for all function names.
-   Prefix function names intended for public use with `cdb_`
-   and private functions (to be used only internally inside
-   the extension)  with `_cdb_`.
+The version of both the SQL extension and the Python package shall
+follow the [Semantic Versioning 2.0](http://semver.org/) guidelines:

-Once the code is ready to be tested, update the local development installation
-with `sudo make install`.
-This will update the 'dev' version of the extension in `src/pg/` and
-make it available to PostgreSQL.
-It will also install the python package (crankshaft) in a virtual
-environment `env/dev`.
+* When backwards incompatibility is introduced the major number is incremented
+* When functionally is added (in a backwards-compatible manner) the minor number
+  is incremented
+* When only fixes are introduced (backwards-compatible) the patch number is
+  incremented

-The version number of the Python package, defined in
-`src/pg/crankshaft/setup.py` will be overridden when
-the package is released and always match the extension version number,
-but for development it shall be kept as '0.0.0'.
+### Python Package

-Run the tests with `make test`.
+...

-To use the python extension for custom tests, activate the virtual
-environment with:
+### SQL Extension
+
+* Generate a **new subfolder version** for `sql` and `test` folders to define
+  the new functions and tests
+  - Use symlinks to avoid file duplication between versions that don't update them
+  - Add new files or modify copies of the old files to add new functions or
+    modify existing functions (remember to rename a function if the signature
+    changes)
+  - Add or modify the corresponding documentation files in the `doc` folder.
+    Since we expect to have highly technical functions here, an extense
+    background explanation would be of great help to users of this extension.
+  - Create tests for the new functions/behaviour
+
+* Generate the **upgrade and downgrade files** for the extension
+
+* Update the control file and the Makefile to generate the complete SQL
+  file for the new created version. After running `make` a new
+  file `crankshaft--X.Y.Z.sql` will be created for the current version.
+  Additional files for migrating to/from the previous version A.B.Z should be
+  created:
+  - `crankshaft--X.Y.Z--A.B.C.sql`
+  - `crankshaft--A.B.C--X.Y.Z.sql`
+  All these new files must be added to git and pushed.
+
+* Update the public docs! ;-)
+
+## Conventions
+
+# SQL
+
+Use snake case (i.e. `snake_case` and not `CamelCase`) for all
+functions. Prefix functions intended for public use with `cdb_`
+and private functions (to be used only internally inside
+the extension)  with `_cdb_`.
+
+# Python
+
+...
+
+## Testing
+
+Running just the Python tests:

 ```
-source envs/dev/bin/activate
+(cd python && make test)
 ```

-Update extension in a working database with:
-
-* `ALTER EXTENSION crankshaft VERSION TO 'current';`
-  `ALTER EXTENSION crankshaft VERSION TO 'dev';`
-
-Note: we keep the current development version install as 'dev' always;
-we update through the 'current' alias to allow changing the extension
-contents but not the version identifier. This will fail if the
-changes involve incompatible function changes such as a different
-return type; in that case the offending function (or the whole extension)
-should be dropped manually before the update.
-
-If the extension has not previously been installed in a database,
-it can be installed directly with:
-
-* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
-
-Note: the development extension uses the development python virtual
-environment automatically.
-
-Before proceeding to the release process peer code reviewing of the code is
-a must.
-
-Once the feature or bugfix is completed and all the tests are passing
-a Pull-Request shall be created on the topic branch, reviewed by a peer
-and then merged back into the `develop` branch when all CI tests pass.
-
-When the changes in the `develop` branch are to be released in a new
-version of the extension, a PR must be created on the `develop` branch.
-
-The release manage will take hold of the PR at this moment to proceed
-to the release process for a new revision of the extension.
-
-## Relevant development tasks available in the Makefile
+Installing the Extension and running just the PostgreSQL tests:

 ```
-* `make help` show a short description of the available targets
-
-* `sudo make install` will generate the extension scripts for the development
-  version ('dev'/'current') and install the python package into the
-  development virtual environment `envs/dev`.
-  Intended for use by developers.
-
-* `make test` will run the tests for the installed development extension.
-  Intended for use by developers.
+(cd pg && sudo make install && PGUSER=postgres make installcheck)
+```
+
+Installing and testing everything:
+
+```
+sudo make install && PGUSER=postgres make testinstalled
 ```
--- a/DEPLOYING.md
+++ b/DEPLOYING.md
@@ -0,0 +1,43 @@
+# Workflow
+
+... (branching/merging flow)
+
+# Deployment
+
+...
+
+Deployment to db servers: the next command will install both the Python
+package and the extension.
+
+```
+sudo make install
+```
+
+Installing only the Python package:
+
+```
+sudo pip install python/crankshaft --upgrade
+```
+
+Caveat: note that `pip install ./crankshaft` will install
+from local files, but `pip install crankshaft` will not.
+
+CI: Install and run the tests on the installed extension and package:
+
+```
+(sudo make install && PGUSER=postgres make testinstalled)
+```
+
+Installing the extension in user databases:
+Once installed in a server, the extension can be added
+to a database with the next SQL command:
+
+```
+CREATE EXTENSION crankshaft;
+```
+
+To upgrade the extension to an specific version X.Y.Z:
+
+```
+ALTER EXTENSION crankshaft UPGRADE TO 'X.Y.Z';
+```
--- a/65
+++ b/65
@@ -1,70 +1,13 @@
-include ./Makefile.global
-
 EXT_DIR = src/pg
 PYP_DIR = src/py

 .PHONY: install
 .PHONY: run_tests
-.PHONY: release
-.PHONY: deploy

-# Generate and install developmet versions of the extension
-# and python package.
-# The extension is named 'dev' with a 'current' alias for easily upgrading.
-# The Python package is installed in a virtual environment envs/dev/
-# Requires sudo.
-install: ## Generate and install development version of the extension; requires sudo.
+install:
 	$(MAKE) -C $(PYP_DIR) install
 	$(MAKE) -C $(EXT_DIR) install

-# Run the tests for the installed development extension and
-# python package
-test:   ## Run the tests for the development version of the extension
-	$(MAKE) -C $(PYP_DIR) test
-	$(MAKE) -C $(EXT_DIR) test
-
-# Generate a new release into release
-release: ## Generate a new release of the extension. Only for telease manager
-	$(MAKE) -C $(EXT_DIR) release
-	$(MAKE) -C $(PYP_DIR) release
-
-# Install the current release.
-# The Python package is installed in a virtual environment envs/X.Y.Z/
-# Requires sudo.
-# Use the RELEASE_VERSION environment variable to deploy a specific version:
-#     sudo make deploy RELEASE_VERSION=1.0.0
-deploy: ## Deploy a released extension. Only for release manager. Requires sudo.
-	$(MAKE) -C $(EXT_DIR) deploy
-	$(MAKE) -C $(PYP_DIR) deploy
-
-# Cleanup development extension script files
-clean-dev: ## clean up development extension script files
-	rm -f src/pg/$(EXTENSION)--*.sql
-
-# Cleanup all releases
-clean-releases: ## clean up all releases
-	rm -rf release/python/*
-	rm -f release/$(EXTENSION)--*.sql
-	rm -f release/$(EXTENSION).control
-
-# Cleanup current/specific version
-clean-release: ## clean up current release
-	rm -rf release/python/$(RELEASE_VERSION)
-	rm -f release/$(RELEASE_VERSION)--*.sql
-
-# Cleanup all virtual environments
-clean-environments: ## clean up all virtual environments
-	rm -rf envs/*
-
-clean-all: clean-dev clean-release clean-environments
-
-help:
-	@IFS=$$'\n' ; \
-	help_lines=(`fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//'`); \
-	for help_line in $${help_lines[@]}; do \
-		IFS=$$'#' ; \
-		help_split=($$help_line) ; \
-		help_command=`echo $${help_split[0]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
-		help_info=`echo $${help_split[2]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \
-		printf "%-30s %s\n" $$help_command $$help_info ; \
-	done
+testinstalled:
+	$(MAKE) -C $(PYP_DIR) testinstalled
+	$(MAKE) -C $(EXT_DIR) installcheck
--- a/Makefile.global
+++ b/Makefile.global
@@ -1,6 +0,0 @@
-SELF_DIR         := $(dir $(lastword $(MAKEFILE_LIST)))
-EXTENSION        = crankshaft
-PACKAGE          = crankshaft
-EXTVERSION       = $(shell grep default_version $(SELF_DIR)/src/pg/$(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
-RELEASE_VERSION ?= $(EXTVERSION)
-SED              = sed
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,7 +0,0 @@
-0.0.2 (2016-03-16)
------------------
-* New versioning approach using per-version Python virtual environments
-
-0.0.1 (2016-02-22)
------------------
-* Preliminar release
--- a/README.md
+++ b/README.md
@@ -8,64 +8,83 @@ CartoDB Spatial Analysis extension for PostgreSQL.
 * *src* source code
 * - *src/pg* contains the PostgreSQL extension source code
 * - *src/py* Python module source code
-* *release* reseleased versions
-* *env* base directory for Python virtual environments
+* *release* reselesed versions

 ## Requirements

 * pip, virtualenv, PostgreSQL
-* python-scipy system package (see [src/py/README.md](https://github.com/CartoDB/crankshaft/blob/master/src/py/README.md))

-# Working Process -- Quickstart Guide
+# Working Process

-We distinguish two roles regarding the development cycle of crankshaft:
+## Development

-* *developers* will implement new functionality and bugfixes into
-  the codebase and will request for new releases of the extension.
-* A *release manager* will attend these requests and will handle
-  the release process. The release process is sequential:
-  no concurrent releases will ever be in the works.
+Work in `src/pg/sql`, `src/py/crankshaft`;
+use topic branch.

-We use the default `develop` branch as the basis for development.
-The `master` branch is used to merge and tag releases to be
-deployed in production.
+Update local installation with `sudo make install`
+(this will update the 'dev' version of the extension in 'src/pg/')

-Developers shall create a new topic branch from `develop` for any new feature
-or bugfix and commit their changes to it and eventually merge back into
-the `develop` branch. When a new release is required a Pull Request
-will be open against the `develop` branch.
+Run the tests with `PGUSER=postgres make test`

-The `develop` pull requests will be handled by the release manage,
-who will merge into master where new releases are prepared and tagged.
-The `master` branch is the sole responsibility of the release masters
-and developers must not commit or merge into it.
+Update extension in working database with

-## Development Guidelines
+* `ALTER EXTENSION crankshaft VERSION TO 'current';`
+  `ALTER EXTENSION crankshaft VERSION TO 'dev';`

-For a detailed description of the development process please see
-the [CONTRIBUTING.md](https://github.com/CartoDB/crankshaft/blob/master/CONTRIBUTING.md) guide.
+Note: we keep the current development version install as 'dev' always;
+we update through the 'current' alias to allow changing the extension
+contents but not the version identifier. This will fail if the
+changes involve incompatible function changes such as a different
+return type; in that case the offending function (or the whole extension)
+should be dropped manually before the update.

-Any modification to the source code (`src/pg/sql` for the SQL extension,
-`src/py/crankshaft` for the Python package) shall always be done
-in a topic branch created from the `develop` branch.
+If the extension has not previously been installed in a database
+we can:

-Tests, documentation and peer code reviewing are required for all
-modifications.
+Add tests...

-The tests (both for SQL and Python) are executed by running,
-from the top directory:
+* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`

-```
-sudo make install
-make test
-```
+Test

-To request a new release, which will be handled by them
-release manager, a Pull Request must be created in the `develop`
-branch.
+Commit, push, create PR, wait for CI tests, CR, ...

 ## Release

-The release and deployment process is described in the
-[RELEASE.md](https://github.com/CartoDB/crankshaft/blob/master/RELEASE.md) guide and it is the responsibility of the designated
-release manager.
+To release current development version
+(working directory should be clean in dev branch)
+
+(process to be gradually automated)
+
+For backwards compatible changes (no return value, num of arguments, etc. changes...)
+new version number increasing either patch level (no new functionality)
+or minor level (new functionality) => 'X.Y.Z'.
+Update version in src/pg/crankshaft.control
+Copy release/crankshaft--current.sql to release/crankshaft--X.Y.Z.sql
+Prepare incremental downgrade, upgrade scripts....
+
+Python: ...
+
+Install the new release
+
+`make install-release`
+
+Test the new release
+
+`make test-release`
+
+Push the release
+
+Wait for CI tests
+
+Merge into master
+
+Deploy: install extension and python to production hosts,
+update extension in databases (limited to team users, data observatory, ...)
+
+Release manager role: ...
+
+.sql release scripts
+commit
+tests: staging....
+merge, tag, deploy...
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -1,93 +0,0 @@
-# Release & Deployment Process
-
-Please read the Working Process/Quickstart Guide in README.md
-and the Development guidelines in CONTRIBUTING.md.
-
-The release process of a new version of the extension
-shall be performed by the designated *Release Manager*.
-
-Note that we expect to gradually automate more of this process.
-
-Having checked PR to be released it shall be
-merged back into the `master` branch to prepare the new release.
-
-The version number in `pg/cranckshaft.control` must first be updated.
-To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
-
-Thew `NEWS.md` will be updated.
-
-We now will explain the process for the case of backwards-compatible
-releases (updating the minor or patch version numbers).
-
-TODO: document the complex case of major releases.
-
-The next command must be executed to produce the main installation
-script for the new release, `release/cranckshaft--X.Y.Z.sql` and
-also to copy the python package to `release/python/X.Y.Z/crankshaft`.
-
-```
-make release
-```
-
-Then, the release manager shall produce upgrade and downgrade scripts
-to migrate to/from the previous release. In the case of minor/patch
-releases this simply consist in extracting the functions that have changed
-and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
-file.
-
-The new release can be deployed for staging/smoke tests with this command:
-
-```
-sudo make deploy
-```
-
-This will copy the current 'X.Y.Z' released version of the extension to
-PostgreSQL. The corresponding Python extension will be installed in a
-virtual environment in `envs/X.Y.Z`.
-
-It can be activated with:
-
-```
-source envs/X.Y.Z/bin/activate
-```
-
-But note that this is needed only for using the package directly;
-the 'X.Y.Z' version of the extension will automatically use the
-python package from this virtual environment.
-
-The `sudo make deploy` operation can be also used for installing
-the new version after it has been released.
-
-To install a specific version 'X.Y.Z' different from the current one
-(which must be present in `releases/`) you can:
-
-```
-sudo make deploy RELEASE_VERSION=X.Y.Z
-```
-
-TODO: testing procedure for the new release.
-
-TODO: procedure for staging deployment.
-
-TODO: procedure for merging to master, tagging and deploying
-in production.
-
-## Relevant release & deployment tasks available in the Makefile
-
-```
-* `make help` show a short description of the available targets
-
-* `make release` will generate a new release (version number defined in
-  `src/pg/crankshaft.control`) into `release/`.
-  Intended for use by the release manager.
-
-* `sudo make deploy` will install the current release X.Y.Z from the
-  `release/` files into PostgreSQL and a Python virtual environment
-  `envs/X.Y.Z`.
-  Intended for use by the release manager and deployment jobs.
-
-* `sudo make deploy RELEASE_VERSION=X.Y.Z` will install specified version
-  previously generated in `release/`
-  into PostgreSQL and a Python virtual environment `envs/X.Y.Z`.
-  Intended for use by the release manager and deployment jobs.
-```
--- a/TODO.md
+++ b/TODO.md
@@ -0,0 +1,9 @@
+* [x] Support versioning
+* [x] Test use of `plpy` from python Package
+* [x] Add `pysal` etc. dependencies
+* [x] Define documentation practices (general, per extension/package?)
+* [x] Add initial function set (WIP)
+* Unify style of function comments
+* [x] Add integration tests
+* Make target to open a new version development (create symlinks, etc.)
+* [x] Should add cartodb ext. as a dependency?
--- a/release/.gitignore
+++ b/release/.gitignore
--- a/release/crankshaft--0.0.1--0.0.2.sql
+++ b/release/crankshaft--0.0.1--0.0.2.sql
@@ -1,74 +0,0 @@
-CREATE OR REPLACE FUNCTION cdb_crankshaft.cdb_crankshaft_version()
-RETURNS text AS $$
-  SELECT '0.0.2'::text;
-$$ language 'sql' STABLE STRICT;
-
-CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_internal_version()
-RETURNS text AS $$
-  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
-$$ language 'sql' STABLE STRICT;
-CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_virtualenvs_path()
-RETURNS text
-AS $$
-  BEGIN
-    RETURN '/home/ubuntu/crankshaft/envs';
-  END;
-$$ language plpgsql IMMUTABLE STRICT;
-
-CREATE OR REPLACE FUNCTION cdb_crankshaft._cdb_crankshaft_activate_py()
-RETURNS VOID
-AS $$
-    import os
-    # plpy.notice('%',str(os.environ))
-    # activate virtualenv
-    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
-    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
-    default_venv_path = os.path.join(base_path, crankshaft_version)
-    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
-    activate_path = venv_path + '/bin/activate_this.py'
-    exec(open(activate_path).read(), dict(__file__=activate_path))
-$$ LANGUAGE plpythonu;
-
-CREATE OR REPLACE FUNCTION
-cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft import random_seeds
-  random_seeds.set_random_seeds(seed_value)
-$$ LANGUAGE plpythonu;
-- Moran's I
-CREATE OR REPLACE FUNCTION
-cdb_crankshaft.cdb_moran_local (
-      t TEXT,
-  	  attr TEXT,
-  	  significance float DEFAULT 0.05,
-  	  num_ngbrs INT DEFAULT 5,
-  	  permutations INT DEFAULT 99,
-  	  geom_column TEXT DEFAULT 'the_geom',
-  	  id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn')
-RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-
-CREATE OR REPLACE FUNCTION
-cdb_crankshaft.cdb_moran_local_rate(t TEXT,
-		 numerator TEXT,
-		 denominator TEXT,
-		 significance FLOAT DEFAULT 0.05,
-		 num_ngbrs INT DEFAULT 5,
-		 permutations INT DEFAULT 99,
-		 geom_column TEXT DEFAULT 'the_geom',
-		 id_col TEXT DEFAULT 'cartodb_id',
-		 w_type TEXT DEFAULT 'knn')
-RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
--- a/release/crankshaft--0.0.1.sql
+++ b/release/crankshaft--0.0.1.sql
@@ -1,148 +0,0 @@
--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
-- Internal function.
-- Set the seeds of the RNGs (Random Number Generators)
-- used internally.
-CREATE OR REPLACE FUNCTION
-_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
-AS $$
-  from crankshaft import random_seeds
-  random_seeds.set_random_seeds(seed_value)
-$$ LANGUAGE plpythonu;
-- Moran's I
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local (
-      t TEXT,
-  	  attr TEXT,
-  	  significance float DEFAULT 0.05,
-  	  num_ngbrs INT DEFAULT 5,
-  	  permutations INT DEFAULT 99,
-  	  geom_column TEXT DEFAULT 'the_geom',
-  	  id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn')
-RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
-AS $$
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-
-- Moran's I Local Rate
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local_rate(t TEXT,
-		 numerator TEXT,
-		 denominator TEXT,
-		 significance FLOAT DEFAULT 0.05,
-		 num_ngbrs INT DEFAULT 5,
-		 permutations INT DEFAULT 99,
-		 geom_column TEXT DEFAULT 'the_geom',
-		 id_col TEXT DEFAULT 'cartodb_id',
-		 w_type TEXT DEFAULT 'knn')
-RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
-AS $$
-  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-- Function by Stuart Lynn for a simple interpolation of a value
-- from a polygon table over an arbitrary polygon
-- (weighted by the area proportion overlapped)
-- Aereal weighting is a very simple form of aereal interpolation.
--
-- Parameters:
--   * geom a Polygon geometry which defines the area where a value will be
--     estimated as the area-weighted sum of a given table/column
--   * target_table_name table name of the table that provides the values
--   * target_column column name of the column that provides the values
--   * schema_name optional parameter to defina the schema the target table
--     belongs to, which is necessary if its not in the search_path.
--     Note that target_table_name should never include the schema in it.
-- Return value:
--   Aereal-weighted interpolation of the column values over the geometry
-CREATE OR REPLACE
-FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
-  RETURNS numeric AS
-$$
-DECLARE
-	result numeric;
-  qualified_name text;
-BEGIN
-  IF schema_name IS NULL THEN
-    qualified_name := Format('%I', target_table_name);
-  ELSE
-    qualified_name := Format('%I.%s', schema_name, target_table_name);
-  END IF;
-  EXECUTE Format('
-    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
-    FROM %s AS a
-    WHERE $1 && a.the_geom
-  ', target_column, qualified_name)
-  USING geom
-  INTO result;
-  RETURN result;
-END;
-$$ LANGUAGE plpgsql;
--
-- Creates N points randomly distributed arround the polygon
--
-- @param g - the geometry to be turned in to points
--
-- @param no_points - the number of points to generate
--
-- @params max_iter_per_point - the function generates points in the polygon's bounding box
-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
-- misses per point the funciton accepts before giving up.
--
-- Returns: Multipoint with the requested points
-CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
-RETURNS GEOMETRY AS $$
-DECLARE
-  extent GEOMETRY;
-  test_point Geometry;
-  width                NUMERIC;
-  height               NUMERIC;
-  x0                   NUMERIC;
-  y0                   NUMERIC;
-  xp                   NUMERIC;
-  yp                   NUMERIC;
-  no_left              INTEGER;
-  remaining_iterations INTEGER;
-  points               GEOMETRY[];
-  bbox_line            GEOMETRY;
-  intersection_line    GEOMETRY;
-BEGIN
-  extent  := ST_Envelope(geom);
-  width   := ST_XMax(extent) - ST_XMIN(extent);
-  height  := ST_YMax(extent) - ST_YMIN(extent);
-  x0 	  := ST_XMin(extent);
-  y0 	  := ST_YMin(extent);
-  no_left := no_points;
-
-  LOOP
-    if(no_left=0) THEN
-      EXIT;
-    END IF;
-    yp = y0 + height*random();
-    bbox_line  = ST_MakeLine(
-      ST_SetSRID(ST_MakePoint(yp, x0),4326),
-      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
-    );
-    intersection_line = ST_Intersection(bbox_line,geom);
-  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
-	  points := points || test_point;
-	  no_left = no_left - 1 ;
-  END LOOP;
-  RETURN ST_Collect(points);
-END;
-$$
-LANGUAGE plpgsql VOLATILE;
-- Make sure by default there are no permissions for publicuser
-- NOTE: this happens at extension creation time, as part of an implicit transaction.
-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
-
-- Grant permissions on the schema to publicuser (but just the schema)
-GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
-
-- Revoke execute permissions on all functions in the schema by default
-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/release/crankshaft--0.0.2.sql
+++ b/release/crankshaft--0.0.2.sql
@@ -1,186 +0,0 @@
--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
-- Version number of the extension release
-CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
-RETURNS text AS $$
-  SELECT '0.0.2'::text;
-$$ language 'sql' STABLE STRICT;
-
-- Internal identifier of the installed extension instence
-- e.g. 'dev' for current development version
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
-RETURNS text AS $$
-  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
-$$ language 'sql' STABLE STRICT;
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
-RETURNS text
-AS $$
-  BEGIN
-    -- RETURN '/opt/virtualenvs/crankshaft';
-    RETURN '/home/ubuntu/crankshaft/envs';
-  END;
-$$ language plpgsql IMMUTABLE STRICT;
-
-- Use the crankshaft python module
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
-RETURNS VOID
-AS $$
-    import os
-    # plpy.notice('%',str(os.environ))
-    # activate virtualenv
-    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
-    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
-    default_venv_path = os.path.join(base_path, crankshaft_version)
-    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
-    activate_path = venv_path + '/bin/activate_this.py'
-    exec(open(activate_path).read(), dict(__file__=activate_path))
-$$ LANGUAGE plpythonu;
-- Internal function.
-- Set the seeds of the RNGs (Random Number Generators)
-- used internally.
-CREATE OR REPLACE FUNCTION
-_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft import random_seeds
-  random_seeds.set_random_seeds(seed_value)
-$$ LANGUAGE plpythonu;
-- Moran's I
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local (
-      t TEXT,
-  	  attr TEXT,
-  	  significance float DEFAULT 0.05,
-  	  num_ngbrs INT DEFAULT 5,
-  	  permutations INT DEFAULT 99,
-  	  geom_column TEXT DEFAULT 'the_geom',
-  	  id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn')
-RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-
-- Moran's I Local Rate
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local_rate(t TEXT,
-		 numerator TEXT,
-		 denominator TEXT,
-		 significance FLOAT DEFAULT 0.05,
-		 num_ngbrs INT DEFAULT 5,
-		 permutations INT DEFAULT 99,
-		 geom_column TEXT DEFAULT 'the_geom',
-		 id_col TEXT DEFAULT 'cartodb_id',
-		 w_type TEXT DEFAULT 'knn')
-RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
-AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
-  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-- Function by Stuart Lynn for a simple interpolation of a value
-- from a polygon table over an arbitrary polygon
-- (weighted by the area proportion overlapped)
-- Aereal weighting is a very simple form of aereal interpolation.
--
-- Parameters:
--   * geom a Polygon geometry which defines the area where a value will be
--     estimated as the area-weighted sum of a given table/column
--   * target_table_name table name of the table that provides the values
--   * target_column column name of the column that provides the values
--   * schema_name optional parameter to defina the schema the target table
--     belongs to, which is necessary if its not in the search_path.
--     Note that target_table_name should never include the schema in it.
-- Return value:
--   Aereal-weighted interpolation of the column values over the geometry
-CREATE OR REPLACE
-FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
-  RETURNS numeric AS
-$$
-DECLARE
-	result numeric;
-  qualified_name text;
-BEGIN
-  IF schema_name IS NULL THEN
-    qualified_name := Format('%I', target_table_name);
-  ELSE
-    qualified_name := Format('%I.%s', schema_name, target_table_name);
-  END IF;
-  EXECUTE Format('
-    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
-    FROM %s AS a
-    WHERE $1 && a.the_geom
-  ', target_column, qualified_name)
-  USING geom
-  INTO result;
-  RETURN result;
-END;
-$$ LANGUAGE plpgsql;
--
-- Creates N points randomly distributed arround the polygon
--
-- @param g - the geometry to be turned in to points
--
-- @param no_points - the number of points to generate
--
-- @params max_iter_per_point - the function generates points in the polygon's bounding box
-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
-- misses per point the funciton accepts before giving up.
--
-- Returns: Multipoint with the requested points
-CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
-RETURNS GEOMETRY AS $$
-DECLARE
-  extent GEOMETRY;
-  test_point Geometry;
-  width                NUMERIC;
-  height               NUMERIC;
-  x0                   NUMERIC;
-  y0                   NUMERIC;
-  xp                   NUMERIC;
-  yp                   NUMERIC;
-  no_left              INTEGER;
-  remaining_iterations INTEGER;
-  points               GEOMETRY[];
-  bbox_line            GEOMETRY;
-  intersection_line    GEOMETRY;
-BEGIN
-  extent  := ST_Envelope(geom);
-  width   := ST_XMax(extent) - ST_XMIN(extent);
-  height  := ST_YMax(extent) - ST_YMIN(extent);
-  x0 	  := ST_XMin(extent);
-  y0 	  := ST_YMin(extent);
-  no_left := no_points;
-
-  LOOP
-    if(no_left=0) THEN
-      EXIT;
-    END IF;
-    yp = y0 + height*random();
-    bbox_line  = ST_MakeLine(
-      ST_SetSRID(ST_MakePoint(yp, x0),4326),
-      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
-    );
-    intersection_line = ST_Intersection(bbox_line,geom);
-  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
-	  points := points || test_point;
-	  no_left = no_left - 1 ;
-  END LOOP;
-  RETURN ST_Collect(points);
-END;
-$$
-LANGUAGE plpgsql VOLATILE;
-- Make sure by default there are no permissions for publicuser
-- NOTE: this happens at extension creation time, as part of an implicit transaction.
-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
-
-- Grant permissions on the schema to publicuser (but just the schema)
-GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
-
-- Revoke execute permissions on all functions in the schema by default
-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/release/crankshaft.control
+++ b/release/crankshaft.control
@@ -1,5 +0,0 @@
-comment = 'CartoDB Spatial Analysis extension'
-default_version = '0.0.2'
-requires = 'plpythonu, postgis, cartodb'
-superuser = true
-schema = cdb_crankshaft
--- a/release/python/.gitignore
+++ b/release/python/.gitignore
--- a/release/python/0.0.1/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/init.py
@@ -1,2 +0,0 @@
-import random_seeds
-import clustering
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/init.py
@@ -1 +0,0 @@
-from moran import *
--- a/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/clustering/moran.py
@@ -1,321 +0,0 @@
-"""
-Moran's I geostatistics (global clustering & outliers presence)
-"""
-
-# TODO: Fill in local neighbors which have null/NoneType values with the
-#       average of the their neighborhood
-
-import numpy as np
-import pysal as ps
-import plpy
-
-# High level interface ---------------------------------------
-
-def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    """
-    Moran's I implementation for PL/Python
-    Andy Eschbacher
-    """
-    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
-    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
-
-    plpy.notice('** Constructing query')
-
-    # geometries with attributes that are null are ignored
-    # resulting in a collection of not as near neighbors
-
-    qvals = {"id_col": id_col,
-            "attr1": attr,
-            "geom_col": geom_column,
-             "table": t,
-             "num_ngbrs": num_ngbrs}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-    y = get_attributes(r, 1)
-    w = get_weight(r, w_type)
-
-    # calculate LISA values
-    lisa = ps.Moran_Local(y, w)
-
-    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
-
-def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    """
-    Moran's I Local Rate
-    Andy Eschbacher
-    """
-
-    plpy.notice('** Constructing query')
-
-    # geometries with attributes that are null are ignored
-    # resulting in a collection of not as near neighbors
-
-    qvals = {"id_col": id_col,
-             "numerator": numerator,
-             "denominator": denominator,
-             "geom_col": geom_column,
-             "table": t,
-             "num_ngbrs": num_ngbrs}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-        plpy.notice('r.nrows() = %d' % r.nrows())
-
-    ## collect attributes
-    numer = get_attributes(r, 1)
-    denom = get_attributes(r, 2)
-
-    w = get_weight(r, w_type, num_ngbrs)
-
-    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
-
-    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    ## TODO: Decide on which return values here
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
-
-def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    plpy.notice('** Constructing query')
-
-    qvals = {"num_ngbrs": num_ngbrs,
-             "attr1": attr1,
-             "attr2": attr2,
-             "table": t,
-             "geom_col": geom_column,
-             "id_col": id_col}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-    ## collect attributes
-    attr1_vals = get_attributes(r, 1)
-    attr2_vals = get_attributes(r, 2)
-
-    # create weights
-    w = get_weight(r, w_type, num_ngbrs)
-
-    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
-
-    plpy.notice("len of Is: %d" % len(lisa.Is))
-
-    # find clustering of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
-
-# Low level functions ----------------------------------------
-
-def map_quads(coord):
-    """
-        Map a quadrant number to Moran's I designation
-        HH=1, LH=2, LL=3, HL=4
-        Input:
-        :param coord (int): quadrant of a specific measurement
-    """
-    if coord == 1:
-        return 'HH'
-    elif coord == 2:
-        return 'LH'
-    elif coord == 3:
-        return 'LL'
-    elif coord == 4:
-        return 'HL'
-    else:
-        return None
-
-def query_attr_select(params):
-    """
-        Create portion of SELECT statement for attributes inolved in query.
-        :param params: dict of information used in query (column names,
-                       table name, etc.)
-    """
-
-    attrs = [k for k in params
-             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
-
-    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
-
-    attr_string = ""
-
-    for idx, val in enumerate(sorted(attrs)):
-        attr_string += template % {"col": val, "alias_num": idx + 1}
-
-    return attr_string
-
-def query_attr_where(params):
-    """
-        Create portion of WHERE clauses for weeding out NULL-valued geometries
-    """
-    attrs = sorted([k for k in params
-                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
-
-    attr_string = []
-
-    for attr in attrs:
-        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
-
-    if len(attrs) == 2:
-        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
-
-    out = " AND ".join(attr_string)
-
-    return out
-
-def knn(params):
-    """SQL query for k-nearest neighbors.
-        :param vars: dict of values to fill template
-    """
-
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                              "FROM \"{table}\" As j " \
-                              "WHERE %(attr_where_j)s " \
-                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
-                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## SQL query for finding queens neighbors (all contiguous polygons)
-def queen(params):
-    """SQL query for queen neighbors.
-        :param params: dict of information to fill query
-    """
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                 "FROM \"{table}\" As j " \
-                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
-                 "%(attr_where_j)s)" \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## to add more weight methods open a ticket or pull request
-
-def get_query(w_type, query_vals):
-    """Return requested query.
-        :param w_type: type of neighbors to calculate (knn or queen)
-        :param query_vals: values used to construct the query
-    """
-
-    if w_type == 'knn':
-        return knn(query_vals)
-    else:
-        return queen(query_vals)
-
-def get_attributes(query_res, attr_num):
-    """
-        :param query_res: query results with attributes and neighbors
-        :param attr_num: attribute number (1, 2, ...)
-    """
-    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
-
-## Build weight object
-def get_weight(query_res, w_type='queen', num_ngbrs=5):
-    """
-        Construct PySAL weight from return value of query
-        :param query_res: query results with attributes and neighbors
-    """
-    if w_type == 'knn':
-        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
-        weights = {x['id']: row_normed_weights for x in query_res}
-    elif w_type == 'queen':
-        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
-                            if len(x['neighbors']) > 0
-                            else [] for x in query_res}
-
-    neighbors = {x['id']: x['neighbors'] for x in query_res}
-
-    return ps.W(neighbors, weights)
-
-def quad_position(quads):
-    """
-        Produce Moran's I classification based of n
-    """
-
-    lisa_sig = np.array([map_quads(q) for q in quads])
-
-    return lisa_sig
-
-def lisa_sig_vals(pvals, quads, threshold):
-    """
-        Produce Moran's I classification based of n
-    """
-
-    sig = (pvals <= threshold)
-
-    lisa_sig = np.empty(len(sig), np.chararray)
-
-    for idx, val in enumerate(sig):
-        if val:
-            lisa_sig[idx] = map_quads(quads[idx])
-        else:
-            lisa_sig[idx] = 'Not significant'
-
-    return lisa_sig
--- a/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.1/crankshaft/crankshaft/random_seeds.py
@@ -1,10 +0,0 @@
-import random
-import numpy
-
-def set_random_seeds(value):
-    """
-    Set the seeds of the RNGs (Random Number Generators)
-    used internally.
-    """
-    random.seed(value)
-    numpy.random.seed(value)
--- a/release/python/0.0.1/crankshaft/setup.py
+++ b/release/python/0.0.1/crankshaft/setup.py
@@ -1,48 +0,0 @@
-
-"""
-CartoDB Spatial Analysis Python Library
-See:
-https://github.com/CartoDB/crankshaft
-"""
-
-from setuptools import setup, find_packages
-
-setup(
-    name='crankshaft',
-
-    version='0.0.01',
-
-    description='CartoDB Spatial Analysis Python Library',
-
-    url='https://github.com/CartoDB/crankshaft',
-
-    author='Data Services Team - CartoDB',
-    author_email='dataservices@cartodb.com',
-
-    license='MIT',
-
-    classifiers=[
-        'Development Status :: 3 - Alpha',
-        'Intended Audience :: Mapping comunity',
-        'Topic :: Maps :: Mapping Tools',
-        'License :: OSI Approved :: MIT License',
-        'Programming Language :: Python :: 2.7',
-    ],
-
-    keywords='maps mapping tools spatial analysis geostatistics',
-
-    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
-
-    extras_require={
-        'dev': ['unittest'],
-        'test': ['unittest', 'nose', 'mock'],
-    },
-
-    # The choice of component versions is dictated by what's
-    # provisioned in the production servers.
-    install_requires=['pysal==1.11.0','numpy==1.6.1','scipy==0.17.0'],
-
-    requires=['pysal', 'numpy'],
-
-    test_suite='test'
-)
--- a/release/python/0.0.1/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/moran.json
@@ -1,52 +0,0 @@
-[[0.9319096128346788, "HH"],
-[-1.135787401862846, "HL"],
-[0.11732030672508517, "Not significant"],
-[0.6152779669180425, "Not significant"],
-[-0.14657336660125297, "Not significant"],
-[0.6967858120189607, "Not significant"],
-[0.07949310115714454, "Not significant"],
-[0.4703198759258987, "Not significant"],
-[0.4421125200498064, "Not significant"],
-[0.5724288737143592, "Not significant"],
-[0.8970743435692062, "LL"],
-[0.18327334401918674, "Not significant"],
-[-0.01466729201304962, "Not significant"],
-[0.3481559372544409, "Not significant"],
-[0.06547094736902978, "Not significant"],
-[0.15482141569329988, "HH"],
-[0.4373841193538136, "Not significant"],
-[0.15971286468915544, "Not significant"],
-[1.0543588860308968, "Not significant"],
-[1.7372866900020818, "HH"],
-[1.091998586053999, "LL"],
-[0.1171572584252222, "Not significant"],
-[0.08438455015300014, "Not significant"],
-[0.06547094736902978, "Not significant"],
-[0.15482141569329985, "HH"],
-[1.1627044812890683, "HH"],
-[0.06547094736902978, "Not significant"],
-[0.795275137550483, "Not significant"],
-[0.18562939195219, "LL"],
-[0.3010757406693439, "Not significant"],
-[2.8205795942839376, "HH"],
-[0.11259190602909264, "Not significant"],
-[-0.07116352791516614, "Not significant"],
-[-0.09945240794119009, "Not significant"],
-[0.18562939195219, "LL"],
-[0.1832733440191868, "Not significant"],
-[-0.39054253768447705, "Not significant"],
-[-0.1672071289487642, "HL"],
-[0.3337669247916343, "Not significant"],
-[0.2584386102554792, "Not significant"],
-[-0.19733845476322634, "HL"],
-[-0.9379282899805409, "LH"],
-[-0.028770969951095866, "Not significant"],
-[0.051367269430983485, "Not significant"],
-[-0.2172548045913472, "LH"],
-[0.05136726943098351, "Not significant"],
-[0.04191046803899837, "Not significant"],
-[0.7482357030403517, "HH"],
-[-0.014585767863118111, "Not significant"],
-[0.5410013139159929, "Not significant"],
-[1.0223932668429925, "LL"],
-[1.4179402898927476, "LL"]]
--- a/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.1/crankshaft/test/fixtures/neighbors.json
@@ -1,54 +0,0 @@
-[
-    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
-    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
-    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
-    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
-    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
-    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
-    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
-    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
-    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
-    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
-    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
-    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
-    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
-    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
-    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
-    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
-    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
-    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
-    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
-    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
-    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
-    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
-    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
-    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
-    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
-    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
-    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
-    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
-    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
-    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
-    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
-    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
-    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
-    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
-    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
-    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
-    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
-    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
-    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
-    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
-    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
-    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
-    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
-    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
-    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
-    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
-    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
-    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
-    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
-    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
-    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
-    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
-  ]
--- a/release/python/0.0.1/crankshaft/test/helper.py
+++ b/release/python/0.0.1/crankshaft/test/helper.py
@@ -1,13 +0,0 @@
-import unittest
-
-from mock_plpy import MockPlPy
-plpy = MockPlPy()
-
-import sys
-sys.modules['plpy'] = plpy
-
-import os
-
-def fixture_file(name):
-    dir = os.path.dirname(os.path.realpath(__file__))
-    return os.path.join(dir, 'fixtures', name)
--- a/release/python/0.0.1/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.1/crankshaft/test/mock_plpy.py
@@ -1,34 +0,0 @@
-import re
-
-class MockPlPy:
-    def __init__(self):
-        self._reset()
-
-    def _reset(self):
-        self.infos = []
-        self.notices = []
-        self.debugs = []
-        self.logs = []
-        self.warnings = []
-        self.errors = []
-        self.fatals = []
-        self.executes = []
-        self.results = []
-        self.prepares = []
-        self.results = []
-
-    def _define_result(self, query, result):
-        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
-        self.results.append([pattern, result])
-
-    def notice(self, msg):
-        self.notices.append(msg)
-
-    def info(self, msg):
-        self.infos.append(msg)
-
-    def execute(self, query): # TODO: additional arguments
-       for result in self.results:
-          if result[0].match(query):
-            return result[1]
-       return []
--- a/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.1/crankshaft/test/test_clustering_moran.py
@@ -1,144 +0,0 @@
-import unittest
-import numpy as np
-
-import unittest
-
-
-# from mock_plpy import MockPlPy
-# plpy = MockPlPy()
-#
-# import sys
-# sys.modules['plpy'] = plpy
-from helper import plpy, fixture_file
-
-import crankshaft.clustering as cc
-from crankshaft import random_seeds
-import json
-
-class MoranTest(unittest.TestCase):
-    """Testing class for Moran's I functions."""
-
-    def setUp(self):
-        plpy._reset()
-        self.params = {"id_col": "cartodb_id",
-                       "attr1": "andy",
-                       "attr2": "jay_z",
-                       "table": "a_list",
-                       "geom_col": "the_geom",
-                       "num_ngbrs": 321}
-        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
-        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
-
-    def test_map_quads(self):
-        """Test map_quads."""
-        self.assertEqual(cc.map_quads(1), 'HH')
-        self.assertEqual(cc.map_quads(2), 'LH')
-        self.assertEqual(cc.map_quads(3), 'LL')
-        self.assertEqual(cc.map_quads(4), 'HL')
-        self.assertEqual(cc.map_quads(33), None)
-        self.assertEqual(cc.map_quads('andy'), None)
-
-    def test_query_attr_select(self):
-        """Test query_attr_select."""
-
-        ans = "i.\"{attr1}\"::numeric As attr1, " \
-              "i.\"{attr2}\"::numeric As attr2, "
-
-        self.assertEqual(cc.query_attr_select(self.params), ans)
-
-    def test_query_attr_where(self):
-        """Test query_attr_where."""
-
-        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" <> 0"
-
-        self.assertEqual(cc.query_attr_where(self.params), ans)
-
-    def test_knn(self):
-        """Test knn function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
-              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
-              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
-              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
-              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
-              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
-              "BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.knn(self.params), ans)
-
-    def test_queen(self):
-        """Test queen neighbors function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
-              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
-              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
-              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
-              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
-              "i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.queen(self.params), ans)
-
-    def test_get_query(self):
-        """Test get_query."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
-              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
-              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
-              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
-              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
-              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.get_query('knn', self.params), ans)
-
-    def test_get_attributes(self):
-        """Test get_attributes."""
-
-        ## need to add tests
-
-        self.assertEqual(True, True)
-
-    def test_get_weight(self):
-        """Test get_weight."""
-
-        self.assertEqual(True, True)
-
-
-    def test_quad_position(self):
-        """Test lisa_sig_vals."""
-
-        quads = np.array([1, 2, 3, 4], np.int)
-
-        ans = np.array(['HH', 'LH', 'LL', 'HL'])
-        test_ans = cc.quad_position(quads)
-
-        self.assertTrue((test_ans == ans).all())
-
-    def test_moran_local(self):
-        """Test Moran's I local"""
-        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
-        plpy._define_result('select', data)
-        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
-        result = [(row[0], row[1]) for row in result]
-        expected = self.moran_data
-        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
-            self.assertAlmostEqual(res_val, exp_val)
-            self.assertEqual(res_quad, exp_quad)
-
-    def test_moran_local_rate(self):
-        """Test Moran's I rate"""
-        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
-        plpy._define_result('select', data)
-        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
-        result = [(row[0], row[1]) for row in result]
-        expected = self.moran_data
-        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
-            self.assertAlmostEqual(res_val, exp_val)
--- a/release/python/0.0.2/crankshaft/crankshaft/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/init.py
@@ -1,2 +0,0 @@
-import random_seeds
-import clustering
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/init.py
@@ -1 +0,0 @@
-from moran import *
--- a/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/clustering/moran.py
@@ -1,321 +0,0 @@
-"""
-Moran's I geostatistics (global clustering & outliers presence)
-"""
-
-# TODO: Fill in local neighbors which have null/NoneType values with the
-#       average of the their neighborhood
-
-import numpy as np
-import pysal as ps
-import plpy
-
-# High level interface ---------------------------------------
-
-def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    """
-    Moran's I implementation for PL/Python
-    Andy Eschbacher
-    """
-    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
-    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
-
-    plpy.notice('** Constructing query')
-
-    # geometries with attributes that are null are ignored
-    # resulting in a collection of not as near neighbors
-
-    qvals = {"id_col": id_col,
-            "attr1": attr,
-            "geom_col": geom_column,
-             "table": t,
-             "num_ngbrs": num_ngbrs}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-    y = get_attributes(r, 1)
-    w = get_weight(r, w_type)
-
-    # calculate LISA values
-    lisa = ps.Moran_Local(y, w)
-
-    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
-
-def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    """
-    Moran's I Local Rate
-    Andy Eschbacher
-    """
-
-    plpy.notice('** Constructing query')
-
-    # geometries with attributes that are null are ignored
-    # resulting in a collection of not as near neighbors
-
-    qvals = {"id_col": id_col,
-             "numerator": numerator,
-             "denominator": denominator,
-             "geom_col": geom_column,
-             "table": t,
-             "num_ngbrs": num_ngbrs}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-        plpy.notice('r.nrows() = %d' % r.nrows())
-
-    ## collect attributes
-    numer = get_attributes(r, 1)
-    denom = get_attributes(r, 2)
-
-    w = get_weight(r, w_type, num_ngbrs)
-
-    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
-
-    # find units of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    ## TODO: Decide on which return values here
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
-
-def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
-    plpy.notice('** Constructing query')
-
-    qvals = {"num_ngbrs": num_ngbrs,
-             "attr1": attr1,
-             "attr2": attr2,
-             "table": t,
-             "geom_col": geom_column,
-             "id_col": id_col}
-
-    q = get_query(w_type, qvals)
-
-    try:
-        r = plpy.execute(q)
-        plpy.notice('** Query returned with %d rows' % len(r))
-    except plpy.SPIError:
-        plpy.notice('** Query failed: "%s"' % q)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        plpy.notice('** Exiting function')
-        return zip([None], [None], [None], [None])
-
-    ## collect attributes
-    attr1_vals = get_attributes(r, 1)
-    attr2_vals = get_attributes(r, 2)
-
-    # create weights
-    w = get_weight(r, w_type, num_ngbrs)
-
-    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)
-
-    plpy.notice("len of Is: %d" % len(lisa.Is))
-
-    # find clustering of significance
-    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
-
-    plpy.notice('** Finished calculations')
-
-    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
-
-
-# Low level functions ----------------------------------------
-
-def map_quads(coord):
-    """
-        Map a quadrant number to Moran's I designation
-        HH=1, LH=2, LL=3, HL=4
-        Input:
-        :param coord (int): quadrant of a specific measurement
-    """
-    if coord == 1:
-        return 'HH'
-    elif coord == 2:
-        return 'LH'
-    elif coord == 3:
-        return 'LL'
-    elif coord == 4:
-        return 'HL'
-    else:
-        return None
-
-def query_attr_select(params):
-    """
-        Create portion of SELECT statement for attributes inolved in query.
-        :param params: dict of information used in query (column names,
-                       table name, etc.)
-    """
-
-    attrs = [k for k in params
-             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
-
-    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
-
-    attr_string = ""
-
-    for idx, val in enumerate(sorted(attrs)):
-        attr_string += template % {"col": val, "alias_num": idx + 1}
-
-    return attr_string
-
-def query_attr_where(params):
-    """
-        Create portion of WHERE clauses for weeding out NULL-valued geometries
-    """
-    attrs = sorted([k for k in params
-                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
-
-    attr_string = []
-
-    for attr in attrs:
-        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
-
-    if len(attrs) == 2:
-        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
-
-    out = " AND ".join(attr_string)
-
-    return out
-
-def knn(params):
-    """SQL query for k-nearest neighbors.
-        :param vars: dict of values to fill template
-    """
-
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                              "FROM \"{table}\" As j " \
-                              "WHERE %(attr_where_j)s " \
-                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
-                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## SQL query for finding queens neighbors (all contiguous polygons)
-def queen(params):
-    """SQL query for queen neighbors.
-        :param params: dict of information to fill query
-    """
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                 "FROM \"{table}\" As j " \
-                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
-                 "%(attr_where_j)s)" \
-                ") As neighbors " \
-            "FROM \"{table}\" As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## to add more weight methods open a ticket or pull request
-
-def get_query(w_type, query_vals):
-    """Return requested query.
-        :param w_type: type of neighbors to calculate (knn or queen)
-        :param query_vals: values used to construct the query
-    """
-
-    if w_type == 'knn':
-        return knn(query_vals)
-    else:
-        return queen(query_vals)
-
-def get_attributes(query_res, attr_num):
-    """
-        :param query_res: query results with attributes and neighbors
-        :param attr_num: attribute number (1, 2, ...)
-    """
-    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
-
-## Build weight object
-def get_weight(query_res, w_type='queen', num_ngbrs=5):
-    """
-        Construct PySAL weight from return value of query
-        :param query_res: query results with attributes and neighbors
-    """
-    if w_type == 'knn':
-        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
-        weights = {x['id']: row_normed_weights for x in query_res}
-    elif w_type == 'queen':
-        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
-                            if len(x['neighbors']) > 0
-                            else [] for x in query_res}
-
-    neighbors = {x['id']: x['neighbors'] for x in query_res}
-
-    return ps.W(neighbors, weights)
-
-def quad_position(quads):
-    """
-        Produce Moran's I classification based of n
-    """
-
-    lisa_sig = np.array([map_quads(q) for q in quads])
-
-    return lisa_sig
-
-def lisa_sig_vals(pvals, quads, threshold):
-    """
-        Produce Moran's I classification based of n
-    """
-
-    sig = (pvals <= threshold)
-
-    lisa_sig = np.empty(len(sig), np.chararray)
-
-    for idx, val in enumerate(sig):
-        if val:
-            lisa_sig[idx] = map_quads(quads[idx])
-        else:
-            lisa_sig[idx] = 'Not significant'
-
-    return lisa_sig
--- a/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
+++ b/release/python/0.0.2/crankshaft/crankshaft/random_seeds.py
@@ -1,10 +0,0 @@
-import random
-import numpy
-
-def set_random_seeds(value):
-    """
-    Set the seeds of the RNGs (Random Number Generators)
-    used internally.
-    """
-    random.seed(value)
-    numpy.random.seed(value)
--- a/release/python/0.0.2/crankshaft/setup.py
+++ b/release/python/0.0.2/crankshaft/setup.py
@@ -1,48 +0,0 @@
-
-"""
-CartoDB Spatial Analysis Python Library
-See:
-https://github.com/CartoDB/crankshaft
-"""
-
-from setuptools import setup, find_packages
-
-setup(
-    name='crankshaft',
-
-    version='0.0.2',
-
-    description='CartoDB Spatial Analysis Python Library',
-
-    url='https://github.com/CartoDB/crankshaft',
-
-    author='Data Services Team - CartoDB',
-    author_email='dataservices@cartodb.com',
-
-    license='MIT',
-
-    classifiers=[
-        'Development Status :: 3 - Alpha',
-        'Intended Audience :: Mapping comunity',
-        'Topic :: Maps :: Mapping Tools',
-        'License :: OSI Approved :: MIT License',
-        'Programming Language :: Python :: 2.7',
-    ],
-
-    keywords='maps mapping tools spatial analysis geostatistics',
-
-    packages=find_packages(exclude=['contrib', 'docs', 'tests']),
-
-    extras_require={
-        'dev': ['unittest'],
-        'test': ['unittest', 'nose', 'mock'],
-    },
-
-    # The choice of component versions is dictated by what's
-    # provisioned in the production servers.
-    install_requires=['pysal==1.9.1'],
-
-    requires=['pysal', 'numpy' ],
-
-    test_suite='test'
-)
--- a/release/python/0.0.2/crankshaft/test/fixtures/moran.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/moran.json
@@ -1,52 +0,0 @@
-[[0.9319096128346788, "HH"],
-[-1.135787401862846, "HL"],
-[0.11732030672508517, "Not significant"],
-[0.6152779669180425, "Not significant"],
-[-0.14657336660125297, "Not significant"],
-[0.6967858120189607, "Not significant"],
-[0.07949310115714454, "Not significant"],
-[0.4703198759258987, "Not significant"],
-[0.4421125200498064, "Not significant"],
-[0.5724288737143592, "Not significant"],
-[0.8970743435692062, "LL"],
-[0.18327334401918674, "Not significant"],
-[-0.01466729201304962, "Not significant"],
-[0.3481559372544409, "Not significant"],
-[0.06547094736902978, "Not significant"],
-[0.15482141569329988, "HH"],
-[0.4373841193538136, "Not significant"],
-[0.15971286468915544, "Not significant"],
-[1.0543588860308968, "Not significant"],
-[1.7372866900020818, "HH"],
-[1.091998586053999, "LL"],
-[0.1171572584252222, "Not significant"],
-[0.08438455015300014, "Not significant"],
-[0.06547094736902978, "Not significant"],
-[0.15482141569329985, "HH"],
-[1.1627044812890683, "HH"],
-[0.06547094736902978, "Not significant"],
-[0.795275137550483, "Not significant"],
-[0.18562939195219, "LL"],
-[0.3010757406693439, "Not significant"],
-[2.8205795942839376, "HH"],
-[0.11259190602909264, "Not significant"],
-[-0.07116352791516614, "Not significant"],
-[-0.09945240794119009, "Not significant"],
-[0.18562939195219, "LL"],
-[0.1832733440191868, "Not significant"],
-[-0.39054253768447705, "Not significant"],
-[-0.1672071289487642, "HL"],
-[0.3337669247916343, "Not significant"],
-[0.2584386102554792, "Not significant"],
-[-0.19733845476322634, "HL"],
-[-0.9379282899805409, "LH"],
-[-0.028770969951095866, "Not significant"],
-[0.051367269430983485, "Not significant"],
-[-0.2172548045913472, "LH"],
-[0.05136726943098351, "Not significant"],
-[0.04191046803899837, "Not significant"],
-[0.7482357030403517, "HH"],
-[-0.014585767863118111, "Not significant"],
-[0.5410013139159929, "Not significant"],
-[1.0223932668429925, "LL"],
-[1.4179402898927476, "LL"]]
--- a/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
+++ b/release/python/0.0.2/crankshaft/test/fixtures/neighbors.json
@@ -1,54 +0,0 @@
-[
-    {"neighbors": [48, 26, 20, 9, 31], "id": 1, "value": 0.5},
-    {"neighbors": [30, 16, 46, 3, 4], "id": 2, "value": 0.7},
-    {"neighbors": [46, 30, 2, 12, 16], "id": 3, "value": 0.2},
-    {"neighbors": [18, 30, 23, 2, 52], "id": 4, "value": 0.1},
-    {"neighbors": [47, 40, 45, 37, 28], "id": 5, "value": 0.3},
-    {"neighbors": [10, 21, 41, 14, 37], "id": 6, "value": 0.05},
-    {"neighbors": [8, 17, 43, 25, 12], "id": 7, "value": 0.4},
-    {"neighbors": [17, 25, 43, 22, 7], "id": 8, "value": 0.7},
-    {"neighbors": [39, 34, 1, 26, 48], "id": 9, "value": 0.5},
-    {"neighbors": [6, 37, 5, 45, 49], "id": 10, "value": 0.04},
-    {"neighbors": [51, 41, 29, 21, 14], "id": 11, "value": 0.08},
-    {"neighbors": [44, 46, 43, 50, 3], "id": 12, "value": 0.2},
-    {"neighbors": [45, 23, 14, 28, 18], "id": 13, "value": 0.4},
-    {"neighbors": [41, 29, 13, 23, 6], "id": 14, "value": 0.2},
-    {"neighbors": [36, 27, 32, 33, 24], "id": 15, "value": 0.3},
-    {"neighbors": [19, 2, 46, 44, 28], "id": 16, "value": 0.4},
-    {"neighbors": [8, 25, 43, 7, 22], "id": 17, "value": 0.6},
-    {"neighbors": [23, 4, 29, 14, 13], "id": 18, "value": 0.3},
-    {"neighbors": [42, 16, 28, 26, 40], "id": 19, "value": 0.7},
-    {"neighbors": [1, 48, 31, 26, 42], "id": 20, "value": 0.8},
-    {"neighbors": [41, 6, 11, 14, 10], "id": 21, "value": 0.1},
-    {"neighbors": [25, 50, 43, 31, 44], "id": 22, "value": 0.4},
-    {"neighbors": [18, 13, 14, 4, 2], "id": 23, "value": 0.1},
-    {"neighbors": [33, 49, 34, 47, 27], "id": 24, "value": 0.3},
-    {"neighbors": [43, 8, 22, 17, 50], "id": 25, "value": 0.4},
-    {"neighbors": [1, 42, 20, 31, 48], "id": 26, "value": 0.6},
-    {"neighbors": [32, 15, 36, 33, 24], "id": 27, "value": 0.3},
-    {"neighbors": [40, 45, 19, 5, 13], "id": 28, "value": 0.8},
-    {"neighbors": [11, 51, 41, 14, 18], "id": 29, "value": 0.3},
-    {"neighbors": [2, 3, 4, 46, 18], "id": 30, "value": 0.1},
-    {"neighbors": [20, 26, 1, 50, 48], "id": 31, "value": 0.9},
-    {"neighbors": [27, 36, 15, 49, 24], "id": 32, "value": 0.3},
-    {"neighbors": [24, 27, 49, 34, 32], "id": 33, "value": 0.4},
-    {"neighbors": [47, 9, 39, 40, 24], "id": 34, "value": 0.3},
-    {"neighbors": [38, 51, 11, 21, 41], "id": 35, "value": 0.3},
-    {"neighbors": [15, 32, 27, 49, 33], "id": 36, "value": 0.2},
-    {"neighbors": [49, 10, 5, 47, 24], "id": 37, "value": 0.5},
-    {"neighbors": [35, 21, 51, 11, 41], "id": 38, "value": 0.4},
-    {"neighbors": [9, 34, 48, 1, 47], "id": 39, "value": 0.6},
-    {"neighbors": [28, 47, 5, 9, 34], "id": 40, "value": 0.5},
-    {"neighbors": [11, 14, 29, 21, 6], "id": 41, "value": 0.4},
-    {"neighbors": [26, 19, 1, 9, 31], "id": 42, "value": 0.2},
-    {"neighbors": [25, 12, 8, 22, 44], "id": 43, "value": 0.3},
-    {"neighbors": [12, 50, 46, 16, 43], "id": 44, "value": 0.2},
-    {"neighbors": [28, 13, 5, 40, 19], "id": 45, "value": 0.3},
-    {"neighbors": [3, 12, 44, 2, 16], "id": 46, "value": 0.2},
-    {"neighbors": [34, 40, 5, 49, 24], "id": 47, "value": 0.3},
-    {"neighbors": [1, 20, 26, 9, 39], "id": 48, "value": 0.5},
-    {"neighbors": [24, 37, 47, 5, 33], "id": 49, "value": 0.2},
-    {"neighbors": [44, 22, 31, 42, 26], "id": 50, "value": 0.6},
-    {"neighbors": [11, 29, 41, 14, 21], "id": 51, "value": 0.01},
-    {"neighbors": [4, 18, 29, 51, 23], "id": 52, "value": 0.01}
-  ]
--- a/release/python/0.0.2/crankshaft/test/helper.py
+++ b/release/python/0.0.2/crankshaft/test/helper.py
@@ -1,13 +0,0 @@
-import unittest
-
-from mock_plpy import MockPlPy
-plpy = MockPlPy()
-
-import sys
-sys.modules['plpy'] = plpy
-
-import os
-
-def fixture_file(name):
-    dir = os.path.dirname(os.path.realpath(__file__))
-    return os.path.join(dir, 'fixtures', name)
--- a/release/python/0.0.2/crankshaft/test/mock_plpy.py
+++ b/release/python/0.0.2/crankshaft/test/mock_plpy.py
@@ -1,34 +0,0 @@
-import re
-
-class MockPlPy:
-    def __init__(self):
-        self._reset()
-
-    def _reset(self):
-        self.infos = []
-        self.notices = []
-        self.debugs = []
-        self.logs = []
-        self.warnings = []
-        self.errors = []
-        self.fatals = []
-        self.executes = []
-        self.results = []
-        self.prepares = []
-        self.results = []
-
-    def _define_result(self, query, result):
-        pattern = re.compile(query, re.IGNORECASE | re.MULTILINE)
-        self.results.append([pattern, result])
-
-    def notice(self, msg):
-        self.notices.append(msg)
-
-    def info(self, msg):
-        self.infos.append(msg)
-
-    def execute(self, query): # TODO: additional arguments
-       for result in self.results:
-          if result[0].match(query):
-            return result[1]
-       return []
--- a/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
+++ b/release/python/0.0.2/crankshaft/test/test_clustering_moran.py
@@ -1,144 +0,0 @@
-import unittest
-import numpy as np
-
-import unittest
-
-
-# from mock_plpy import MockPlPy
-# plpy = MockPlPy()
-#
-# import sys
-# sys.modules['plpy'] = plpy
-from helper import plpy, fixture_file
-
-import crankshaft.clustering as cc
-from crankshaft import random_seeds
-import json
-
-class MoranTest(unittest.TestCase):
-    """Testing class for Moran's I functions."""
-
-    def setUp(self):
-        plpy._reset()
-        self.params = {"id_col": "cartodb_id",
-                       "attr1": "andy",
-                       "attr2": "jay_z",
-                       "table": "a_list",
-                       "geom_col": "the_geom",
-                       "num_ngbrs": 321}
-        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
-        self.moran_data = json.loads(open(fixture_file('moran.json')).read())
-
-    def test_map_quads(self):
-        """Test map_quads."""
-        self.assertEqual(cc.map_quads(1), 'HH')
-        self.assertEqual(cc.map_quads(2), 'LH')
-        self.assertEqual(cc.map_quads(3), 'LL')
-        self.assertEqual(cc.map_quads(4), 'HL')
-        self.assertEqual(cc.map_quads(33), None)
-        self.assertEqual(cc.map_quads('andy'), None)
-
-    def test_query_attr_select(self):
-        """Test query_attr_select."""
-
-        ans = "i.\"{attr1}\"::numeric As attr1, " \
-              "i.\"{attr2}\"::numeric As attr2, "
-
-        self.assertEqual(cc.query_attr_select(self.params), ans)
-
-    def test_query_attr_where(self):
-        """Test query_attr_where."""
-
-        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
-              "idx_replace.\"{attr2}\" <> 0"
-
-        self.assertEqual(cc.query_attr_where(self.params), ans)
-
-    def test_knn(self):
-        """Test knn function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
-              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
-              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
-              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
-              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
-              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
-              "BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.knn(self.params), ans)
-
-    def test_queen(self):
-        """Test queen neighbors function."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
-              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
-              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
-              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
-              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
-              "i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.queen(self.params), ans)
-
-    def test_get_query(self):
-        """Test get_query."""
-
-        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
-              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
-              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
-              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
-              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
-              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
-              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
-              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(cc.get_query('knn', self.params), ans)
-
-    def test_get_attributes(self):
-        """Test get_attributes."""
-
-        ## need to add tests
-
-        self.assertEqual(True, True)
-
-    def test_get_weight(self):
-        """Test get_weight."""
-
-        self.assertEqual(True, True)
-
-
-    def test_quad_position(self):
-        """Test lisa_sig_vals."""
-
-        quads = np.array([1, 2, 3, 4], np.int)
-
-        ans = np.array(['HH', 'LH', 'LL', 'HL'])
-        test_ans = cc.quad_position(quads)
-
-        self.assertTrue((test_ans == ans).all())
-
-    def test_moran_local(self):
-        """Test Moran's I local"""
-        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
-        plpy._define_result('select', data)
-        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
-        result = [(row[0], row[1]) for row in result]
-        expected = self.moran_data
-        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
-            self.assertAlmostEqual(res_val, exp_val)
-            self.assertEqual(res_quad, exp_quad)
-
-    def test_moran_local_rate(self):
-        """Test Moran's I rate"""
-        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
-        plpy._define_result('select', data)
-        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
-        result = [(row[0], row[1]) for row in result]
-        expected = self.moran_data
-        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
-            self.assertAlmostEqual(res_val, exp_val)
--- a/src/pg/Makefile
+++ b/src/pg/Makefile
@@ -1,15 +1,11 @@
-include ../../Makefile.global
+# Generation of a new development version 'dev' (with an alias 'current' for
+# updating easily by upgrading to 'current', then 'dev')

-# Development tasks:
-#
-# * install generates the control & script files into src/pg/
-#   and installs then into the PostgreSQL extensions directory;
-#   requires sudo. In additionof the current development version
-#   named 'dev', an alias 'current' is generating for ease of
-#   update (upgrade to 'current', then to 'dev').
-#   the python module is installed in a virtualenv in envs/dev/
-# * test runs the tests for the currently generated Development
-#   extension.
+# sudo make install -- generate the 'dev' version from current source
+#                      and make it available to PostgreSQL
+# PGUSER=postgres make installcheck -- test the 'dev' extension
+
+EXTENSION    = crankshaft

 DATA         = $(EXTENSION)--dev.sql \
 	             $(EXTENSION)--current--dev.sql \
@@ -18,14 +14,8 @@ DATA         = $(EXTENSION)--dev.sql \
 SOURCES_DATA_DIR = sql
 SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql)

-VIRTUALENV_PATH = $(realpath ../../envs)
-ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH))
-
-REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \
-               -e 's/@@VIRTUALENV_PATH@@/$(ESC_VIRVIRTUALENV_PATH)/g'
-
 $(DATA): $(SOURCES_DATA)
-	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > $@
+	cat $(SOURCES_DATA_DIR)/*.sql > $@

 TEST_DIR = test
 REGRESS = $(notdir $(basename $(wildcard $(TEST_DIR)/sql/*test.sql)))
@@ -38,23 +28,14 @@ include $(PGXS)
 # This seems to be needed at least for PG 9.3.11
 all: $(DATA)

-test: export PGUSER=postgres
-test: installcheck
+# WIP: goals for releasing the extension...

-# Release tasks
+EXTVERSION   = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")

-../../release/$(EXTENSION).control: $(EXTENSION).control
+../release/$(EXTENSION).control: $(EXTENSION).control
 	cp $< $@

-# Prepare new release from the currently installed development version,
-# for the current version X.Y.Z (defined in the control file)
-# producing the extension script and control files in releases/
-# and the python package in releases/python/X.Y.Z/crankshaft/
-release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
-	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > ../../release/$(EXTENSION)--$(EXTVERSION).sql
-
-# Install the current relese into the PostgreSQL extensions directory
-# and the Python package in a virtual environment envs/X.Y.Z
-deploy:
-	$(INSTALL_DATA) ../../release/$(EXTENSION).control '$(DESTDIR)$(datadir)/extension/'
-	$(INSTALL_DATA) ../../release/*.sql '$(DESTDIR)$(datadir)/extension/'
+release: ../release/$(EXTENSION).control
+	cp $(EXTENSION)--dev.sql $(EXTENSION)--$(EXTVERSION).sql
+	# pending: create upgrade/downgrade scripts,
+	#          commit, push, tag....
--- a/src/pg/README.md
+++ b/src/pg/README.md
@@ -0,0 +1,7 @@
+
+# Running the tests:
+
+```
+sudo make install
+PGUSER=postgres make installcheck
+```
--- a/src/pg/crankshaft.control
+++ b/src/pg/crankshaft.control
@@ -1,5 +1,5 @@
 comment = 'CartoDB Spatial Analysis extension'
-default_version = '0.0.2'
+default_version = '0.0.1'
 requires = 'plpythonu, postgis, cartodb'
 superuser = true
 schema = cdb_crankshaft
--- a/src/pg/sql/01_random_seeds.sql
+++ b/src/pg/sql/01_random_seeds.sql
@@ -4,7 +4,6 @@
 CREATE OR REPLACE FUNCTION
 _cdb_random_seeds (seed_value INTEGER) RETURNS VOID
 AS $$
-  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft import random_seeds
  random_seeds.set_random_seeds(seed_value)
 $$ LANGUAGE plpythonu;
--- a/src/pg/sql/01_version.sql
+++ b/src/pg/sql/01_version.sql
@@ -1,12 +0,0 @@
-- Version number of the extension release
-CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
-RETURNS text AS $$
-  SELECT '@@VERSION@@'::text;
-$$ language 'sql' STABLE STRICT;
-
-- Internal identifier of the installed extension instence
-- e.g. 'dev' for current development version
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version()
-RETURNS text AS $$
-  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
-$$ language 'sql' STABLE STRICT;
--- a/release/crankshaft--0.0.2--0.0.1.sql
+++ b/release/crankshaft--0.0.2--0.0.1.sql
@@ -1,12 +1,6 @@
+-- Moran's I
 CREATE OR REPLACE FUNCTION
-cdb_crankshaft._cdb_random_seeds (seed_value INTEGER) RETURNS VOID
-AS $$
-  from crankshaft import random_seeds
-  random_seeds.set_random_seeds(seed_value)
-$$ LANGUAGE plpythonu;
-
-CREATE OR REPLACE FUNCTION
-cdb_crankshaft.cdb_moran_local (
+  cdb_moran_local (
      t TEXT,
  	  attr TEXT,
  	  significance float DEFAULT 0.05,
@@ -22,8 +16,9 @@ AS $$
  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
 $$ LANGUAGE plpythonu;

+-- Moran's I Local Rate
 CREATE OR REPLACE FUNCTION
-cdb_crankshaft.cdb_moran_local_rate(t TEXT,
+  cdb_moran_local_rate(t TEXT,
 		 numerator TEXT,
 		 denominator TEXT,
 		 significance FLOAT DEFAULT 0.05,
@@ -38,7 +33,3 @@ AS $$
  # TODO: use named parameters or a dictionary
  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
 $$ LANGUAGE plpythonu;
-
-DROP FUNCTION IF EXISTS cdb_crankshaft.cdb_crankshaft_version();
-DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_internal_version();
-DROP FUNCTION IF EXISTS cdb_crankshaft._cdb_crankshaft_activate_py();
--- a/src/pg/sql/02_py.sql
+++ b/src/pg/sql/02_py.sql
@@ -1,23 +0,0 @@
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
-RETURNS text
-AS $$
-  BEGIN
-    -- RETURN '/opt/virtualenvs/crankshaft';
-    RETURN '@@VIRTUALENV_PATH@@';
-  END;
-$$ language plpgsql IMMUTABLE STRICT;
-
-- Use the crankshaft python module
-CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
-RETURNS VOID
-AS $$
-    import os
-    # plpy.notice('%',str(os.environ))
-    # activate virtualenv
-    crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version']
-    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
-    default_venv_path = os.path.join(base_path, crankshaft_version)
-    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
-    activate_path = venv_path + '/bin/activate_this.py'
-    exec(open(activate_path).read(), dict(__file__=activate_path))
-$$ LANGUAGE plpythonu;
--- a/src/pg/sql/03_overlap_sum.sql
+++ b/src/pg/sql/03_overlap_sum.sql
--- a/src/pg/sql/04_dot_density.sql
+++ b/src/pg/sql/04_dot_density.sql
--- a/src/pg/sql/10_moran.sql
+++ b/src/pg/sql/10_moran.sql
@@ -1,89 +0,0 @@
-- Moran's I (global)
-CREATE OR REPLACE FUNCTION
-  CDB_AreasOfInterest_Global (
-      subquery TEXT,
-      attr_name TEXT,
-      permutations INT DEFAULT 99,
-      geom_col TEXT DEFAULT 'the_geom',
-      id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn',
-      num_ngbrs INT DEFAULT 5)
-RETURNS TABLE (moran NUMERIC, significance NUMERIC)
-AS $$
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran(subquery, attr, num_ngbrs, permutations, geom_col, id_col, w_type)
-$$ LANGUAGE plpythonu;
-
-- Moran's I Local
-CREATE OR REPLACE FUNCTION
-  CDB_AreasOfInterest_Local(
-      subquery TEXT,
-      attr TEXT,
-      permutations INT DEFAULT 99,
-      geom_col TEXT DEFAULT 'the_geom',
-      id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn',
-      num_ngbrs INT DEFAULT 5)
-RETURNS TABLE (moran NUMERIC, quads TEXT, significance NUMERIC, ids INT, y NUMERIC)
-AS $$
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(subquery, attr, permutations, geom_col, id_col, w_type, num_ngbrs)
-$$ LANGUAGE plpythonu;
-
-- Moran's I Rate (global)
-CREATE OR REPLACE FUNCTION
-  CDB_AreasOfInterest_Global_Rate(
-      subquery TEXT,
-      numerator TEXT,
-      denominator TEXT,
-      permutations INT DEFAULT 99,
-      geom_col TEXT DEFAULT 'the_geom',
-      id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn',
-      num_ngbrs INT DEFAULT 5)
-RETURNS TABLE (moran FLOAT, significance FLOAT)
-AS $$
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_rate(subquery, numerator, denominator, permutations, geom_col, id_col, w_type, num_ngbrs)
-$$ LANGUAGE plpythonu;
-
-
-- Moran's I Local Rate
-CREATE OR REPLACE FUNCTION
-  CDB_AreasOfInterest_Local_Rate(
-      subquery TEXT,
-      numerator TEXT,
-      denominator TEXT,
-      permutations INT DEFAULT 99,
-      geom_col TEXT DEFAULT 'the_geom',
-      id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn',
-      num_ngbrs INT DEFAULT 5)
-RETURNS
-TABLE(moran NUMERIC, quads TEXT, significance NUMERIC, ids INT, y NUMERIC)
-AS $$
-  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(subquery, numerator, denominator, permutations, geom_col, id_col, w_type, num_ngbrs)
-$$ LANGUAGE plpythonu;
-
-- -- Moran's I Local Bivariate
-- CREATE OR REPLACE FUNCTION
--   cdb_moran_local_bv(
--       subquery TEXT,
--       attr1 TEXT,
--       attr2 TEXT,
--       permutations INT DEFAULT 99,
--       geom_col TEXT DEFAULT 'the_geom',
--       id_col TEXT DEFAULT 'cartodb_id',
--       w_type TEXT DEFAULT 'knn',
--       num_ngbrs INT DEFAULT 5)
-- RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
-- AS $$
--   from crankshaft.clustering import moran_local_bv
--   # TODO: use named parameters or a dictionary
--   return moran_local_bv(t, attr1, attr2, permutations, geom_col, id_col, w_type, num_ngbrs)
-- $$ LANGUAGE plpythonu;
--- a/src/pg/sql/50_contours.sql
+++ b/src/pg/sql/50_contours.sql
@@ -1,32 +0,0 @@
-
-CREATE OR REPLACE FUNCTION
-  _CDB_Contours (
-      subquery TEXT,
-      grid_size NUMERIC DEFAULT 100,
-      bandwidth NUMERIC DEFAULT 0.0001,
-      levels NUMERIC[] DEFAULT null
-      )
-RETURNS table (level Numeric, geom_text text )
-AS $$
-  from crankshaft.contours import cdb_generate_contours
-  # TODO: use named parameters or a dictionary
-  return cdb_generate_contours(subquery, grid_size, bandwidth, levels)
-$$ LANGUAGE plpythonu;
-
-
-CREATE OR REPLACE FUNCTION
-  CDB_Contours (
-    subquery TEXT,
-    grid_size NUMERIC DEFAULT 100,
-    bandwidth NUMERIC DEFAULT 0.0001,
-    levels NUMERIC[] DEFAULT null
-    )
-RETURNS table (level Numeric, geom geometry )
-AS $$
-BEGIN
-
-  RETURN QUERY
-    select cont.level as level, ST_GeomFromText(cont.geom_text, 4326)::geometry as geom from _CDB_Contours(subquery,grid_size,bandwidth,levels) as cont;
-END;
-$$ LANGUAGE plpgsql;
-
--- a/src/pg/test/expected/02_moran_test.out
+++ b/src/pg/test/expected/02_moran_test.out
@@ -110,7 +110,7 @@ INSERT INTO ppoints2 VALUES
 (24,'0101000020E61000009C5F91C5095C17C0C78784B15A4F4540'::geometry,'24','07',0.3, 1.0),
 (29,'0101000020E6100000C34D4A5B48E712C092E680892C684240'::geometry,'29','01',0.3, 1.0),
 (52,'0101000020E6100000406A545EB29A07C04E5F0BDA39A54140'::geometry,'52','19',0.0, 1.01)
-- Areas of Interest functions perform some nondeterministic computations
+-- Moral functions perform some nondeterministic computations
 -- (to estimate the significance); we will set the seeds for the RNGs
 -- that affect those results to have repeateble results
 SELECT cdb_crankshaft._cdb_random_seeds(1234);
@@ -121,61 +121,67 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local('SELECT * FROM ppoints', 'value') m
+  JOIN cdb_crankshaft.cdb_moran_local('ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;
- code | quads 
------+-------
+NOTICE:  ** Constructing query
+CONTEXT:  PL/Python function "cdb_moran_local"
+NOTICE:  ** Query returned with 52 rows
+CONTEXT:  PL/Python function "cdb_moran_local"
+NOTICE:  ** Finished calculations
+CONTEXT:  PL/Python function "cdb_moran_local"
+ code |      quads      
+------+-----------------
 01   | HH
 02   | HL
- 03   | LL
- 04   | LL
- 05   | LH
- 06   | LL
- 07   | HH
- 08   | HH
- 09   | HH
- 10   | LL
+ 03   | Not significant
+ 04   | Not significant
+ 05   | Not significant
+ 06   | Not significant
+ 07   | Not significant
+ 08   | Not significant
+ 09   | Not significant
+ 10   | Not significant
 11   | LL
- 12   | LL
- 13   | HL
- 14   | LL
- 15   | LL
+ 12   | Not significant
+ 13   | Not significant
+ 14   | Not significant
+ 15   | Not significant
 16   | HH
- 17   | HH
- 18   | LL
- 19   | HH
+ 17   | Not significant
+ 18   | Not significant
+ 19   | Not significant
 20   | HH
 21   | LL
- 22   | HH
- 23   | LL
- 24   | LL
+ 22   | Not significant
+ 23   | Not significant
+ 24   | Not significant
 25   | HH
 26   | HH
- 27   | LL
- 28   | HH
+ 27   | Not significant
+ 28   | Not significant
 29   | LL
- 30   | LL
+ 30   | Not significant
 31   | HH
- 32   | LL
- 33   | HL
- 34   | LH
+ 32   | Not significant
+ 33   | Not significant
+ 34   | Not significant
 35   | LL
- 36   | LL
- 37   | HL
+ 36   | Not significant
+ 37   | Not significant
 38   | HL
- 39   | HH
- 40   | HH
+ 39   | Not significant
+ 40   | Not significant
 41   | HL
 42   | LH
- 43   | LH
- 44   | LL
+ 43   | Not significant
+ 44   | Not significant
 45   | LH
- 46   | LL
- 47   | LL
+ 46   | Not significant
+ 47   | Not significant
 48   | HH
- 49   | LH
- 50   | HH
+ 49   | Not significant
+ 50   | Not significant
 51   | LL
 52   | LL
 (52 rows)
@@ -188,61 +194,67 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints2.code, m.quads
  FROM ppoints2
-  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local_Rate('SELECT * FROM ppoints2', 'numerator', 'denominator') m
+  JOIN cdb_crankshaft.cdb_moran_local_rate('ppoints2', 'numerator', 'denominator') m
    ON ppoints2.cartodb_id = m.ids
  ORDER BY ppoints2.code;
- code | quads 
------+-------
+NOTICE:  ** Constructing query
+CONTEXT:  PL/Python function "cdb_moran_local_rate"
+NOTICE:  ** Query returned with 51 rows
+CONTEXT:  PL/Python function "cdb_moran_local_rate"
+NOTICE:  ** Finished calculations
+CONTEXT:  PL/Python function "cdb_moran_local_rate"
+ code |      quads      
+------+-----------------
 01   | LL
- 02   | LH
- 03   | HH
- 04   | HH
- 05   | LL
- 06   | HH
- 07   | LL
- 08   | LL
+ 02   | Not significant
+ 03   | Not significant
+ 04   | Not significant
+ 05   | Not significant
+ 06   | Not significant
+ 07   | Not significant
+ 08   | Not significant
 09   | LL
- 10   | HH
+ 10   | Not significant
 11   | HH
- 12   | HL
- 13   | LL
- 14   | HH
- 15   | LL
- 16   | LL
+ 12   | Not significant
+ 13   | Not significant
+ 14   | Not significant
+ 15   | Not significant
+ 16   | Not significant
 17   | LL
- 18   | LH
- 19   | LL
+ 18   | Not significant
+ 19   | Not significant
 20   | LL
- 21   | HH
- 22   | LL
- 23   | HL
- 24   | LL
+ 21   | Not significant
+ 22   | Not significant
+ 23   | Not significant
+ 24   | Not significant
 25   | LL
 26   | LL
- 27   | LL
- 28   | LL
+ 27   | Not significant
+ 28   | Not significant
 29   | LH
- 30   | HH
+ 30   | Not significant
 31   | LL
- 32   | LL
- 33   | LL
- 34   | LL
+ 32   | Not significant
+ 33   | Not significant
+ 34   | Not significant
 35   | LH
- 36   | HL
- 37   | LH
+ 36   | Not significant
+ 37   | Not significant
 38   | LH
- 39   | LL
- 40   | LL
+ 39   | Not significant
+ 40   | Not significant
 41   | LH
 42   | HL
- 43   | LL
- 44   | HL
+ 43   | Not significant
+ 44   | Not significant
 45   | LL
- 46   | HL
- 47   | LL
+ 46   | Not significant
+ 47   | Not significant
 48   | LL
- 49   | HL
- 50   | LL
- 51   | HH
+ 49   | Not significant
+ 50   | Not significant
+ 51   | Not significant
 (51 rows)

--- a/src/pg/test/sql/02_moran_test.sql
+++ b/src/pg/test/sql/02_moran_test.sql
@@ -1,14 +1,14 @@
 \i test/fixtures/ppoints.sql
 \i test/fixtures/ppoints2.sql

-- Areas of Interest functions perform some nondeterministic computations
+-- Moral functions perform some nondeterministic computations
 -- (to estimate the significance); we will set the seeds for the RNGs
 -- that affect those results to have repeateble results
 SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local('SELECT * FROM ppoints', 'value') m
+  JOIN cdb_crankshaft.cdb_moran_local('ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;

@@ -16,6 +16,6 @@ SELECT cdb_crankshaft._cdb_random_seeds(1234);

 SELECT ppoints2.code, m.quads
  FROM ppoints2
-  JOIN cdb_crankshaft.CDB_AreasOfInterest_Local_Rate('SELECT * FROM ppoints2', 'numerator', 'denominator') m
+  JOIN cdb_crankshaft.cdb_moran_local_rate('ppoints2', 'numerator', 'denominator') m
    ON ppoints2.cartodb_id = m.ids
  ORDER BY ppoints2.code;
--- a/src/pg/test/sql/90_permissions.sql
+++ b/src/pg/test/sql/90_permissions.sql
@@ -9,7 +9,7 @@ SET search_path TO public,cartodb,cdb_crankshaft;
 -- Exercise public functions
 SELECT ppoints.code, m.quads
  FROM ppoints
-  JOIN CDB_AreasOfInterest_Local('ppoints', 'value') m
+  JOIN cdb_moran_local('ppoints', 'value') m
    ON ppoints.cartodb_id = m.ids
  ORDER BY ppoints.code;
 SELECT round(cdb_overlap_sum(
--- a/src/py/.gitignore
+++ b/src/py/.gitignore
@@ -0,0 +1,2 @@
+*.pyc
+dev/
--- a/src/py/Makefile
+++ b/src/py/Makefile
@@ -1,22 +1,9 @@
-include ../../Makefile.global
-
 # Install the package locally for development
 install:
-	virtualenv --system-site-packages ../../envs/dev
-	# source ../../envs/dev/bin/activate
-	../../envs/dev/bin/pip install -I ./crankshaft
-	../../envs/dev/bin/pip install -I nose
+	virtualenv dev
+	./dev/bin/pip install ./crankshaft --upgrade
+	./dev/bin/pip install nose

 # Test develpment install
-test:
-	../../envs/dev/bin/nosetests crankshaft/test/
-
-release: ../../release/$(EXTENSION).control $(SOURCES_DATA)
-	mkdir -p ../../release/python/$(EXTVERSION)
-	cp -r ./$(PACKAGE) ../../release/python/$(EXTVERSION)/
-	$(SED) -i -r 's/version='"'"'[0-9]+\.[0-9]+\.[0-9]+'"'"'/version='"'"'$(EXTVERSION)'"'"'/g'  ../../release/python/$(EXTVERSION)/$(PACKAGE)/setup.py
-
-deploy:
-	virtualenv --system-site-packages $(VIRTUALENV_PATH)/$(RELEASE_VERSION)
-	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I -U ../../release/python/$(RELEASE_VERSION)/$(PACKAGE)
-	$(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I nose
+testinstalled:
+	./dev/bin/nosetests crankshaft/test/
--- a/src/py/README.md
+++ b/src/py/README.md
@@ -8,7 +8,7 @@ cd crankshaft
 nosetests test/
 ```

-## Notes about Python dependencies
+## Notes about python dependencies
 * This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments.
 * We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has  all the dependencies
 * Every dependency should be:
@@ -20,69 +20,80 @@ nosetests test/

 ---

-To avoid troublesome compilations/linkings we will use
-the available system package `python-scipy`.
-This package and its dependencies provide numpy 1.6.1
-and scipy 0.9.0. To be able to use these versions we cannot
-PySAL 1.10 or later, so we'll stick to 1.9.1.
+### Sample session with virtualenv
+#### Create and use a virtual env

-```
-apt-get install -y python-scipy
-```
-
-We'll use virtual environments to install our packages,
-but configued to use also system modules so that the
-mentioned scipy and numpy are used.
-
-    # Create a virtual environment for python
-    $ virtualenv --system-site-packages dev
+    # Create the virtual environment for python
+    $ virtualenv myenv

    # Activate the virtualenv
-    $ source dev/bin/activate
+    $ source myenv/bin/activate

    # Install all the requirements
    # expect this to take a while, as it will trigger a few compilations
-    (dev) $ pip install -I ./crankshaft
+    (myenv) $ pip install -r requirements.txt
+
+    # Add a new pip to the party
+    (myenv) $ pip install pandas

 #### Test the libraries with that virtual env
-
 ##### Test numpy library dependency:

    import numpy
    numpy.test('full')

+output:
+```
+======================================================================
+ERROR: test_multiarray.TestNewBufferProtocol.test_relaxed_strides
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "/home/ubuntu/www/crankshaft/src/py/dev2/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
+    self.test(*self.arg)
+  File "/home/ubuntu/www/crankshaft/src/py/dev2/lib/python2.7/site-packages/numpy/core/tests/test_multiarray.py", line 5366, in test_relaxed_strides
+    fd.write(c.data)
+TypeError: 'buffer' does not have the buffer interface
+
+----------------------------------------------------------------------
+Ran 6153 tests in 84.561s
+
+FAILED (KNOWNFAIL=3, SKIP=5, errors=1)
+Out[2]: <nose.result.TextTestResult run=6153 errors=1 failures=0>
+```
+
+NOTE: this is expected to fail with Python 2.7.3, which is the version embedded in our postgresql installation
+
+
 ##### Run scipy tests

    import scipy
    scipy.test('full')

+Output:
+```
+Ran 21562 tests in 321.610s
+
+OK (KNOWNFAIL=130, SKIP=1840)
+Out[2]: <nose.result.TextTestResult run=21562 errors=0 failures=0>
+```
+Ok, this looks good...
+
 ##### Testing pysal
-
 See [http://pysal.readthedocs.org/en/latest/developers/testing.html]

-This will require putting this into `dev/lib/python2.7/site-packages/setup.cfg`:
-
-```
-[nosetests]
-ignore-files=collection
-exclude-dir=pysal/contrib
-
-[wheel]
-universal=1
-```
-
-And copying some files before executing the tests:
-(we'll use a temporary directory from where the tests will be executed because
-some tests expect some files in the current directory). Next must be executed
-from
-
-```
-cp dev/lib/python2.7/site-packages/pysal/examples/geodanet/* dev/local/lib/python2.7/site-packages/pysal/examples
-mkdir -p test_tmp && cd test_tmp && cp ../dev/lib/python2.7/site-packages/pysal/examples/geodanet/* ./
-```
-
-Then, execute the tests with:
-
    import pysal
    import nose
    nose.runmodule('pysal')
+
+```
+Ran 537 tests in 42.182s
+
+FAILED (errors=48, failures=17)
+An exception has occurred, use %tb to see the full traceback.
+```
+
+This doesn't look good... Taking a deeper look at the failures, many have the `IOError: [Errno 2] No such file or directory: 'streets.shp'`
+
+In the source code, there's the following [config](https://github.com/pysal/pysal/blob/master/setup.cfg) that seems to be missing in the pip package. By copying it to `lib/python2.7/site-packages` within the environment, it goes down to 17 failures.
+
+The remaining failures don't look good. I see two types: precision calculation errors and arrays/matrices missing 1 element when comparing... TODO: FIX this
--- a/src/py/crankshaft/crankshaft/init.py
+++ b/src/py/crankshaft/crankshaft/init.py
@@ -1,3 +1,2 @@
 import random_seeds
 import clustering
-import contours
--- a/src/py/crankshaft/crankshaft/clustering/moran.py
+++ b/src/py/crankshaft/crankshaft/clustering/moran.py
@@ -5,226 +5,143 @@ Moran's I geostatistics (global clustering & outliers presence)
 # TODO: Fill in local neighbors which have null/NoneType values with the
 #       average of the their neighborhood

+import numpy as np
 import pysal as ps
 import plpy

-# crankshaft module
-import crankshaft.pysal_utils as pu
-
 # High level interface ---------------------------------------

-def moran(subquery, attr_name,
-          permutations, geom_col, id_col, w_type, num_ngbrs):
-    """
-    Moran's I (global)
-    Implementation building neighbors with a PostGIS database and Moran's I
-     core clusters with PySAL.
-    Andy Eschbacher
-    """
-    qvals = {"id_col": id_col,
-             "attr1": attr_name,
-             "geom_col": geom_col,
-             "subquery": subquery,
-             "num_ngbrs": num_ngbrs}
-
-    query = pu.construct_neighbor_query(w_type, qvals)
-
-    plpy.notice('** Query: %s' % query)
-
-    try:
-        result = plpy.execute(query)
-        # if there are no neighbors, exit
-        if len(result) == 0:
-            return pu.empty_zipped_array(2)
-        plpy.notice('** Query returned with %d rows' % len(result))
-    except plpy.SPIError:
-        plpy.error('Error: areas of interest query failed, check input parameters')
-        plpy.notice('** Query failed: "%s"' % query)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        return pu.empty_zipped_array(2)
-
-    ## collect attributes
-    attr_vals = pu.get_attributes(result)
-
-    ## calculate weights
-    weight = pu.get_weight(result, w_type, num_ngbrs)
-
-    ## calculate moran global
-    moran_global = ps.esda.moran.Moran(attr_vals, weight,
-                                       permutations=permutations)
-
-    return zip([moran_global.I], [moran_global.EI])
-
-def moran_local(subquery, attr,
-                permutations, geom_col, id_col, w_type, num_ngbrs):
+def moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
    """
    Moran's I implementation for PL/Python
    Andy Eschbacher
    """
+    # TODO: ensure that the significance output can be smaller that 1e-3 (0.001)
+    # TODO: make a wishlist of output features (zscores, pvalues, raw local lisa, what else?)
+
+    plpy.notice('** Constructing query')

    # geometries with attributes that are null are ignored
    # resulting in a collection of not as near neighbors

    qvals = {"id_col": id_col,
-             "attr1": attr,
-             "geom_col": geom_col,
-             "subquery": subquery,
+            "attr1": attr,
+            "geom_col": geom_column,
+             "table": t,
             "num_ngbrs": num_ngbrs}

-    query = pu.construct_neighbor_query(w_type, qvals)
+    q = get_query(w_type, qvals)

    try:
-        result = plpy.execute(query)
-        # if there are no neighbors, exit
-        if len(result) == 0:
-            return pu.empty_zipped_array(5)
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
    except plpy.SPIError:
-        plpy.error('Error: areas of interest query failed, check input parameters')
-        plpy.notice('** Query failed: "%s"' % query)
-        return pu.empty_zipped_array(5)
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])

-    attr_vals = pu.get_attributes(result)
-    weight = pu.get_weight(result, w_type, num_ngbrs)
+    y = get_attributes(r, 1)
+    w = get_weight(r, w_type)

    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local(attr_vals, weight,
-                                     permutations=permutations)
-
-    # find quadrants for each geometry
-    quads = quad_position(lisa.q)
-
-    return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)
-
-def moran_rate(subquery, numerator, denominator,
-               permutations, geom_col, id_col, w_type, num_ngbrs):
-    """
-    Moran's I Rate (global)
-    Andy Eschbacher
-    """
-    qvals = {"id_col": id_col,
-             "attr1": numerator,
-             "attr2": denominator,
-             "geom_col": geom_col,
-             "subquery": subquery,
-             "num_ngbrs": num_ngbrs}
-
-    query = pu.construct_neighbor_query(w_type, qvals)
-
-    plpy.notice('** Query: %s' % query)
-
-    try:
-        result = plpy.execute(query)
-        # if there are no neighbors, exit
-        if len(result) == 0:
-            return pu.empty_zipped_array(2)
-        plpy.notice('** Query returned with %d rows' % len(result))
-    except plpy.SPIError:
-        plpy.error('Error: areas of interest query failed, check input parameters')
-        plpy.notice('** Query failed: "%s"' % query)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        return pu.empty_zipped_array(2)
-
-    ## collect attributes
-    numer = pu.get_attributes(result, 1)
-    denom = pu.get_attributes(result, 2)
-
-    weight = pu.get_weight(result, w_type, num_ngbrs)
-
-    ## calculate moran global rate
-    lisa_rate = ps.esda.moran.Moran_Rate(numer, denom, weight,
-                                         permutations=permutations)
-
-    return zip([lisa_rate.I], [lisa_rate.EI])
-
-def moran_local_rate(subquery, numerator, denominator,
-                     permutations, geom_col, id_col, w_type, num_ngbrs):
-    """
-        Moran's I Local Rate
-        Andy Eschbacher
-    """
-    # geometries with values that are null are ignored
-    # resulting in a collection of not as near neighbors
-
-    query = pu.construct_neighbor_query(w_type,
-                                     {"id_col": id_col,
-                                      "numerator": numerator,
-                                      "denominator": denominator,
-                                      "geom_col": geom_col,
-                                      "subquery": subquery,
-                                      "num_ngbrs": num_ngbrs})
-
-    try:
-        result = plpy.execute(query)
-        # if there are no neighbors, exit
-        if len(result) == 0:
-            return pu.empty_zipped_array(5)
-    except plpy.SPIError:
-        plpy.error('Error: areas of interest query failed, check input parameters')
-        plpy.notice('** Query failed: "%s"' % query)
-        plpy.notice('** Error: %s' % plpy.SPIError)
-        return pu.empty_zipped_array(5)
-
-    ## collect attributes
-    numer = pu.get_attributes(result, 1)
-    denom = pu.get_attributes(result, 2)
-
-    weight = pu.get_weight(result, w_type, num_ngbrs)
-
-    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, weight,
-                                          permutations=permutations)
+    lisa = ps.Moran_Local(y, w)

    # find units of significance
-    quads = quad_position(lisa.q)
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)

-    return zip(lisa.Is, quads, lisa.p_sim, weight.id_order, lisa.y)
+    plpy.notice('** Finished calculations')

-def moran_local_bv(subquery, attr1, attr2,
-                   permutations, geom_col, id_col, w_type, num_ngbrs):
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+
+
+def moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
    """
-        Moran's I (local) Bivariate (untested)
+    Moran's I Local Rate
+    Andy Eschbacher
    """
+
+    plpy.notice('** Constructing query')
+
+    # geometries with attributes that are null are ignored
+    # resulting in a collection of not as near neighbors
+
+    qvals = {"id_col": id_col,
+             "numerator": numerator,
+             "denominator": denominator,
+             "geom_col": geom_column,
+             "table": t,
+             "num_ngbrs": num_ngbrs}
+
+    q = get_query(w_type, qvals)
+
+    try:
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
+    except plpy.SPIError:
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])
+
+        plpy.notice('r.nrows() = %d' % r.nrows())
+
+    ## collect attributes
+    numer = get_attributes(r, 1)
+    denom = get_attributes(r, 2)
+
+    w = get_weight(r, w_type, num_ngbrs)
+
+    # calculate LISA values
+    lisa = ps.esda.moran.Moran_Local_Rate(numer, denom, w, permutations=permutations)
+
+    # find units of significance
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)
+
+    plpy.notice('** Finished calculations')
+
+    ## TODO: Decide on which return values here
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order, lisa.y)
+
+def moran_local_bv(t, attr1, attr2, significance, num_ngbrs, permutations, geom_column, id_col, w_type):
    plpy.notice('** Constructing query')

    qvals = {"num_ngbrs": num_ngbrs,
             "attr1": attr1,
             "attr2": attr2,
-             "subquery": subquery,
-             "geom_col": geom_col,
+             "table": t,
+             "geom_col": geom_column,
             "id_col": id_col}

-    query = pu.construct_neighbor_query(w_type, qvals)
+    q = get_query(w_type, qvals)

    try:
-        result = plpy.execute(query)
-        # if there are no neighbors, exit
-        if len(result) == 0:
-            return pu.empty_zipped_array(4)
+        r = plpy.execute(q)
+        plpy.notice('** Query returned with %d rows' % len(r))
    except plpy.SPIError:
-        plpy.error("Error: areas of interest query failed, " \
-                   "check input parameters")
-        plpy.notice('** Query failed: "%s"' % query)
-        return pu.empty_zipped_array(4)
+        plpy.notice('** Query failed: "%s"' % q)
+        plpy.notice('** Error: %s' % plpy.SPIError)
+        plpy.notice('** Exiting function')
+        return zip([None], [None], [None], [None])

    ## collect attributes
-    attr1_vals = pu.get_attributes(result, 1)
-    attr2_vals = pu.get_attributes(result, 2)
+    attr1_vals = get_attributes(r, 1)
+    attr2_vals = get_attributes(r, 2)

    # create weights
-    weight = pu.get_weight(result, w_type, num_ngbrs)
+    w = get_weight(r, w_type, num_ngbrs)

    # calculate LISA values
-    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, weight,
-                                        permutations=permutations)
+    lisa = ps.esda.moran.Moran_Local_BV(attr1_vals, attr2_vals, w)

    plpy.notice("len of Is: %d" % len(lisa.Is))

    # find clustering of significance
-    lisa_sig = quad_position(lisa.q)
+    lisa_sig = lisa_sig_vals(lisa.p_sim, lisa.q, significance)

    plpy.notice('** Finished calculations')

-    return zip(lisa.Is, lisa_sig, lisa.p_sim, weight.id_order)
+    return zip(lisa.Is, lisa_sig, lisa.p_sim, w.id_order)
+

 # Low level functions ----------------------------------------

@@ -233,9 +150,7 @@ def map_quads(coord):
        Map a quadrant number to Moran's I designation
        HH=1, LH=2, LL=3, HL=4
        Input:
-        @param coord (int): quadrant of a specific measurement
-        Output:
-            classification (one of 'HH', 'LH', 'LL', or 'HL')
+        :param coord (int): quadrant of a specific measurement
    """
    if coord == 1:
        return 'HH'
@@ -248,13 +163,159 @@ def map_quads(coord):
    else:
        return None

+def query_attr_select(params):
+    """
+        Create portion of SELECT statement for attributes inolved in query.
+        :param params: dict of information used in query (column names,
+                       table name, etc.)
+    """
+
+    attrs = [k for k in params
+             if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')]
+
+    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
+
+    attr_string = ""
+
+    for idx, val in enumerate(sorted(attrs)):
+        attr_string += template % {"col": val, "alias_num": idx + 1}
+
+    return attr_string
+
+def query_attr_where(params):
+    """
+        Create portion of WHERE clauses for weeding out NULL-valued geometries
+    """
+    attrs = sorted([k for k in params
+                    if k not in ('id_col', 'geom_col', 'table', 'num_ngbrs')])
+
+    attr_string = []
+
+    for attr in attrs:
+        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
+
+    if len(attrs) == 2:
+        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
+
+    out = " AND ".join(attr_string)
+
+    return out
+
+def knn(params):
+    """SQL query for k-nearest neighbors.
+        :param vars: dict of values to fill template
+    """
+
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                              "FROM \"{table}\" As j " \
+                              "WHERE %(attr_where_j)s " \
+                              "ORDER BY j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
+                              "LIMIT {num_ngbrs} OFFSET 1 ) " \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## SQL query for finding queens neighbors (all contiguous polygons)
+def queen(params):
+    """SQL query for queen neighbors.
+        :param params: dict of information to fill query
+    """
+    attr_select = query_attr_select(params)
+    attr_where = query_attr_where(params)
+
+    replacements = {"attr_select": attr_select,
+                    "attr_where_i": attr_where.replace("idx_replace", "i"),
+                    "attr_where_j": attr_where.replace("idx_replace", "j")}
+
+    query = "SELECT " \
+                "i.\"{id_col}\" As id, " \
+                "%(attr_select)s" \
+                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
+                 "FROM \"{table}\" As j " \
+                 "WHERE ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
+                 "%(attr_where_j)s)" \
+                ") As neighbors " \
+            "FROM \"{table}\" As i " \
+            "WHERE " \
+                "%(attr_where_i)s " \
+            "ORDER BY i.\"{id_col}\" ASC;" % replacements
+
+    return query.format(**params)
+
+## to add more weight methods open a ticket or pull request
+
+def get_query(w_type, query_vals):
+    """Return requested query.
+        :param w_type: type of neighbors to calculate (knn or queen)
+        :param query_vals: values used to construct the query
+    """
+
+    if w_type == 'knn':
+        return knn(query_vals)
+    else:
+        return queen(query_vals)
+
+def get_attributes(query_res, attr_num):
+    """
+        :param query_res: query results with attributes and neighbors
+        :param attr_num: attribute number (1, 2, ...)
+    """
+    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
+
+## Build weight object
+def get_weight(query_res, w_type='queen', num_ngbrs=5):
+    """
+        Construct PySAL weight from return value of query
+        :param query_res: query results with attributes and neighbors
+    """
+    if w_type == 'knn':
+        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
+        weights = {x['id']: row_normed_weights for x in query_res}
+    elif w_type == 'queen':
+        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
+                            if len(x['neighbors']) > 0
+                            else [] for x in query_res}
+
+    neighbors = {x['id']: x['neighbors'] for x in query_res}
+
+    return ps.W(neighbors, weights)
+
 def quad_position(quads):
    """
        Produce Moran's I classification based of n
-        Input:
-        @param quads ndarray: an array of quads classified by
-          1-4 (PySAL default)
-        Output:
-        @param list: an array of quads classied by 'HH', 'LL', etc.
    """
-    return [map_quads(q) for q in quads]
+
+    lisa_sig = np.array([map_quads(q) for q in quads])
+
+    return lisa_sig
+
+def lisa_sig_vals(pvals, quads, threshold):
+    """
+        Produce Moran's I classification based of n
+    """
+
+    sig = (pvals <= threshold)
+
+    lisa_sig = np.empty(len(sig), np.chararray)
+
+    for idx, val in enumerate(sig):
+        if val:
+            lisa_sig[idx] = map_quads(quads[idx])
+        else:
+            lisa_sig[idx] = 'Not significant'
+
+    return lisa_sig
--- a/src/py/crankshaft/crankshaft/contours/init.py
+++ b/src/py/crankshaft/crankshaft/contours/init.py
@@ -1 +0,0 @@
-from contours import *
--- a/src/py/crankshaft/crankshaft/contours/contours.py
+++ b/src/py/crankshaft/crankshaft/contours/contours.py
@@ -1,58 +0,0 @@
-from scipy.stats import gaussian_kde
-from scipy.interpolate import griddata
-import numpy as np 
-from sklearn.neighbors import KernelDensity
-from skimage.measure import find_contours
-import plpy
-
-def cdb_generate_contours(query, grid_size, bandwidth, levels):
-    plpy.notice('one')
-    data   = plpy.execute( 'select ST_X(the_geom) as x , ST_Y(the_geom) as y from ({0}) as a '.format(query))
-    plpy.notice('two')
-
-    xs = [d['x'] for d in data]
-    ys = [d['y'] for d in data]
-    plpy.notice('three')
-    return generate_contours(xs,ys,grid_size,bandwidth,levels)
-  
-def scale_coord(coord, x_range,y_range,grid_size):
-    plpy.notice('ranges %,  % ', x_range, y_range)
-    return [coord[0]*(x_range[1]-x_range[0])/float(grid_size)+x_range[0],
-            coord[1]*(y_range[1]-y_range[0])/float(grid_size)+y_range[0]]
-    
-def make_wkt(data,x_range, y_range, grid_size):
-    joined = ','.join([' '.join(map(str,scale_coord(coord_pair, x_range, y_range, grid_size))) for coord_pair in data])
-    return '({0})'.format(joined)
-    
-def make_multi_line(data,x_range,y_range, grid_size):
-    joined = ','.join([ make_wkt(ring,x_range,y_range,grid_size)  for ring in data ])
-    return 'MULTILINESTRING({0})'.format(joined)
-
-def generate_contours(xs,ys, grid_res=100, bandwidth=0.001, levels=None):
-    plpy.notice("HERE")
-    max_y, min_y = np.max(ys), np.min(ys)
-    max_x, min_x = np.max(xs), np.min(xs)
-    positions = np.vstack([ys,xs]).T
-    grid_x,grid_y = np.meshgrid(np.linspace(min_x, max_x , grid_res), np.linspace(min_y, max_y, grid_res))
-    xy = np.vstack([grid_y.ravel(), grid_x.ravel()]).T
-    xy *= np.pi / 180.
-
-    plpy.notice(" Generating kernel density")
-    kde = KernelDensity(bandwidth=bandwidth, metric='haversine',
-                        kernel='gaussian', algorithm='ball_tree')
-    kde.fit(positions*np.pi/180.)
-    results = np.exp(kde.score_samples(xy))
-    results = results.reshape((grid_x.shape[0], grid_y.shape[0]))
-    
-    if not levels:
-        levels = np.linspace(results.min(), results.max(),60)
-    plpy.notice(' finding contours')
-    CS = [find_contours(results, level) for level in levels]
-    
-    vertices = []
-    for contours,level in zip(CS,levels):
-        if len(contours)>0:
-            multiline = make_multi_line(contours, (min_x,max_x), (min_y, max_y), grid_res)
-            vertices.append([level, multiline ])
-    plpy.notice('generated vertices retunring ?')
-    return vertices
--- a/src/py/crankshaft/crankshaft/pysal_utils/init.py
+++ b/src/py/crankshaft/crankshaft/pysal_utils/init.py
@@ -1 +0,0 @@
-from pysal_utils import *
--- a/src/py/crankshaft/crankshaft/pysal_utils/pysal_utils.py
+++ b/src/py/crankshaft/crankshaft/pysal_utils/pysal_utils.py
@@ -1,152 +0,0 @@
-"""
-    Utilities module for generic PySAL functionality, mainly centered on translating queries into numpy arrays or PySAL weights objects
-"""
-
-import numpy as np
-import pysal as ps
-
-def construct_neighbor_query(w_type, query_vals):
-    """Return query (a string) used for finding neighbors
-        @param w_type text: type of neighbors to calculate ('knn' or 'queen')
-        @param query_vals dict: values used to construct the query
-    """
-
-    if w_type == 'knn':
-        return knn(query_vals)
-    else:
-        return queen(query_vals)
-
-## Build weight object
-def get_weight(query_res, w_type='knn', num_ngbrs=5):
-    """
-        Construct PySAL weight from return value of query
-        @param query_res: query results with attributes and neighbors
-    """
-    if w_type == 'knn':
-        row_normed_weights = [1.0 / float(num_ngbrs)] * num_ngbrs
-        weights = {x['id']: row_normed_weights for x in query_res}
-    else:
-        weights = {x['id']: [1.0 / len(x['neighbors'])] * len(x['neighbors'])
-                            if len(x['neighbors']) > 0
-                            else [] for x in query_res}
-
-    neighbors = {x['id']: x['neighbors'] for x in query_res}
-
-    return ps.W(neighbors, weights)
-
-def query_attr_select(params):
-    """
-        Create portion of SELECT statement for attributes inolved in query.
-        @param params: dict of information used in query (column names,
-                       table name, etc.)
-    """
-
-    attrs = [k for k in params
-             if k not in ('id_col', 'geom_col', 'subquery', 'num_ngbrs')]
-
-    template = "i.\"{%(col)s}\"::numeric As attr%(alias_num)s, "
-
-    attr_string = ""
-
-    for idx, val in enumerate(sorted(attrs)):
-        attr_string += template % {"col": val, "alias_num": idx + 1}
-
-    return attr_string
-
-def query_attr_where(params):
-    """
-        Create portion of WHERE clauses for weeding out NULL-valued geometries
-    """
-    attrs = sorted([k for k in params
-                    if k not in ('id_col', 'geom_col', 'subquery', 'num_ngbrs')])
-
-    attr_string = []
-
-    for attr in attrs:
-        attr_string.append("idx_replace.\"{%s}\" IS NOT NULL" % attr)
-
-    if len(attrs) == 2:
-        attr_string.append("idx_replace.\"{%s}\" <> 0" % attrs[1])
-
-    out = " AND ".join(attr_string)
-
-    return out
-
-def knn(params):
-    """SQL query for k-nearest neighbors.
-        @param vars: dict of values to fill template
-    """
-
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                              "FROM ({subquery}) As j " \
-                              "WHERE " \
-                                "i.\"{id_col}\" <> j.\"{id_col}\" AND " \
-                                "%(attr_where_j)s " \
-                              "ORDER BY " \
-                                "j.\"{geom_col}\" <-> i.\"{geom_col}\" ASC " \
-                              "LIMIT {num_ngbrs})" \
-                ") As neighbors " \
-            "FROM ({subquery}) As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## SQL query for finding queens neighbors (all contiguous polygons)
-def queen(params):
-    """SQL query for queen neighbors.
-        @param params dict: information to fill query
-    """
-    attr_select = query_attr_select(params)
-    attr_where = query_attr_where(params)
-
-    replacements = {"attr_select": attr_select,
-                    "attr_where_i": attr_where.replace("idx_replace", "i"),
-                    "attr_where_j": attr_where.replace("idx_replace", "j")}
-
-    query = "SELECT " \
-                "i.\"{id_col}\" As id, " \
-                "%(attr_select)s" \
-                "(SELECT ARRAY(SELECT j.\"{id_col}\" " \
-                 "FROM ({subquery}) As j " \
-                 "WHERE i.\"{id_col}\" <> j.\"{id_col}\" AND " \
-                       "ST_Touches(i.\"{geom_col}\", j.\"{geom_col}\") AND " \
-                       "%(attr_where_j)s)" \
-                ") As neighbors " \
-            "FROM ({subquery}) As i " \
-            "WHERE " \
-                "%(attr_where_i)s " \
-            "ORDER BY i.\"{id_col}\" ASC;" % replacements
-
-    return query.format(**params)
-
-## to add more weight methods open a ticket or pull request
-
-def get_attributes(query_res, attr_num=1):
-    """
-        @param query_res: query results with attributes and neighbors
-        @param attr_num: attribute number (1, 2, ...)
-    """
-    return np.array([x['attr' + str(attr_num)] for x in query_res], dtype=np.float)
-
-def empty_zipped_array(num_nones):
-    """
-        prepare return values for cases of empty weights objects (no neighbors)
-        Input:
-        @param num_nones int: number of columns (e.g., 4)
-        Output:
-        [(None, None, None, None)]
-    """
-
-    return [tuple([None] * num_nones)]
--- a/src/py/crankshaft/setup.py
+++ b/src/py/crankshaft/setup.py
@@ -10,7 +10,7 @@ from setuptools import setup, find_packages
 setup(
    name='crankshaft',

-    version='0.0.0',
+    version='0.0.1',

    description='CartoDB Spatial Analysis Python Library',

@@ -40,9 +40,13 @@ setup(

    # The choice of component versions is dictated by what's
    # provisioned in the production servers.
-    install_requires=['pysal==1.9.1'],
+    install_requires=[
+        'numpy>=1.10.4,<2',
+        'scipy>=0.11,<1', # see https://github.com/pysal/pysal/blob/master/requirements.txt
+        'pysal>=1.11.0,<2',
+    ],

-    requires=['pysal', 'numpy', 'sklearn', 'scikit-image'],
+    requires=['pysal', 'numpy'],

    test_suite='test'
 )
--- a/src/py/crankshaft/test/fixtures/moran.json
+++ b/src/py/crankshaft/test/fixtures/moran.json
@@ -1,52 +1,52 @@
 [[0.9319096128346788, "HH"],
 [-1.135787401862846, "HL"],
-[0.11732030672508517, "LL"],
-[0.6152779669180425, "LL"],
-[-0.14657336660125297, "LH"],
-[0.6967858120189607, "LL"],
-[0.07949310115714454, "HH"],
-[0.4703198759258987, "HH"],
-[0.4421125200498064, "HH"],
-[0.5724288737143592, "LL"],
+[0.11732030672508517, "Not significant"],
+[0.6152779669180425, "Not significant"],
+[-0.14657336660125297, "Not significant"],
+[0.6967858120189607, "Not significant"],
+[0.07949310115714454, "Not significant"],
+[0.4703198759258987, "Not significant"],
+[0.4421125200498064, "Not significant"],
+[0.5724288737143592, "Not significant"],
 [0.8970743435692062, "LL"],
-[0.18327334401918674, "LL"],
-[-0.01466729201304962, "HL"],
-[0.3481559372544409, "LL"],
-[0.06547094736902978, "LL"],
+[0.18327334401918674, "Not significant"],
+[-0.01466729201304962, "Not significant"],
+[0.3481559372544409, "Not significant"],
+[0.06547094736902978, "Not significant"],
 [0.15482141569329988, "HH"],
-[0.4373841193538136, "HH"],
-[0.15971286468915544, "LL"],
-[1.0543588860308968, "HH"],
+[0.4373841193538136, "Not significant"],
+[0.15971286468915544, "Not significant"],
+[1.0543588860308968, "Not significant"],
 [1.7372866900020818, "HH"],
 [1.091998586053999, "LL"],
-[0.1171572584252222, "HH"],
-[0.08438455015300014, "LL"],
-[0.06547094736902978, "LL"],
+[0.1171572584252222, "Not significant"],
+[0.08438455015300014, "Not significant"],
+[0.06547094736902978, "Not significant"],
 [0.15482141569329985, "HH"],
 [1.1627044812890683, "HH"],
-[0.06547094736902978, "LL"],
-[0.795275137550483, "HH"],
+[0.06547094736902978, "Not significant"],
+[0.795275137550483, "Not significant"],
 [0.18562939195219, "LL"],
-[0.3010757406693439, "LL"],
+[0.3010757406693439, "Not significant"],
 [2.8205795942839376, "HH"],
-[0.11259190602909264, "LL"],
-[-0.07116352791516614, "HL"],
-[-0.09945240794119009, "LH"],
+[0.11259190602909264, "Not significant"],
+[-0.07116352791516614, "Not significant"],
+[-0.09945240794119009, "Not significant"],
 [0.18562939195219, "LL"],
-[0.1832733440191868, "LL"],
-[-0.39054253768447705, "HL"],
+[0.1832733440191868, "Not significant"],
+[-0.39054253768447705, "Not significant"],
 [-0.1672071289487642, "HL"],
-[0.3337669247916343, "HH"],
-[0.2584386102554792, "HH"],
+[0.3337669247916343, "Not significant"],
+[0.2584386102554792, "Not significant"],
 [-0.19733845476322634, "HL"],
 [-0.9379282899805409, "LH"],
-[-0.028770969951095866, "LH"],
-[0.051367269430983485, "LL"],
+[-0.028770969951095866, "Not significant"],
+[0.051367269430983485, "Not significant"],
 [-0.2172548045913472, "LH"],
-[0.05136726943098351, "LL"],
-[0.04191046803899837, "LL"],
+[0.05136726943098351, "Not significant"],
+[0.04191046803899837, "Not significant"],
 [0.7482357030403517, "HH"],
-[-0.014585767863118111, "LH"],
-[0.5410013139159929, "HH"],
+[-0.014585767863118111, "Not significant"],
+[0.5410013139159929, "Not significant"],
 [1.0223932668429925, "LL"],
-[1.4179402898927476, "LL"]]
+[1.4179402898927476, "LL"]]
--- a/src/py/crankshaft/test/test_clustering_moran.py
+++ b/src/py/crankshaft/test/test_clustering_moran.py
@@ -1,6 +1,8 @@
 import unittest
 import numpy as np

+import unittest
+

 # from mock_plpy import MockPlPy
 # plpy = MockPlPy()
@@ -10,26 +12,25 @@ import numpy as np
 from helper import plpy, fixture_file

 import crankshaft.clustering as cc
-import crankshaft.pysal_utils as pu
 from crankshaft import random_seeds
 import json

 class MoranTest(unittest.TestCase):
-    """Testing class for Moran's I functions"""
+    """Testing class for Moran's I functions."""

    def setUp(self):
        plpy._reset()
        self.params = {"id_col": "cartodb_id",
                       "attr1": "andy",
                       "attr2": "jay_z",
-                       "subquery": "SELECT * FROM a_list",
+                       "table": "a_list",
                       "geom_col": "the_geom",
                       "num_ngbrs": 321}
        self.neighbors_data = json.loads(open(fixture_file('neighbors.json')).read())
        self.moran_data = json.loads(open(fixture_file('moran.json')).read())

    def test_map_quads(self):
-        """Test map_quads"""
+        """Test map_quads."""
        self.assertEqual(cc.map_quads(1), 'HH')
        self.assertEqual(cc.map_quads(2), 'LH')
        self.assertEqual(cc.map_quads(3), 'LL')
@@ -37,8 +38,80 @@ class MoranTest(unittest.TestCase):
        self.assertEqual(cc.map_quads(33), None)
        self.assertEqual(cc.map_quads('andy'), None)

+    def test_query_attr_select(self):
+        """Test query_attr_select."""
+
+        ans = "i.\"{attr1}\"::numeric As attr1, " \
+              "i.\"{attr2}\"::numeric As attr2, "
+
+        self.assertEqual(cc.query_attr_select(self.params), ans)
+
+    def test_query_attr_where(self):
+        """Test query_attr_where."""
+
+        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" IS NOT NULL AND "\
+              "idx_replace.\"{attr2}\" <> 0"
+
+        self.assertEqual(cc.query_attr_where(self.params), ans)
+
+    def test_knn(self):
+        """Test knn function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT j.\"cartodb_id\" " \
+              "FROM \"a_list\" As j WHERE j.\"andy\" IS NOT NULL AND " \
+              "j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 ORDER BY " \
+              "j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 OFFSET 1 ) ) " \
+              "As neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT " \
+              "NULL AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER " \
+              "BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.knn(self.params), ans)
+
+    def test_queen(self):
+        """Test queen neighbors function."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE ST_Touches(" \
+              "i.\"the_geom\", j.\"the_geom\") AND j.\"andy\" IS NOT NULL " \
+              "AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0)) As " \
+              "neighbors FROM \"a_list\" As i WHERE i.\"andy\" IS NOT NULL " \
+              "AND i.\"jay_z\" IS NOT NULL AND i.\"jay_z\" <> 0 ORDER BY " \
+              "i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.queen(self.params), ans)
+
+    def test_get_query(self):
+        """Test get_query."""
+
+        ans = "SELECT i.\"cartodb_id\" As id, i.\"andy\"::numeric As attr1, " \
+              "i.\"jay_z\"::numeric As attr2, (SELECT ARRAY(SELECT " \
+              "j.\"cartodb_id\" FROM \"a_list\" As j WHERE j.\"andy\" IS " \
+              "NOT NULL AND j.\"jay_z\" IS NOT NULL AND j.\"jay_z\" <> 0 " \
+              "ORDER BY j.\"the_geom\" <-> i.\"the_geom\" ASC LIMIT 321 " \
+              "OFFSET 1 ) ) As neighbors FROM \"a_list\" As i WHERE " \
+              "i.\"andy\" IS NOT NULL AND i.\"jay_z\" IS NOT NULL AND " \
+              "i.\"jay_z\" <> 0 ORDER BY i.\"cartodb_id\" ASC;"
+
+        self.assertEqual(cc.get_query('knn', self.params), ans)
+
+    def test_get_attributes(self):
+        """Test get_attributes."""
+
+        ## need to add tests
+
+        self.assertEqual(True, True)
+
+    def test_get_weight(self):
+        """Test get_weight."""
+
+        self.assertEqual(True, True)
+
+
    def test_quad_position(self):
-        """Test lisa_sig_vals"""
+        """Test lisa_sig_vals."""

        quads = np.array([1, 2, 3, 4], np.int)

@@ -52,7 +125,7 @@ class MoranTest(unittest.TestCase):
        data = [ { 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
        plpy._define_result('select', data)
        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local('subquery', 'value', 99, 'the_geom', 'cartodb_id', 'knn', 5)
+        result = cc.moran_local('table', 'value', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
        result = [(row[0], row[1]) for row in result]
        expected = self.moran_data
        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
@@ -64,20 +137,8 @@ class MoranTest(unittest.TestCase):
        data = [ { 'id': d['id'], 'attr1': d['value'], 'attr2': 1, 'neighbors': d['neighbors'] } for d in self.neighbors_data]
        plpy._define_result('select', data)
        random_seeds.set_random_seeds(1234)
-        result = cc.moran_local_rate('subquery', 'numerator', 'denominator', 99, 'the_geom', 'cartodb_id', 'knn', 5)
-        print 'result == None? ', result == None
+        result = cc.moran_local_rate('table', 'numerator', 'denominator', 0.05, 5, 99, 'the_geom', 'cartodb_id', 'knn')
        result = [(row[0], row[1]) for row in result]
        expected = self.moran_data
        for ([res_val, res_quad], [exp_val, exp_quad]) in zip(result, expected):
            self.assertAlmostEqual(res_val, exp_val)
-
-    def test_moran(self):
-        """Test Moran's I global"""
-        data = [{ 'id': d['id'], 'attr1': d['value'], 'neighbors': d['neighbors'] } for d in self.neighbors_data]
-        plpy._define_result('select', data)
-        random_seeds.set_random_seeds(1235)
-        result = cc.moran('table', 'value', 99, 'the_geom', 'cartodb_id', 'knn', 5)
-        print 'result == None?', result == None
-        result_moran = result[0][0]
-        expected_moran = np.array([row[0] for row in self.moran_data]).mean()
-        self.assertAlmostEqual(expected_moran, result_moran, delta=10e-2)
--- a/src/py/crankshaft/test/test_pysal_utils.py
+++ b/src/py/crankshaft/test/test_pysal_utils.py
@@ -1,107 +0,0 @@
-import unittest
-
-import crankshaft.pysal_utils as pu
-from crankshaft import random_seeds
-
-
-class PysalUtilsTest(unittest.TestCase):
-    """Testing class for utility functions related to PySAL integrations"""
-
-    def setUp(self):
-        self.params = {"id_col": "cartodb_id",
-                       "attr1": "andy",
-                       "attr2": "jay_z",
-                       "subquery": "SELECT * FROM a_list",
-                       "geom_col": "the_geom",
-                       "num_ngbrs": 321}
-
-    def test_query_attr_select(self):
-        """Test query_attr_select"""
-
-        ans = "i.\"{attr1}\"::numeric As attr1, " \
-              "i.\"{attr2}\"::numeric As attr2, "
-
-        self.assertEqual(pu.query_attr_select(self.params), ans)
-
-    def test_query_attr_where(self):
-        """Test pu.query_attr_where"""
-
-        ans = "idx_replace.\"{attr1}\" IS NOT NULL AND " \
-              "idx_replace.\"{attr2}\" IS NOT NULL AND " \
-              "idx_replace.\"{attr2}\" <> 0"
-
-        self.assertEqual(pu.query_attr_where(self.params), ans)
-
-    def test_knn(self):
-        """Test knn neighbors constructor"""
-
-        ans = "SELECT i.\"cartodb_id\" As id, " \
-                     "i.\"andy\"::numeric As attr1, " \
-                     "i.\"jay_z\"::numeric As attr2, " \
-                     "(SELECT ARRAY(SELECT j.\"cartodb_id\" " \
-                                   "FROM (SELECT * FROM a_list) As j " \
-                                   "WHERE " \
-                                    "i.\"cartodb_id\" <> j.\"cartodb_id\" AND " \
-                                    "j.\"andy\" IS NOT NULL AND " \
-                                    "j.\"jay_z\" IS NOT NULL AND " \
-                                    "j.\"jay_z\" <> 0 " \
-                                   "ORDER BY " \
-                                    "j.\"the_geom\" <-> i.\"the_geom\" ASC " \
-                      "LIMIT 321)) As neighbors " \
-              "FROM (SELECT * FROM a_list) As i " \
-              "WHERE i.\"andy\" IS NOT NULL AND " \
-                    "i.\"jay_z\" IS NOT NULL AND " \
-                    "i.\"jay_z\" <> 0 " \
-              "ORDER BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(pu.knn(self.params), ans)
-
-    def test_queen(self):
-        """Test queen neighbors constructor"""
-
-        ans = "SELECT i.\"cartodb_id\" As id, " \
-                     "i.\"andy\"::numeric As attr1, " \
-                     "i.\"jay_z\"::numeric As attr2, " \
-                     "(SELECT ARRAY(SELECT j.\"cartodb_id\" " \
-                                   "FROM (SELECT * FROM a_list) As j " \
-                                   "WHERE " \
-                                   "i.\"cartodb_id\" <> j.\"cartodb_id\" AND " \
-                                   "ST_Touches(i.\"the_geom\", " \
-                                              "j.\"the_geom\") AND " \
-                                   "j.\"andy\" IS NOT NULL AND " \
-                                   "j.\"jay_z\" IS NOT NULL AND " \
-                                   "j.\"jay_z\" <> 0)" \
-                                  ") As neighbors " \
-              "FROM (SELECT * FROM a_list) As i " \
-              "WHERE i.\"andy\" IS NOT NULL AND " \
-                    "i.\"jay_z\" IS NOT NULL AND " \
-                    "i.\"jay_z\" <> 0 " \
-              "ORDER BY i.\"cartodb_id\" ASC;"
-
-        self.assertEqual(pu.queen(self.params), ans)
-
-    def test_construct_neighbor_query(self):
-        """Test construct_neighbor_query"""
-
-        # Compare to raw knn query
-        self.assertEqual(pu.construct_neighbor_query('knn', self.params),
-                         pu.knn(self.params))
-
-    def test_get_attributes(self):
-        """Test get_attributes"""
-
-        ## need to add tests
-
-        self.assertEqual(True, True)
-
-    def test_get_weight(self):
-        """Test get_weight"""
-
-        self.assertEqual(True, True)
-
-    def test_empty_zipped_array(self):
-        """Test empty_zipped_array"""
-        ans2 = [(None, None)]
-        ans4 = [(None, None, None, None)]
-        self.assertEqual(pu.empty_zipped_array(2), ans2)
-        self.assertEqual(pu.empty_zipped_array(4), ans4)
Author	SHA1	Message	Date
Javier Goizueta	8dcf420437	Fix typo	2016-03-10 10:11:11 +01:00
Rafa de la Torre	7cd15885dc	Add info about python dependencies	2016-03-09 18:51:04 +01:00
Rafa de la Torre	d766001bf4	Constraint version numbers of reqs a little	2016-03-09 17:45:50 +01:00
Rafa de la Torre	e76eb0c56f	Define install dependencies in order I don't know if that actually affects the result, but just in case.	2016-03-09 14:40:02 +01:00