Update documentation

Add info about python dependencies
Set the path to virtualenvs in the Makefile
2016-03-10 19:13:46 +01:00 · 2016-03-10 18:06:21 +01:00 · 2016-03-09 19:04:21 +01:00 · 2016-03-09 15:03:17 +01:00 · 2016-03-09 15:00:50 +01:00 · 2016-03-08 19:35:02 +01:00
48 changed files with 332 additions and 301 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,84 +0,0 @@
-# Contributing guide
-
-## How to add new functions
-
-Try to put as little logic in the SQL extension as possible and
-just use it as a wrapper to the Python module functionality.
-
-Once a function is defined it should never change its signature in subsequent
-versions. To change a function's signature a new function with a different
-name must be created.
-
-### Version numbers
-
-The version of both the SQL extension and the Python package shall
-follow the[Semantic Versioning 2.0](http://semver.org/) guidelines:
-
-* When backwards incompatibility is introduced the major number is incremented
-* When functionally is added (in a backwards-compatible manner) the minor number
-  is incremented
-* When only fixes are introduced (backwards-compatible) the patch number is
-  incremented
-
-### Python Package
-
-...
-
-### SQL Extension
-
-* Generate a **new subfolder version** for `sql` and `test` folders to define
-  the new functions and tests
-  - Use symlinks to avoid file duplication between versions that don't update them
-  - Add new files or modify copies of the old files to add new functions or
-    modify existing functions (remember to rename a function if the signature
-    changes)
-  - Add or modify the corresponding documentation files in the `doc` folder.
-    Since we expect to have highly technical functions here, an extense
-    background explanation would be of great help to users of this extension.
-  - Create tests for the new functions/behaviour
-
-* Generate the **upgrade and downgrade files** for the extension
-
-* Update the control file and the Makefile to generate the complete SQL
-  file for the new created version. After running `make` a new
-  file `crankshaft--X.Y.Z.sql` will be created for the current version.
-  Additional files for migrating to/from the previous version A.B.Z should be
-  created:
-  - `crankshaft--X.Y.Z--A.B.C.sql`
-  - `crankshaft--A.B.C--X.Y.Z.sql`
-  All these new files must be added to git and pushed.
-
-* Update the public docs! ;-)
-
-## Conventions
-
-# SQL
-
-Use snake case (i.e. `snake_case` and not `CamelCase`) for all
-functions. Prefix functions intended for public use with `cdb_`
-and private functions (to be used only internally inside
-the extension)  with `_cdb_`.
-
-# Python
-
-...
-
-## Testing
-
-Running just the Python tests:
-
-```
-(cd python && make test)
-```
-
-Installing the Extension and running just the PostgreSQL tests:
-
-```
-(cd pg && sudo make install && PGUSER=postgres make installcheck)
-```
-
-Installing and testing everything:
-
-```
-sudo make install && PGUSER=postgres make testinstalled
-```
--- a/4
+++ b/4
@@ -1,5 +1,5 @@
-EXT_DIR = pg
-PYP_DIR = python
+EXT_DIR = src/pg
+PYP_DIR = src/py

 .PHONY: install
 .PHONY: run_tests
--- a/README.md
+++ b/README.md
@@ -4,9 +4,99 @@ CartoDB Spatial Analysis extension for PostgreSQL.

 ## Code organization

-* *pg* contains the PostgreSQL extension source code
-* *python* Python module
+* *doc* documentation
+* *src* source code
+* - *src/pg* contains the PostgreSQL extension source code
+* - *src/py* Python module source code
+* *release* reseleased versions

 ## Requirements

-* pip
+* pip, virtualenv, PostgreSQL
+* python-scipy system package (see src/py/README.md)
+
+# Working Process
+
+## Development
+
+Work in `src/pg/sql`, `src/py/crankshaft`;
+use a topic branch. See src/py/README.md
+for the procedure to work with the Python local environment.
+
+Take into account:
+
+*  Always remember to add tests for any new functionality
+   documentation.
+*  Add or modify the corresponding documentation files in the `doc` folder.
+   Since we expect to have highly technical functions here, an extense
+   background explanation would be of great help to users of this extension.
+*  Convention: Use snake case (i.e. `snake_case` and not `CamelCase`) for all
+   functions. Prefix functions intended for public use with `cdb_`
+   and private functions (to be used only internally inside
+   the extension)  with `_cdb_`.
+
+Update local installation with `sudo make install`
+(this will update the 'dev' version of the extension in 'src/pg/')
+
+Run the tests with `PGUSER=postgres make test`
+
+Update extension in working database with
+
+* `ALTER EXTENSION crankshaft VERSION TO 'current';`
+  `ALTER EXTENSION crankshaft VERSION TO 'dev';`
+
+Note: we keep the current development version install as 'dev' always;
+we update through the 'current' alias to allow changing the extension
+contents but not the version identifier. This will fail if the
+changes involve incompatible function changes such as a different
+return type; in that case the offending function (or the whole extension)
+should be dropped manually before the update.
+
+If the extension has not previously been installed in a database
+we can:
+
+* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
+
+Once the tests are succeeding a new Pull-Request can be created.
+CI-tests must be checked to be successfull.
+
+Before merging a topic branch peer code reviewing of the code is a must.
+
+
+## Release
+
+The release process of a new version of the extension
+shall by performed by the designated *Release Manager*.
+
+Note that we expect to gradually automate this process.
+
+Having checkout the topic branch of the PR to be released:
+
+The version number in `pg/cranckshaft.control` must first be updated.
+To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
+
+We now will explain the process for the case of backwards-compatible
+releases (updating the minor or patch version numbers).
+
+TODO: document the complex case of major releases.
+
+The next command must be executed to produce the main installation
+script for the new release, `release/cranckshaft--X.Y.Z.sql`.
+
+```
+make release
+```
+
+Then, the release manager shall produce upgrade and downgrade scripts
+to migrate to/from the previous release. In the case of minor/patch
+releases this simply consist in extracting the functions that have changed
+and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
+file.
+
+TODO: configure the local enviroment to be used by the release;
+currently should be directory `src/py/X.Y.Z`, but this must be fixed;
+a possibility to explore is to use the `cdb_conf` table.
+
+TODO: testing procedure for the new release
+
+TODO: push, merge, tag, deploy procedures.
--- a/pg/doc/02_moran.md
+++ b/pg/doc/02_moran.md
--- a/pg/doc/03_overlap_sum.md
+++ b/pg/doc/03_overlap_sum.md
--- a/pg/.gitignore
+++ b/pg/.gitignore
@@ -1,3 +0,0 @@
-regression.diffs
-regression.out
-results/
--- a/pg/Makefile
+++ b/pg/Makefile
@@ -1,30 +0,0 @@
-# Makefile to generate the extension out of separate sql source files.
-# Once a version is released, it is not meant to be changed. E.g: once version 0.0.1 is out, it SHALL NOT be changed.
-
-EXTENSION    = crankshaft
-EXTVERSION   = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
-
-# The new version to be generated from templates
-NEW_EXTENSION_ARTIFACT = $(EXTENSION)--$(EXTVERSION).sql
-
-# DATA is a special variable used by postgres build infrastructure
-# These are the files to be installed in the server shared dir,
-# for installation from scratch, upgrades and downgrades.
-# @see http://www.postgresql.org/docs/current/static/extend-pgxs.html
-DATA =  $(NEW_EXTENSION_ARTIFACT)
-
-SOURCES_DATA_DIR = sql/$(EXTVERSION)
-SOURCES_DATA = $(wildcard sql/$(EXTVERSION)/*.sql)
-
-# The extension installation artifacts are stored in the base subdirectory
-$(NEW_EXTENSION_ARTIFACT): $(SOURCES_DATA)
-	rm -f $@
-	cat $(SOURCES_DATA_DIR)/*.sql >> $@
-
-REGRESS = $(notdir $(basename $(wildcard test/$(EXTVERSION)/sql/*test.sql)))
-TEST_DIR = test/$(EXTVERSION)
-REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
-
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
--- a/pg/crankshaft--0.0.1.sql
+++ b/pg/crankshaft--0.0.1.sql
@@ -1,148 +0,0 @@
--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
-- Internal function.
-- Set the seeds of the RNGs (Random Number Generators)
-- used internally.
-CREATE OR REPLACE FUNCTION
-_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
-AS $$
-  from crankshaft import random_seeds
-  random_seeds.set_random_seeds(seed_value)
-$$ LANGUAGE plpythonu;
-- Moran's I
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local (
-      t TEXT,
-  	  attr TEXT,
-  	  significance float DEFAULT 0.05,
-  	  num_ngbrs INT DEFAULT 5,
-  	  permutations INT DEFAULT 99,
-  	  geom_column TEXT DEFAULT 'the_geom',
-  	  id_col TEXT DEFAULT 'cartodb_id',
-      w_type TEXT DEFAULT 'knn')
-RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
-AS $$
-  from crankshaft.clustering import moran_local
-  # TODO: use named parameters or a dictionary
-  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-
-- Moran's I Local Rate
-CREATE OR REPLACE FUNCTION
-  cdb_moran_local_rate(t TEXT,
-		 numerator TEXT,
-		 denominator TEXT,
-		 significance FLOAT DEFAULT 0.05,
-		 num_ngbrs INT DEFAULT 5,
-		 permutations INT DEFAULT 99,
-		 geom_column TEXT DEFAULT 'the_geom',
-		 id_col TEXT DEFAULT 'cartodb_id',
-		 w_type TEXT DEFAULT 'knn')
-RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
-AS $$
-  from crankshaft.clustering import moran_local_rate
-  # TODO: use named parameters or a dictionary
-  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
-$$ LANGUAGE plpythonu;
-- Function by Stuart Lynn for a simple interpolation of a value
-- from a polygon table over an arbitrary polygon
-- (weighted by the area proportion overlapped)
-- Aereal weighting is a very simple form of aereal interpolation.
--
-- Parameters:
--   * geom a Polygon geometry which defines the area where a value will be
--     estimated as the area-weighted sum of a given table/column
--   * target_table_name table name of the table that provides the values
--   * target_column column name of the column that provides the values
--   * schema_name optional parameter to defina the schema the target table
--     belongs to, which is necessary if its not in the search_path.
--     Note that target_table_name should never include the schema in it.
-- Return value:
--   Aereal-weighted interpolation of the column values over the geometry
-CREATE OR REPLACE
-FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
-  RETURNS numeric AS
-$$
-DECLARE
-	result numeric;
-  qualified_name text;
-BEGIN
-  IF schema_name IS NULL THEN
-    qualified_name := Format('%I', target_table_name);
-  ELSE
-    qualified_name := Format('%I.%s', schema_name, target_table_name);
-  END IF;
-  EXECUTE Format('
-    SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
-    FROM %s AS a
-    WHERE $1 && a.the_geom
-  ', target_column, qualified_name)
-  USING geom
-  INTO result;
-  RETURN result;
-END;
-$$ LANGUAGE plpgsql;
--
-- Creates N points randomly distributed arround the polygon
--
-- @param g - the geometry to be turned in to points
--
-- @param no_points - the number of points to generate
--
-- @params max_iter_per_point - the function generates points in the polygon's bounding box
-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
-- misses per point the funciton accepts before giving up.
--
-- Returns: Multipoint with the requested points
-CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
-RETURNS GEOMETRY AS $$
-DECLARE
-  extent GEOMETRY;
-  test_point Geometry;
-  width                NUMERIC;
-  height               NUMERIC;
-  x0                   NUMERIC;
-  y0                   NUMERIC;
-  xp                   NUMERIC;
-  yp                   NUMERIC;
-  no_left              INTEGER;
-  remaining_iterations INTEGER;
-  points               GEOMETRY[];
-  bbox_line            GEOMETRY;
-  intersection_line    GEOMETRY;
-BEGIN
-  extent  := ST_Envelope(geom);
-  width   := ST_XMax(extent) - ST_XMIN(extent);
-  height  := ST_YMax(extent) - ST_YMIN(extent);
-  x0 	  := ST_XMin(extent);
-  y0 	  := ST_YMin(extent);
-  no_left := no_points;
-
-  LOOP
-    if(no_left=0) THEN
-      EXIT;
-    END IF;
-    yp = y0 + height*random();
-    bbox_line  = ST_MakeLine(
-      ST_SetSRID(ST_MakePoint(yp, x0),4326),
-      ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
-    );
-    intersection_line = ST_Intersection(bbox_line,geom);
-  	test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
-	  points := points || test_point;
-	  no_left = no_left - 1 ;
-  END LOOP;
-  RETURN ST_Collect(points);
-END;
-$$
-LANGUAGE plpgsql VOLATILE;
-- Make sure by default there are no permissions for publicuser
-- NOTE: this happens at extension creation time, as part of an implicit transaction.
-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
-
-- Grant permissions on the schema to publicuser (but just the schema)
-GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
-
-- Revoke execute permissions on all functions in the schema by default
-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;
--- a/pg/test/0.0.1/results/01_install_test.out
+++ b/pg/test/0.0.1/results/01_install_test.out
@@ -1,6 +0,0 @@
-- Install dependencies
-CREATE EXTENSION plpythonu;
-CREATE EXTENSION postgis;
-CREATE EXTENSION cartodb;
-- Install the extension
-CREATE EXTENSION crankshaft;
--- a/python/Makefile
+++ b/python/Makefile
@@ -1,11 +0,0 @@
-# Install the package (needs root privileges)
-install:
-	pip install ./crankshaft --upgrade
-
-# Test from source code
-test:
-	(cd crankshaft && nosetests test/)
-
-# Test currently installed package
-testinstalled:
-	nosetests crankshaft/test/
--- a/python/README.md
+++ b/python/README.md
@@ -1,9 +0,0 @@
-# Crankshaft Python Package
-
-...
-### Run the tests
-
-```bash
-cd crankshaft
-nosetests test/
-```
--- a/src/pg/.gitignore
+++ b/src/pg/.gitignore
@@ -0,0 +1,6 @@
+regression.diffs
+regression.out
+results/
+crankshaft--dev.sql
+crankshaft--dev--current.sql
+crankshaft--current--dev.sql
--- a/src/pg/Makefile
+++ b/src/pg/Makefile
@@ -0,0 +1,48 @@
+# Generation of a new development version 'dev' (with an alias 'current' for
+# updating easily by upgrading to 'current', then 'dev')
+
+# sudo make install -- generate the 'dev' version from current source
+#                      and make it available to PostgreSQL
+# PGUSER=postgres make installcheck -- test the 'dev' extension
+
+SED = sed
+
+EXTENSION    = crankshaft
+
+DATA         = $(EXTENSION)--dev.sql \
+	             $(EXTENSION)--current--dev.sql \
+	             $(EXTENSION)--dev--current.sql
+
+SOURCES_DATA_DIR = sql
+SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql)
+
+VIRTUALENV_PATH = $(realpath ../py/)
+ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH))
+
+REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \
+               -e 's/@@VIRTUALENV_PATH@@/$(ESC_VIRVIRTUALENV_PATH)/g'
+
+$(DATA): $(SOURCES_DATA)
+	$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > $@
+
+TEST_DIR = test
+REGRESS = $(notdir $(basename $(wildcard $(TEST_DIR)/sql/*test.sql)))
+REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
+
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+
+# This seems to be needed at least for PG 9.3.11
+all: $(DATA)
+
+
+# WIP: goals for releasing the extension...
+
+EXTVERSION   = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
+
+../release/$(EXTENSION).control: $(EXTENSION).control
+	cp $< $@
+
+release: ../release/$(EXTENSION).control
+	cp $(EXTENSION)--dev.sql $(EXTENSION)--$(EXTVERSION).sql
--- a/src/pg/README.md
+++ b/src/pg/README.md
--- a/src/pg/crankshaft.control
+++ b/src/pg/crankshaft.control
--- a/pg/sql/0.0.1/00_header.sql
+++ b/pg/sql/0.0.1/00_header.sql
--- a/src/pg/sql/01_version.sql
+++ b/src/pg/sql/01_version.sql
@@ -0,0 +1,12 @@
+-- Version number of the extension release
+CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
+RETURNS text AS $$
+  SELECT '@@VERSION@@'::text;
+$$ language 'sql' IMMUTABLE STRICT;
+
+-- Internal identifier of the installed extension instence
+-- e.g. 'dev' for current development version
+CREATE OR REPLACE FUNCTION cdb_crankshaft_internal_version()
+RETURNS text AS $$
+  SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
+$$ language 'sql' IMMUTABLE STRICT;
--- a/src/pg/sql/02_py.sql
+++ b/src/pg/sql/02_py.sql
@@ -0,0 +1,23 @@
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
+RETURNS text
+AS $$
+  BEGIN
+    -- RETURN '/opt/virtualenvs/crankshaft';
+    RETURN '@@VIRTUALENV_PATH@@';
+  END;
+$$ language plpgsql IMMUTABLE STRICT;
+
+-- Use the crankshaft python module
+CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
+RETURNS VOID
+AS $$
+    import os
+    # plpy.notice('%',str(os.environ))
+    # activate virtualenv
+    crankshaft_version = plpy.execute('SELECT cdb_crankshaft.cdb_crankshaft_internal_version()')[0]['cdb_crankshaft_internal_version']
+    base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
+    default_venv_path = os.path.join(base_path, crankshaft_version)
+    venv_path =  os.environ.get('CRANKSHAFT_VENV', default_venv_path)
+    activate_path = venv_path + '/bin/activate_this.py'
+    exec(open(activate_path).read(), dict(__file__=activate_path))
+$$ LANGUAGE plpythonu;
--- a/pg/sql/0.0.1/01_random_seeds.sql
+++ b/pg/sql/0.0.1/01_random_seeds.sql
@@ -4,6 +4,7 @@
 CREATE OR REPLACE FUNCTION
 _cdb_random_seeds (seed_value INTEGER) RETURNS VOID
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft import random_seeds
  random_seeds.set_random_seeds(seed_value)
 $$ LANGUAGE plpythonu;
--- a/pg/sql/0.0.1/02_moran.sql
+++ b/pg/sql/0.0.1/02_moran.sql
@@ -11,6 +11,7 @@ CREATE OR REPLACE FUNCTION
      w_type TEXT DEFAULT 'knn')
 RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local
  # TODO: use named parameters or a dictionary
  return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
@@ -29,6 +30,7 @@ CREATE OR REPLACE FUNCTION
 		 w_type TEXT DEFAULT 'knn')
 RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
 AS $$
+  plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
  from crankshaft.clustering import moran_local_rate
  # TODO: use named parameters or a dictionary
  return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
--- a/pg/sql/0.0.1/03_overlap_sum.sql
+++ b/pg/sql/0.0.1/03_overlap_sum.sql
--- a/pg/sql/0.0.1/04_dot_density.sql
+++ b/pg/sql/0.0.1/04_dot_density.sql
--- a/pg/sql/0.0.1/90_permissions.sql
+++ b/pg/sql/0.0.1/90_permissions.sql
--- a/pg/test/0.0.1/expected/01_install_test.out
+++ b/pg/test/0.0.1/expected/01_install_test.out
@@ -3,4 +3,4 @@ CREATE EXTENSION plpythonu;
 CREATE EXTENSION postgis;
 CREATE EXTENSION cartodb;
 -- Install the extension
-CREATE EXTENSION crankshaft;
+CREATE EXTENSION crankshaft VERSION 'dev';
--- a/pg/test/0.0.1/expected/02_moran_test.out
+++ b/pg/test/0.0.1/expected/02_moran_test.out
--- a/pg/test/0.0.1/expected/03_overlap_sum_test.out
+++ b/pg/test/0.0.1/expected/03_overlap_sum_test.out
--- a/pg/test/0.0.1/expected/04_dot_density_test.out
+++ b/pg/test/0.0.1/expected/04_dot_density_test.out
--- a/src/pg/test/fixtures/polyg_values.sql
+++ b/src/pg/test/fixtures/polyg_values.sql
--- a/src/pg/test/fixtures/ppoints.sql
+++ b/src/pg/test/fixtures/ppoints.sql
--- a/src/pg/test/fixtures/ppoints2.sql
+++ b/src/pg/test/fixtures/ppoints2.sql
--- a/pg/test/0.0.1/sql/01_install_test.sql
+++ b/pg/test/0.0.1/sql/01_install_test.sql
@@ -4,4 +4,4 @@ CREATE EXTENSION postgis;
 CREATE EXTENSION cartodb;

 -- Install the extension
-CREATE EXTENSION crankshaft;
+CREATE EXTENSION crankshaft VERSION 'dev';
--- a/pg/test/0.0.1/sql/02_moran_test.sql
+++ b/pg/test/0.0.1/sql/02_moran_test.sql
--- a/pg/test/0.0.1/sql/03_overlap_sum_test.sql
+++ b/pg/test/0.0.1/sql/03_overlap_sum_test.sql
--- a/pg/test/0.0.1/sql/04_dot_density_test.sql
+++ b/pg/test/0.0.1/sql/04_dot_density_test.sql
--- a/pg/test/0.0.1/sql/90_permissions.sql
+++ b/pg/test/0.0.1/sql/90_permissions.sql
--- a/src/py/.gitignore
+++ b/src/py/.gitignore
@@ -1 +1,2 @@
 *.pyc
+dev/
--- a/src/py/Makefile
+++ b/src/py/Makefile
@@ -0,0 +1,9 @@
+# Install the package locally for development
+install:
+	virtualenv --system-site-packages dev
+	./dev/bin/pip install -I ./crankshaft
+	./dev/bin/pip install -I nose
+
+# Test develpment install
+testinstalled:
+	./dev/bin/nosetests crankshaft/test/
--- a/src/py/README.md
+++ b/src/py/README.md
@@ -0,0 +1,130 @@
+# Crankshaft Python Package
+
+...
+### Run the tests
+
+```bash
+cd crankshaft
+nosetests test/
+```
+
+## Notes about python dependencies
+* This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments.
+* We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has  all the dependencies
+* Every dependency should be:
+  - Added to the `setup.py` file
+  - Installed through it
+  - Tested, when they have a test suite.
+  - Fixed in the `requirements.txt`
+* At present we use Python version 2.7.3
+
+---
+
+We have two possible approaches being considered as to how manage
+the Python virtual environment: using a pure virtual enviroment
+or combine it with some system packages that include depencencies
+for the *hard-to-compile* packages (and pin them in somewhat old versions).
+
+### Alternative A: pure virtual environment
+
+In this case we will install all the packages needed in the
+virtual environment.
+This will involve, specially for the numerical packages compiling
+and linking code that uses a number of third party libraries,
+and requires having theses depencencies solved for the production
+environments.
+
+#### Create and use a virtual env
+
+We'll use a virtual enviroment directory `dev`
+under the `src/pg` directory.
+
+    # Create the virtual environment for python
+    $ virtualenv dev
+
+    # Activate the virtualenv
+    $ source dev/bin/activate
+
+    # Install all the requirements
+    # expect this to take a while, as it will trigger a few compilations
+    (dev) $ pip install -r requirements.txt
+
+    # Add a new pip to the party
+    (dev) $ pip install pandas
+
+#### Test the libraries with that virtual env
+
+##### Test numpy library dependency:
+
+    import numpy
+    numpy.test('full')
+
+##### Run scipy tests
+
+    import scipy
+    scipy.test('full')
+
+##### Testing pysal
+
+See [http://pysal.readthedocs.org/en/latest/developers/testing.html]
+
+This will require putting this into `dev/lib/python2.7/site-packages/setup.cfg`:
+
+```
+[nosetests]
+ignore-files=collection
+exclude-dir=pysal/contrib
+
+[wheel]
+universal=1
+```
+
+And copying some files before executing the tests:
+(we'll use a temporary directory from where the tests will be executed because
+some tests expect some files in the current directory). Next must be executed
+from
+
+```
+cp dev/lib/python2.7/site-packages/pysal/examples/geodanet/* dev/local/lib/python2.7/site-packages/pysal/examples
+mkdir -p test_tmp && cd test_tmp && cp ../dev/lib/python2.7/site-packages/pysal/examples/geodanet/* ./
+```
+
+Then, execute the tests with:
+
+    import pysal
+    import nose
+    nose.runmodule('pysal')
+
+
+### Alternative B: using some packaged modules
+
+This option avoids troublesome compilations/linkings, at the cost
+of freezing some module versions as available in system packages,
+namely numpy 1.6.1 and scipy 0.9.0. (in turn, this implies
+the most recent version of PySAL we can use is 1.9.1)
+
+
+TODO: to use this alternative the python-scipy package must be
+installed (this will have to be included in server provisioning)
+
+```
+apt-get install -y python-scipy
+```
+
+#### Create and use a virtual env
+
+We'll use a `dev` enviroment as before, but will configure it to
+use also system modules.
+
+
+    # Create the virtual environment for python
+    $ virtualenv --system-site-packages dev
+
+    # Activate the virtualenv
+    $ source dev/bin/activate
+
+    # Install all the requirements
+    # expect this to take a while, as it will trigger a few compilations
+    (dev) $ pip install -I ./crankshaft
+
+Then we can proceed to testing as in Alternative A.
--- a/src/py/crankshaft/crankshaft/init.py
+++ b/src/py/crankshaft/crankshaft/init.py
--- a/src/py/crankshaft/crankshaft/clustering/init.py
+++ b/src/py/crankshaft/crankshaft/clustering/init.py
--- a/src/py/crankshaft/crankshaft/clustering/moran.py
+++ b/src/py/crankshaft/crankshaft/clustering/moran.py
--- a/src/py/crankshaft/crankshaft/random_seeds.py
+++ b/src/py/crankshaft/crankshaft/random_seeds.py
--- a/src/py/crankshaft/setup.py
+++ b/src/py/crankshaft/setup.py
@@ -10,7 +10,7 @@ from setuptools import setup, find_packages
 setup(
    name='crankshaft',

-    version='0.0.01',
+    version='0.0.1',

    description='CartoDB Spatial Analysis Python Library',

@@ -40,9 +40,9 @@ setup(

    # The choice of component versions is dictated by what's
    # provisioned in the production servers.
-    install_requires=['pysal==1.11.0','numpy==1.6.1','scipy==0.17.0'],
+    install_requires=['pysal==1.9.1'],

-    requires=['pysal', 'numpy'],
+    requires=['pysal', 'numpy' ],

    test_suite='test'
 )
--- a/src/py/crankshaft/test/fixtures/moran.json
+++ b/src/py/crankshaft/test/fixtures/moran.json
--- a/src/py/crankshaft/test/fixtures/neighbors.json
+++ b/src/py/crankshaft/test/fixtures/neighbors.json
--- a/src/py/crankshaft/test/helper.py
+++ b/src/py/crankshaft/test/helper.py
--- a/src/py/crankshaft/test/mock_plpy.py
+++ b/src/py/crankshaft/test/mock_plpy.py
--- a/src/py/crankshaft/test/test_clustering_moran.py
+++ b/src/py/crankshaft/test/test_clustering_moran.py
Author	SHA1	Message	Date
Javier Goizueta	0206cc6c44	Update documentation	2016-03-10 19:13:46 +01:00
Rafa de la Torre	b754ffe42a	Add info about python dependencies	2016-03-10 18:06:21 +01:00
Javier Goizueta	0056f411b5	Set the path to virtualenvs in the Makefile Also, version the virtualenv	2016-03-09 19:04:21 +01:00
Javier Goizueta	1810f02242	Use SciPy from system package python-scipy	2016-03-09 15:03:17 +01:00
Javier Goizueta	8e972128eb	Modify sql code to user the python virtualenv	2016-03-09 15:00:50 +01:00
Javier Goizueta	cdd2d9e722	Directory reorganization and sketch of new versioning procedure	2016-03-08 19:35:02 +01:00
Javier Goizueta	46c66476b5	Merge pull request #5 from CartoDB/4-pgxs-fix Adapt Makefile of the extension for some PGXS versions	2016-02-29 16:35:04 +01:00
Javier Goizueta	e03aac4d8f	Fix typo	2016-02-26 19:09:17 +01:00
Javier Goizueta	d885c16db2	Adapt Makefile of the extension for some PGXS versions Postgresql 9.3.11 doesn't generates $DATA by default. fixes #4	2016-02-26 19:02:18 +01:00
Rafa de la Torre	abfda1c75e	Update CONTRIBUTING.md minor change (just a space)	2016-02-23 17:23:33 +01:00