Compare commits

..

6 Commits

Author SHA1 Message Date
Javier Goizueta
0206cc6c44 Update documentation 2016-03-10 19:13:46 +01:00
Rafa de la Torre
b754ffe42a Add info about python dependencies 2016-03-10 18:06:21 +01:00
Javier Goizueta
0056f411b5 Set the path to virtualenvs in the Makefile
Also, version the virtualenv
2016-03-09 19:04:21 +01:00
Javier Goizueta
1810f02242 Use SciPy from system package python-scipy 2016-03-09 15:03:17 +01:00
Javier Goizueta
8e972128eb Modify sql code to user the python virtualenv 2016-03-09 15:00:50 +01:00
Javier Goizueta
cdd2d9e722 Directory reorganization and sketch of new versioning procedure 2016-03-08 19:35:02 +01:00
55 changed files with 331 additions and 641 deletions

View File

@@ -1,84 +0,0 @@
# Contributing guide
## How to add new functions
Try to put as little logic in the SQL extension as possible and
just use it as a wrapper to the Python module functionality.
Once a function is defined it should never change its signature in subsequent
versions. To change a function's signature a new function with a different
name must be created.
### Version numbers
The version of both the SQL extension and the Python package shall
follow the [Semantic Versioning 2.0](http://semver.org/) guidelines:
* When backwards incompatibility is introduced the major number is incremented
* When functionally is added (in a backwards-compatible manner) the minor number
is incremented
* When only fixes are introduced (backwards-compatible) the patch number is
incremented
### Python Package
...
### SQL Extension
* Generate a **new subfolder version** for `sql` and `test` folders to define
the new functions and tests
- Use symlinks to avoid file duplication between versions that don't update them
- Add new files or modify copies of the old files to add new functions or
modify existing functions (remember to rename a function if the signature
changes)
- Add or modify the corresponding documentation files in the `doc` folder.
Since we expect to have highly technical functions here, an extense
background explanation would be of great help to users of this extension.
- Create tests for the new functions/behaviour
* Generate the **upgrade and downgrade files** for the extension
* Update the control file and the Makefile to generate the complete SQL
file for the new created version. After running `make` a new
file `crankshaft--X.Y.Z.sql` will be created for the current version.
Additional files for migrating to/from the previous version A.B.Z should be
created:
- `crankshaft--X.Y.Z--A.B.C.sql`
- `crankshaft--A.B.C--X.Y.Z.sql`
All these new files must be added to git and pushed.
* Update the public docs! ;-)
## Conventions
# SQL
Use snake case (i.e. `snake_case` and not `CamelCase`) for all
functions. Prefix functions intended for public use with `cdb_`
and private functions (to be used only internally inside
the extension) with `_cdb_`.
# Python
...
## Testing
Running just the Python tests:
```
(cd python && make test)
```
Installing the Extension and running just the PostgreSQL tests:
```
(cd pg && sudo make install && PGUSER=postgres make installcheck)
```
Installing and testing everything:
```
sudo make install && PGUSER=postgres make testinstalled
```

View File

@@ -1,187 +0,0 @@
# PostgreSQL GIS stack
#
# This image includes the following tools
# - PostgreSQL 9.5
# - PostGIS 2.2 with raster, topology and sfcgal support
# - OGR Foreign Data Wrapper
# - PgRouting
# - PDAL master
# - PostgreSQL PointCloud version master
#
# Version 1.7
FROM phusion/baseimage
MAINTAINER Vincent Picavet, vincent.picavet@oslandia.com
# Set correct environment variables.
ENV HOME /root
# Regenerate SSH host keys. baseimage-docker does not contain any, so you
# have to do that yourself. You may also comment out this instruction; the
# init system will auto-generate one during boot.
RUN /etc/my_init.d/00_regen_ssh_host_keys.sh
# Use baseimage-docker's init system.
CMD ["/sbin/my_init"]
RUN apt-get update && apt-get install -y wget ca-certificates
# Use APT postgresql repositories for 9.5 version
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ wheezy-pgdg main 9.5" > /etc/apt/sources.list.d/pgdg.list && wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
# packages needed for compilation
RUN apt-get update
RUN apt-get install -y autoconf build-essential cmake docbook-mathml docbook-xsl libboost-dev libboost-thread-dev libboost-filesystem-dev libboost-system-dev libboost-iostreams-dev libboost-program-options-dev libboost-timer-dev libcunit1-dev libgdal-dev libgeos++-dev libgeotiff-dev libgmp-dev libjson0-dev libjson-c-dev liblas-dev libmpfr-dev libopenscenegraph-dev libpq-dev libproj-dev libxml2-dev postgresql-server-dev-9.5 xsltproc git build-essential wget
RUN add-apt-repository ppa:fkrull/deadsnakes &&\
apt-get update &&\
apt-get install -y python3.2
# application packages
RUN apt-get install -y postgresql-9.5 postgresql-plpython-9.5
# Download and compile CGAL
RUN wget https://gforge.inria.fr/frs/download.php/file/32994/CGAL-4.3.tar.gz &&\
tar -xzf CGAL-4.3.tar.gz &&\
cd CGAL-4.3 &&\
mkdir build && cd build &&\
cmake .. &&\
make -j3 && make install
# orig sfcgal method
# download and compile SFCGAL
# RUN git clone https://github.com/Oslandia/SFCGAL.git
# RUN cd SFCGAL && cmake . && make -j3 && make install
# # cleanup
# RUN rm -Rf SFCGAL
# andrewxhill fix for stable sfcgal version
RUN wget https://github.com/Oslandia/SFCGAL/archive/v1.2.0.tar.gz
RUN tar -xzf v1.2.0.tar.gz
RUN cd SFCGAL-1.2.0 && cmake . && make -j 1 && make install
RUN rm -Rf v1.2.0.tar.gz SFCGAL-1.2.0
# download and install GEOS 3.5
RUN wget http://download.osgeo.org/geos/geos-3.5.0.tar.bz2 &&\
tar -xjf geos-3.5.0.tar.bz2 &&\
cd geos-3.5.0 &&\
./configure && make && make install &&\
cd .. && rm -Rf geos-3.5.0 geos-3.5.0.tar.bz2
# Download and compile PostGIS
RUN wget http://download.osgeo.org/postgis/source/postgis-2.2.0.tar.gz
RUN tar -xzf postgis-2.2.0.tar.gz
RUN cd postgis-2.2.0 && ./configure --with-sfcgal=/usr/local/bin/sfcgal-config --with-geos=/usr/local/bin/geos-config
RUN cd postgis-2.2.0 && make && make install
# cleanup
RUN rm -Rf postgis-2.2.0.tar.gz postgis-2.2.0
# Download and compile pgrouting
RUN git clone https://github.com/pgRouting/pgrouting.git &&\
cd pgrouting &&\
mkdir build && cd build &&\
cmake -DWITH_DOC=OFF -DWITH_DD=ON .. &&\
make -j3 && make install
# cleanup
RUN rm -Rf pgrouting
# Download and compile ogr_fdw
RUN git clone https://github.com/pramsey/pgsql-ogr-fdw.git &&\
cd pgsql-ogr-fdw &&\
make && make install &&\
cd .. && rm -Rf pgsql-ogr-fdw
# Compile PDAL
RUN git clone https://github.com/PDAL/PDAL.git pdal
RUN mkdir PDAL-build && \
cd PDAL-build && \
cmake ../pdal && \
make -j3 && \
make install
# cleanup
RUN rm -Rf pdal && rm -Rf PDAL-build
# Compile PointCloud
RUN git clone https://github.com/pramsey/pointcloud.git
RUN cd pointcloud && ./autogen.sh && ./configure && make -j3 && make install
# cleanup
RUN rm -Rf pointcloud
RUN git clone https://github.com/CartoDB/cartodb-postgresql.git &&\
cd cartodb-postgresql &&\
make all install &&\
cd .. && rm -Rf cartodb-postgresql
# install pip
RUN apt-get -y install python-dev python-pip liblapack-dev gfortran libyaml-dev
RUN pip install numpy pandas scipy theano keras sklearn
RUN pip install pysal
# get compiled libraries recognized
RUN ldconfig
# clean packages
# all -dev packages
# RUN apt-get remove -y --purge autotools-dev libgeos-dev libgif-dev libgl1-mesa-dev libglu1-mesa-dev libgnutls-dev libgpg-error-dev libhdf4-alt-dev libhdf5-dev libicu-dev libidn11-dev libjasper-dev libjbig-dev libjpeg8-dev libjpeg-dev libjpeg-turbo8-dev libkrb5-dev libldap2-dev libltdl-dev liblzma-dev libmysqlclient-dev libnetcdf-dev libopenthreads-dev libp11-kit-dev libpng12-dev libpthread-stubs0-dev librtmp-dev libspatialite-dev libsqlite3-dev libssl-dev libstdc++-4.8-dev libtasn1-6-dev libtiff5-dev libwebp-dev libx11-dev libx11-xcb-dev libxau-dev libxcb1-dev libxcb-dri2-0-dev libxcb-dri3-dev libxcb-glx0-dev libxcb-present-dev libxcb-randr0-dev libxcb-render0-dev libxcb-shape0-dev libxcb-sync-dev libxcb-xfixes0-dev libxdamage-dev libxdmcp-dev libxerces-c-dev libxext-dev libxfixes-dev libxshmfence-dev libxxf86vm-dev linux-libc-dev manpages-dev mesa-common-dev libgcrypt11-dev unixodbc-dev uuid-dev x11proto-core-dev x11proto-damage-dev x11proto-dri2-dev x11proto-fixes-dev x11proto-gl-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev x11proto-xf86vidmode-dev xtrans-dev zlib1g-dev
# installed packages
# RUN apt-get remove -y --purge autoconf build-essential cmake docbook-mathml docbook-xsl libboost-dev libboost-filesystem-dev libboost-timer-dev libcgal-dev libcunit1-dev libgdal-dev libgeos++-dev libgeotiff-dev libgmp-dev libjson0-dev libjson-c-dev liblas-dev libmpfr-dev libopenscenegraph-dev libpq-dev libproj-dev libxml2-dev postgresql-server-dev-9.5 xsltproc git build-essential wget
# additional compilation packages
# RUN apt-get remove -y --purge automake m4 make
# ---------- SETUP --------------
# add a baseimage PostgreSQL init script
RUN mkdir /etc/service/postgresql
ADD postgresql.sh /etc/service/postgresql/run
# Adjust PostgreSQL configuration so that remote connections to the
# database are possible.
RUN echo "host all all 0.0.0.0/0 md5" >> /etc/postgresql/9.5/main/pg_hba.conf
# And add ``listen_addresses`` to ``/etc/postgresql/9.5/main/postgresql.conf``
RUN echo "listen_addresses='*'" >> /etc/postgresql/9.5/main/postgresql.conf
# Expose PostgreSQL
EXPOSE 5432
# Add VOLUMEs to allow backup of config, logs and databases
VOLUME ["/data", "/etc/postgresql", "/var/log/postgresql", "/var/lib/postgresql"]
# Add pip
# http://bugs.python.org/issue19846
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK.
ENV LANG C.UTF-8
# add database setup upon image start
ADD pgpass /root/.pgpass
RUN chmod 700 /root/.pgpass
RUN mkdir -p /etc/my_init.d
ADD init_db_script.sh /etc/my_init.d/init_db_script.sh
ADD init_db.sh /root/init_db.sh
ADD run_tests.sh /root/run_tests.sh
ADD run_tests.sh /root/run_server.sh
# ---------- Final cleanup --------------
#
# Clean up APT when done.
# RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

View File

@@ -1,5 +1,5 @@
EXT_DIR = pg
PYP_DIR = python
EXT_DIR = src/pg
PYP_DIR = src/py
.PHONY: install
.PHONY: run_tests

124
README.md
View File

@@ -4,37 +4,99 @@ CartoDB Spatial Analysis extension for PostgreSQL.
## Code organization
* *pg* contains the PostgreSQL extension source code
* *python* Python module
## Running with Docker
Crankshaft comes with a Dockerfile to build and run a sandboxed machine for testing
and development.
First you have to build the docker container
docker build -t crankshaft .
To run the pg tests run
docker run -it --rm -v $(pwd):/crankshaft crankshaft /root/run_tests.sh
if there are failures it will dump the reasion to the screen.
To run a server you can develop on run
docker run -it --rm -v $(pwd):/crankshaft -p $(docker-machine ip default):5432:5432 /root/run_server.sh
and connect from you host using
psql -U pggis -h $(docker-machine ip default) -p 5432 -W
the password is pggis
* *doc* documentation
* *src* source code
* - *src/pg* contains the PostgreSQL extension source code
* - *src/py* Python module source code
* *release* reseleased versions
## Requirements
* pip
* pip, virtualenv, PostgreSQL
* python-scipy system package (see src/py/README.md)
# Working Process
## Development
Work in `src/pg/sql`, `src/py/crankshaft`;
use a topic branch. See src/py/README.md
for the procedure to work with the Python local environment.
Take into account:
* Always remember to add tests for any new functionality
documentation.
* Add or modify the corresponding documentation files in the `doc` folder.
Since we expect to have highly technical functions here, an extense
background explanation would be of great help to users of this extension.
* Convention: Use snake case (i.e. `snake_case` and not `CamelCase`) for all
functions. Prefix functions intended for public use with `cdb_`
and private functions (to be used only internally inside
the extension) with `_cdb_`.
Update local installation with `sudo make install`
(this will update the 'dev' version of the extension in 'src/pg/')
Run the tests with `PGUSER=postgres make test`
Update extension in working database with
* `ALTER EXTENSION crankshaft VERSION TO 'current';`
`ALTER EXTENSION crankshaft VERSION TO 'dev';`
Note: we keep the current development version install as 'dev' always;
we update through the 'current' alias to allow changing the extension
contents but not the version identifier. This will fail if the
changes involve incompatible function changes such as a different
return type; in that case the offending function (or the whole extension)
should be dropped manually before the update.
If the extension has not previously been installed in a database
we can:
* `CREATE EXTENSION crankshaft WITH VERSION 'dev';`
Once the tests are succeeding a new Pull-Request can be created.
CI-tests must be checked to be successfull.
Before merging a topic branch peer code reviewing of the code is a must.
## Release
The release process of a new version of the extension
shall by performed by the designated *Release Manager*.
Note that we expect to gradually automate this process.
Having checkout the topic branch of the PR to be released:
The version number in `pg/cranckshaft.control` must first be updated.
To do so [Semantic Versioning 2.0](http://semver.org/) is in order.
We now will explain the process for the case of backwards-compatible
releases (updating the minor or patch version numbers).
TODO: document the complex case of major releases.
The next command must be executed to produce the main installation
script for the new release, `release/cranckshaft--X.Y.Z.sql`.
```
make release
```
Then, the release manager shall produce upgrade and downgrade scripts
to migrate to/from the previous release. In the case of minor/patch
releases this simply consist in extracting the functions that have changed
and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql`
file.
TODO: configure the local enviroment to be used by the release;
currently should be directory `src/py/X.Y.Z`, but this must be fixed;
a possibility to explore is to use the `cdb_conf` table.
TODO: testing procedure for the new release
TODO: push, merge, tag, deploy procedures.

View File

@@ -1,77 +0,0 @@
#!/bin/bash
# wait for pg server to be ready
echo "Waiting for PostgreSQL to run..."
sleep 1
while ! /usr/bin/pg_isready -q
do
sleep 1
echo -n "."
done
# PostgreSQL running
echo "PostgreSQL running, initializing database."
# PostgreSQL user
#
# create postgresql user pggis
/sbin/setuser postgres /usr/bin/psql -c "CREATE USER pggis with SUPERUSER PASSWORD 'pggis';"
/sbin/setuser postgres /usr/bin/psql -c "CREATE role publicuser;"
# == Auto restore dumps ==
#
# If we find some postgresql dumps in /data/restore, then we load it
# in new databases
shopt -s nullglob
for f in /data/restore/*.backup
do
echo "Found database dump to restore : $f"
DBNAME=$(basename -s ".backup" "$f")
echo "Creating a new database $DBNAME.."
/usr/bin/psql -U pggis -h localhost -c "CREATE DATABASE $DBNAME WITH OWNER = pggis ENCODING = 'UTF8' TEMPLATE = template0 CONNECTION LIMIT = -1;" postgres
/usr/bin/psql -U pggis -h localhost -w -c "CREATE EXTENSION citext; CREATE EXTENSION pg_trgm; CREATE EXTENSION btree_gist; CREATE EXTENSION hstore; CREATE EXTENSION fuzzystrmatch; CREATE EXTENSION unaccent; CREATE EXTENSION postgres_fdw; CREATE EXTENSION pgcrypto; CREATE EXTENSION plpythonu; CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology; CREATE EXTENSION pgrouting; CREATE EXTENSION pointcloud; CREATE EXTENSION pointcloud_postgis; CREATE EXTENSION postgis_sfcgal; drop type if exists texture; create type texture as (url text,uv float[][]);CREATE ROLE publicuser;" $DBNAME
# /usr/bin/psql -U pggis -h localhost -w -f /usr/share/postgresql/9.5/contrib/postgis-2.1/sfcgal.sql -d $DBNAME
echo "Restoring database $DBNAME.."
/usr/bin/pg_restore -U pggis -h localhost -d $DBNAME -w "$f"
echo "creating public user"
/usr/bin/psql -U pggis -h localhost -w -c "CREATE ROLE publicuser;"
echo "Restore done."
done
# == Auto restore SQL backups ==
#
# If we find some postgresql sql scripts /data/restore, then we load it
# in new databases
shopt -s nullglob
for f in /data/restore/*.sql
do
echo "Found database SQL dump to restore : $f"
DBNAME=$(basename -s ".sql" "$f")
echo "Creating a new database $DBNAME.."
/usr/bin/psql -U pggis -h localhost -c "CREATE DATABASE $DBNAME WITH OWNER = pggis ENCODING = 'UTF8' TEMPLATE = template0 CONNECTION LIMIT = -1;" postgres
/usr/bin/psql -U pggis -h localhost -w -c "CREATE EXTENSION citext; CREATE EXTENSION pg_trgm; CREATE EXTENSION btree_gist; CREATE EXTENSION hstore; CREATE EXTENSION fuzzystrmatch; CREATE EXTENSION unaccent; CREATE EXTENSION postgres_fdw; CREATE EXTENSION pgcrypto; CREATE EXTENSION plpythonu; CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology; CREATE EXTENSION postgis_sfcgal; CREATE EXTENSION pgrouting; CREATE EXTENSION pointcloud; CREATE EXTENSION pointcloud_postgis; drop type if exists texture; create type texture as (url text,uv float[][]);" $DBNAME
# /usr/bin/psql -U pggis -h localhost -w -f /usr/share/postgresql/9.5/contrib/postgis-2.1/sfcgal.sql -d $DBNAME
echo "Restoring database $DBNAME.."
/usr/bin/psql -U pggis -h localhost -d $DBNAME -w -f "$f"
echo "Restore done."
done
# == create new database pggis ==
echo "Creating a new empty database..."
# create user and main database
/usr/bin/psql -U pggis -h localhost -c "CREATE DATABASE pggis WITH OWNER = pggis ENCODING = 'UTF8' TEMPLATE = template0 CONNECTION LIMIT = -1;" postgres
# activate all needed extension in pggis database
/usr/bin/psql -U pggis -h localhost -w -c "CREATE EXTENSION citext; CREATE EXTENSION pg_trgm; CREATE EXTENSION btree_gist; CREATE EXTENSION hstore; CREATE EXTENSION fuzzystrmatch; CREATE EXTENSION unaccent; CREATE EXTENSION postgres_fdw; CREATE EXTENSION pgcrypto; CREATE EXTENSION plpythonu; CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology; CREATE EXTENSION postgis_sfcgal; CREATE EXTENSION pgrouting; CREATE EXTENSION pointcloud; CREATE EXTENSION pointcloud_postgis; drop type if exists texture;
create type texture as (url text,uv float[][]);" pggis
#/usr/bin/psql -U pggis -h localhost -w -f /usr/share/postgresql/9.5/contrib/postgis-2.1/sfcgal.sql -d pggis
echo "Database initialized. Connect from host with :"
echo "psql -h localhost -p <PORT> -U pggis -W pggis"
echo "Get <PORT> value with 'docker ps'"

View File

@@ -1,3 +0,0 @@
#!/bin/sh
# Script for my_init.d, so as to run database init without blocking
/root/init_db.sh &

3
pg/.gitignore vendored
View File

@@ -1,3 +0,0 @@
regression.diffs
regression.out
results/

View File

@@ -1,33 +0,0 @@
# Makefile to generate the extension out of separate sql source files.
# Once a version is released, it is not meant to be changed. E.g: once version 0.0.1 is out, it SHALL NOT be changed.
EXTENSION = crankshaft
EXTVERSION = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
# The new version to be generated from templates
NEW_EXTENSION_ARTIFACT = $(EXTENSION)--$(EXTVERSION).sql
# DATA is a special variable used by postgres build infrastructure
# These are the files to be installed in the server shared dir,
# for installation from scratch, upgrades and downgrades.
# @see http://www.postgresql.org/docs/current/static/extend-pgxs.html
DATA = $(NEW_EXTENSION_ARTIFACT)
SOURCES_DATA_DIR = sql/$(EXTVERSION)
SOURCES_DATA = $(wildcard sql/$(EXTVERSION)/*.sql)
# The extension installation artifacts are stored in the base subdirectory
$(NEW_EXTENSION_ARTIFACT): $(SOURCES_DATA)
rm -f $@
cat $(SOURCES_DATA_DIR)/*.sql >> $@
REGRESS = $(notdir $(basename $(wildcard test/$(EXTVERSION)/sql/*test.sql)))
TEST_DIR = test/$(EXTVERSION)
REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
# This seems to be needed at least for PG 9.3.11
all: $(DATA)

View File

@@ -1,148 +0,0 @@
--DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES
-- Complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit
-- Internal function.
-- Set the seeds of the RNGs (Random Number Generators)
-- used internally.
CREATE OR REPLACE FUNCTION
_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
AS $$
from crankshaft import random_seeds
random_seeds.set_random_seeds(seed_value)
$$ LANGUAGE plpythonu;
-- Moran's I
CREATE OR REPLACE FUNCTION
cdb_moran_local (
t TEXT,
attr TEXT,
significance float DEFAULT 0.05,
num_ngbrs INT DEFAULT 5,
permutations INT DEFAULT 99,
geom_column TEXT DEFAULT 'the_geom',
id_col TEXT DEFAULT 'cartodb_id',
w_type TEXT DEFAULT 'knn')
RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
AS $$
from crankshaft.clustering import moran_local
# TODO: use named parameters or a dictionary
return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
$$ LANGUAGE plpythonu;
-- Moran's I Local Rate
CREATE OR REPLACE FUNCTION
cdb_moran_local_rate(t TEXT,
numerator TEXT,
denominator TEXT,
significance FLOAT DEFAULT 0.05,
num_ngbrs INT DEFAULT 5,
permutations INT DEFAULT 99,
geom_column TEXT DEFAULT 'the_geom',
id_col TEXT DEFAULT 'cartodb_id',
w_type TEXT DEFAULT 'knn')
RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
AS $$
from crankshaft.clustering import moran_local_rate
# TODO: use named parameters or a dictionary
return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
$$ LANGUAGE plpythonu;
-- Function by Stuart Lynn for a simple interpolation of a value
-- from a polygon table over an arbitrary polygon
-- (weighted by the area proportion overlapped)
-- Aereal weighting is a very simple form of aereal interpolation.
--
-- Parameters:
-- * geom a Polygon geometry which defines the area where a value will be
-- estimated as the area-weighted sum of a given table/column
-- * target_table_name table name of the table that provides the values
-- * target_column column name of the column that provides the values
-- * schema_name optional parameter to defina the schema the target table
-- belongs to, which is necessary if its not in the search_path.
-- Note that target_table_name should never include the schema in it.
-- Return value:
-- Aereal-weighted interpolation of the column values over the geometry
CREATE OR REPLACE
FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL)
RETURNS numeric AS
$$
DECLARE
result numeric;
qualified_name text;
BEGIN
IF schema_name IS NULL THEN
qualified_name := Format('%I', target_table_name);
ELSE
qualified_name := Format('%I.%s', schema_name, target_table_name);
END IF;
EXECUTE Format('
SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom))
FROM %s AS a
WHERE $1 && a.the_geom
', target_column, qualified_name)
USING geom
INTO result;
RETURN result;
END;
$$ LANGUAGE plpgsql;
--
-- Creates N points randomly distributed arround the polygon
--
-- @param g - the geometry to be turned in to points
--
-- @param no_points - the number of points to generate
--
-- @params max_iter_per_point - the function generates points in the polygon's bounding box
-- and discards points which don't lie in the polygon. max_iter_per_point specifies how many
-- misses per point the funciton accepts before giving up.
--
-- Returns: Multipoint with the requested points
CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000)
RETURNS GEOMETRY AS $$
DECLARE
extent GEOMETRY;
test_point Geometry;
width NUMERIC;
height NUMERIC;
x0 NUMERIC;
y0 NUMERIC;
xp NUMERIC;
yp NUMERIC;
no_left INTEGER;
remaining_iterations INTEGER;
points GEOMETRY[];
bbox_line GEOMETRY;
intersection_line GEOMETRY;
BEGIN
extent := ST_Envelope(geom);
width := ST_XMax(extent) - ST_XMIN(extent);
height := ST_YMax(extent) - ST_YMIN(extent);
x0 := ST_XMin(extent);
y0 := ST_YMin(extent);
no_left := no_points;
LOOP
if(no_left=0) THEN
EXIT;
END IF;
yp = y0 + height*random();
bbox_line = ST_MakeLine(
ST_SetSRID(ST_MakePoint(yp, x0),4326),
ST_SetSRID(ST_MakePoint(yp, x0+width),4326)
);
intersection_line = ST_Intersection(bbox_line,geom);
test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random());
points := points || test_point;
no_left = no_left - 1 ;
END LOOP;
RETURN ST_Collect(points);
END;
$$
LANGUAGE plpgsql VOLATILE;
-- Make sure by default there are no permissions for publicuser
-- NOTE: this happens at extension creation time, as part of an implicit transaction.
-- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE;
-- Grant permissions on the schema to publicuser (but just the schema)
GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser;
-- Revoke execute permissions on all functions in the schema by default
-- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser;

View File

@@ -1,6 +0,0 @@
-- Install dependencies
CREATE EXTENSION plpythonu;
CREATE EXTENSION postgis;
CREATE EXTENSION cartodb;
-- Install the extension
CREATE EXTENSION crankshaft;

1
pgpass
View File

@@ -1 +0,0 @@
localhost:5432:*:pggis:pggis

View File

@@ -1,5 +0,0 @@
#!/bin/sh
# `/sbin/setuser postgres` runs the given command as the user `postgres`.
# If you omit that part, the command will be run as root.
rm -rf /etc/ssl/private-copy; mkdir /etc/ssl/private-copy; mv /etc/ssl/private/* /etc/ssl/private-copy/; rm -r /etc/ssl/private; mv /etc/ssl/private-copy /etc/ssl/private; chmod -R 0700 /etc/ssl/private; chown -R postgres /etc/ssl/private
exec /sbin/setuser postgres /usr/lib/postgresql/9.5/bin/postgres -D /var/lib/postgresql/9.5/main -c config_file=/etc/postgresql/9.5/main/postgresql.conf >> /var/log/postgresql.log 2>&1

View File

@@ -1,11 +0,0 @@
# Install the package (needs root privileges)
install:
pip install ./crankshaft --upgrade
# Test from source code
test:
(cd crankshaft && nosetests test/)
# Test currently installed package
testinstalled:
nosetests crankshaft/test/

View File

@@ -1,9 +0,0 @@
# Crankshaft Python Package
...
### Run the tests
```bash
cd crankshaft
nosetests test/
```

View File

@@ -1,14 +0,0 @@
#!/bin/bash
/sbin/my_init &
echo "Waiting for PostgreSQL to run..."
sleep 1
while ! /usr/bin/pg_isready -q
do
sleep 1
echo -n "."
done
cd /crankshaft/pg
make install
fg

View File

@@ -1,23 +0,0 @@
#!/bin/bash
/sbin/my_init &
echo "Waiting for PostgreSQL to run..."
sleep 1
while ! /usr/bin/pg_isready -q
do
sleep 1
echo -n "."
done
cd /crankshaft/pg
make install
PGUSER=pggis PGPASSOWRD=pggis PGHOST=localhost make installcheck
if [ "$?" -eq "0" ]
then
echo "PASSED"
else
cat /crankshaft/pg/test/0.0.1/regression.diffs
fi

6
src/pg/.gitignore vendored Normal file
View File

@@ -0,0 +1,6 @@
regression.diffs
regression.out
results/
crankshaft--dev.sql
crankshaft--dev--current.sql
crankshaft--current--dev.sql

48
src/pg/Makefile Normal file
View File

@@ -0,0 +1,48 @@
# Generation of a new development version 'dev' (with an alias 'current' for
# updating easily by upgrading to 'current', then 'dev')
# sudo make install -- generate the 'dev' version from current source
# and make it available to PostgreSQL
# PGUSER=postgres make installcheck -- test the 'dev' extension
SED = sed
EXTENSION = crankshaft
DATA = $(EXTENSION)--dev.sql \
$(EXTENSION)--current--dev.sql \
$(EXTENSION)--dev--current.sql
SOURCES_DATA_DIR = sql
SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql)
VIRTUALENV_PATH = $(realpath ../py/)
ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH))
REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \
-e 's/@@VIRTUALENV_PATH@@/$(ESC_VIRVIRTUALENV_PATH)/g'
$(DATA): $(SOURCES_DATA)
$(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > $@
TEST_DIR = test
REGRESS = $(notdir $(basename $(wildcard $(TEST_DIR)/sql/*test.sql)))
REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)'
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
include $(PGXS)
# This seems to be needed at least for PG 9.3.11
all: $(DATA)
# WIP: goals for releasing the extension...
EXTVERSION = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
../release/$(EXTENSION).control: $(EXTENSION).control
cp $< $@
release: ../release/$(EXTENSION).control
cp $(EXTENSION)--dev.sql $(EXTENSION)--$(EXTVERSION).sql

12
src/pg/sql/01_version.sql Normal file
View File

@@ -0,0 +1,12 @@
-- Version number of the extension release
CREATE OR REPLACE FUNCTION cdb_crankshaft_version()
RETURNS text AS $$
SELECT '@@VERSION@@'::text;
$$ language 'sql' IMMUTABLE STRICT;
-- Internal identifier of the installed extension instence
-- e.g. 'dev' for current development version
CREATE OR REPLACE FUNCTION cdb_crankshaft_internal_version()
RETURNS text AS $$
SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL;
$$ language 'sql' IMMUTABLE STRICT;

23
src/pg/sql/02_py.sql Normal file
View File

@@ -0,0 +1,23 @@
CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path()
RETURNS text
AS $$
BEGIN
-- RETURN '/opt/virtualenvs/crankshaft';
RETURN '@@VIRTUALENV_PATH@@';
END;
$$ language plpgsql IMMUTABLE STRICT;
-- Use the crankshaft python module
CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py()
RETURNS VOID
AS $$
import os
# plpy.notice('%',str(os.environ))
# activate virtualenv
crankshaft_version = plpy.execute('SELECT cdb_crankshaft.cdb_crankshaft_internal_version()')[0]['cdb_crankshaft_internal_version']
base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path']
default_venv_path = os.path.join(base_path, crankshaft_version)
venv_path = os.environ.get('CRANKSHAFT_VENV', default_venv_path)
activate_path = venv_path + '/bin/activate_this.py'
exec(open(activate_path).read(), dict(__file__=activate_path))
$$ LANGUAGE plpythonu;

View File

@@ -4,6 +4,7 @@
CREATE OR REPLACE FUNCTION
_cdb_random_seeds (seed_value INTEGER) RETURNS VOID
AS $$
plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
from crankshaft import random_seeds
random_seeds.set_random_seeds(seed_value)
$$ LANGUAGE plpythonu;

View File

@@ -11,6 +11,7 @@ CREATE OR REPLACE FUNCTION
w_type TEXT DEFAULT 'knn')
RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT)
AS $$
plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
from crankshaft.clustering import moran_local
# TODO: use named parameters or a dictionary
return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type)
@@ -29,6 +30,7 @@ CREATE OR REPLACE FUNCTION
w_type TEXT DEFAULT 'knn')
RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric)
AS $$
plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()')
from crankshaft.clustering import moran_local_rate
# TODO: use named parameters or a dictionary
return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type)

View File

@@ -3,4 +3,4 @@ CREATE EXTENSION plpythonu;
CREATE EXTENSION postgis;
CREATE EXTENSION cartodb;
-- Install the extension
CREATE EXTENSION crankshaft;
CREATE EXTENSION crankshaft VERSION 'dev';

View File

@@ -4,4 +4,4 @@ CREATE EXTENSION postgis;
CREATE EXTENSION cartodb;
-- Install the extension
CREATE EXTENSION crankshaft;
CREATE EXTENSION crankshaft VERSION 'dev';

View File

@@ -1 +1,2 @@
*.pyc
dev/

9
src/py/Makefile Normal file
View File

@@ -0,0 +1,9 @@
# Install the package locally for development
install:
virtualenv --system-site-packages dev
./dev/bin/pip install -I ./crankshaft
./dev/bin/pip install -I nose
# Test develpment install
testinstalled:
./dev/bin/nosetests crankshaft/test/

130
src/py/README.md Normal file
View File

@@ -0,0 +1,130 @@
# Crankshaft Python Package
...
### Run the tests
```bash
cd crankshaft
nosetests test/
```
## Notes about python dependencies
* This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments.
* We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has all the dependencies
* Every dependency should be:
- Added to the `setup.py` file
- Installed through it
- Tested, when they have a test suite.
- Fixed in the `requirements.txt`
* At present we use Python version 2.7.3
---
We have two possible approaches being considered as to how manage
the Python virtual environment: using a pure virtual enviroment
or combine it with some system packages that include depencencies
for the *hard-to-compile* packages (and pin them in somewhat old versions).
### Alternative A: pure virtual environment
In this case we will install all the packages needed in the
virtual environment.
This will involve, specially for the numerical packages compiling
and linking code that uses a number of third party libraries,
and requires having theses depencencies solved for the production
environments.
#### Create and use a virtual env
We'll use a virtual enviroment directory `dev`
under the `src/pg` directory.
# Create the virtual environment for python
$ virtualenv dev
# Activate the virtualenv
$ source dev/bin/activate
# Install all the requirements
# expect this to take a while, as it will trigger a few compilations
(dev) $ pip install -r requirements.txt
# Add a new pip to the party
(dev) $ pip install pandas
#### Test the libraries with that virtual env
##### Test numpy library dependency:
import numpy
numpy.test('full')
##### Run scipy tests
import scipy
scipy.test('full')
##### Testing pysal
See [http://pysal.readthedocs.org/en/latest/developers/testing.html]
This will require putting this into `dev/lib/python2.7/site-packages/setup.cfg`:
```
[nosetests]
ignore-files=collection
exclude-dir=pysal/contrib
[wheel]
universal=1
```
And copying some files before executing the tests:
(we'll use a temporary directory from where the tests will be executed because
some tests expect some files in the current directory). Next must be executed
from
```
cp dev/lib/python2.7/site-packages/pysal/examples/geodanet/* dev/local/lib/python2.7/site-packages/pysal/examples
mkdir -p test_tmp && cd test_tmp && cp ../dev/lib/python2.7/site-packages/pysal/examples/geodanet/* ./
```
Then, execute the tests with:
import pysal
import nose
nose.runmodule('pysal')
### Alternative B: using some packaged modules
This option avoids troublesome compilations/linkings, at the cost
of freezing some module versions as available in system packages,
namely numpy 1.6.1 and scipy 0.9.0. (in turn, this implies
the most recent version of PySAL we can use is 1.9.1)
TODO: to use this alternative the python-scipy package must be
installed (this will have to be included in server provisioning)
```
apt-get install -y python-scipy
```
#### Create and use a virtual env
We'll use a `dev` enviroment as before, but will configure it to
use also system modules.
# Create the virtual environment for python
$ virtualenv --system-site-packages dev
# Activate the virtualenv
$ source dev/bin/activate
# Install all the requirements
# expect this to take a while, as it will trigger a few compilations
(dev) $ pip install -I ./crankshaft
Then we can proceed to testing as in Alternative A.

View File

@@ -40,9 +40,9 @@ setup(
# The choice of component versions is dictated by what's
# provisioned in the production servers.
install_requires=['pysal==1.11.0','numpy==1.6.1','scipy==0.17.0'],
install_requires=['pysal==1.9.1'],
requires=['pysal', 'numpy'],
requires=['pysal', 'numpy' ],
test_suite='test'
)