Updates docs for postal code

This commit is contained in:
Carla Iriberri
2015-10-19 17:16:16 +02:00
parent ece434972d
commit 1cfaf26406
3 changed files with 47 additions and 24 deletions

View File

@@ -2,11 +2,14 @@ Postal code geocoder
===============
This section is divided in geocoding postal codes as points or as polygons. Each option has its own sources and its own functions which are described below.
# Postal code geocoder: Polygons
## Function
By following the next steps a table is populated with zipcodes from Australia, Canada, USA and France (identified by iso3) related with their spatial location in terms of polygons.
## Usage example
## Creation steps
1. Import the four files attached in the section "Datasources" for Australia (`doc` table), Canada (`gfsa000a11a_e` table), USA (`tl_2013_us_zcta510` table) and France (`codes_postaux` table).
@@ -18,15 +21,15 @@ By following the next steps a table is populated with zipcodes from Australia, C
#### Table structure
````
Table "public.postal_code_polygons"
Column | Type | Modifiers | Storage | Stats target | Description
Column | Type | Modifiers | Storage | Stats target | Description
----------------------+--------------------------+-----------------------------------------------------------------------+----------+--------------+-------------
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq'::regclass) | plain | |
postal_code | text | | extended | |
adm0_a3 | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq'::regclass) | plain | |
postal_code | text | | extended | |
adm0_a3 | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
````
@@ -50,7 +53,7 @@ Indexes:
public | geocode_postalcode_polygons | SETOF geocode_namedplace_v1 | code text[] | normal
public | geocode_postalcode_polygons | SETOF geocode_namedplace_country_v1 | code text[], inputcountries text[] | normal
public | geocode_postalcode_polygons | SETOF geocode_namedplace_v1 | code text[], inputcountry text | normal
`````
````
### test_geocode_postalcode_polygons
````
@@ -58,7 +61,7 @@ Indexes:
--------+----------------------------------+-------------------------------------+------------------------------------+--------
public | test_geocode_postalcode_polygons | SETOF geocode_namedplace_country_v1 | code text[], inputcountries text[] | normal
public | test_geocode_postalcode_polygons | SETOF geocode_namedplace_v1 | code text[], inputcountry text | normal
`````
````
## Response data types
* geocode_namedplace_country_v1:
@@ -71,12 +74,20 @@ Indexes:
## Data Sources
* Australian polygons - http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/2033.0.55.0012011?OpenDocument - Download the KMZ for *Postal Area IRSD, SEIFA 2011*. Unzip and upload the kmz
- Coverage: AUS
- Geometry type: polygon
* Canadian polygons - http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/bound-limit-2011-eng.cfm - Download ESRI Shp, Forward Sortation Areas, Digital Boundary
* Canadian polygons - http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/bound-limit-2011-eng.cfm - Download ESRI Shp, Forward Sortation Areas, Digital Boundary
- Coverage: CAN
- Geometry type: polygon
* USA polygons - http://www2.census.gov/geo/tiger/TIGER2013/ZCTA5/tl_2013_us_zcta510.zip
- Coverage: USA
- Geometry type: polygon
* French polygons - http://www.data.gouv.fr/dataset/fond-de-carte-des-codes-postaux
- Coverage: FRA
- Geometry type: polygon
# Postal code geocoder: Points
@@ -93,6 +104,8 @@ IT, LU, SK, LI, PR, IM, NO, PT, PL, FI, JP, CA, DE, HU, PH, SE, VA, YT, MK, FR,
MH, RO, FO, GF, AD, HR, DZ, GT, AU, AS, BE, AT
````
## Usage example
## Creation steps
1. Download the allCountries.zip file from [GeoNames](www.geonames.org). Import and rename the table as tmp_zipcode_points. You can follow the manual process explained below instead.
@@ -121,15 +134,15 @@ Import the file ignoring step 2.
#### Table structure
````
Table "public.postal_code_points"
Column | Type | Modifiers | Storage | Stats target | Description
Column | Type | Modifiers | Storage | Stats target | Description
----------------------+--------------------------+------------------------------------------------------------------------+----------+--------------+-------------
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq2'::regclass) | plain | |
adm0_a3 | text | | extended | |
postal_code | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq2'::regclass) | plain | |
adm0_a3 | text | | extended | |
postal_code | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
````
#### Current indexes
````
@@ -152,6 +165,8 @@ Import the file ignoring step 2.
## Data Sources
* All countries points [GeoNames](www.geonames.org) - http://download.geonames.org/export/zip/allCountries.zip
- Coverage: See details at function description
- Geometry type: point
# Geocoder coverage map
![Map](https://camo.githubusercontent.com/483eae203445096ffa8bf0fe3d92a99fd9367a01/68747470733a2f2f646c2e64726f70626f7875736572636f6e74656e742e636f6d2f752f323837393330382f53637265656e25323053686f74253230323031352d30362d3239253230617425323031342e30332e34342e706e67)
@@ -160,22 +175,24 @@ Import the file ignoring step 2.
# Known deficiencies of the service
* For the USA polygon zipcode service, Zipcode Tabulation Areas (ZCTA) are being used which [don't correspond to actual zipcode regions](http://web.archive.org/web/20050209030255/http://www.manifold.net/cases/zip_codes/zip_codes.html).
* Regarding the point geocoder service, being offered from GeoNames data: we've detected that the accuracy for a big section of zipcodes is not as good as intended, as GeoNames interpolates zipcode-populated place information. As an example, in the case of Madrid, Spain, all the zipcodes belonging to the city are geocoded in the centroid of the city itself.
* Regarding the point geocoder service, being offered from GeoNames data: we've detected that the accuracy for a big section of zipcodes is not as good as intended, as GeoNames interpolates zipcode-populated place information. As an example, in the case of Madrid, Spain, all the zipcodes belonging to the city are geocoded in the centroid of the city itself.
This issue can be spotted easily by comparing interesecting zipcode points:
`SELECT the_geom, the_geom_webmercator, COUNT(*), adm0_a3 FROM postal_code_points GROUP BY the_geom, the_geom_webmercator, adm0_a3 HAVING COUNT(*) > 1 order by count(*)`
In this case, we conclude that most affected countries are Portugal, Mexico, Spain, Netherlands, Czech Republic or Slovakia, meanwhile Brazil doesn't show intersecting values.
The visual result of intersecting zipcodes is demonstrated in the following figure:
![Duplicates](https://camo.githubusercontent.com/1dbd4874830b0654b2fc2e11cd2a650d498f6bc9/68747470733a2f2f646c2e64726f70626f7875736572636f6e74656e742e636f6d2f752f323837393330382f53637265656e25323053686f74253230323031352d30362d3239253230617425323031322e35362e30332e706e67)
# Historic:
* [19/10/2015]:
* Updates readme with usage examples and setup scripts
* [08/10/2015]:
* Adds response data types
* Adds response data types
* [15/07/2015]:
* Adds basic tests for postal codes polygons
* [02/07/2015]:
* Adds known deficiencies and coverage map
* Adds known deficiencies and coverage map
* [24/06/2015]:
* Updated readme.md: added information about tables, function, indexes and the known issues section.
* Updated readme.md: added information about tables, function, indexes and the known issues section.
* Review and [upload functions](https://github.com/CartoDB/data-services/pull/152)

View File

@@ -0,0 +1,5 @@
-- Response types for postal codes geocoder
CREATE TYPE geocode_namedplace_v1 AS (q TEXT, geom GEOMETRY, success BOOLEAN);
CREATE TYPE geocode_place_country_iso_v1 AS (iso3 TEXT, c TEXT, q TEXT, geom GEOMETRY, success BOOLEAN);
CREATE TYPE geocode_namedplace_country_v1 AS (c TEXT, q TEXT, geom GEOMETRY, success BOOLEAN);
CREATE TYPE geocode_postalint_country_v1 AS (c TEXT, q INTEGER, geom GEOMETRY, success BOOLEAN);

View File

@@ -0,0 +1 @@
-- Triggers for postal codes geocoder