Compare commits

...

57 Commits

Author SHA1 Message Date
Rafa de la Torre
4c93258cd2 Prepare release of 0.28.1 2019-07-04 16:09:53 +02:00
Javier Goizueta
bf140890d8 Merge pull request #365 from CartoDB/table-syncer-notmps
Sync tables without using temporary hash tables
2019-07-04 16:05:21 +02:00
Rafa de la Torre
29a31d4c40 Merge pull request #362 from CartoDB/fdw-affected-tables
Make CDB_Get_Foreign_Updated_At robust to missing CDB_TableMetadata
2019-07-04 16:02:46 +02:00
Javier Goizueta
dbd403a2f6 Fix cases of double-quoting identifiers 2019-07-04 12:47:58 +02:00
Javier Goizueta
cb353ec6a8 Add tests for quoted table & column names 2019-07-04 12:47:01 +02:00
Javier Goizueta
2beabfced6 Add test for double schema-quoting 2019-07-03 18:48:49 +02:00
Javier Goizueta
7bdee5c13e Avoid double-quoting
Since dst_schema is a REGNAMESPACE, it is automatically quoted when casted to TEXT
2019-07-03 18:34:54 +02:00
Javier Goizueta
c07784566a NEWS for next release 2019-07-03 17:02:36 +02:00
Javier Goizueta
2e1fe2933c Fix whitespace 2019-07-03 16:56:29 +02:00
Rafa de la Torre
d378ca6fe0 Test for missing foreign CDB_TableMetadata 2019-07-03 16:25:16 +02:00
Rafa de la Torre
446f4113d9 Return NULL instead of NOW()
The absence of foreign CDB_TableMetadata actually means that we cannot
really tell when a remote table was modified.

Therefore we're using NULL with the meaning of "I don't know when it
was last modified".

To be taken in other caching layers, to adjust headers accordingly.
2019-07-03 16:19:46 +02:00
Javier Goizueta
5963c67376 Order the columns of a cartodbfied table consistently
The final order of the columns of a cartodbfied table wasn't uniquely specified, so could vary across PG versions.
This was a problem in particular for having deterministic test results.
2019-07-03 16:16:37 +02:00
Javier Goizueta
f5f18ca57c Fix the order of the columns 2019-07-02 18:14:18 +02:00
Javier Goizueta
d8c840d126 Quote table and column names when necessary
Also use type NAME when appropriate. Note that quoted column names are not NAMES (may be longer).
2019-07-02 18:14:04 +02:00
Javier Goizueta
65483743b4 Remove unused vars 2019-07-02 17:44:06 +02:00
Javier Goizueta
4651883454 Fix order of test results 2019-07-02 17:43:13 +02:00
Javier Goizueta
470bae6268 Merge branch 'master' into table-syncer-notmps
# Conflicts:
#	scripts-available/CDB_SyncTable.sql
2019-07-02 12:21:49 +02:00
Javier Goizueta
75ba13b834 Merge pull request #364 from CartoDB/synctable-doc
Document CDB_SyncTable
2019-07-02 10:20:24 +02:00
Javier Goizueta
057695361d Fix typo 2019-07-01 17:49:42 +02:00
Javier Goizueta
0e1aeb0a76 Minor copy edit 2019-07-01 17:00:22 +02:00
Javier Goizueta
f2dae651b3 More complete sync table example 2019-07-01 16:58:24 +02:00
Javier Goizueta
9f414938ba Document CDB_SyncTable 2019-07-01 16:41:40 +02:00
Javier Goizueta
4f7b07f922 Merge pull request #363 from CartoDB/fix-sync-schema
Fix cartodb schema references
2019-07-01 16:15:25 +02:00
Javier Goizueta
69cc56b589 Remove schema from public function examples 2019-07-01 15:13:48 +02:00
Javier Goizueta
1028b24333 Fix cartodb schema references 2019-07-01 14:48:42 +02:00
Javier Goizueta
4d1c4f6a22 Release 0.28.0 2019-07-01 10:56:06 +02:00
Javier Goizueta
83ab128a01 Merge pull request #355 from CartoDB/table-syncer
[WIP] Table syncer
2019-07-01 10:49:06 +02:00
Rafa de la Torre
b0b4a92240 Add a test case for missing foreign CDB_Tablementadata 2019-06-28 16:19:57 +02:00
Rafa de la Torre
c06d24aa19 Make CDB_Get_Foreign_Updated_At robust to missing CDB_TableMetadata
This may happen with non-carto DB's, when checking the updated_at
times and not finding the corresponding remote.cdb_tablemetadata
imported from the foreign non-carto DB.

Instead of failing, return a NOW() timestampt, so that caching logic
just assumes there may have been changes.

This makes it work today, and leaves open the possibility of adding
the required carto metadata for homogeneous caching in the future.
2019-06-28 16:25:52 +02:00
Javier Goizueta
4f4df2de8d Merge branch 'table-syncer' into table-syncer-notmps 2019-06-28 15:28:06 +02:00
Javier Goizueta
b5f36902c5 Merge branch 'master' into table-syncer 2019-06-28 13:50:46 +02:00
Javier Goizueta
0bcbf6708a Remove usage of temporary tables 2019-06-28 13:49:15 +02:00
Gonzalo Riestra
2e665a56b4 release 0.27.2 2019-06-21 09:19:04 +02:00
Gonzalo Riestra
1e8ef2e1d9 Merge pull request #360 from CartoDB/fix-ghost-tables-trigger
Improve Ghost Tables functions
2019-06-21 09:15:20 +02:00
Gonzalo Riestra
e6ecde4346 add CREATE/ALTER/DROP FOREIGN TABLE to ddl commands 2019-06-20 11:20:48 +02:00
Gonzalo Riestra
2d42e6197a use bigint to store values from txid_current() 2019-06-20 11:20:03 +02:00
Gonzalo Riestra
5605fdd9b2 fix event trigger 2019-06-19 15:27:28 +02:00
Sergio Conde Gómez
ab7082d4c3 Merge pull request #359 from CartoDB/fix_carto-package
Fix carto-package requirements to be SemVer compliant
2019-06-17 13:40:35 +02:00
Sergio Conde Gomez
f9ac627e0e Fix carto-package requirements to be SemVer compliant 2019-06-17 13:33:24 +02:00
Raul Marin
cc2066ea89 Release 0.27.1 2019-06-03 17:57:42 +02:00
Raúl Marín
aa380171b4 Merge pull request #358 from Algunenano/missing
Add missing qualifications
2019-06-03 16:26:10 +02:00
Raul Marin
a43abb37e0 Add missing qualifications 2019-06-03 15:22:52 +02:00
Rafa de la Torre
42dc03d77b Qualify functions with extension schema
This avoids some issues with search_path and scripting black magic.
2019-05-30 16:56:12 +02:00
Rafa de la Torre
9254723719 Make format expression more readable
Make an EXECUTE format('', param1, ..., paramN) more readable by
adding the index of the argument to print (`position`) in the format
specifier.
2019-05-29 13:13:30 +02:00
Rafa de la Torre
26ad966ab6 Add some timing info 2019-05-28 16:40:01 +02:00
Rafa de la Torre
a2723a3c90 Exclude certain columns from sync if instructed to do so
For the Geocoding (and in general for LDS use cases) it may come in
handy to exclude geometry columns from the list of stuff to
syncrhonize. Otherwise they may be lost, overwritten with NULL values.
2019-05-28 16:11:56 +02:00
Rafa de la Torre
2f8ea7e4ea Avoid tables name clashing when executing within same transaction
Generate more unique temp table names when the CDB_SyncTable function
is executed multiple times within the same transaction.

When executed in isolation, there will be always an implicit
surrounding transaction.

But when executed several times within the same transaction it can
give an `ERROR:  relation "src_sync_718794" already exists`.

E.g:

```
BEGIN;
SELECT cartodb.CDB_SyncTable('source1', 'public', 'dest1');
SELECT cartodb.CDB_SyncTable('source12, 'public', 'dest2');
COMMIT;
```
2019-05-28 15:39:02 +02:00
Rafa de la Torre
45c21d7f42 Make other tests resilient to CDB_SyncTable test failures
Use a transaction to avoid leaving stuff after our tests, that affect
Quota and User Tables tests.
2019-05-28 15:33:55 +02:00
Rafa de la Torre
f442c21fa4 Fix the test/CDB_AnalysisCheckTest.sql
These tests are failing in PG11:

```
*** /home/travis/build/CartoDB/cartodb-postgresql/expected/test/CDB_AnalysisCheckTest.out	2019-05-27 16:09:45.063543994 +0000
--- /home/travis/build/CartoDB/cartodb-postgresql/results/test/CDB_AnalysisCheckTest.out	2019-05-27 16:12:39.770847666 +0000
***************
*** 5,12 ****
  CREATE TABLE
  CREATE TABLE
  CREATE TABLE
- (analysis_2f13a3dbd7_41bd92976fc6dd97072afe4ee450054f4c0715d5,public,analysis_2f13a3dbd7_41bd92976fc6dd97072afe4ee450054f4c0715d5)
  (analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94,public,analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94)
  0

  1
--- 5,12 ----
  CREATE TABLE
  CREATE TABLE
  CREATE TABLE
  (analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94,public,analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94)
+ (analysis_2f13a3dbd7_41bd92976fc6dd97072afe4ee450054f4c0715d5,public,analysis_2f13a3dbd7_41bd92976fc6dd97072afe4ee450054f4c0715d5)
  0

  1
```

The reason for that is that they rely on row ordering that cannot be
guaranteed as per SQL Standard. Forcing that assumed ordering is
enough to get it working again.
2019-05-28 09:34:10 +02:00
Rafa de la Torre
ee9d08a2be Some functional E2E tests for the CDB_SyncTable() function 2019-05-27 17:56:36 +02:00
Rafa de la Torre
7606585672 Perf optimization: use EXCEPT instead of NOT IN
With javitonino's help, greatly reduce the processing time by using
EXCEPT instead of NOT IN, which causes it to use a `HashSetOp Except`
plan on the subqueries rather than a `Seq Scan` on `Materialize`'d
subtables.
2019-05-27 15:20:06 +02:00
Rafa de la Torre
81d0f338cf Create HASH indices on the temp tables 2019-05-27 15:18:13 +02:00
Rafa de la Torre
951f257654 Simplify code by using ON COMMIT DROP
Simplify code by relying on automatic removal of temp tables with ON
COMMIT DROP, so that we avoid the messy EXCEPTION management.
2019-05-27 12:02:12 +02:00
Rafa de la Torre
f461faf0b6 Simplify udpate: use * instead of column list 2019-05-27 11:56:41 +02:00
Rafa de la Torre
a8d57abda6 Update changed rows 2019-05-27 11:54:38 +02:00
Rafa de la Torre
982ddfdeff Helper function to generate UPDATE SET clause 2019-05-27 11:31:07 +02:00
Rafa de la Torre
da4331ac78 WIP: A function to sync tables
It assumes there's a cartodb_id column in both source and target. It
does not perform unnecessary actions. It respects augmented columns in
target table, if they exist. It is meant to be efficient.
2019-05-24 18:38:28 +02:00
15 changed files with 451 additions and 10 deletions

View File

@@ -1,7 +1,7 @@
# cartodb/Makefile
EXTENSION = cartodb
EXTVERSION = 0.27.0
EXTVERSION = 0.28.1
SED = sed
AWK = awk
@@ -97,6 +97,10 @@ UPGRADABLE = \
0.26.0 \
0.26.1 \
0.27.0 \
0.27.1 \
0.27.2 \
0.28.0 \
0.28.1 \
$(EXTVERSION)dev \
$(EXTVERSION)next \
$(END)

13
NEWS.md
View File

@@ -1,3 +1,16 @@
0.28.1 (2019-07-04)
* Avoid temporary tables creation in CDB_SyncTable (#366)
* Make CDB_Get_Foreign_Updated_At robust to missing CDB_TableMetadata (#362)
0.28.0 (2019-07-01)
* New function CDB_SyncTable (#355)
0.27.2 (2019-06-21)
* Improvements and fixes in Ghost tables functions (#360)
0.27.1 (2019-06-03)
* Add some qualifications that were left in the previous release.
0.27.0 (2019-06-03)
* Fully qualify function calls
* Several improvements to bash tests.

View File

@@ -2,7 +2,7 @@
"name": "carto_postgresql_ext",
"current_version": {
"requires": {
"postgresql": ">=10.0",
"postgresql": ">=10.0.0",
"postgis": ">=2.4.0.0"
},
"works_with": {

56
doc/CDB_SyncTable.md Normal file
View File

@@ -0,0 +1,56 @@
Synchronize two tables. This function will synchronize a *destination* table with a *source* table.
The idea is that the *destination* is a replica of *source* and *source* has been subject to
modifications that are to be applied to *destination*.
This will be achieved by deleting the rows in the destination not present
in the source, inserting rows of the source not in the destination and updating modified rows.
If the destination table does not exist it will be created and all the rows of the source inserted into it.
Both tables must have a consistent `cartodb_id` primary key column which will be used to match
the source and destination rows.
Note that both tables do not necessarily become identical after the synchronization, since additional columns
may have been added to the destination; those columns will not be altered by the synchronization.
In addition some source columns may be skipped by listing them in the optional last argument; such columns
will not be updated in the destination, so if they are present in it their values won't be altered.
#### Using the function
Import some data using COPY FROM into a temporary table, then synchronize a table with the data and
finally delete the temporary table. This could be used import and update some data periodically while
allowing to add columns to the data that will be preserved across updates.
```sql
CREATE tmp_pois(cartodb_id int, name text, type text, longitude double precision, latitude double precision, rank int);
COPY tmp_pois FROM '/tmp/pois.csv';
SELECT CDB_SyncTable('tmp_pois', 'public', 'pois');
DROP TABLE tmp_pois;
```
Now we could perform some changes to the `pois` to maintain our own ranking:
```sql
UPDATE pois SET rank = random()*4 + 1;
```
Then, if the source were updated at `/tmp/pois.csv` we could synchronize with it while preserving our `rank` values with:
```sql
CREATE tmp_pois(cartodb_id int, name text, type text, longitude double precision, latitude double precision, rank int);
COPY tmp_pois FROM '/tmp/pois.csv';
SELECT CDB_SyncTable('tmp_pois', 'public', 'pois', '{rank}');
DROP TABLE tmp_pois;
```
#### Arguments
```
CDB_SyncTable(src_table, dst_schema, dst_table, skip_cols)
```
* **src_table** REGCLASS the source data for the synchronization
* **dst_scgena** REGNAMESPACE the destination schema
* **dst_table** NAME the destination table to be updated
* **skip_cols** NAME[] an array of column names, empty by default, which will be skipped

View File

@@ -1071,7 +1071,7 @@ BEGIN
-- by selecting their names into an array and
-- joining the array with a comma
SELECT
',' || array_to_string(array_agg(Format('%I',a.attname)),',') AS column_name_sql,
',' || array_to_string(array_agg(Format('%I',a.attname) ORDER BY a.attnum),',') AS column_name_sql,
Count(*) AS count
INTO rec
FROM pg_class c

View File

@@ -125,7 +125,14 @@ BEGIN
-- We assume that the remote cdb_tablemetadata is called cdb_tablemetadata and is on the same schema as the queried table.
SELECT nspname FROM pg_class c, pg_namespace n WHERE c.oid=foreign_table AND c.relnamespace = n.oid INTO fdw_schema_name;
EXECUTE FORMAT('SELECT updated_at FROM %I.cdb_tablemetadata WHERE tabname=%L ORDER BY updated_at DESC LIMIT 1', fdw_schema_name, remote_table_name) INTO time;
BEGIN
EXECUTE FORMAT('SELECT updated_at FROM %I.cdb_tablemetadata WHERE tabname=%L ORDER BY updated_at DESC LIMIT 1', fdw_schema_name, remote_table_name) INTO time;
EXCEPTION
WHEN undefined_table THEN
-- If you add a GET STACKED DIAGNOSTICS text_var = RETURNED_SQLSTATE
-- you get a code 42P01 which corresponds to undefined_table
RAISE NOTICE 'CDB_Get_Foreign_Updated_At: could not find %.cdb_tablemetadata while checking % updated_at, returning NULL timestamp', fdw_schema_name, foreign_table;
END;
RETURN time;
END
$$

View File

@@ -83,7 +83,7 @@ CREATE OR REPLACE FUNCTION @extschema@.CDB_SaveDDLTransaction()
RETURNS event_trigger
AS $$
BEGIN
INSERT INTO @extschema@.cdb_ddl_execution VALUES (txid_current(), tg_tag) ON CONFLICT (txid) DO NOTHING;
INSERT INTO @extschema@.cdb_ddl_execution VALUES (txid_current(), tg_tag) ON CONFLICT ON CONSTRAINT cdb_ddl_execution_pkey DO NOTHING;
END;
$$ LANGUAGE plpgsql VOLATILE PARALLEL UNSAFE SECURITY DEFINER;
@@ -96,7 +96,7 @@ AS $$
DROP TRIGGER IF EXISTS check_ddl_update ON @extschema@.cdb_ddl_execution;
-- Table to store the transaction id from DDL events to avoid multiple executions
CREATE TABLE IF NOT EXISTS @extschema@.cdb_ddl_execution(txid integer PRIMARY KEY, tag text);
CREATE TABLE IF NOT EXISTS @extschema@.cdb_ddl_execution(txid bigint PRIMARY KEY, tag text);
CREATE CONSTRAINT TRIGGER check_ddl_update
AFTER INSERT ON @extschema@.cdb_ddl_execution
@@ -106,7 +106,7 @@ AS $$
CREATE EVENT TRIGGER link_ghost_tables
ON ddl_command_end
WHEN TAG IN ('CREATE TABLE', 'SELECT INTO', 'DROP TABLE', 'ALTER TABLE', 'CREATE TRIGGER', 'DROP TRIGGER', 'CREATE VIEW', 'DROP VIEW', 'ALTER VIEW')
WHEN TAG IN ('CREATE TABLE', 'SELECT INTO', 'DROP TABLE', 'ALTER TABLE', 'CREATE TRIGGER', 'DROP TRIGGER', 'CREATE VIEW', 'DROP VIEW', 'ALTER VIEW', 'CREATE FOREIGN TABLE', 'ALTER FOREIGN TABLE', 'DROP FOREIGN TABLE')
EXECUTE PROCEDURE @extschema@.CDB_SaveDDLTransaction();
END;
$$ LANGUAGE plpgsql VOLATILE PARALLEL UNSAFE;

View File

@@ -66,7 +66,7 @@ AS $$
FROM
@extschema@._CDB_UserTablesInSchema(), unnest(tables) base_table
WHERE
schema_name = _cdb_schema_name(base_table)
schema_name = @extschema@._cdb_schema_name(base_table)
AND @extschema@._CDB_IsOverviewTableOf((SELECT relname FROM pg_class WHERE oid=base_table), table_name)
ORDER BY base_table, z;
$$ LANGUAGE SQL STABLE PARALLEL SAFE;

View File

@@ -0,0 +1,167 @@
/*
Gets the column names of a given table.
Sample usage:
SELECT @extschema@._CDB_GetColumns('public.films');
*/
CREATE OR REPLACE FUNCTION @extschema@._CDB_GetColumns(src_table REGCLASS)
RETURNS SETOF NAME
AS $$
SELECT
a.attname as "colname"
FROM
pg_catalog.pg_attribute a
WHERE
a.attnum > 0
AND NOT a.attisdropped
AND a.attrelid = (
SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.oid = src_table::oid
AND pg_catalog.pg_table_is_visible(c.oid)
)
ORDER BY a.attnum;
$$ LANGUAGE sql STABLE PARALLEL UNSAFE;
/*
Given an array of quoted column names, it generates an UPDATE SET
clause with the following form:
the_geom = changed.the_geom,
id = changed.id,
elevation = changed.elevation
Example of usage:
SELECT @extschema@.__CDB_GetUpdateSetClause('{the_geom, id, elevation}', 'changed');
*/
CREATE OR REPLACE FUNCTION @extschema@.__CDB_GetUpdateSetClause(colnames TEXT[], update_source TEXT)
RETURNS TEXT
AS $$
DECLARE
set_clause_list TEXT[];
col TEXT;
BEGIN
FOREACH col IN ARRAY colnames
LOOP
set_clause_list := array_append(set_clause_list, format('%1$s = %2$s.%1$s', col, update_source));
END lOOP;
RETURN array_to_string(set_clause_list, ', ');
END;
$$ LANGUAGE plpgsql IMMUTABLE PARALLEL SAFE;
/*
Given a prefix, generate a safe unique NAME for a temp table.
Example of usage:
SELECT @extschema@.__CDB_GenerateUniqueName('src_sync'); --> src_sync_718794_120106
*/
CREATE OR REPLACE FUNCTION @extschema@.__CDB_GenerateUniqueName(prefix TEXT)
RETURNS NAME
AS $$
SELECT format('%s_%s_%s', prefix, txid_current(), (random()*1000000)::int)::NAME;
$$ LANGUAGE sql VOLATILE PARALLEL UNSAFE;
/*
Given a table name and an array of column names,
return array of column names qualified with the table name and quoted when necessary
tablename and colnames should be properly quoted, and for this reason the type NAME is not
used for them (with quotes they could exceed the maximum identifier length)
Example of usage:
SELECT @extschema@.__CDB_QualifyColumns('t', ARRAY['a','"b-1"']); --> ARRAY['t.a','t."b-1"']
*/
CREATE OR REPLACE FUNCTION @extschema@.__CDB_QualifyColumns(tablename NAME, colnames NAME[]) RETURNS TEXT[] AS
$$
SELECT array_agg(tablename || '.' || _colname) from unnest(colnames) _colname;
$$ LANGUAGE sql IMMUTABLE PARALLEL SAFE;
/*
A Table Syncer
Assumptions:
- Both tables contain a consistent cartodb_id column
- Destination table has all columns of the source or does not exist
Sample usage:
SELECT CDB_SyncTable('radar_stations', 'public', 'syncdest');
SELECT CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest', '{the_geom, the_geom_webmercator}');
*/
CREATE OR REPLACE FUNCTION @extschema@.CDB_SyncTable(src_table REGCLASS, dst_schema REGNAMESPACE, dst_table NAME, skip_cols NAME[] = '{}')
RETURNS void
AS $$
DECLARE
fq_dest_table TEXT;
colnames TEXT[];
dst_colnames TEXT;
src_colnames TEXT;
update_set_clause TEXT;
num_rows BIGINT;
err_context text;
t timestamptz;
BEGIN
-- If the destination table does not exist, just copy the source table
fq_dest_table := format('%s.%I', dst_schema, dst_table);
EXECUTE format('CREATE TABLE IF NOT EXISTS %s as TABLE %s', fq_dest_table, src_table);
GET DIAGNOSTICS num_rows = ROW_COUNT;
IF num_rows > 0 THEN
RAISE NOTICE 'INSERTED % row(s)', num_rows;
RETURN;
END IF;
skip_cols := skip_cols || '{cartodb_id}';
-- Get the list of columns from the source table, excluding skip_cols
SELECT ARRAY(SELECT quote_ident(c) FROM @extschema@._CDB_GetColumns(src_table) as c EXCEPT SELECT unnest(skip_cols)) INTO colnames;
-- Deal with deleted rows: ids in dest but not in source
t := clock_timestamp();
EXECUTE format(
'DELETE FROM %1$s _dst WHERE NOT EXISTS (SELECT * FROM %2$s _src WHERE _src.cartodb_id=_dst.cartodb_id)',
fq_dest_table, src_table);
GET DIAGNOSTICS num_rows = ROW_COUNT;
RAISE NOTICE 'DELETED % row(s)', num_rows;
RAISE DEBUG 'DELETE time (s): %', clock_timestamp() - t;
-- Deal with inserted rows: ids in source but not in dest
t := clock_timestamp();
EXECUTE format('
INSERT INTO %1$s(cartodb_id, %2$s)
SELECT cartodb_id, %2$s FROM %3$s _src WHERE NOT EXISTS (SELECT * FROM %1$s _dst WHERE _src.cartodb_id=_dst.cartodb_id)
', fq_dest_table, array_to_string(colnames, ','), src_table);
GET DIAGNOSTICS num_rows = ROW_COUNT;
RAISE NOTICE 'INSERTED % row(s)', num_rows;
RAISE DEBUG 'INSERT time (s): %', clock_timestamp() - t;
-- Deal with modified rows: ids in source and dest but different hashes
t := clock_timestamp();
update_set_clause := @extschema@.__CDB_GetUpdateSetClause(colnames, '_changed');
dst_colnames := array_to_string(@extschema@.__CDB_QualifyColumns('_dst', colnames), ',');
src_colnames := array_to_string(@extschema@.__CDB_QualifyColumns('_src', colnames), ',');
EXECUTE format('
UPDATE %1$s _update SET %2$s
FROM (
SELECT _src.* FROM %3$s _src JOIN %1$s _dst ON (_dst.cartodb_id = _src.cartodb_id)
WHERE md5(ROW(%4$s)::text) <> md5(ROW(%5$s)::text)
) _changed
WHERE _update.cartodb_id = _changed.cartodb_id;
', fq_dest_table, update_set_clause, src_table, dst_colnames, src_colnames);
GET DIAGNOSTICS num_rows = ROW_COUNT;
RAISE NOTICE 'MODIFIED % row(s)', num_rows;
RAISE DEBUG 'UPDATE time (s): %', clock_timestamp() - t;
END;
$$ LANGUAGE plpgsql VOLATILE PARALLEL UNSAFE;

View File

@@ -124,7 +124,7 @@ DROP TRIGGER IF EXISTS table_modified ON @extschema@.CDB_TableMetadata;
-- oid (regclass) to its name
CREATE TRIGGER table_modified AFTER INSERT OR UPDATE
ON @extschema@.CDB_TableMetadata FOR EACH ROW EXECUTE PROCEDURE
_CDB_TableMetadata_Updated();
@extschema@._CDB_TableMetadata_Updated();
-- similar to TOUCH(1) in unix filesystems but for table in cdb_tablemetadata

View File

@@ -0,0 +1 @@
../scripts-available/CDB_SyncTable.sql

View File

@@ -7,7 +7,7 @@ SELECT _CDB_AnalysisDataSize('public');
CREATE TABLE analysis_2f13a3dbd7_41bd92976fc6dd97072afe4ee450054f4c0715d5(id int);
CREATE TABLE analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94(id int);
CREATE TABLE analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da9(id int);
SELECT _CDB_AnalysisTablesInSchema('public');
SELECT _CDB_AnalysisTablesInSchema('public') t ORDER BY t;
SELECT _CDB_AnalysisDataSize('public');
SELECT CDB_CheckAnalysisQuota('analysis_2f13a3dbd7_f00cee44e9e6152b450bde3a92eb9ae0d099da94');
SELECT CDB_SetUserQuotaInBytes(1);

View File

@@ -0,0 +1,94 @@
-- Setup: create and populate a table to test the syncs
\set QUIET on
BEGIN;
SET client_min_messages TO error;
CREATE TABLE test_sync_source (
cartodb_id bigint,
lat double precision,
lon double precision,
name text
);
INSERT INTO test_sync_source VALUES
(1, 1.0, 1.0, 'foo'),
(2, 2.0, 2.0, 'bar'),
(3, 3.0, 3.0, 'patata'),
(4, 4.0, 4.0, 'melon');
SET client_min_messages TO notice;
\set QUIET off
\echo 'First table sync: it should be simply just copied to the destination'
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest');
\echo 'Next table sync: there shall be no changes'
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest');
\echo 'Remove a row from the source and check it is deleted from the dest table'
DELETE FROM test_sync_source WHERE cartodb_id = 3;
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest');
\echo 'Insert a new row and check that it is inserted in the dest table'
INSERT INTO test_sync_source VALUES (5, 5.0, 5.0, 'sandia');
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest');
\echo 'Modify row and check that it is modified in the dest table'
UPDATE test_sync_source SET name = 'cantaloupe' WHERE cartodb_id = 4;
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest');
\echo 'Sanity check: the end result is the same source table'
SELECT * FROM test_sync_source ORDER BY cartodb_id;
SELECT * FROM test_sync_dest ORDER BY cartodb_id;
\echo 'It shall exclude geom columns if instructed to do so'
\set QUIET on
SET client_min_messages TO error;
SELECT cartodb.CDB_SetUserQuotaInBytes(0); -- Set user quota to infinite
SELECT cartodb.CDB_CartodbfyTable('test_sync_source');
SELECT cartodb.CDB_CartodbfyTable('test_sync_dest');
UPDATE test_sync_dest SET the_geom = cartodb.CDB_LatLng(lat, lon); -- A "gecoding"
\set QUIET off
SET client_min_messages TO notice;
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest', '{the_geom, the_geom_webmercator}');
SELECT * FROM test_sync_source ORDER BY cartodb_id;
SELECT * FROM test_sync_dest ORDER BY cartodb_id;
\echo 'It will work with schemas that need quoting'
\set QUIET on
SET client_min_messages TO error;
CREATE SCHEMA "sch-ema";
CREATE TABLE "test_sync_source2" AS SELECT * FROM test_sync_source;
\set QUIET off
SELECT cartodb.CDB_SyncTable('test_sync_source2', 'sch-ema', 'test_sync_dest');
INSERT INTO test_sync_source2(cartodb_id, lat, lon, name) VALUES (6, 6.0, 6.0, 'papaya');
DELETE FROM test_sync_source2 WHERE cartodb_id = 4;
UPDATE test_sync_source2 SET lat = 2.5 WHERE cartodb_id = 2;
SET client_min_messages TO notice;
SELECT cartodb.CDB_SyncTable('test_sync_source2', 'sch-ema', 'test_sync_dest');
\echo 'It will work with table names that need quoting'
\set QUIET on
SET client_min_messages TO error;
CREATE TABLE "test-sync-source" AS SELECT * FROM test_sync_source;
\set QUIET off
SELECT cartodb.CDB_SyncTable('test-sync-source', 'public', 'test-sync-dest');
INSERT INTO "test-sync-source"(cartodb_id, lat, lon, name) VALUES (6, 6.0, 6.0, 'papaya');
DELETE FROM "test-sync-source" WHERE cartodb_id = 4;
UPDATE "test-sync-source" SET lat = 2.5 WHERE cartodb_id = 2;
SET client_min_messages TO notice;
SELECT cartodb.CDB_SyncTable('test-sync-source', 'public', 'test-sync-dest');
\echo 'It will work with column names that need quoting'
\set QUIET on
SET client_min_messages TO error;
ALTER TABLE test_sync_source ADD COLUMN "a-column" int;
\set QUIET off
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest2');
INSERT INTO test_sync_source(cartodb_id, lat, lon, name) VALUES (6, 6.0, 6.0, 'papaya');
DELETE FROM test_sync_source WHERE cartodb_id = 4;
UPDATE test_sync_source SET lat = 2.5 WHERE cartodb_id = 2;
SET client_min_messages TO notice;
SELECT cartodb.CDB_SyncTable('test_sync_source', 'public', 'test_sync_dest2');
-- Cleanup
ROLLBACK;

View File

@@ -0,0 +1,94 @@
First table sync: it should be simply just copied to the destination
NOTICE: INSERTED 4 row(s)
Next table sync: there shall be no changes
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: DELETED 0 row(s)
NOTICE: INSERTED 0 row(s)
NOTICE: MODIFIED 0 row(s)
Remove a row from the source and check it is deleted from the dest table
DELETE 1
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: DELETED 1 row(s)
NOTICE: INSERTED 0 row(s)
NOTICE: MODIFIED 0 row(s)
Insert a new row and check that it is inserted in the dest table
INSERT 0 1
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: DELETED 0 row(s)
NOTICE: INSERTED 1 row(s)
NOTICE: MODIFIED 0 row(s)
Modify row and check that it is modified in the dest table
UPDATE 1
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: DELETED 0 row(s)
NOTICE: INSERTED 0 row(s)
NOTICE: MODIFIED 1 row(s)
Sanity check: the end result is the same source table
1|1|1|foo
2|2|2|bar
4|4|4|cantaloupe
5|5|5|sandia
1|1|1|foo
2|2|2|bar
4|4|4|cantaloupe
5|5|5|sandia
It shall exclude geom columns if instructed to do so
0
test_sync_source
test_sync_dest
SET
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: cdb_invalidate_varnish(public.test_sync_dest) called
NOTICE: DELETED 0 row(s)
NOTICE: cdb_invalidate_varnish(public.test_sync_dest) called
NOTICE: INSERTED 0 row(s)
NOTICE: cdb_invalidate_varnish(public.test_sync_dest) called
NOTICE: MODIFIED 0 row(s)
1|||1|1|foo
2|||2|2|bar
4|||4|4|cantaloupe
5|||5|5|sandia
1|0101000020E6100000000000000000F03F000000000000F03F|0101000020110F0000DB0B4ADA772DFB402B432E49D22DFB40|1|1|foo
2|0101000020E610000000000000000000400000000000000040|0101000020110F00003C0C4ADA772D0B4177F404ABE12E0B41|2|2|bar
4|0101000020E610000000000000000010400000000000001040|0101000020110F00003C0C4ADA772D1B4160AB497020331B41|4|4|cantaloupe
5|0101000020E610000000000000000014400000000000001440|0101000020110F000099476EE86AFC20413E7EB983F2012141|5|5|sandia
It will work with schemas that need quoting
INSERT 0 1
DELETE 1
UPDATE 1
SET
NOTICE: relation "test_sync_dest" already exists, skipping
NOTICE: DELETED 1 row(s)
NOTICE: INSERTED 1 row(s)
NOTICE: MODIFIED 1 row(s)
It will work with table names that need quoting
INSERT 0 1
DELETE 1
UPDATE 1
SET
NOTICE: relation "test-sync-dest" already exists, skipping
NOTICE: DELETED 1 row(s)
NOTICE: INSERTED 1 row(s)
NOTICE: MODIFIED 1 row(s)
It will work with column names that need quoting
INSERT 0 1
DELETE 1
UPDATE 1
SET
NOTICE: relation "test_sync_dest2" already exists, skipping
NOTICE: DELETED 1 row(s)
NOTICE: INSERTED 1 row(s)
NOTICE: MODIFIED 1 row(s)
ROLLBACK

View File

@@ -586,6 +586,11 @@ test_extension|public|"local-table-with-dashes"'
sql postgres "SELECT cartodb.CDB_Last_Updated_Time(ARRAY['test_extension.public.\"local-table-with-dashes\"']::text[]) < now()" should 't'
sql postgres "SELECT cartodb.CDB_Last_Updated_Time(ARRAY['test_extension.public.\"local-table-with-dashes\"']::text[]) > (now() - interval '1 minute')" should 't'
# Check CDB_Get_Foreign_Updated_At is robust to unimported CDB_TableMetadata
sql postgres "DROP FOREIGN TABLE IF EXISTS test_fdw.cdb_tablemetadata;"
sql postgres "SELECT cartodb.CDB_Get_Foreign_Updated_At('test_fdw.foo') IS NULL" should 't'
# Teardown
DATABASE=fdw_target sql postgres 'REVOKE USAGE ON SCHEMA test_fdw FROM fdw_user;'
DATABASE=fdw_target sql postgres 'REVOKE SELECT ON test_fdw.foo FROM fdw_user;'
DATABASE=fdw_target sql postgres 'REVOKE SELECT ON test_fdw.foo2 FROM fdw_user;'