Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. First double check if Dictionary has duplicates by executing next SQL, don’t to forget to provide actual Dictionary id instead of <dictionary_oid>:

    Code Block
    languagesql
    SELECT a.*
    FROM tw_term a
             JOIN (SELECT term_id, dictionary_oid, COUNT(*)
                   FROM tw_term
                   GROUP BY term_id, dictionary_oid
                   HAVING count(term_id) > 1
                      and dictionary_oid = '<dictionary_oid>') b
                  ON a.term_id = b.term_id
                      AND a.dictionary_oid = b.dictionary_oid
    ORDER BY a.term_id
  2. If you see duplicate records and want to continue then go to step 3

  3. Stop Tomcat and backup database

  4. To update duplicate records, we need to launch first script which will generate update scripts, don’t to forget to provide actual Dictionary id instead of <dictionary_oid>:

    Code Block
    languagesql
    SELECT (CONCAT('UPDATE tw_term AS a SET a.term_id = CONCAT(\'', CONCAT(REGEXP_REPLACE(a.term_id, '(\\d+-)\\d+', '$1')),
                   '\', (SELECT MAX(CONVERT(REGEXP_REPLACE(d.term_id, ''\\\\d+-(\\\\d+)'', ''$1'') , UNSIGNED INTEGER)) + 1 FROM (SELECT b.* FROM (SELECT c.term_id FROM tw_term AS c WHERE c.concept_oid = \'',
                   a.concept_oid, '\') AS b) AS d)) WHERE a.oid = \'', a.oid, '\';')) AS script
    FROM tw_term a
             JOIN (SELECT term_id, dictionary_oid
                   FROM tw_term
                   GROUP BY term_id, dictionary_oid
                   HAVING count(term_id) > 1
                      and dictionary_oid = '<dictionary_oid>') b
                  ON a.term_id = b.term_id
                      AND a.dictionary_oid = b.dictionary_oid
    ORDER BY a.term_id;

    In case when there is a lot of records you may save results to file via mysql cli:

    Code Block
    languagebash
    mysql -h localhost -u user -p --skip-column-names -e "SELECT (CONCAT('UPDATE tw_term AS a SET a.term_id = CONCAT(\'', CONCAT(REGEXP_REPLACE(a.term_id, '(\\\\d+-)\\\\d+', '\$1')), '\', (SELECT MAX(CONVERT(REGEXP_REPLACE(d.term_id, ''\\\\d+-(\\\\d+)'', ''\$1'') , UNSIGNED INTEGER)) + 1 FROM (SELECT b.* FROM (SELECT c.term_id FROM tw_term AS c WHERE c.concept_oid = \'', a.concept_oid, '\') AS b) AS d)) WHERE a.oid = \'', a.oid, '\';')) AS script FROM tw_term a JOIN (SELECT term_id, dictionary_oid FROM tw_term GROUP BY term_id, dictionary_oid HAVING count(term_id) > 1 and dictionary_oid = '<dictionary_oid>') b ON a.term_id = b.term_id  AND a.dictionary_oid = b.dictionary_oid ORDER BY a.term_id;" <database_name> > scripts.sql

    Don’t forget to change <dictionary_oid> and <database_name>. Please note that for bash script her has been added extra escape symbols \.

  5. Copy generated scripts and launch it in your favorite SQL editor program.
    In case of script file you may run it, like this: mysql -h localhost -u user -p <database_name> < scriptscripts.sql

  6. Start Tomcat and verify results.

...