pg_dump serial datatype issues

孤人 提交于 2019-12-10 09:33:36

问题


Could someone explain to me why a PostgreSQL table created with the following scripts:

CREATE TABLE users
(
  "id" serial NOT NULL,
  "name" character varying(150) NOT NULL,
  "surname" character varying (250) NOT NULL,
  "dept_id" integer NOT NULL,
  CONSTRAINT users_pkey PRIMARY KEY ("id")
)

gets dumped by pg_dump in the following format:

CREATE TABLE users(
      "id" integer NOT NULL,
      "name" character varying(150) NOT NULL,
      "surname" character varying (250) NOT NULL,
      "dept_id" integer NOT NULL
    );

ALTER TABLE users OWNER TO postgres;

CREATE SEQUENCE "users_id_seq"
        START WITH 1
        INCREMENT BY 1
        NO MINVALUE
        NO MAXVALUE
        CACHE 1;

ALTER TABLE "users_id_seq" OWNER TO postgres;
ALTER SEQUENCE "users_id_seq" OWNED BY users."id";
ALTER TABLE ONLY users
ADD CONSTRAINT users_pkey PRIMARY KEY ("id");

Obviously the above is only a small extract from the dump file.

Why does a pg_dump convert the datatypes serial to integer? When I restore the database from the dumped SQL file, it pretty much becomes useless because the autoincrementation stops working, and when adding new records from a front-end form, it fails with message along the lines 'id field cannot be empty', obviously because it is a primary key set to not null, but auto-incrementation should kick in and populate the field with the next value in the sequence.

Am I missing something here?


回答1:


From docs:

The data types smallserial, serial and bigserial are not true types, but merely a notational convenience for creating unique identifier columns (similar to the AUTO_INCREMENT property supported by some other databases). In the current implementation, specifying:

CREATE TABLE tablename (
    colname SERIAL
);

is equivalent to specifying:

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
    colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

Thus, we have created an integer column and arranged for its default values to be assigned from a sequence generator. A NOT NULL constraint is applied to ensure that a null value cannot be inserted. (In most cases you would also want to attach a UNIQUE or PRIMARY KEY constraint to prevent duplicate values from being inserted by accident, but this is not automatic.) Lastly, the sequence is marked as "owned by" the column, so that it will be dropped if the column or table is dropped.




回答2:


Well, it works here. Test snippet:

DROP SCHEMA tmpdump ;
CREATE SCHEMA tmpdump ;
set search_path = tmpdump ;

-- SELECT version();
DROP TABLE lusers;
CREATE TABLE lusers
(
  "id" serial NOT NULL
  , "name" character varying(150) NOT NULL
  , "surname" character varying (250) NOT NULL
  , "dept_id" integer NOT NULL
  , CONSTRAINT lusers_pkey PRIMARY KEY ("id")
);
INSERT INTO lusers ("name", "surname", "dept_id") VALUES
         ('Joop', 'Zoetemelk', 2) , ('Jan', 'Jansen', 3)
         , ('Ard', 'Schenk', 4) , ('Kees', 'Verkerk', 5);

Dump only the tmpdump schema with:

pg_dump -U someusername yourdbname -n tmpdump -F p --inserts | less



回答3:


here is a script I've done to export db as what I would write by hand:

#!/usr/bin/env python2
from __future__ import print_function

import re
import sys
from subprocess import check_output

re_seq_name = re.compile(r'OWNED BY .*\.(.+);')
re_pk = re.compile(r'\s*ADD (CONSTRAINT [\w_]+ PRIMARY KEY .*);')
re_fk = re.compile(
    r'\s*ADD (CONSTRAINT [\w_]+ FOREIGN KEY .*);')
re_fk_tbl = re.compile(r'FOREIGN KEY .* REFERENCES ([\w_]+)\s*\([\w_]+\)')
re_unique = re.compile(r'\s*ADD (CONSTRAINT [\w_]+ UNIQUE .*);')
re_tbl = re.compile(r'\s*CREATE TABLE IF NOT EXISTS ([\w_]+)')

env = {"PGHOST": "127.0.0.1",
       "PGDATABASE": "kadir",
       "PGUSER": "postgres",
       "PGPASSWORD": "password"}


def main():
    result = check_output(
        ["psql", "-tA", "-F,", "-c", r"\dt public.*"], env=env)

    table_names = []

    for line in result.split('\n'):
        if not line:
            continue
        table_name = line.split(",")[1]
        table_names.append(table_name)

    create_stmts = {}  # record all create table statement for topological sort
    tables_deps = []

    for table_name in table_names:
        result = check_output(["pg_dump", "-sx", "-t", table_name], env=env)

        sequences = {}
        constraints = []
        indexes = []
        lines = []
        for line in result.split("\n"):
            # remove unnecessary lines
            if not line:
                continue
            if line.startswith("--"):
                continue
            if line.startswith("SET"):
                continue
            if line.startswith("SELECT"):
                continue
            line = line.replace("CREATE TABLE", "CREATE TABLE IF NOT EXISTS")
            line = line.replace("public.", "")
            line = line.replace("timestamp without time zone", "timestamp")
            line = line.replace("character varying", "varchar")
            lines.append(line)

        # record sequences, constraints, indexes and fk deps
        deps = []
        for line in lines:
            if line.strip().startswith("CREATE SEQUENCE"):
                seq_name = line.replace("CREATE SEQUENCE", "").strip()
                seq_col_line = [
                    l for l in lines if seq_name in l and "ALTER SEQUENCE" in l]
                if len(seq_col_line) != 1:
                    raise Exception("expect one element")
                seq_col = re_seq_name.findall(seq_col_line[0])[0]
                sequences[seq_name] = seq_col
            if "PRIMARY KEY" in line:
                constraints.append(re_pk.findall(line)[0])
            if "FOREIGN KEY" in line:
                constraints.append(re_fk.findall(line)[0])
                deps.append(re_fk_tbl.findall(line)[0])
            if "UNIQUE" in line:
                constraints.append(re_unique.findall(line)[0])
            if line.strip().startswith("CREATE INDEX"):
                line = line.replace("USING btree ", "")
                line = line.replace(
                    "CREATE INDEX", "CREATE INDEX IF NOT EXISTS")
                indexes.append(line)

        tables_deps.append((table_name, deps))

        # extract create table statement
        start_index = (i for i, s in enumerate(lines)
                       if "CREATE TABLE" in s).next()
        end_index = (i for i, s in enumerate(lines)
                     if s.strip().startswith(");")).next()
        create_stmt = lines[start_index:end_index+1]

        # populate sequences
        for seq, col in sequences.items():
            (index, line) = ((i, s) for i, s in enumerate(create_stmt)
                             if s.strip().startswith(col)).next()
            if "bigint" in line:
                line = line.replace("bigint", "bigserial")
            elif "integer" in line:
                line = line.replace("integer", "serial")
            create_stmt[index] = line

        # insert constraints
        constraints = ["    "+c.strip() for c in constraints]
        constraints[:-1] = [c+"," for c in constraints[:-1]]

        create_stmt[end_index-1] = create_stmt[end_index-1]+",\n"
        create_stmt[end_index:end_index] = constraints
        create_stmt.extend(indexes)

        create_stmts[table_name] = create_stmt

    result = topological_sort(tables_deps)

    for table_name in result:
        for line in create_stmts[table_name]:
            print(line)
        print("\n")


def topological_sort(items):
    """shape of items: [(item1, (item2, item3))]"""
    result = []
    provided = set()
    remaining_items = list(items)
    all_items = set([i[0] for i in items])
    while remaining_items:
        emitted = False
        for i in remaining_items:
            item, dependencies = i
            dependencies = set(dependencies)
            if dependencies.issubset(provided):
                result.append(item)
                remaining_items.remove(i)
                provided.add(item)
                emitted = True
                break
        if not emitted:
            print("[Error]Dependency not found or cyclic dependency found:")
            # print("Found dependencies:", ", ".join(provided))
            for i in remaining_items:
                item, dependencies = i
                not_met = set(dependencies).difference(all_items)
                if not_met:
                    print("  ", item, "depends on", ", ".join(dependencies))
                    print("  dependency not met:", ", ".join(not_met))
            sys.exit(1)

    return result


if __name__ == '__main__':
    main()

Example output:

CREATE TABLE IF NOT EXISTS competency (
    id bigserial NOT NULL,
    competency_title varchar(64) DEFAULT NULL::varchar,
    competency_description varchar(2048),
    competency_category_id bigint,
    created_by bigint,
    created_date timestamp DEFAULT now(),
    changed_date timestamp DEFAULT now(),
    deleted_date timestamp,

    CONSTRAINT competency_competency_title_unique UNIQUE (competency_title),
    CONSTRAINT competency_pk PRIMARY KEY (id),
    CONSTRAINT competency_competency_category_id_fk FOREIGN KEY (competency_category_id) REFERENCES competency_category(id),
    CONSTRAINT competency_created_by_fk FOREIGN KEY (created_by) REFERENCES user_profile(user_id)
);
CREATE INDEX competency_competency_title_index ON competency (competency_title);


来源:https://stackoverflow.com/questions/33038651/pg-dump-serial-datatype-issues

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!