最近在折腾greenplum,遇到一个蛋疼的问题,那就是获取表结构,也就是建表语句。大家都知道在MySQL里面是非常easy的,show create table table_name 就搞定了,在gpdb里面就没这么容易,在查询大量资料以后终于找到了方法。那就是自己定义一个函数去获取,函数中可以嵌套python代码,非常的方便。但是资料中的代码有大量错误,在经过几番调试以后终于可以使用了。
如果没有这个函数其实也可以获取表结构,那就是只能导出这个表的结构进行查看了。导出表结构的命令是:
pg_dump -s --table=tb1_partition_range_yyyymmdd testdb > tb1_partition_range_yyyymmdd.sql
查看表结构:
[gpadmin@mdw ~]$ cat tb1_partition_range_yyyymmdd.sql -- -- Greenplum Database database dump -- SET statement_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = off; SET check_function_bodies = false; SET client_min_messages = warning; SET escape_string_warning = off; SET search_path = public, pg_catalog; SET default_tablespace = ''; SET default_with_oids = false; -- -- Name: tb1_partition_range_yyyymmdd; Type: TABLE; Schema: public; Owner: gpadmin; Tablespace: -- CREATE TABLE tb1_partition_range_yyyymmdd ( id numeric, yyyymmdd date ) WITH (appendonly=true, compresslevel=5) DISTRIBUTED BY (id) PARTITION BY RANGE(yyyymmdd) ( PARTITION p20120811 START ('2012-08-11'::date) END ('2012-08-12'::date) WITH (tablename='tb1_partition_range_yyyymmdd_1_prt_p20120811', orientation=row , appendonly=true, compresslevel=5 ), PARTITION p20120812 START ('2012-08-12'::date) END ('2012-08-13'::date) WITH (tablename='tb1_partition_range_yyyymmdd_1_prt_p20120812', orientation=row , appendonly=true, compresslevel=5 ) ); ALTER TABLE public.tb1_partition_range_yyyymmdd OWNER TO gpadmin; -- -- Name: idx_yyyymmdd; Type: INDEX; Schema: public; Owner: gpadmin; Tablespace: -- CREATE INDEX idx_yyyymmdd ON tb1_partition_range_yyyymmdd USING btree (yyyymmdd); -- -- Greenplum Database database dump complete --
还有一种方法就是d table_name查看。这个只能查看字段信息啥的。
testdb=# d+ tb1_partition_range_yyyymmdd Append-Only Table "public.tb1_partition_range_yyyymmdd" Column | Type | Modifiers | Storage | Description ----------+---------+-----------+---------+------------- id | numeric | | main | yyyymmdd | date | | plain | Compression Type: zlib Compression Level: 5 Block Size: 32768 Checksum: t Indexes: "idx_yyyymmdd" btree (yyyymmdd) Child tables: tb1_partition_range_yyyymmdd_1_prt_p20120811, tb1_partition_range_yyyymmdd_1_prt_p20120812 Has OIDs: no Options: appendonly=true, compresslevel=5 Distributed by: (id) Partition by: (yyyymmdd) testdb=#
总体来说还是不太方便,下面创建一个自定义的函数进行查看:
1. 创建语言
CREATE PROCEDURAL LANGUAGE plpythonu;
2. 创建函数(代码如下):
vim get_table_structure.sql

create or replace function get_table_structure(tablename text) returns text as $$ try: table_name = tablename.lower().split('.')[1] talbe_schema=tablename.lower().split('.')[0] except (IndexError): return 'Please in put "tableschema.table_name"' get_table_oid="select oid,reloptions,relkind from pg_class where oid='%s'::regclass"%(tablename) try: rv_oid=plpy.execute(get_table_oid,5) if not rv_oid: return 'Did not find any relation named"'+tablename +'".' except (Error): return 'Did not find any relation named"'+tablename +'".' table_oid=rv_oid[0]['oid'] rv_reloptions=rv_oid[0]['reloptions'] rv_relkind=rv_oid[0]['relkind'] create_sql=""; table_kind='table'; if rv_relkind !='r' and rv_relkind !='v': plpy.error('%s is not table or view'%(tablename)); elif rv_relkind=='v': get_view_def="select pg_get_viewdef(%s,'t') as viewdef;" % (table_oid) rv_viewdef=plpy.execute(get_view_def); create_sql='create view %s as ' % (tablename) create_sql += rv_viewdef[0]['viewdef']+' '; table_kind='view' else: get_columns="select a.attname,pg_catalog.format_type(a.atttypid,a.atttypmod), (select substring(pg_catalog.pg_get_expr(d.adbin,d.adrelid) for 128) from pg_catalog.pg_attrdef d where d.adrelid=a.attrelid and d.adnum=a.attnum and a.atthasdef) as default,a.attnotnull as isnull from pg_catalog.pg_attribute a where a.attrelid= %s and a.attnum >0 and not a.attisdropped order by a.attnum;" % (table_oid); rv_columns=plpy.execute(get_columns) get_table_distribution1="select attrnums from pg_catalog.gp_distribution_policy t where localoid = '" + table_oid + "' " rv_distribution1=plpy.execute(get_table_distribution1,500) rv_distribution2='' if rv_distribution1 and rv_distribution1[0]['attrnums']: get_table_distribution2="select attname from pg_attribute where attrelid='"+table_oid+"' and attnum in (" + str(rv_distribution1[0]['attrnums']).strip('{').strip('}').strip('[').strip(']')+")" rv_distribution2=plpy.execute(get_table_distribution2,500) create_sql='create table %s ( ' % (tablename) get_index="select pg_get_indexdef(indexrelid) as indexdef from pg_index where indrelid=%s" % (table_oid); rv_index=plpy.execute(get_index); get_parinfo1="select attname as columnname from pg_attribute where attnum =(select paratts[0] from pg_partition where parrelid=%s) and attrelid=%s;"%(table_oid,table_oid); get_parinfo2=""" select pp.parrelid,prl.parchildrelid,case when pp.parkind='h'::"char" then 'hash'::text when pp.parkind='r'::"char" then 'range'::text when pp.parkind='l'::"char" then 'list'::text else null::text end as partitiontype,pg_get_partition_rule_def(prl.oid,true) as partitionboundary from pg_partition pp,pg_partition_rule prl where pp.paristemplate=false and pp.parrelid = %s and prl.paroid = pp.oid order by prl.parname; """ % (table_oid) v_par_parent=plpy.execute(get_parinfo1); v_par_info=plpy.execute(get_parinfo2); max_column_len=10 max_type_len=4 max_modifiers_len=4 max_default_len=4 for i in rv_columns: if i['attname']: if max_column_len < i['attname'].__len__(): max_column_len=i['attname'].__len__() if i['format_type']: if max_type_len < i['format_type'].__len__(): max_type_len=i['format_type'].__len__() if i['default']: if max_type_len < i['default'].__len__(): max_default_len=i['default'].__len__() first=True for i in rv_columns: if first==True: split_char=' '; first=False else: split_char=','; if i['attname']: create_sql += " " + split_char + i['attname'].ljust(max_column_len+6)+'' else: create_sql += "" + split_char + ' '.ljust(max_column_len+6) if i['format_type']: create_sql += ' ' + i['format_type'].ljust(max_type_len +2) else: create_sql += ' ' + ' '.ljust(max_type_len+2) if i['isnull'] and i['isnull']: create_sql += ' ' + ' not null '.ljust(8) if i['default']: create_sql += ' default ' + i['default'].ljust(max_default_len+6) create_sql += " " create_sql += ")" if rv_reloptions: create_sql +=" with ("+str(rv_reloptions).strip('{').strip('}').strip('[').strip(']') +") " create_sql = create_sql.replace("'",'') if rv_distribution2: create_sql += 'Distributed by (' for i in rv_distribution2: create_sql += i['attname'] + ',' create_sql =create_sql.strip(',')+')' elif rv_distribution1: create_sql += 'Distributed randomly ' if v_par_parent: partitiontype=v_par_info[0]['partitiontype']; create_sql +=' PARTITION BY '+ partitiontype + "("+v_par_parent[0]['columnname']+") ( "; for i in v_par_info: create_sql +=" " +i['partitionboundary']+', '; create_sql=create_sql.strip(', '); create_sql+=" )" create_sql+="; " for i in rv_index: create_sql += i['indexdef']+'; ' get_table_comment="select 'comment on %s %s is '''|| COALESCE (description,'')|| '''' as comment from pg_description where objoid=%s and objsubid=0;" % (table_kind,tablename,table_oid) get_column_comment="select 'comment on column %s.'||b.attname ||' is ''' || COALESCE(a.description,'')|| ''' ' as comment from pg_catalog.pg_description a,pg_catalog.pg_attribute b where objoid=%s and a.objoid=b.attrelid and a.objsubid=b.attnum;" % (tablename,table_oid) rv_table_comment=plpy.execute(get_table_comment); rv_column_comment=plpy.execute(get_column_comment); for i in rv_table_comment: create_sql += i['comment']+'; ' for i in rv_column_comment: create_sql +=i['comment']+'; ' return create_sql; $$ LANGUAGE plpythonu;
3. 进行测试
testdb=# SELECT get_table_structure('public.tb1_partition_range_yyyymmdd'); get_table_structure ------------------------------------------------------------------------------------------------------------------- create table public.tb1_partition_range_yyyymmdd ( id numeric ,yyyymmdd date ) with (appendonly=true, compresslevel=5) Distributed by (id) PARTITION BY range(yyyymmdd) ( PARTITION p20120811 START ('2012-08-11'::date) END ('2012-08-12'::date) WITH (appendonly=true, compresslevel=5), PARTITION p20120812 START ('2012-08-12'::date) END ('2012-08-13'::date) WITH (appendonly=true, compresslevel=5) ); CREATE INDEX idx_yyyymmdd ON tb1_partition_range_yyyymmdd USING btree (yyyymmdd); (1 row)
参考资料:
《greenplum 企业应用实战》