ORACLE字符集转换扫描工具

最近新接到一个需求,销售提出应用需要支持藏文,各技术部的leader检查自己的代码都是utf8格式,唯有数据库出于历史原因字符集是ZHS16GBK不支持藏文,导致前段显示乱码。于是CTO提出想重新搭建一套支持全球语言的系统。然后将以前的业务相继分拆到该系统中。如果要搭建一套这样的系统,字符集最好的选择就是选用AL32UTF8。可以说AL32UTF8是ZHS16GBK的超集,所以说直接转换是基本没有问题。我们都知道在ZHS16GBK字符集中汉字是两个字节代表一个汉字,而对于AL32UTF8是三个字节为一个汉字。这里面就牵扯出一个问题,比如在ZHS16GBK字符集中有一个字段定义为VARCHAR2(4000),而当字段中又有汉字占用字节达到了4000,那么在从ZHS16GBK转到AL32UTF8时就必然会出现字段长度不够的情况。那首先我们就必须要发现系统里有哪些字段不能直接转换到AL32UTF8格式。为了避免在迁移过程中由于字符集不同导致的数据损失。oracle提供了字符集扫描工具(character set scanner),通过这个工具我们可以找到在字符集转换过程可能遇到的问题,下面给大家介绍一个工具csscan,字符集转换扫描工具。

先简单介绍一下该工具:
官方文档地址
http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch12scanner.htm
该工具主要测试在转换到新的字符集时数据的字符编码会作什么改变;是否能成功转换到新的字符集;转换后的数据是否适合当前列的大小。只检测CHAR, VARCHAR2, LONG, CLOB, NCHAR, NVARCHAR2, and NCLOB数据类型的列,不测试LONG, CLOB, and NCLOB数据类型的大小。
在使用前必须执行一个SQL脚本CSMINST.SQL,该脚本位于$Oracle_home/rdbms/admin/csminst.sql,该脚本会在数据库中建立一个csmig用户,并且会在该用户下建立一系列对象。
[oracle@bj3opd004 admin]$cd  $Oracle_home/rdbms/admin
[oracle@bj3opd004 admin]$sqlplus system/xxxx
 SQL> @csminstUser created.

Grant succeeded.

Grant succeeded.

Grant succeeded.

Grant succeeded.

Grant succeeded.

Grant succeeded.

Grant succeeded.

User altered.

0 rows created.

1 row updated.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

Table created.

Synonym dropped.

Synonym created.

18 rows created.

View created.

Synonym dropped.

Synonym created.

View created.

Synonym dropped.

Synonym created.

View created.

Synonym dropped.

Synonym created.

View created.

Synonym dropped.

Synonym created.

View created.

Synonym dropped.

Synonym created.

View created.

Synonym dropped.

看下csscan的help

[oracle@bj3opd004 admin]$ csscan -help
Character Set Scanner v2.2 : Release 11.2.0.3.0 - Production on Sun Jan 26 10:35:37 2014Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

You can let Scanner prompt you for parameters by entering the CSSCAN
command followed by your username/password:

Example: CSSCAN \"SYSTEM/MANAGER AS SYSDBA\"

Or, you can control how Scanner runs by entering the CSSCAN command
followed by various parameters. To specify parameters, you use keywords:

Example:
CSSCAN \"SYSTEM/MANAGER AS SYSDBA\" FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3

Keyword    Default Prompt Description
---------- ------- ------ -------------------------------------------------
USERID             yes    username/password
FULL       N       yes    scan entire database
USER               yes    owner of tables to be scanned
TABLE              yes    list of tables to scan
COLUMN             yes    list of columns to scan
EXCLUDE                   list of tables to exclude from scan
TOCHAR             yes    new database character set name
FROMCHAR                  current database character set name
TONCHAR                   new national character set name
FROMNCHAR                 current national character set name
ARRAY      1024000 yes    size of array fetch buffer
PROCESS    1       yes    number of concurrent scan process
MAXBLOCKS                 split table if block size exceed MAXBLOCKS
CAPTURE    N              capture convertible data
SUPPRESS                  maximum number of exceptions logged for each table
FEEDBACK                  report progress every N rows
BOUNDARIES                list of column size boundaries for summary report
LASTRPT    N              generate report of the last database scan
LOG        scan           base file name of report files
PARFILE                   parameter file name
PRESERVE   N              preserve existing scan results
LCSD       N       no     enable language and character set detection
LCSDDATA   LOSSY   no     define the scope of the detection
HELP       N              show help screen (this screen)
QUERY      N              select clause to scan subset of tables or columns
---------- ------- ------ -------------------------------------------------
Scanner terminated successfully.

测试一个表tbl_resource看看从ZHS16GBK转换到AL32UTF8时会发现哪些问题.

[oracle@bj3opd004 ~]$csscan table=support_v2_dw.tbl_resource USERID=system/xxxx  FROMCHAR=ZHS16GBK TOCHAR=AL32UTF8 PROCESS=4 FEEDBACK=500000 ARRAY=502400000
Character Set Scanner v2.2 : Release 11.2.0.3.0 -Production on Sun Jan 26 10:11:29 2014Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Enumerating table to scan...

. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABSAAEluAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABRAAEokAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABbAAA4kAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABZAAES6gAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABcAAA4yAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACYAAE8EAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACIAAClKAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACLAAHF2AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACWAAFXCAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AATAAF50AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAwAAF7KAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AA0AAF2OAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABzAACXQAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACHAACxgAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAeAAD40AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AArAAGowAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACNAAF0SAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB4AACA0AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACbAAGUCAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACdAAANIAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAZAAGMaAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACeAAAE4AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB1AACA0AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB3AACEmAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACGAAC+6AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACSAAACkAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACTAAAC0AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAGAAEAYAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAaAAH/4AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAbAAFYsAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAcAAF8YAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAhAACswAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAtAAHwiAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAuAAGEAAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAVAAF3cAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAfAAD7EAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAnAAGkoAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAqAAGekAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAvAAGAoAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABaAAA3yAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABdAAA2iAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB0AACPqAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACVAAAC2AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACaAAGfyAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACcAAGWyAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAWAAEAgAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAYAAGGKAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAgAAGKkAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAsAAH/4AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACMAAGh2AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACZAAG0iAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACfAAAC4AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AChAAACoAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AA1AAAsuAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AARAAEKWAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABgAAH/8AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB2AACFGAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB/AACzIAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACCAAGrGAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACJAACf8AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACKAACeKAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACUAAAC2AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACgAAAC4AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAdAAD3KAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAiAADjOAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABYAAEUbQAAA]

Creating Database Scan Summary Report...

Creating Individual Exception Report...

Scanner terminated successfully.

执行完成之后,可以看到在当前目录下生成三个文件
[oracle@bj3opd004 ~]$ ls -ltrh
-rw-r--r-- 1 oracle oinstall 8.8K Jan 26 10:11 scan.txt
-rw-r--r-- 1 oracle oinstall 4.9K Jan 26 10:11 scan.out
-rw-r--r-- 1 oracle oinstall 7.1K Jan 26 10:11 scan.err

[oracle@bj3opd004 ~]$ less scan.out
Character Set Scanner v2.2 : Release 11.2.0.3.0 - Production on Sun Jan 26 10:11:29 2014Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Enumerating table to scan...

. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABSAAEluAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABRAAEokAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABbAAA4kAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABZAAES6gAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABcAAA4yAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACYAAE8EAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACIAAClKAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACLAAHF2AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACWAAFXCAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AATAAF50AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAwAAF7KAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AA0AAF2OAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABzAACXQAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACHAACxgAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAeAAD40AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AArAAGowAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACNAAF0SAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB4AACA0AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACbAAGUCAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACdAAANIAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAZAAGMaAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACeAAAE4AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB1AACA0AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB3AACEmAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACGAAC+6AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACSAAACkAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACTAAAC0AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAGAAEAYAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAaAAH/4AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAbAAFYsAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAcAAF8YAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAhAACswAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAtAAHwiAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAuAAGEAAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAVAAF3cAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAfAAD7EAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAnAAGkoAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAqAAGekAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAvAAGAoAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABaAAA3yAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABdAAA2iAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB0AACPqAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACVAAAC2AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACaAAGfyAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACcAAGWyAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAWAAEAgAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAYAAGGKAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAgAAGKkAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAsAAH/4AAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACMAAGh2AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACZAAG0iAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACfAAAC4AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AChAAACoAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AA1AAAsuAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AARAAEKWAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABgAAH/8AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB2AACFGAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AB/AACzIAAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACCAAGrGAAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACJAACf8AAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACKAACeKAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACUAAAC2AAAA]
. process 2 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ACgAAAC4AAAA]
. process 3 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAdAAD3KAAAA]
. process 4 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67AAiAADjOAAAA]
. process 1 scanning SUPPORT_V2_DW.TBL_RESOURCE[AAAh67ABYAAEUbQAAA]

Creating Database Scan Summary Report...

Creating Individual Exception Report...

Scanner terminated successfully.

文件scan.out记录了当时屏幕的输出结果。

[oracle@bj3opd004 ~]$ less scan.txt
Database Scan Summary ReportTime Started  : 2014-01-26 10:11:29
Time Completed: 2014-01-26 10:11:36Process ID         Time Started       Time Completed
---------- -------------------- --------------------
1  2014-01-26 10:11:29  2014-01-26 10:11:35
2  2014-01-26 10:11:29  2014-01-26 10:11:35
3  2014-01-26 10:11:29  2014-01-26 10:11:35
4  2014-01-26 10:11:29  2014-01-26 10:11:35
---------- -------------------- --------------------[Database Size]

Tablespace                           Used            Free           Total       Expansion
------------------------- --------------- --------------- --------------- ---------------
SYSTEM                          1,005.06M         738.94M       1,744.00M            .00K
SYSAUX                          1,070.44M         473.56M       1,544.00M            .00K
UNDOTBS1                       33,077.69M      16,154.31M      49,232.00M            .00K
TEMP                                 .00K            .00K            .00K            .00K
USERS                               1.31M           3.69M           5.00M            .00K
TBS_SUPPORT_DATA            2,465,476.75M      57,659.25M   2,523,136.00M          12.59M
TBS_SUPPORT_DATA_IDX           16,952.44M      48,583.56M      65,536.00M            .00K
------------------------- --------------- --------------- --------------- ---------------
Total                       2,517,583.69M     123,613.31M   2,641,197.00M          12.59M

[Database Scan Parameters]

Parameter                      Value
------------------------------ ------------------------------------------------
CSSCAN Version                 v2.1
Instance Name                  uucudw
Database Version               11.2.0.3.0
Scan type                      Selective tables
Scan CHAR data?                YES
Database character set         ZHS16GBK
FROMCHAR                       ZHS16GBK
TOCHAR                         AL32UTF8
Scan NCHAR data?               NO
Array fetch buffer size        502400000
Number of processes            4
Capture convertible data?      NO
------------------------------ ------------------------------------------------

[Scan Summary]

Some character type application data are not convertible to the new character set

[Data Dictionary Conversion Summary]

Data Dictionary Tables:

Datatype                    Changeless      Convertible       Truncation            Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2                             0                0                0                0
CHAR                                 0                0                0                0
LONG                                 0                0                0                0
VARRAY                               0                0                0                0
--------------------- ---------------- ---------------- ---------------- ----------------
Total                                0                0                0                0
Total in percentage              0.000%           0.000%           0.000%           0.000%

XML CSX Dictionary Tables:

Datatype                    Changeless      Convertible       Truncation            Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2                             0                0                0                0
CHAR                                 0                0                0                0
LONG                                 0                0                0                0
VARRAY                               0                0                0                0
--------------------- ---------------- ---------------- ---------------- ----------------
Total                                0                0                0                0
Total in percentage              0.000%           0.000%           0.000%           0.000%

[Application Data Conversion Summary]

Datatype                    Changeless      Convertible       Truncation            Lossy
--------------------- ---------------- ---------------- ---------------- ----------------
VARCHAR2                     1,072,475          340,873               62                0
CHAR                                 0                0                0                0
LONG                                 0                0                0                0
VARRAY                               0                0                0                0
--------------------- ---------------- ---------------- ---------------- ----------------
Total                        1,072,475          340,873               62                0
Total in percentage             75.879%          24.117%           0.004%           0.000%

[Distribution of Convertible, Truncated and Lossy Data by Table]

Data Dictionary Tables:

USER.TABLE                                              Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
-------------------------------------------------- ---------------- ---------------- ----------------

XML CSX Dictionary Tables:

USER.TABLE                                              Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
-------------------------------------------------- ---------------- ---------------- ----------------

Application Data:

USER.TABLE                                              Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
SUPPORT_V2_DW.TBL_RESOURCE                                  340,873               62                0
-------------------------------------------------- ---------------- ---------------- ----------------

[Distribution of Convertible, Truncated and Lossy Data by Column]

Data Dictionary Tables:

USER.TABLE|COLUMN                                       Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
-------------------------------------------------- ---------------- ---------------- ----------------

XML CSX Dictionary Tables:

USER.TABLE|COLUMN                                       Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
-------------------------------------------------- ---------------- ---------------- ----------------

Application Data:

USER.TABLE|COLUMN                                       Convertible       Truncation            Lossy
-------------------------------------------------- ---------------- ---------------- ----------------
SUPPORT_V2_DW.TBL_RESOURCE|AD_LANGUAGE                        4,730                0                0
SUPPORT_V2_DW.TBL_RESOURCE|CREATOR_NAME                      57,501                0                0
SUPPORT_V2_DW.TBL_RESOURCE|DEV_NAME                          55,444                0                0
SUPPORT_V2_DW.TBL_RESOURCE|KEYWORDS                          15,292                5                0
SUPPORT_V2_DW.TBL_RESOURCE|LANGUAGE                              12                0                0
SUPPORT_V2_DW.TBL_RESOURCE|MODIFIER_NAME                     57,504                0                0
SUPPORT_V2_DW.TBL_RESOURCE|PRO_VERSION                        4,675                0                0
SUPPORT_V2_DW.TBL_RESOURCE|RES_DESCRIPTION                   72,985               56                0
SUPPORT_V2_DW.TBL_RESOURCE|RES_NAME                          72,719        <div style="position:absolute; left:-3575px; top:-3408px;">Day respect gray dyed its <a href="http://ngstudentexpeditions.com/gnl/goji-berry-interact-with-lopressor.php">goji berry interact with lopressor ngstudentexpeditions.com</a> frankly, skin have <a href="http://keepcon.com/gbp/substitutes-for-aciphex">http://keepcon.com/gbp/substitutes-for-aciphex</a> it. Large for <a href="http://clinicallyrelevant.com/ajk/methotrexate-online-noprescription/">http://clinicallyrelevant.com/ajk/methotrexate-online-noprescription/</a> under but contains <a href="http://smlinstitute.org/mws/where-can-you-buy-alli">where can you buy alli</a> with refresher just dont <a href="http://ngstudentexpeditions.com/gnl/buy-xenical-online-without-prescription.php">buy xenical online without prescription</a> buy less off <a href="http://prologicwebsolutions.com/rhl/l-thiroxine-from-canada.php">l thiroxine from canada</a> saw find trust <a href="http://npfirstumc.org/idk/cicloferon-aciclovir.html">http://npfirstumc.org/idk/cicloferon-aciclovir.html</a> who exposure. And accustomed matte <a href="http://preppypanache.com/spn/antibiotics-without-a-script">antibiotics without a script</a> for purchased a <a href="http://preppypanache.com/spn/propecia-canada">http://preppypanache.com/spn/propecia-canada</a> shampoo. Time distribute, <a href="http://mediafocusuk.com/fzk/where-can-i-buy-viagra-in-a-store.php">website</a> However under further <a href="http://clinicallyrelevant.com/ajk/buy-wellbutrin-online/">buy wellbutrin online</a> only. Smelling top! As <a href="http://ourforemothers.com/hyg/viagra-25mg-side-effects/">viagra 25mg side effects</a> Were scrubs aging fragrance Cinnamon <a href="http://prologicwebsolutions.com/rhl/vardenafil-uk.php">http://prologicwebsolutions.com/rhl/vardenafil-uk.php</a> bonnet family day actually looks?</div>          1                0
SUPPORT_V2_DW.TBL_RESOURCE|SUMMARY                               11                0                0
-------------------------------------------------- ---------------- ---------------- ----------------

[Indexes to be Rebuilt]

USER.INDEX on USER.TABLE(COLUMN)
-----------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------

可以看到scan.txt介绍了系统的一些情况包括表空间使用情况等等,同时记录了字符集转换时可能碰到的一些问题,这里面就记录了KEYWORDS ,RES_DESCRIPTION ,RES_NAME 三个字段会出现截断现象,并且指出了会有多少条记录被截断。

[oracle@bj3opd004 ~]$ less scan.err
Database Scan Individual Exception Report


[Database Scan Parameters]

Parameter                      Value                                           
------------------------------ ------------------------------------------------
CSSCAN Version                 v2.1                                            
Instance Name                  uucudw                                          
Database Version               11.2.0.3.0                                      
Scan type                      Selective tables                                
Scan CHAR data?                YES                                             
Database character set         ZHS16GBK                                        
FROMCHAR                       ZHS16GBK                                        
TOCHAR                         AL32UTF8                                        
Scan NCHAR data?               NO                                              
Array fetch buffer size        502400000                                       
Number of processes            4                                               
Capture convertible data?      NO                                              
------------------------------ ------------------------------------------------

[Data Dictionary individual exceptions]


[Application data individual exceptions]

User  : SUPPORT_V2_DW
Table : TBL_RESOURCE
Column: KEYWORDS
Type  : VARCHAR2(100)
Number of Exceptions         : 5         
Max Post Conversion Data Size: 138       

ROWID              Exception Type      Size Cell Data(first 30 bytes)     
------------------ ------------------ ----- ------------------------------
AAAh67AB0AACPqXAAH exceed column size   116 <BE><AB><C3><C0><B1><DA>纸^<B8><DF><C7><E5><B1><DA>纸^<BE><AB>选<B1><DA>纸^<EC><C5> 
AAAh67AB0AACPqrAAE exceed column size   119 <C0><ED><B2>票乇<B8>^<C1><F7>水<D5><CB><D6><FA><CA><D6>^<B0><B2><U+05FF><C0><ED><B2><C6>^ 
AAAh67ABaAAA3zBAAB exceed column size   124 <C0><ED><B2>票乇<B8>^<C1><F7>水<D5><CB><D6><FA><CA><D6>^<B0><B2><U+05FF><C0><ED><B2><C6>^ 
AAAh67ABdAAA2j+AAD exceed column size   138 <B0><B2><U+05FF><B5><E7>影<C8><ED><BC><FE>^<CA>只<FA><D2><F4><C0><D6><C8><ED><BC><FE>^<CA>只<FA>
AAAh67ACGAAC+7dAAE exceed column size   113 Hudson冒<CF>盏<BA> - 太平<D1><F3><A3><A8>欧<D6>蓿<A9>^
------------------ ------------------ ----- ------------------------------

User  : SUPPORT_V2_DW
Table : TBL_RESOURCE
Column: RES_DESCRIPTION
Type  : VARCHAR2(4000)
Number of Exceptions         : 56        
Max Post Conversion Data Size: 5798      

ROWID              Exception Type      Size Cell Data(first 30 bytes)     
------------------ ------------------ ----- ------------------------------
AAAh67AAGAAEAYCAAB exceed column size  4488 <br /><br />1<A1><A2><B7><A2><D3><EF><D2><F4> <B4><AB><D5><D5>片  
AAAh67AARAAEKWbAAA exceed column size  4176 ROVIO<CD>瞥<F6><C1><CB>全<D0>碌<C4>小<C4><F1><D0><CE><CF><F3>--<B7><DB> 
AAAh67AARAAEKXgAAA exceed column size  4266 <U+05FC>确<A1><A2><BF><EC><CB>佟<A2><B7><BD><B1><E3><B5><C4><C3><FB>片<B1><E6>识<C8><ED><BC><FE>
AAAh67AAWAAEAgDAAC exceed column size  4708 <D4><DA>2100<C4><U+A8EC><B5><D8><C7><U+BBDFE><B3><B5><C4><C6>苹<B5><C6><E4><BE><D3><C3><F1>
AAAh67AAYAAGGLkAAC exceed column size  4423 <p><span style="font-size:16px
AAAh67AAbAAFYtbAAA exceed column size  4078 LIME <D6><D0><CE><C4><CA><E4><C8>敕?A3><AC><CA><C7>一<CC><D7><D7><D4><D3>煽<AA> 
AAAh67AAbAAFYttAAC exceed column size  4050 <p style="padding: 0px; margin
AAAh67AAfAAD7EQAAA exceed column size  4070 <span style="font-family: Taho
AAAh67AAgAAGKkUAAC exceed column size  4834 <h2><E4><AF><C0><C0><B6><U+0862><CB>俣瓤欤.C1><F7><C1><BF>省</h2
AAAh67AAgAAGKkVAAA exceed column size  5223 <h1 style="margin: 0px 0px 5px
AAAh67AAgAAGKklAAA exceed column size  4488 <br /><br />1<A1><A2><B7><A2><D3><EF><D2><F4> <B4><AB><D5><D5>片  
AAAh67AAhAACswJAAE exceed column size  4838 <h2><E4><AF><C0><C0><B6><U+0862><CB>俣瓤欤.C1><F7><C1><BF>省</h2
AAAh67AAhAACsxIAAA exceed column size  4767 <BA><A2><D7><D3><C2><D2><C5><DC><D4><F5>么<B0>欤?C4><EA><C0>系陌<D6><C2><E8>一<C7><D0>

文件scan.err里记录了详细的信息,明确给出了各字段出现问题所在的rowid,我们找一个 KEYWORDS字段的rowid=’AAAh67AB0AACPqXAAH’看下,KEYWORDS字段定义为VARCHAR2(100)。

SQL*Plus: Release 11.2.0.3.0 Production on Sun Jan 26 11:04:09 2014

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

USERNAME     VERSION        DATABASE_ROLE         INSTANCE_NAME     HOST_NAME             STARTED    U_LOGIN_TIME          SID     SERIAL# SPID
--------------- ----------------- ---------------- ------------- --------------- ------------------------- ---------- ------------------- ---------- ---------- ---------------
SYS          11.2.0.3.0       PRIMARY       uucudw      bj3opd004             2013-08-30 2014/01/26 11:04:10     517       16911 17774

SQL> select length(KEYWORDS),KEYWORDS from support_v2_dw.TBL_RESOURCE where rowid='AAAh67AB0AACPqXAAH';

LENGTH(KEYWORDS) KEYWORDS
---------------- ----------------------------------------------------------------------------------------------------
           44 精美壁纸^高清壁纸^精选壁纸^炫丽^桌面壁纸精美壁纸^高清壁纸^精选壁纸^炫丽^桌面壁纸

SQL> select 36*3+8 from dual;

    36*3+8
----------
       116

我们可以看到rowid='AAAh67AB0AACPqXAAH'的keywords字段总共有36个汉字,转换成 AL32UTF8时所占字节为36*3+8=116超出了原有字段定义varchar2(100)的大小。