Thank you for your response. The triple size will work in most cases.
However, we have noticed the following scenario -
I have a varchar(255) in Oracle. For a particular data item on Oracle,
I apply the Oracle function for length and get 86 character back. I
also apply the Oracle function for byte size and get back 238.
Inserting this same data into DB2 varchar(255) rejects. Inserting
into DB2 varchar(2000) works. DB2 length function returns 466 (this
should be bytes, correct ?). How could this be ?
Phil Sherman - 12 Apr 2006 19:14 GMT
I don't know how Oracle measures the length of a UTF-8 varchar field or
how it physically stores it but UDB stores UTF-8 data as 1-4 bytes for
each character. Characters in pictographic languages, especially
Japanese, Chinese, and other languages used in that part of the world,
are more often the longer three and four byte ones.
A varchar definition specifies the number of bytes, not the number of
characters to be stored. The data capacity of the column can't be
precisely determined because each character may occupy a different
number of bytes.
A single UTF-8 character occupies a maximum of four bytes. Specify a
varchar length four times the number of characters you expect to store
and you'll always have enough space. Unfortunately, this will NOT
prevent storing more characters than you want if each character is
shorter than the 4/character worst case.
Philip Sherman
> Thank you for your response. The triple size will work in most cases.
> However, we have noticed the following scenario -
[quoted text clipped - 6 lines]
> into DB2 varchar(2000) works. DB2 length function returns 466 (this
> should be bytes, correct ?). How could this be ?
the_bandit - 12 Apr 2006 19:28 GMT
Is there any way in which you can change the character semantics for
the field? In particular, I know in oracle you can define a fields data
type to either use byte or character length semantics:
A) varchar2(120 BYTE)
B) varchar2(120 CHAR)
Where A is 120 bytes and B is 120 characters?
Also, just out of curiosity if I were to attempt to load a UTF-16
character into a field in a UTF-8 database should I expect DB2 to use 8
bytes for this?
socrates73@yahoo.com - 12 Apr 2006 19:38 GMT
Is there any way in which you can change the character semantics for
the field? In particular, I know in oracle you can define a fields data
type to either use byte or character length semantics:
A) varchar2(120 BYTE)
B) varchar2(120 CHAR)
Where A is 120 bytes and B is 120 characters?
Also, just out of curiosity if I were to attempt to load a UTF-16
character into a field in a UTF-8 database should I expect DB2 to use 8
bytes for this?
Mark A - 12 Apr 2006 23:54 GMT
> Thank you for your response. The triple size will work in most cases.
> However, we have noticed the following scenario -
[quoted text clipped - 6 lines]
> into DB2 varchar(2000) works. DB2 length function returns 466 (this
> should be bytes, correct ?). How could this be ?
You obviously have too much time on your hands.
rAinDeEr - 13 Apr 2006 06:29 GMT
Hi..
I have a table which I use to store names and other
information(English). Now, the requirement is that the table needs to
store all local language like Japanese, Chinese, Hindhi etc.
What all things do I have to change in the Database.The databse is a
DB2 UTF-8 database.
Also, how can I test if the data is being Inserted and how to retreive
the data.
I dont have a clue. Can some one guide me..
Thanks in advance
RaInDeEr.
Serge Rielau - 13 Apr 2006 08:05 GMT
> Thank you for your response. The triple size will work in most cases.
> However, we have noticed the following scenario -
[quoted text clipped - 6 lines]
> into DB2 varchar(2000) works. DB2 length function returns 466 (this
> should be bytes, correct ?). How could this be ?
The following functions have been introduced in Viper
-------
* CHARACTER_LENGTH
* OCTET_LENGTH
* POSITION
* SUBSTRING
The modified existing functions include:
* LENGTH
* LOCATE
These functions process strings along character boundaries rather than
along byte or double-byte boundaries. Each function (except
OCTET_LENGTH) accepts an argument specifying the code unit, or string
length unit of the result:
* CODEUNITS16 specifies that the result is to be expressed in
16-bit UTF-16 code units
* CODEUNITS32 specifies that the result is to be expressed in
32-bit UTF-32 code units
* OCTETS specifies that the result is to be expressed in bytes
This argument is optional for the existing functions.
----------
More information is needed to figure out the issue on bind-in.
But the fact that the byte length is notes > 400 gives some hint.
Unicode has an n-m relationship between glyphs and code points.
It could be that in your DB2 App you end up with longer code points for
the same glyph.
Just a wild guess. You could compare the HEX() values between the two.
Cheers
Serge

Signature
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
mike_dba - 20 Apr 2006 17:04 GMT
After accepting the fact that for a table with a varchar(255) in
Oracle, I will need to define a DB2 table as varchar (3*255), I find
that I exceed the 32K page size (as someone pointed out earlier) . So
how come Oracle can have a table defined exactly the same as DB2 and
yet fit the data in a 32K page and DB2 cannot ?
mike_dba - 27 Apr 2006 16:17 GMT
This was something missed. By setting the DB2CODEPAGE=1208 variable on
the client side, The data loads correctly. Apparently, the client was
not picking up the locale from the OS. There was no need for me to
expand the table by three.
As an FYI - I have found that Oracle permits page chaining (ie Oracle
can have a row span multiple data pages).
mike_dba - 20 Apr 2006 17:04 GMT
After accepting the fact that for a table with a varchar(255) in
Oracle, I will need to define a DB2 table as varchar (3*255), I find
that I exceed the 32K page size (as someone pointed out earlier) . So
how come Oracle can have a table defined exactly the same as DB2 and
yet fit the data in a 32K page and DB2 cannot ?
mike_dba - 20 Apr 2006 17:08 GMT
Please disregard my last post. As I was misinformed. The Oracle table
has fewer columns.
mike_dba - 20 Apr 2006 17:08 GMT
Please disregard my last post. As I was misinformed. The Oracle table
has fewer columns.