How to delete my avatar?
xsl transformation encoding problem with unix
Hello Guest
  
  • Login
• Register…
• Start blog
  • Who, Where, When
• What is interesting here?
• Duels
  • Polls
• Avatars
• Interests
  • Cities and Countries
• Random blog
• Users search
  • Search
• Games
• Tests
• QAIX
  • Сообщества
• Talxy Chat
• Horoscope
• Online
 
Register!

QAIX > Java Programming > xsl transformation encoding problem with unix 13 January 2005 10:40:47

  Top users: 
  Recent blog posts: 
  Forums:   
  Discuss: 
  Recent forum topics: 
  Recent forum comments:
  Модератор:

xsl transformation encoding problem with unix

Engin Ertilav 12 January 2005 18:52:16
 Hi,



we are using this code to perform transform. (it is changed from sample
code UseStylesheetPI)



it simply gets stylesheet from xml and performs transform. In addition
to that we are working on turkish files and so it changes

charset encoding for input and output xml files.



This works fine on Windows2000 but on Solaris Unix it ignores new
charset and result xml file contains char codes like ğ Ş



Is there a solution for that? Or is it a bug? XALAN version is 2.6.0.



Thanks in advance.



public static void main(String[] args)

throws TransformerExceptio­n,
TransformerConfigur­ationException

{

String media= null , title = null, charset = null;

try

{

TransformerFactory tFactory =
TransformerFactory.­newInstance();

Source stylesheet = tFactory.getAssocia­tedStylesheet

(new StreamSource("x.xml­"),media, title,
charset);



Transformer transformer =
tFactory.newTransfo­rmer(stylesheet);



//create input stream with special encoding

FileInputStream fi = new FileInputStream("x.­xml");

InputStreamReader i = new
InputStreamReader(f­i,"ISO8859_9");

StreamSource so = new StreamSource(i);



//create output stream with special encoding

FileOutputStream f = new FileOutputStream("x­out.xml");

OutputStreamWriter o = new
OutputStreamWriter(­f,"ISO8859_9");

StreamResult s = new StreamResult(o);



transformer.transfo­rm(so, s);



fi.close();

i.close();

o.close();

f.close();



}

catch (Exception e)

{

e.printStackTrace()­;

}

}



This is my xsl header :

<?xml version="1.0" encoding="ISO-8859-­9"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://w­ww.w3.org/1999/XSL/T­ransform"
xmlns:xalan="http:/­/xml.apache.org/xslt­">

<xsl:output method="xml" indent="yes" encoding="ISO-8859-­9"
xalan:indent-amount­="3"/>



and this is my xml header :

<?xml version="1.0" encoding="ISO-8859-­9"?>

<?xml-stylesheet type="text/xsl" href="myxsl.xsl"?>



Add comment
Brian Minchau 13 January 2005 08:30:52 permanent link ]
 



Engin,

I reproduced your problem, but with some differences. Your call to
Source stylesheet = tFactory.getAssocia­tedStylesheet(new
StreamSource("x.xml­"),media, title,charset);
gave me a null, so I changed your code to this:

package jan12;

import java.io.FileInputSt­ream;
import java.io.FileOutputS­tream;
import java.io.InputStream­Reader;
import java.io.OutputStrea­mWriter;

import javax.xml.transform­.Source;
import javax.xml.transform­.Transformer;
import javax.xml.transform­.TransformerConfigur­ationException;
import javax.xml.transform­.TransformerExceptio­n;
import javax.xml.transform­.TransformerFactory;­
import javax.xml.transform­.stream.StreamResult­;
import javax.xml.transform­.stream.StreamSource­;

public class Jan12 {
public static void main(String[] args)
throws TransformerExceptio­n, TransformerConfigur­ationException {

String media = null, title = null, charset = null;

try {

TransformerFactory tFactory = TransformerFactory.­newInstance();
StreamSource ss = new StreamSource("jan12­/x.xsl");

final Transformer transformer;
transformer = tFactory.newTransfo­rmer(ss);

//create input stream with special encoding

FileInputStream fi = new FileInputStream("ja­n12/x.xml");

InputStreamReader i = new InputStreamReader(f­i, "ISO8859_9");

StreamSource so = new StreamSource(i);

//create output stream with special encoding

FileOutputStream f = new FileOutputStream("x­out.xml");

OutputStreamWriter o = new OutputStreamWriter(­f, "ISO8859_9");

StreamResult s = new StreamResult(o);

transformer.transfo­rm(so, s);

fi.close();

i.close();

o.close();

f.close();

} catch (Exception e) {

e.printStackTrace()­;

}

}

}


The input x.xml was irrelevant, because I used this stylesheet for x.xsl:
<?xml version="1.0" encoding="ISO-8859-­9"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://w­ww.w3.org/1999/XSL/T­ransform">
<xsl:output method="xml" indent="yes" encoding="ISO-8859-­9" />

<xsl:template match="/">
<out>char 287:&#287; char 350:&#350; Dotted capital I char 304 &#304;</out>
</xsl:template>

</xsl:stylesheet>


The behavior is different depending on whether the Java Class
sun.io.CharToByteCo­nverter is available or not.

I suspect that when your run on windows the class is there, but on your
UNIX system the JRE is different and the class is not available. You can
add this to you Java code:
Class clazz = Class.forName("sun.­io.CharToByteConvert­er");
and test whether clazz is null in one environment but not the other. I
suspect that when this class is available that you get the correct output.

When this class is not available it looks like it exposes a configuration
error in Xalan in its Encodings.propertie­s file in the
org.apache.xml.seri­alizer package. It has information for the Turkish
characters in lines like this:
ISO8859_9 ISO-8859-9 0x00FF
ISO8859-9 ISO-8859-9 0x00FF
The third word on the line, 0x00FF indicates the code point of the highest
value used in the character set. In base 10 this value is 255. But these
Turkish characters are 287, 350, 304, which is bigger than 255. When
writing the characters to the output file, the serializer thinks the
unicode characters are out of range because they are larger than the
supposed maximum codepoint value. So the serializer converts them to
numerical character references, e.g. the five characters &#304; rather than
the single unicode character with a code point of 304.

At this point I'm not sure what the correct maximal code point value is for
this character set, but I think that getting the value right might fix your
problem.

Please open a defect in JIRA ( http://issues.apach­e.org/jira/ ) against
XalanJ2.





----------
Brian Minchau
XSLT Development, IBM Toronto
e-mail: minchau@ca.ibm.com


Add comment
Engin Ertilav 13 January 2005 10:40:47 permanent link ]
 Hi,

My problem is solved. I find out that my input xml contains Turkish characters with these codes :

Ећ : 222
Д° : 221 ...

All my turkish characters are in range of 0-255. so i tried to give encoding ISO8859_1 for both of input and output streams.(it works without giving them because it is default for streams on unix...) Now my file is correct.

I also checked sun.io.CharToByteCo­nverter, and it is available.

In my opinion when i used ISO8859_9 it converts my turkish characters to their ISO8859_9 equivalent codes. (350,304...)

It is interesting that when i perform transform with this code piece :

transformer.transfo­rm(new StreamSource("xin.x­ml"),
new StreamResult(new java.io.FileOutputS­tream("xout.xml")));­

it does not work and again it produces 350,304 codes for some characters. Maybe it uses stylesheet encoding.

Thanks for help and quick response.

Regards.

Engin ERTД°LAV


Add comment
 

Add new comment

As:
Login:  Password:  
 
 
  
 
Пожалуйста, относитесь к собеседникам уважительно, не используйте нецензурные слова, не злоупотребляйте заглавными буквами, не публикуйте рекламу и объявления о купле/продаже, а также материалы нарушающие сетевой этикет или законы РФ. Ваш ip-адрес записывается.


QAIX > Java Programming > xsl transformation encoding problem with unix 13 January 2005 10:40:47

see also:
Class autoload patch
Performance of Apache 2.0 Filter
Apache 2 support
pass tests:
Do you know women?
see also:
wholesale burberry t shirt afficiton…

  Copyright © 2001—2010 QAIX
Идея: Монашёв Михаил.
Авторами текстов, изображений и видео, размещённых на этой странице, являются пользователи сайта.
See Help and FAQ in the community support.qaix.com.
Write in the community about the bugs you have noticedbugs.qaix.com.
Write your offers and comments in the communities suggest.qaix.com.
Information for parents.
Пишите нам на .
If you would like to report an abuse of our service, such as a spam message, please .
Если Вы хотите пожаловаться на содержимое этой страницы, пожалуйста .