Monday, 22 December 2014 12:45
beggarly

non-latin chars in urls - parsing error in google

An open forum for opinions and general questions

non-latin chars in urls - parsing error in google

Postby yandos » Wed Sep 24, 2008 4:26 pm

Hi everyone,
I'm relatively new xmap, and hope my question is not silly...
i use joomla 1.5.7 and xmap 1.2.
I submitted my xml sitemap to google, and i get a message that it contains errors - parsing errors.
i looked for an answer, and it might be because i have non-Latin utf-8 letters in my urls - found some doc that said its impossible to have those on xml sitemaps.

i use sh404sef to create those urls...

could this be the problem?

that doc said i should change the unicode chars to its unicode symbols e.g. &blabla.

is it possible to overcome this?
*edit: i tried to validate the sitemap with google xml sitemap validation, and got the following error:
Low-level XML well-formedness and/or validity processing output

Error: Input error: Illegal UTF-8 start byte <0x95> at file offset 469
in unnamed entity at line 5 char 60

i tried to access the xml sitemap from my browser but couldn't - got this message:
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.


--------------------------------------------------------------------------------

An invalid character was found in text content. Error processing resource 'http://www.mysite.com/index.php/component/opt...

<loc>http://www.mysite.com/index.php/&#215;&#166;&#215;

thanks again,
Yandos.




thanks,
Yandos.
yandos
Fresh Boarder
Fresh Boarder
 
Posts: 2
Joined: Wed Sep 24, 2008 4:16 pm

Re: non-latin chars in urls - parsing error in google

Postby guilleva » Thu Sep 25, 2008 1:51 pm

Hi, please try the following, in the file components/com_xmap/xmap.xml.php replace
Code: Select all
echo '<loc>', $this->escapeURL($link) ,'</loc>'."\n";


by

Code: Select all
echo '<loc>', $link ,'</loc>'."\n";
User avatar
guilleva
Administrator
Administrator
 
Posts: 1527
Joined: Wed Sep 12, 2007 3:10 am
Location: San José, Costa Rica

Re: non-latin chars in urls - parsing error in google

Postby yandos » Fri Sep 26, 2008 12:01 am

Hey guilleva,
that did the trick right away...
Readable, validated and most important - google doesn't yell about errors...
Thanks a bunch for the quick and professional help,
yandos.
yandos
Fresh Boarder
Fresh Boarder
 
Posts: 2
Joined: Wed Sep 24, 2008 4:16 pm

Re: non-latin chars in urls - parsing error in google

Postby studio64 » Sun Nov 09, 2008 2:01 pm

Hello,
I've the same probleme,
I make change in xmap.xml.php and the probleme is the same!!!

look here: http://www.antiquitesmarsault.fr/index. ... &no_html=1

Have you an idea.

Thank's
Laurent
studio64
Fresh Boarder
Fresh Boarder
 
Posts: 6
Joined: Sun Nov 09, 2008 1:53 pm

Re: non-latin chars in urls - parsing error in google

Postby HADEE_16 » Wed Nov 19, 2008 6:37 pm

http://www.118cd.com/index2.php?option= ... =component
please visit this adress
you can see error
http://www.118cd.com/index2.php?option= ... =component
this is true url
how i can fix this problem
thanks
HADEE_16
Fresh Boarder
Fresh Boarder
 
Posts: 4
Joined: Fri Oct 24, 2008 10:16 am

Re: non-latin chars in urls - parsing error in google

Postby guilleva » Wed Nov 19, 2008 6:42 pm

Hi, your sitemap url is
index.php?option=com_xmap&view=xml

and not that one.

Which version of Xmap/Joomla are you using?
User avatar
guilleva
Administrator
Administrator
 
Posts: 1527
Joined: Wed Sep 12, 2007 3:10 am
Location: San José, Costa Rica

Re: non-latin chars in urls - parsing error in google

Postby jcsarmento » Mon Jun 18, 2012 10:45 pm

hello,

I have this issue also but I have the last versions for sef404 and xmap installed.
somehow the xml has got error on latin characters and this causes the sitemap on google sitemap to fail...

its on http://www.lojanautica.pt/Mapa-do-site-1

any sugestions?

manyb tks
jsarmento
jcsarmento
Fresh Boarder
Fresh Boarder
 
Posts: 1
Joined: Mon Jun 18, 2012 10:32 pm


Return to General



Who is online

Users browsing this forum: Google [Bot] and 2 guests