Arabic user data that was captured with our web-to-lead form occasionally ends up Mojibake in our lead table. A user would type something like:
الإعلان العالمى لحقوق الإنسان
When we retrieve the message from the database, it reads:
الإعلان العالمى Ù„Øقوق الإنسان
The form is in an embedded iframe page with these tags:
<!DOCTYPE HTML>
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type" />
<!-- other header elements -->
</head>
<body>
<form accept-charset="utf-8" action="https://www.salesforce.com/servlet/servlet.WebToLead?encoding=UTF-8" method="post">
<!-- other body elements -->
</body>
</html>
After first noticing the problem I added the accept-charset
attribute to the form tag. Is there anything more I can do in the page markup that can prevent the problem?
Since the character scramble only happens occasionally, what is the best way to try and replicate / isolate the problem? We have several triggers running after a lead insert and logging all web-to-lead activity generate a fantastic amount of raw data to wade through to try and spot the occasional culprit. Not sure what to look for anyway. User agent signature? Characters falling outside/inside a certain unicode range?
Thanks!
Attribution to: JannieT
Possible Suggestion/Solution #1
JannieT, I think the issue may be due to the character encoding specified on the pages where your Web-to-Lead form is placed. For example, the same Web-to-Lead form on a page that uses UTF-8 encoding will produce entirely different data in your org when placed on a page using ISO 8859-1 encoding.
Can you see whether this is the case with the pages that are producing problematic leads? Take a closer look at the meta
elements on your W2L pages, and check to see whether charset
is being set to UTF-8.
Attribution to: Marty C.
This content is remixed from stackoverflow or stackexchange. Please visit https://salesforce.stackexchange.com/questions/33933