Customer Discussions > Kindle forum

Unicode support

Sort: Oldest first | Newest first
Showing 1-25 of 26 posts in this discussion
Initial post: Oct 26, 2009 9:06:24 AM PDT
Come on Amazon !
You are selling the Kindle worldwide and I can't even read a book in French !
The Nook supports it !
Could you please add Unicode support ?

Posted on Apr 30, 2010 2:18:31 AM PDT
Frozen River says:
It would be great to have unicode support on kindle. And it does not seem to be a big deal to build it in the system.

In reply to an earlier post on Apr 30, 2010 4:57:06 AM PDT
Last edited by the author on Apr 30, 2010 4:59:56 AM PDT
Dragi Raos says:
You *can* read any book in any Latin version, including, say, Croatian Latin digraphs used only when transcribing Serbian Cyrillic and wanting to keep the character count the same (Latin Extended-B; we never use those in Croatian). However, characters not in basic Latin (ISO 8859-1 or even less) should be encoded as HTML "entities", not simply an Unicode character encoded in UTF-8.

Edit: If I find some time this weekend, I will try to demonstrate this.

In reply to an earlier post on Apr 30, 2010 5:33:46 AM PDT
Paxton says:
I have several books in French on my Kindle. Also German and Spanish (and I believe Italian and Portuguese is available). They use the roman alphabet and text is searchable and typeable via the included keyboard.

Other languages (Cyrillic, Arabic, etc) use totally different symbols and unicode programming would mean using two "bytes" of information for each character instead of one. That can significantly change the programming required to handle such text for retrieval, searching and display, not to mention figuring out how to type in search terms, etc, given the current roman keyboard (adding an entire new alphabet to the special symbols page would work, but would be agonizing to use for every character).

In reply to an earlier post on Apr 30, 2010 6:23:31 AM PDT
Dragi Raos says:
PaxtonReader, it is not as simple as "one byte vs. multibyte" (Kindle currently supports a lot more than 256 characters), but I agree with your concern about annotation, search etc. However, I think that native readers of Russian Cyrillic, Traditional Chinese or Cherokee would prefer the support for reading their scripts without search capability than none at all.

In reply to an earlier post on Apr 30, 2010 6:28:17 AM PDT
sabst79 says:
I agree. Would buy two more Kindles if they added Russian Cyrillic to it. My grandparents would totally love that. They're starting to have a hard time reading print books because of small text.

In reply to an earlier post on Apr 30, 2010 6:32:07 AM PDT
Dragi Raos says:
Yes, and installing the hack won't help, because your grandparents don't need Cyrillic for their own documents, but for books. Until Amazon supports it officially, there will be no Cyrillic books for Kindle (apart, perhaps, for several classics scanned, proof-read and set by enthusiasts).

In reply to an earlier post on Apr 30, 2010 6:42:55 AM PDT
sabst79 says:
Yes, I've tried searching around for some books, but haven't had much luck finding things. Granted I didnt spend too much time on it yet. If I do have more luck, I may consider a hack. But you're right, they don't need it for documents, they need it for books.

In reply to an earlier post on Apr 30, 2010 7:00:59 AM PDT
Last edited by the author on Apr 30, 2010 7:04:47 AM PDT
V. Jacobs says: "The Nook supports it !"

The Nook has unicode support because it has multiple fonts (the Helvitica and Amassis fonts are the unicode ones). The Kindle doesn't have multiple fonts, just font sizes. If you use a font hack you can change the font in the Kindle to a unicode font.

Posted on Apr 30, 2010 7:10:21 AM PDT
Last edited by the author on Apr 30, 2010 7:14:23 AM PDT
Amazon would just be buying itself a lot of grief to support languages before they have the buy-in of publishers of books in those languages. People would just switch from, "Why can't the Kindle support Russian?," to "Why does it support Russian but there're no Russian books?"

In general it's a waste of time for U.S. companies to pursue evangelizing content providers in marginal markets. They just have to come around of their own accord. I live in Japan and I know how it is. The publishing (and music) culture and business just doesn't "get" it yet.* Things will eventually move when the influence from abroad on business and consumers reaches a certain threshold, but that's beyond the power of Amazon alone to influence.

[*In Japan several major labels are not on iTunes yet, and some labels release "CDs" that don't play on computers, so they are not even officially "CDs" according to the standards.]

Posted on Apr 30, 2010 7:28:06 AM PDT
I think we'll see this in the next generation of a Kindle. While it wouldn't be impossible to "add" (it's much work more than an adding it) it to the current code, it would require significant effort since it wasn't included in the initial code effort.

Posted on Apr 30, 2010 7:34:27 AM PDT
Merkin, it appears that the Kindle DOES have internal unicode support - otherwise a Unicode hack wouldn't be possible. However, that doesn't mean that the book formats support unicode; if they don't, adding unicode across the ecosystem would be a very major effort.

In reply to an earlier post on Apr 30, 2010 7:44:46 AM PDT
Dragi Raos says:
David, Kindle *does* support Unicode, and mobi (a.k.a. azw) format does, too*. It is just that Kindle *fonts* support only Latin with almost all of its extensions, some obscure, some not, and Greek. Adding Cyrillic would be almost trivial, once a source of good, not too expensive font is found.

*) However, as far as I understand, mobi support for non-Latin 1 (or even everything other than base Unicode Latin, which is more or less old 7-bit ASCII) requires encoding characters as HTML "entities", at least in the source. While it is very simple to make a script, editor macro, whatever to do required substitution, one ends up with practically non human readable HTML. So, another step in preparing the text is introduced.

In reply to an earlier post on Apr 30, 2010 7:49:01 AM PDT
Dragi Raos says:
"In general it's a waste of time for U.S. companies to pursue evangelizing content providers in marginal markets. "

You think that Russian (plus Ukrainian, Belarus etc) market is marginal? Or Chinese? In this case, it is clear what comes first, chicken or egg: until official Cyrillic (and CJK Unified, Arabic...) support is there, there will be no content, quite obviously.

Posted on Apr 30, 2010 7:49:38 AM PDT
Last edited by the author on Apr 30, 2010 7:57:38 AM PDT
Dragi, which means that supporting non Latin (which is what most of the Unicode proponents here want), really should be done via a .mobi/.azw file format update so that Unicode is a native option rather than using HTML entities (which I agree would be messy, and which would increase the size of books perhaps fivefold).

My larger point, which may have not been clear, is that Amazon may not be particularly interested in adding Unicode only for personal documents, but rather, in doing Unicode across the ecosystem including all of their internal tools, so that they can sell to markets they are not selling to now. That doesn't mean it won't happen, but it's a larger effort.

EDIT: By "sell to markets", I mean "sell E-BOOKS to markets". The value in adding Unicode to them is to open up further retail, not to satisfy a segment who are interested in personal documents.

In reply to an earlier post on Apr 30, 2010 7:52:00 AM PDT
Dragi Raos says:
Aha, I see your point now. I agree.

Posted on Apr 30, 2010 11:37:44 AM PDT
Using Kindlegen, at least, there's an option for forcing Unicode in the output format, and I've successfully used UTF-8 source files with raw special characters (curly quotes, em-dashes, and diacritics as-is; no entity-escaping). The resulting .mobi comes out just fine and displays without problems on my Kindle.

However, you do have to put an XML declaration with the encoding at the top of the HTML sources to get Kindlegen to see it as Unicode *input*, whether or not you also use an OPF, otherwise it does convert as garbage even if you add a meta http-equiv or x-metadata/output encoding line in the HTML or OPF.

Before that, I used to use the mobiperl tools and they indeed had problems with unescaped Unicode characters, which the developer once mentioned was because adding proper Unicode support in Perl was proving so complicated that he just didn't want to bother anymore.

I wish Amazon would put out tools to make Topaz-format books, or at least just support un-DRM ePub with font-embedding. Sure, you'd keep losing your page place and it would probably lock up your Kindle in the bargain, but either would make the whole multilingual script situation a whole lot easier.

In reply to an earlier post on Apr 30, 2010 12:06:22 PM PDT
Dragi Raos says:
Hmm, interesting... HTML "entities" I was talking about are necessary (I think) in Mobipocket eBook Creator HTML source - it is quite possible that the resulting .prc (or is it .mobi?) uses "raw" UTF-8 - that would be a reasonable thing to do. I will try a bit of reverse engineering when I catch some spare time.

In reply to an earlier post on Apr 30, 2010 12:27:35 PM PDT
I read up about this issue about a month ago when I started converting my own .mobi files. I think I recall reading that the default charset for the Mobipocket format is actually Windows-1252, not Latin-1, and thus that's why stuff has to be entitied in the first place, and you have to specify Unicode if you want it.

I have a Mac, so I couldn't use the official Mobipocket software (apparently not Wine-compatible), and when I used the mobiperl tools, yeah, I had to search and replace every non-ASCII character, otherwise it would come out wrong.

Incidentally, if anyone thought of trying the suggested hack to edit one of the mobiperl support files to hopefully enable Unicode as the default charset: it doesn't work.

Also, if you don't remember to switch it back to Windows-1252, even if you only ever convert ASCII files again, you'll end up with .mobi where everything mostly seems okay, but at certain points a word or two will be totally replaced with seeming typos or random garbage characters when you read them on the Kindle device, which you won't be able to track down in the original source or even if you convert back out to try to debug.

If you've ever wondered why this mysteriously happened to a file you converted with mobiperl, well, that's why.

Posted on Jun 19, 2012 2:10:17 AM PDT
LI LI says:
I've searched a lot of hacks but fear that my kindle becomes a brick. Can Amazon support built-in Unicode please? I once own a kindle 3 WIFI/free 3G, which supports Unicode quite well.
It is sad that an end user has to find ways to crack the official build, in order to read a Unicode book....

In reply to an earlier post on Jun 19, 2012 2:25:41 AM PDT
@LI LI: If you want to be certain that someone from Amazon hears your feedback, please send an email to - we are mainly Kindle owners here, not from Amazon.

In reply to an earlier post on Jun 19, 2012 5:24:48 AM PDT
It would appear that the Kindle Touch is fine with unicode. I just downloaded a book in Chinese from Gutenberg and copied it to my kindle. I see lots of nice Chinese characters, that I can't read, but they look fine.

Here is the book I tried: (Now realize that I know NOTHING about his book, it might be a technical manual, or Ming Dynasty erotica for all I know.)

In reply to an earlier post on Jun 19, 2012 6:46:47 AM PDT
Dragi Raos says:
K4PC looks OK with that, too...

Posted on Sep 20, 2013 4:44:35 AM PDT
I need some advice as to how to proceed in creating a Russian language textbook that requires accent marks over Cyrillic vowels, for example "я́". Cyrillic Unicode is now supported on Kindles.

Accented Russian vowels in Unicode require the vowel letter plus U+0301, a combining acute accent mark.

But different Kindle devices handle U+0301 in different ways. The monochrome Kindle devices display the accent mark, although on some devices (like my $69 Kindle), the accent mark is off-center, slightly to the right. (It shows up correctly in Kindle Previewer.)

But in Kindle Fire (according to the results from Kindle Previewer), the accented vowels don't show up in any readable way. The entire word has letters superimposed one on another. There is a workaround, for the Kindle Fires, I can remove the Unicode accent mark ad replace it with a character from a special accent font (made up of Russian accented vowels). Then all I have to do is embed the font.

So here are my questions for those in the know:

1. Kindle promises that font embedding with become universal on all Kindles. In the near future?
2. Will old Kindles be updatable (software? firmware?)
3. Am I best off preparing two separate mobi files and distributing the textbook in separae versions: one for color devices and one for monochrome?
4. Any other advice as to how to combine an acute accent mark and Cyrillic Unicode vowels.

In reply to an earlier post on Sep 20, 2013 4:58:04 AM PDT
Dragi Raos says:
Richrad, I might be wrong, but I believe that K8 format supports embedded fonts. I think that devices from K3/KK up have been upgraded to support the new format, but quality of support might vary.

I think you are more likely to get a well informed answer at these places:
‹ Previous 1 2 Next ›
[Add comment]
Add your own message to the discussion
To insert a product link use the format: [[ASIN:ASIN product-title]] (What's this?)
Prompts for sign-in

Recent discussions in the Kindle forum

Discussion Replies Latest Post
NEWS: Kindle for iOS Software Update 4.5
9 3 hours ago
Update re: Amazon/Hachette Business Interruption
0 Jul 29, 2014
Hachette/Amazon Business Interruption
0 May 27, 2014
Have a Kindle Question?
0 Sep 12, 2012
Text-to-speech on Voyage Kindle? 67 1 minute ago
app store not working 38 1 minute ago
Australian/Kiwi kindle users thread #4 - "the BBQ" 362 10 minutes ago
Petition for international release of Kindle Voyage SOON! 13 13 minutes ago
Shipping 2 14 minutes ago
Free Books and Chat - Friday, Sept. 19, 2014 9 16 minutes ago
Discounted / Price Dropped Kindle eBooks III 3367 20 minutes ago
Will it have these features? 3 21 minutes ago

This discussion

Discussion in:  Kindle forum
Participants:  14
Total posts:  26
Initial post:  Oct 26, 2009
Latest post:  Sep 22, 2013

New! Receive e-mail when new posts are made.
Tracked by 2 customers

Search Customer Discussions