Text

A long time ago
In galaxy not far away at all
3200 BCE 𒁹 𒈫 𒐍 𒀭 𒅅 𒊓 𒋟
2000 BCE 𓇍𓅓𓊵 𓃾 𓉐 𓌙 𓇯 𓀠 𓏲 𓏭 𓈈 𓄤 𓂝 𓂧 𓌅 𓈖 𓆓 𓊽 𓁹 𓂋 𓇑 𓃻 𓁶 𓌓 𓏴
1200 BCE 目 人 木 魚 马 马 豕 丹佛你好
1100 BCE 𐤕 𐤔 𐤓 𐤒 𐤑 𐤐 𐤏 𐤎 𐤍 𐤌 𐤋 𐤊 𐤉 𐤈 𐤇 𐤆 𐤅 𐤄 𐤃 𐤂 𐤁 𐤀
800 BCE 𐡕 𐡔 𐡓 𐡒 𐡑 𐡐 𐡏 𐡎 𐡍 𐡌 𐡋 𐡊 𐡉 𐡈 𐡇 𐡆 𐡃 𐡄 𐡃 𐡂 𐡁 𐡀
Α Β Γ Δ Ε Ϝ Ζ Η Θ Ι Κ Λ 𐌌 Ν Ξ Ο Π Ϻ Ϙ Ρ Σ Τ Ͳ Υ Φ Χ Ψ Ω
700 BCE 𐌀 𐌂 𐌄 𐌅 𐌆 𐌇 𐌈 𐌉 𐌊 𐌋 𐌌 𐌍 𐌐 𐌑 𐌒 𐌓 𐌔 𐌕 𐌖 𐌗 𐌘 𐌙 𐌚
A B C D E F Z H I K L M N O P Q R S T V X
200 BCE 𑀓 𑀔 𑀕 𑀖 𑀗 𑀘 𑀙 𑀚 𑀛 𑀜 𑀝 𑀞 𑀟 𑀠 𑀡 𑀢 𑀣 𑀤 𑀥 𑀦 𑀧 𑀨 𑀩 𑀪 𑀫 𑀬 𑀭 𑀮 𑀯 𑀰 𑀱 𑀲 𑀳
𝋠 𝋡 𝋢 𝋣 𝋤 𝋥 𝋦 𝋧 𝋨 𝋩 𝋪 𝋫 𝋬 𝋭 𝋮 𝋯 𝋰 𝋱 𝋲 𝋳
200 CE ᚠ ᚢ ᚦ ᚨ ᚱ ᚲ ᚷ ᚹ ᚻ ᚾ ᛁ ᛃ ᛇ ᛈ ᛉ ᛋ ᛏ ᛒ ᛖ ᛗ ᛚ ᛜ ᛝ ᛟ ᛞ
400 CE a b c d e f h i j k l m n o p q r s t v x z
                                                                                    السلام عليكم
600 CE 挨拶
ༀ ༁ ༂ ༃ ༄ ༅ ༆ ༇ ༈ ༉ ༊ ༑ ༒ ༓ ༔ ༕ ༖ ༗ ࿐ ࿑ ࿒ ࿓ ࿉ ࿊ ࿋ ࿌
900 CE こんにちは
А В Г Д Е Ж Ѕ Ꙁ И І К Л М О П Р С Т ОУ Ф Х Ѡ Ц Щ ЪІ Ѣ Ꙗ Ѥ Ю Ѫ Ѭ Ѧ Ѩ Ѯ Ѱ Ѳ Ѵ Ҁ
ハロー
नमस्ते, दुनिया
1400 CE 안녕하십니까
3200 BCE □□□□□□□□□□□□□□□□□□□□□□□□□
2000 BCE ð“‡ð“…“𓊵 𓃾 𓉠𓌙 𓇯 ð“€  𓲠𓭠𓈈ˆ– 𓆠𓂋 𓇑 𓃻 𓶠𓌓 ð“´
1200 BCE 目 人 木 魚 马 马 豕 丹佛你好
1100 BCE ?% ?% ?# ?%? ?" ?" ?% "e% R ?キ?Q ?%S ?%T O%P ?%U ?%W ?%X
800 BCE ð¡• ð¡” ð¡“ ð¡’ ð¡‘ ð¡ ð¡ ð¡Ž ð¡ 𡌠𡋠𡊠𡉠𡈠𡇠𡆠𡃠𡄠𡃠𡂠ð¡ ð¡€
Α Î’ Γ Δ Ε Ïœ Ζ Η Θ Ι Κ Λ ðŒŒ Î Îž Ο Π Ϻ Ϙ Ρ Σ Τ Ͳ Î¥ Φ Χ Ψ Ω
700 BCE ðŒ€ ðŒ‚ ðŒ„ ðŒ… ðŒ† ðŒ‡ ðŒˆ ðŒ‰ ðŒŠ ðŒ‹ ðŒŒ ðŒ ðŒ ðŒ˜ ðŒ™ ðŒš
A B C D E F Z H I K L M N O P Q R S T V X
200 BCE 𑀓 ð‘€” 𑀕 ð‘€– ð‘€— 𑀘 ð‘€™ 𑀚 ð‘€› 𑀜 𑀠𑀞 𑀟 ð€¯ ð‘€° ð‘€± ð‘€² ð‘€³
ð‹ð‹¡ ð‹¢ ð‹£ ð‹¤ ð‹¥ ð‹¦ ð‹§ ð‹¨ ð‹© ð‹ª ð‹« ð‹¬ ð‹­ ð‹® ð‹¯ ð‹° ð‹± ð‹² ð‹³
200 CE áš  ᚢ ᚦ ᚨ áš± áš² áš· áš¹ áš» áš¾ ᛠᛃ ᛇ ᛈ ᛉ ᛋ á› á›’ á›– á›— ᛚ ᛜ á› á
400 CE a b c d e f h i j k l m n o p q r s t v x z
السلام عليكم
600 CE 挨拶
ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?? ァ。?ィ「ゥ ァ。?ィィ ァ。?ィェ ァ。?? ァ。?? ァ。?? ァ。?「
900 CE ã“ã‚“ã«ã¡ã¯
Ð Ð’ Г Д Е Ж Ð… Ꙁ И І К Л Ðœ РО П Р ÐП ¡¦ Ѩ Ñ® Ñ° Ѳ Ñ´ Ò€
ãƒãƒ­ãƒ¼
नमसà¥à¤¤à¥‡, दà¥à¤¨à¤¿à¤¯à¤¾
1400 CE 안녕하십니까

Text is complicated

What do you mean when you say nothing at all ?
- Kate Gregory

What do you mean?
- Kevlin Henney

What do you mean when you say æ–‡å—化け?

Text is just semantics

No need to care about text...

... unless you have users

  • Communication
  • Payroll
  • Trading
  • Commerce
  • UI
  • User-facing i/o
  • CppCon Badges

ASCII ?

ASCII

Seriously Can't Interchange Information

  • ångströms
  • café
  • resumé
  • piñata
  • Beyoncé
  • naïve
  • façade
  • hāngi
  • belovèd
  • latté
0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | }Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě Ĝ ĝ Ğ ğ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ ŀ Ł ł Ń ń Ņ ņ Ň ň ʼn Ŋ ŋ Ō ō Ŏ ŏ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş Š š Ţ ţ Ť ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů Ű ű Ų ų Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž ſ ƀ Ɓ Ƃ ƃ Ƅ ƅ Ɔ Ƈ ƈ Ɖ Ɗ Ƌ ƌ ƍ Ǝ Ə Ɛ Ƒ ƒ Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ɵ Ơ ơ Ƣ ƣ Ƥ ƥ Ʀ Ƨ ƨ Ʃ ƪ ƫ Ƭ ƭ Ʈ Ư ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ ƹ ƺ ƻ Ƽ ƽ ƾ ƿ ǀ ǁ ǂ ǃ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ Ǡ ǡ Ǣ ǣ Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ ǰ DZ Dz dz Ǵ ǵ Ƕ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ ȉ Ȋ ȋ Ȍ ȍ
Latin:  >1000 characters
Ȏ ȏ Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ Ƞ ȡ Ȣ ȣ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ ȯ Ȱ ȱ Ȳ ȳ ȴ ȵ ȶ ȷ ȸ ȹ Ⱥ Ȼ ȼ Ƚ Ⱦ ȿ ɀ Ɂ ɂ Ƀ Ʉ Ʌ Ɇ ɇ Ɉ ɉ Ɋ ɋ Ɍ ɍ Ɏ ɏ Ḁ ḁ Ḃ ḃ Ḅ ḅ Ḇ ḇ Ḉ ḉ Ḋ ḋ Ḍ ḍ Ḏ ḏ Ḑ ḑ Ḓ ḓ Ḕ ḕ Ḗ ḗ Ḙ ḙ Ḛ ḛ Ḝ ḝ Ḟ ḟ Ḡ ḡ Ḣ ḣ Ḥ ḥ Ḧ ḧ Ḩ ḩ Ḫ ḫ Ḭ ḭ Ḯ ḯ Ḱ ḱ Ḳ ḳ Ḵ ḵ Ḷ ḷ Ḹ ḹ Ḻ ḻ Ḽ ḽ Ḿ ḿ Ṁ ṁ Ṃ ṃ Ṅ ṅ Ṇ ṇ Ṉ ṉ Ṋ ṋ Ṍ ṍ Ṏ ṏ Ṑ ṑ Ṓ ṓ Ṕ ṕ Ṗ ṗ Ṙ ṙ Ṛ ṛ Ṝ ṝ Ṟ ṟ Ṡ ṡ Ṣ ṣ Ṥ ṥ Ṧ ṧ Ṩ ṩ Ṫ ṫ Ṭ ṭ Ṯ ṯ Ṱ ṱ Ṳ ṳ Ṵ ṵ Ṷ ṷ Ṹ ṹ Ṻ ṻ Ṽ ṽ Ṿ ṿ Ẁ ẁ Ẃ ẃ Ẅ ẅ Ẇ ẇ Ẉ ẉ Ẋ ẋ Ẍ ẍ Ẏ ẏ Ẑ ẑ Ẓ ẓ Ẕ ẕ ẖ ẗ ẘ ẙ ẚ ẛ ẜ ẝ ẞ ẟ Ạ ạ Ả ả Ấ ấ Ầ ầ Ẩ ẩ Ẫ ẫ Ậ ậ Ắ ắ Ằ ằ Ẳ ẳ Ẵ ẵ Ặ ặ Ẹ ẹ Ẻ ẻ Ẽ ẽ Ế ế Ề ề Ể ể Ễ ễ Ệ ệ Ỉ ỉ Ị ị Ọ ọ Ỏ ỏ Ố ố Ồ ồ Ổ ổ Ỗ ỗ Ộ ộ Ớ ớ Ờ ờ Ở ở Ỡ ỡ Ợ ợ Ụ ụ Ủ ủ Ứ ứ Ừ ừ Ử ử Ữ ữ Ự ự Ỳ ỳ Ỵ ỵ Ỷ ỷ Ỹ ỹ Ỻ ỻ Ỽ ỽ Ỿ ỿ ꜠ ꜡ Ꜣ ꜣ Ꜥ ꜥ Ꜧ ꜧ Ꜩ ꜩ Ꜫ ꜫ Ꜭ ꜭ Ꜯ ꜯ ꜰ ꜱ Ꜳ ꜳ Ꜵ ꜵ Ꜷ ꜷ Ꜹ ꜹ Ꜻ ꜻ Ꜽ ꜽ Ꜿ ꜿ Ꝁ ꝁ Ꝃ ꝃ Ꝅ ꝅ Ꝇ ꝇ Ꝉ ꝉ Ꝋ ꝋ Ꝍ ꝍ Ꝏ ꝏ Ꝑ ꝑ Ꝓ ꝓ Ꝕ ꝕ Ꝗ ꝗ Ꝙ ꝙ Ꝛ ꝛ Ꝝ ꝝ Ꝟ ꝟ Ꝡ ꝡ Ꝣ ꝣ Ꝥ ꝥ Ꝧ ꝧ Ꝩ ꝩ Ꝫ ꝫ Ꝭ ꝭ Ꝯ ꝯ ꝰ ꝱ ꝲ ꝳ ꝴ ꝵ ꝶ ꝷ ꝸ Ꝺ ꝺ Ꝼ ꝼ Ᵹ Ꝿ ꝿ Ꞁ ꞁ Ꞃ ꞃ Ꞅ ꞅ Ꞇ ꞇ ꞈ ꞉ ꞊ Ꞌ ꞌ Ɥ ꞎ ꞏ Ꞑ ꞑ Ꞓ ꞓ ꞔ ꞕ Ꞗ ꞗ Ꞙ ꞙ Ꞛ ꞛ Ꞝ ꞝ Ꞟ ꞟ Ꞡ ꞡ Ꞣ ꞣ Ꞥ ꞥ Ꞧ ꞧ Ꞩ ꞩ Ɦ Ɜ Ɡ Ɬ Ɪ ꞯ Ʞ Ʇ Ʝ Ꭓ Ꞵ ꞵ Ꞷ ꞷ ꟷ ꟸ ꟹ ꟺ ꟻ ꟼ ꟽ ꟾ ꟿ ꬰ ꬱ ꬲ ꬳ ꬴ ꬵ ꬶ ꬷ ꬸ ꬹ ꬺ ꬻ ꬼ ꬽ ꬾ ꬿ ꭀ ꭁ ꭂ ꭃ ꭄ ꭅ ꭆ ꭇ ꭈ ꭉ ꭊ ꭋ ꭌ ꭍ ꭎ ꭏ ꭐ ꭑ ꭒ ꭓ ꭔ ꭕ ꭖ ꭗ ꭘ ꭙ ꭚ ꭛ ꭜ ꭝ ꭞ ꭟ ꭠ ꭡ ꭢ ꭣ ꭤ ꭥ

Latine non est responsum

wchar_t
wchar_t

Unicode

&

UTF-8

Unicode

1988

Unicode

1988

1996

Characters per character sets

Writing systems are legacy
 

Writing systems are legacy patrimoine

All programming languages supporting Unicode...

  • Swift
  • Java/JavaScript
  • Python 3
  • Safe Rust
  • D
  • Perl
  • PHP
  • C#

 

... Except

  • C
  • C++

Libraries

  • ICU
  • Qt
  • Boost.Text
  • std::locale

SG-16 is working on it

  • C++20: char8_t
  • C++23: Many proposals

We meet every couple of weeks to talk about cafés and ıstanbul

SG-16 is working on it

  • C++20: char8_t
  • C++23: Many proposals

We meet every couple of weeks to talk about cafés and ıstanbul Istanbul

SG-16 is working on it

  • C++20: char8_t
  • C++23: Many proposals

We meet every couple of weeks to talk about cafés and ıstanbul Istanbul istanbul

SG-16 is working on it

  • C++20: char8_t
  • C++23: Many proposals

We meet every couple of weeks to talk about cafés and ıstanbul Istanbul istanbul Constantinopole

Maybe Text is too complicated

Let's go back to hieroglyphs 2.0

𓁶 𓁃 𓁹 𓀠 𓆛 𓅓 𓂝 𓆉
🙂 👨‍🌾 👁️ 🤷 🐟 🦉 💪 🐢

7 bits of information

  • Unicode consortium
  • Google□s Noto project
  • Windows 1903
    (please update all windows system by the next WG21 meeting)
  • Peter Bindels' Unicode talk
  • JeanHeyd Meneide's encoding talk
  • Hana Dusァ?\kovァ?Q's CTRE
  • Michaャヨヲ」 Dominiak, Martin Hoャヨ。テeャヨヲィovskャ?k , ?ミ□?ミ□?ミ□ミーミイ, ?犂ワコ?省???祥ナ?, ...