/r/Unicode

Photograph via snooOG

Discussing Unicode, Unicode characters and Unicode-related tools.

This reddit is for the clever use of unicode in the creation of art, emoticons, or other clever uses of the characters allowed in unicode.

/r/Unicode

5,635 Subscribers

3

Why is UTF-8 so sparse? Why have overlong sequences?

UTF-8 could avoid overlong encodings and be more efficient by indexing from some offset in sequences that consist of multiple bytes instead of starting from 0.

For example:

If the sequence is 2 bytes long then those bytes will be 110abcde 10fghijk and the codepoint will be abcdefghijk (where each variable is a bit and is concatenated, not multiplied).

But why not make it so that instead the codepoint is equal to abcdefghijk + 10000000 (in binary)? Adding 128 would get rid of overlong sequences of 2 bytes and would make 128 characters 2 bytes long instead of 3 bytes long.

For example, with this encoding 11000000 10100000 would not be an overlong space (codepoint 32), but instead would refer to codepoint 32+128, that is, 160.

In general, if a sequence is n bytes then we would add one more than the highest code point representable with n-1 bytes (e.g., with two bytes add 128 because the highest code point of 1 byte is 127 and one more than that is 128).

I hope you get what I mean. I find it difficult to explain, and I find it even more difficult to understand why UTF-8 was not made more efficient and secure like this.

7 Comments
2024/12/20
17:50 UTC

1

Is there any flipped Ɥ?

Ok guys don't lie Ɥ but flipped looks cool

5 Comments
2024/12/20
06:51 UTC

1

1 Comment
2024/12/18
19:03 UTC

0

I want a blank name on this game called Mine-Craft .io

yeah so everything ingame shows as a "?" so can someone find me a symbol that works? ty

12 Comments
2024/12/18
05:53 UTC

29

Why does 𓂺 render as a box on windows, but normally if prepended/appended by another glyph, like 𓂺𓂻/𓂻𓂺?

I spent an afternoon learning about byte encoding and codemaps, and I still can't figure out exactly why the former doesn't work, but the latter does, it's just morbid curiousity at this point >_>

7 Comments
2024/12/16
17:16 UTC

4

Why is there no jelly/jam emoji?

I looked it up on unicode.org and there were two requests for jelly and jam emojis, but they were both rejected. I think it's very silly that they have so many other niche emojis, but not one for a very common food item. What are your thoughts on this?

2 Comments
2024/12/16
03:56 UTC

1

Are there an invisible characters which work analogously to this fellow: _

I'm trying to make some funky looking text for a YouTube video but I'm working with a video editor that isn't very friendly and won't let me move text boxen around when it's doing a specific effect and I very much want to have the text boxen do that effect in a different place so I'm pushing around the letters with zero width characters but they're not formatting correctly in line with the visible characters because the visible characters include an underscore. Actually I might also need to find an invisible character which is read as *not* being an underscore-like fellow as well because it's only allowing me to put underscores in the right places by putting non-underscore characters in and I would like those to be invisible as well.
What an odd life it is sometimes no?

1 Comment
2024/12/14
21:05 UTC

2

Why if I search "achive L2/06-369" I just can see a html page, and not a PDF document request?

I've been having problems with this, because most of the old unicode proposals from the year 1993 are not online, or I guess you need to pay money to see the pdfs

1 Comment
2024/12/13
23:29 UTC

4

Why have surrogate characters and UTF-16?

I know how surrogates work. but I do not understand why UTF-16 is made to require them, and why Unicode bends over backwards to support it. Unicode wastes space with those surrogate characters that are useless in general because they are only used by one specific encoding.

Why not make UTF-16 more like UTF-8, so that it uses 2 bytes for characters that need up to 15 bits, and for other characters sets the first bit of the first byte to 1, and then has a bunch of 1s fillowed by a 0 to indicate how many extra bytes are needed. This encoding could still be more efficient than UTF-8 for characters that need between 12 and 15 bits, and it would not require Unicode to waste space with surrogate characters.

So why does Unicode waste space for generally unusable surrogate characters? Or are they actually not a waste and more useful than I think?

8 Comments
2024/12/13
11:34 UTC

2

Proposed Blocks Idea

These are all the proposed blocks for when in the future, I will be accepting more blocks to future versions

Unicode 17.0

  • Sidetic
  • Sharada Supplement
  • Tolong Siki
  • Archaic Cuneiform Numerals (Will be Accepted in January 2025)
  • Chisoi
  • Beria Erfe
  • Tangut Components Supplement
  • Miscellaneous Symbols Supplement
  • Tai Yo
  • CJK Unified Ideographs Extension J

Provisionally Assigned

  • Proto-Sinaitic (Provisionally Assigned in April 2025)
  • Landa (Provisionally Assigned in April 2025)
  • Mwangwego (Provisionally Assigned in January 2025)
  • Jurchen
  • Jurchen Radicals
  • Musical Symbols Supplement
  • Western Cham ( or Cham Supplement) (Provisionally Assigned in January 2025)
  • Persian Siyaq Numbers (Provisionally Assigned in July 2025)
4 Comments
2024/12/09
23:55 UTC

2

I need help to find some proposal unicodes request

Somebody know about a page wich show all the proposal unicode requests?, a lot of people like Kirk Miller send new proposals to the ISO (I guess) to propose encode new characters, but I can't find a real time page wich show you all requests of all the unicode characters, a similar page I founded is https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=xztdezls8h , but that page only shows codified and accepted characters. (sorry for my bad english lol)

2 Comments
2024/12/09
19:31 UTC

1

Looking for a unicode symbol that looks like the Elder Sign from the Lovecraft mythos

Hi, I'm looking for a unicode symbol that looks like the "twig" variant of the elder sign. The actual sign isn't available, so I'm just looking for the closest possible symbol. All recommendations are appreciated. Thank you in advance.

2 Comments
2024/12/09
03:15 UTC

2

How can I use the block "private use area" in a page html if I want to show my own characters?

I have seen some old norse encoded scripts, wich uses uncodified characters like "LATIN SMALL LIGATURE PP". you can find it on this unicode request (I think):
https://ftp.yz.yamagata-u.ac.jp/pub/CTAN/macros/latex2e/contrib/unicode-alphabets/docs/specimen.pdf

and on this page that show some old norse encoded scripts:
https://skaldic.org/m.php?p=menotatextpage&i=13116&v=33r

12 Comments
2024/12/09
02:09 UTC

0

Idk

What are the steps to propose a new character unicode?

1 Comment
2024/12/09
01:49 UTC

2

Need help finding a unicode

It looks like an I with a : underneath, with a small T to the left of the colon.

8 Comments
2024/12/04
09:07 UTC

4

Unicode Segmented Display.

What is the least number segments to display all unicode characters so that it is still recognisable. While the question is extremely vague, I'm still curious for discussion.

10 Comments
2024/12/04
08:05 UTC

1

skier unicode?

I remember seeing a skier unicode character (might’ve been snowboard or ice skates, idk though) copied in lots of roblox group walls, around 2019. when put on a webpage, they would create 2 lines all the way to the bottom (representing the trail of the skier i guess) and now i can’t find it. is this real?

1 Comment
2024/12/03
03:28 UTC

4

unicodes appear as boxes

on my old pc I resolved this issue by downloading 3 fonts; I was able to see 99% of all unicodes. I sold the pc and now I can't find the fonts. does anyone know which fonts I can download to be able to see unicodes as they are instead of weird boxes?

9 Comments
2024/11/22
21:51 UTC

2

Need help deciphering a message that I think is in Unicode?

My friend changed his name on discord to

"╬┿╣⟱◣❒▽║❐〒⊯◠〶⍍⌫⍯⍱⌘⌒⋡❋∰〠✇▒⊈◯⫸╳╘┡㈥"

and i'm trying to figure out what it means.

1 Comment
2024/11/20
00:38 UTC

2

Can someone use small caps X and Q in a comment to this?

I can carry them to other tabs without having to copy-paste.

4 Comments
2024/11/19
22:49 UTC

5

i just found some cool bracket unicodes

1 Comment
2024/11/19
18:48 UTC

3

why does ✿ look like ✿𝆬

sometimes on diff devices ✿ looks like ✿𝆬 just without the little tiny thing. WHYYY

6 Comments
2024/11/19
09:28 UTC

2

You remember those Unicode blocks U+EF11 and U+EF0C?

What are the links to them?

3 Comments
2024/11/18
17:54 UTC

0

Unicode 19.0

Liit

Piduwi

Urqee

Egyptian Hieroglyphs Extended-C

Gunutar-Munutar

Hinay

Kana Extended-D

Siesie

Tebeha

Guariwt

Grantha Extended-A

0 Comments
2024/11/16
13:05 UTC

3

Need to create a custom character

Hi As the title says it all I want to create a custom character which is the circled copyright symbol but with a 1 beside it inside the circle as in here. I tried windows edcuedit but with no luck. Any help? Thanks in advance.

3 Comments
2024/11/14
09:59 UTC

0

how do i type unicode?

i am new and i am using the numpad, i do everything correctly, but it doesnt work? (btw, windows isnt activated if it has something to do with that)

2 Comments
2024/11/14
08:32 UTC

2

Is there a character that looks like " Ͱ " but mirrored to the left?

Is there a character that looks like " Ͱ " but mirrored to the left? I'm trying to find characters that resemble letters of my conlang

8 Comments
2024/11/13
07:47 UTC

12

If you could propose any symbol and it would be added directly to Unicode, which symbol would you add?

No matter if the symbol is already used by some people or you just made it up yourself. If you could add one thing to Unicode what would it be?

26 Comments
2024/11/11
17:43 UTC

2

Looking for circulated c1 character

Hi I’m looking for the Unicode - if exists - of the copyright character © but with 1 beside the c inside the circle like the photo above. Thanks in advance.

3 Comments
2024/11/10
08:58 UTC

Back To Top