/r/Unicode
Discussing Unicode, Unicode characters and Unicode-related tools.
This reddit is for the clever use of unicode in the creation of art, emoticons, or other clever uses of the characters allowed in unicode.
/r/Unicode
UTF-8 could avoid overlong encodings and be more efficient by indexing from some offset in sequences that consist of multiple bytes instead of starting from 0.
For example:
If the sequence is 2 bytes long then those bytes will be 110abcde 10fghijk and the codepoint will be abcdefghijk (where each variable is a bit and is concatenated, not multiplied).
But why not make it so that instead the codepoint is equal to abcdefghijk + 10000000 (in binary)? Adding 128 would get rid of overlong sequences of 2 bytes and would make 128 characters 2 bytes long instead of 3 bytes long.
For example, with this encoding 11000000 10100000 would not be an overlong space (codepoint 32), but instead would refer to codepoint 32+128, that is, 160.
In general, if a sequence is n bytes then we would add one more than the highest code point representable with n-1 bytes (e.g., with two bytes add 128 because the highest code point of 1 byte is 127 and one more than that is 128).
I hope you get what I mean. I find it difficult to explain, and I find it even more difficult to understand why UTF-8 was not made more efficient and secure like this.
Ok guys don't lie Ɥ but flipped looks cool
yeah so everything ingame shows as a "?" so can someone find me a symbol that works? ty
I spent an afternoon learning about byte encoding and codemaps, and I still can't figure out exactly why the former doesn't work, but the latter does, it's just morbid curiousity at this point >_>
I looked it up on unicode.org and there were two requests for jelly and jam emojis, but they were both rejected. I think it's very silly that they have so many other niche emojis, but not one for a very common food item. What are your thoughts on this?
I'm trying to make some funky looking text for a YouTube video but I'm working with a video editor that isn't very friendly and won't let me move text boxen around when it's doing a specific effect and I very much want to have the text boxen do that effect in a different place so I'm pushing around the letters with zero width characters but they're not formatting correctly in line with the visible characters because the visible characters include an underscore. Actually I might also need to find an invisible character which is read as *not* being an underscore-like fellow as well because it's only allowing me to put underscores in the right places by putting non-underscore characters in and I would like those to be invisible as well.
What an odd life it is sometimes no?
I've been having problems with this, because most of the old unicode proposals from the year 1993 are not online, or I guess you need to pay money to see the pdfs
I know how surrogates work. but I do not understand why UTF-16 is made to require them, and why Unicode bends over backwards to support it. Unicode wastes space with those surrogate characters that are useless in general because they are only used by one specific encoding.
Why not make UTF-16 more like UTF-8, so that it uses 2 bytes for characters that need up to 15 bits, and for other characters sets the first bit of the first byte to 1, and then has a bunch of 1s fillowed by a 0 to indicate how many extra bytes are needed. This encoding could still be more efficient than UTF-8 for characters that need between 12 and 15 bits, and it would not require Unicode to waste space with surrogate characters.
So why does Unicode waste space for generally unusable surrogate characters? Or are they actually not a waste and more useful than I think?
These are all the proposed blocks for when in the future, I will be accepting more blocks to future versions
Somebody know about a page wich show all the proposal unicode requests?, a lot of people like Kirk Miller send new proposals to the ISO (I guess) to propose encode new characters, but I can't find a real time page wich show you all requests of all the unicode characters, a similar page I founded is https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=xztdezls8h , but that page only shows codified and accepted characters. (sorry for my bad english lol)
Hi, I'm looking for a unicode symbol that looks like the "twig" variant of the elder sign. The actual sign isn't available, so I'm just looking for the closest possible symbol. All recommendations are appreciated. Thank you in advance.
I have seen some old norse encoded scripts, wich uses uncodified characters like "LATIN SMALL LIGATURE PP". you can find it on this unicode request (I think):
https://ftp.yz.yamagata-u.ac.jp/pub/CTAN/macros/latex2e/contrib/unicode-alphabets/docs/specimen.pdf
and on this page that show some old norse encoded scripts:
https://skaldic.org/m.php?p=menotatextpage&i=13116&v=33r
What are the steps to propose a new character unicode?
It looks like an I with a : underneath, with a small T to the left of the colon.
What is the least number segments to display all unicode characters so that it is still recognisable. While the question is extremely vague, I'm still curious for discussion.
I remember seeing a skier unicode character (might’ve been snowboard or ice skates, idk though) copied in lots of roblox group walls, around 2019. when put on a webpage, they would create 2 lines all the way to the bottom (representing the trail of the skier i guess) and now i can’t find it. is this real?
on my old pc I resolved this issue by downloading 3 fonts; I was able to see 99% of all unicodes. I sold the pc and now I can't find the fonts. does anyone know which fonts I can download to be able to see unicodes as they are instead of weird boxes?
My friend changed his name on discord to
and i'm trying to figure out what it means.
I can carry them to other tabs without having to copy-paste.
⎛
⎜
⎝
⎞
⎟
⎠
⎡
⎢
⎣
⎤
⎥
⎦
⎧
⎨
⎩
⎫
⎬
⎭
sometimes on diff devices ✿ looks like ✿𝆬 just without the little tiny thing. WHYYY
What are the links to them?
Liit
Piduwi
Urqee
Egyptian Hieroglyphs Extended-C
Gunutar-Munutar
Hinay
Kana Extended-D
Siesie
Tebeha
Guariwt
Grantha Extended-A
Hi As the title says it all I want to create a custom character which is the circled copyright symbol but with a 1 beside it inside the circle as in here. I tried windows edcuedit but with no luck. Any help? Thanks in advance.
i am new and i am using the numpad, i do everything correctly, but it doesnt work? (btw, windows isnt activated if it has something to do with that)
Is there a character that looks like " Ͱ " but mirrored to the left? I'm trying to find characters that resemble letters of my conlang
No matter if the symbol is already used by some people or you just made it up yourself. If you could add one thing to Unicode what would it be?
Hi I’m looking for the Unicode - if exists - of the copyright character © but with 1 beside the c inside the circle like the photo above. Thanks in advance.