Is there a c function that can convert all characters in a string to upper or lower case...

Question

Is there a c function that can convert all characters in a string to upper or lower case...

Justin Hall

is there a c function that can convert all characters in a string to upper or lower case? the string is encoded in utf-8 and the language im using is c (if possible, i can use another language for this if there's no easy way to do this in c). any library is ok as long as it's not GPL.

Attached: 1566539031940.jpg (720x960, 136K)

September 22, 2019 - 16:15

Other urls found in this thread:

stackoverflow.com/a/36898054
ssl.icu-project.org/apiref/icu4c/ustring_8h.html#a14740e3b734ffa82205d4762fcacb5e1
ssl.icu-project.org/apiref/icu4c/group__ustring__ustrlen.html#gac4d8a5581fc5bde71d62ebd0a7a84ec5
stackoverflow.com/questions/4607413/c-library-to-convert-unicode-code-points-to-utf8
developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf8-strup
justfuckinggoogleit.com/
twitter.com/NSFWRedditVideo

Adrian Kelly

who is that semen demon?

September 22, 2019 - 16:21

Hudson Morgan

why is the top half of that cola colored in?

September 22, 2019 - 16:22

Levi Edwards

there won't be an in-place one. You'll have to update every reference to the string.

September 22, 2019 - 16:22

Austin Gomez

first answer my question.
idk.

September 22, 2019 - 16:23

Cooper Cook

so you can't see my reflection ;)

September 22, 2019 - 16:23

Christopher Edwards

Search for yourself faggot

September 22, 2019 - 16:24

Dominic Sanders

Use toUpper in a loop maybe?

September 22, 2019 - 16:25

Zachary Sanders

for unicode? i don't think so.

September 22, 2019 - 16:25

Bentley Cook

Idk about libraries but can't we convert it to ASCII and add 26 to the value, and convert it back to char?

September 22, 2019 - 16:25

Austin Peterson

Just loop through the array you lazy nigger

September 22, 2019 - 16:26

Cameron Howard

no
it doesn't work like that user

September 22, 2019 - 16:27

Nolan Smith

stackoverflow.com/a/36898054

September 22, 2019 - 16:34

Joseph Perez

bro, a unicode char can have more than 1 byte

September 22, 2019 - 16:41

Elijah Gray

yeah but if you only need to worry about latin characters, you can just range check. Anything with the first bit flipped will be 127 so it doesn't even fall into the "lower-case" range
the first bit is flipped for all the bytes in a multi-byte char.

September 22, 2019 - 16:44

Luke Bailey

True but a utf-8 toupper that ignores something like äÄ is probably not that useful.

September 22, 2019 - 16:59

Benjamin Fisher

I have found something called ICU. Do you think I can use that?

Attached: 1568979628153.png (959x573, 126K)

September 22, 2019 - 17:03

Nolan Scott

If you dont mind having to include an entire unicode library: ssl.icu-project.org/apiref/icu4c/ustring_8h.html#a14740e3b734ffa82205d4762fcacb5e1

September 22, 2019 - 17:03

Sebastian Harris

I guess it's ok. Also, is there an ICU function that can give me the amount of bytes a UTF-8 characters is contained of, given the first byte? I already have this function (I wrote it) but if I can use an ICU function for this why not?

September 22, 2019 - 17:07

Jonathan Robinson

Yeah, it's in the same page on the documentation. similar name to standard c function: ssl.icu-project.org/apiref/icu4c/group__ustring__ustrlen.html#gac4d8a5581fc5bde71d62ebd0a7a84ec5
But it might be more efficient to not calculate the size and instead pass an output buffer that is the size of the input buffer * 2 (i guess uppercase characters cant be more than 2 times longer than lowercase?), or on the safe side input buffer * 4.

September 22, 2019 - 17:10

Joseph Martinez

#include

void toUp(char *name, int e)
{
int i;
for(i = 0; i < e - 1; ++i)
{
if(name[i] >= 97 && name[i] = 65 && name[i]

September 22, 2019 - 17:10

Brayden Jones

that's only ascii, OP asked for utf8

September 22, 2019 - 17:11

Jace Sullivan

Btw, you can also just use regular c strlen, as it works with utf8 as well.

September 22, 2019 - 17:12

Luis Hill

Oh, I see, how do you even handle utf-8 then?

September 22, 2019 - 17:13

Christopher Brown

Hmnmm are you sure this is it? My idea is something like this. For example, б is 0xd0b1. So I need
utf8_char_length(bee[0]);
to return 2.
Nope.

September 22, 2019 - 17:16

Gabriel Harris

One character can be more than one byte in utf-8.
It's a bit complicated. Depending on the locale, after you convert to uppercase you can end up with more characters than you started with. You need to check the unicode standard and map the lowercase characters to one or more uppercase characters.

September 22, 2019 - 17:16

Juan Howard

Notice that bee[0] is 0xd0.
Then it's not what I need.

September 22, 2019 - 17:17

Nolan Perry

Yeah there is ICU function for that if you look through the documentation. But why do you want to do that? you dont need that to uppercase a string. ICU can do everything you want with unicode

September 22, 2019 - 17:19

Landon Martin

What should I pass for locale btw?

September 22, 2019 - 17:20

Parker Butler

Bro.....

Attached: 2.jpg (490x736, 105K)

September 22, 2019 - 17:22

Kevin Gonzalez

The uppercase can change depending on the locale you pass, so you would want something that fits whatever you want to do. I believe you can pass an empty string and it should work as you expect.

September 22, 2019 - 17:23

Isaac Roberts

Just do a foor loop and sum the char difference between upper and lower

September 22, 2019 - 17:25

Jacob Morris

>Just do a foor loop and sum the char difference between upper and lower

Attached: 1567446943098.png (168x300, 7K)

September 22, 2019 - 17:28

Angel Clark

Foor loops are my favorite, waaay better than those shitty for loops.

September 22, 2019 - 17:41

Jeremiah Jenkins

I'd loop around those milkies.

September 22, 2019 - 17:42

Gabriel Nguyen

Attached: IMG_20190911_160636.jpg (1080x1290, 185K)

September 22, 2019 - 17:44

Gavin Bell

forr loops are proven to be superior to any other kind of loops

September 22, 2019 - 17:44

Angel Rogers

Since recursive functions can do that, and they are turing complete, and C is also turing complete, than you can comclude that such a C function does indeed exist. Q.E.D.

September 22, 2019 - 17:46

Leo Gray

So I started using ICU and how much memory should I allocate for the converted UChar* string? 4, 6 or 12 times more?

Attached: girlwithflowers.jpg (289x540, 27K)

September 22, 2019 - 17:47

Jose Gonzalez

>nobody in this thread understands strings beyond ASCII char arrays

Attached: caveman_cs.png (287x176, 8K)

September 22, 2019 - 17:49

Cooper Myers

for (i = 0; string[i]; i++) string[i] = tolower(string[i]);

September 22, 2019 - 17:50

Alexander Nguyen

They are the same with more bytes moron

September 22, 2019 - 17:52

Lucas Green

That's not fair, I understand UTF formats. I just fundamentally do not know what "Upper Case" and "Lower Case" means from the perspective of other languages and character sets. And I don't really want to know, the point of unicode libraries and culture helpers is it does that shit for me.

September 22, 2019 - 17:54

Caleb Perez

1. embed python3 interpreter
2. put your c string on the python stack
3. upper case it there
4. return it to c stack

Attached: 1547719731408.png (371x532, 281K)

September 22, 2019 - 17:56

Benjamin Lopez

awk '{print toupper($0)}'
You can use system to use it in a C program.
Why would we care about other strings? Utf8 is just superfluous complicated bloat that does nothing but allow foreigners to misuse our invention.

September 22, 2019 - 18:00

Isaac Sanchez

>Why would we care about other strings? Utf8 is just superfluous complicated bloat that does nothing but allow foreigners to misuse our invention.
I \xF0\x9F\x92\xA9 on your dumb ascii.

September 22, 2019 - 18:10

Robert Flores

That's fair. Most people on this thread are telling OP to just do an ASCII upper case conversion though

September 22, 2019 - 18:19

John Miller

s = 'op is gay'
print(s.lower(), s.upper())

C BTFO by the chad Python :^)

September 22, 2019 - 18:20

Noah Edwards

Use ctype, it exists for a reason. Calling printf every char isn't efficient.

September 22, 2019 - 18:23

Anthony Powell

>that does nothing but allow foreigners to misuse our invention.
based and redpilled

September 22, 2019 - 18:26

Jackson Perry

>string.upper in Lua
and anons keep praising C
my sides

September 22, 2019 - 18:26

Connor Edwards

How about you read the wikipedia article on UTF-8 and then get back to us

September 22, 2019 - 18:27

Aiden Green

based lua

September 22, 2019 - 18:28

Nolan Bell

Is there a specific alphabet you are working on?

September 22, 2019 - 18:29

Jeremiah Hughes

lua is basically a C API. based retards.

September 22, 2019 - 18:30

Jason White

>no dude this is not a list of characters, its a list of characters but bigger!

Wow you just have to add a if string < 128 wtf everything is different

September 22, 2019 - 18:31

Carter Wright

I have a possible solution, but fellow user, you should be able to figure this simple problem out for yourself.
That'll make you grow as a programmer and as a person.

September 22, 2019 - 18:32

Jaxson Davis

There are upper and lower-cased characters outside of the ASCII range...

September 22, 2019 - 18:33

Isaiah Fisher

Just omit those literally :^)

September 22, 2019 - 18:34

Matthew Moore

>t. doesn't know the answer either
cringe

September 22, 2019 - 18:36

Xavier Morales

See the below replies

September 22, 2019 - 18:37

Henry Walker

I have already figured it out. Using ICU. Now how to compile this with GCC? I get undefined references, it surely needs some additional -l flags?

September 22, 2019 - 18:39

Colton Harris

map toUpper x

Oh wait, you're using a shitty language. My bad.

September 22, 2019 - 18:41

Luke Bennett

Someone had to create the algorithm for you, so, it's you that's shitty.

September 22, 2019 - 18:42

Wyatt Gutierrez

god i wish that were me

September 22, 2019 - 18:43

Adrian Nguyen

C is a bare bones lower level programming language. You would have to build your own function or just find one from github. Alternately, use Golang.

September 22, 2019 - 18:46

Andrew Taylor

>iterate through string
>if >64 and if >96 and Ignore the rest

September 22, 2019 - 18:48

Henry Robinson

1. Convert utf8 to ascii
2. +- 32
3. Convert back to utf8

stackoverflow.com/questions/4607413/c-library-to-convert-unicode-code-points-to-utf8

September 22, 2019 - 18:48

Nicholas Price

all the utf8 latin upper/lowercase letters are offset from each other by 30 or something, just google the number and add it to the char

September 22, 2019 - 18:48

Julian Sanchez

Also skip 1 char if you read the unicode special character marker

September 22, 2019 - 18:49

Grayson Rivera

Also its 32, not 26, my bad

September 22, 2019 - 18:50

Easton Rodriguez

>look me! me so clever!
Just use ctype, faggot.

September 22, 2019 - 18:51

Parker King

Poor solition. If he's using utf8 he's probably using non-ASCII letters.

September 22, 2019 - 18:52

Alexander Sullivan

If you don't mind having glib as a dependency there is a function for that. If you are using GTK+, glib is required anyway, so you might as well use it.

developer.gnome.org/glib/stable/glib-Unicode-Manipulation.html#g-utf8-strup

September 22, 2019 - 18:57

Evan Brooks

They're just memeing at that point. Nobody really can be this dumb.

September 22, 2019 - 19:02

Nathan Brown

Well, user?

September 22, 2019 - 19:34

Oliver Price

>is there a c function that can convert all characters in a string to upper or lower case?

Of course, and you can find it, and other amazing things, here:
justfuckinggoogleit.com/

September 22, 2019 - 19:45

1 2 ... 8 Next

Is there a c function that can convert all characters in a string to upper or lower case...

Last threads