Dark Ariel7 Posted October 23, 2013 Posted October 23, 2013 Hey guys, So basically I have a chunk of of text and i need to figure out how many times every character shows up and then list them by frequency. Any idea on how to do this? It is to know which kanji pop up most often in a particular Vn. Quote
shcboomer Posted October 23, 2013 Posted October 23, 2013 The easiest way is to probably extract the script and check in each scene change which characters appear. Quote
Dark Ariel7 Posted October 23, 2013 Author Posted October 23, 2013 So you mean to do it manually. Isn't there a way to check with a program? Quote
shcboomer Posted October 23, 2013 Posted October 23, 2013 So you mean to do it manually. Isn't there a way to check with a program? Not necessarily manually, you can always code something that will do that. Depends on the script, it would most likely have to be custom for your specific VN/engine. Quote
REtransInternational Posted October 23, 2013 Posted October 23, 2013 For each character, increment its entry in a table, then sort the table and output it by rank. Not difficult if you know some programming. Quote
Nayleen Posted October 23, 2013 Posted October 23, 2013 Simple task for someone with programming knowledge, like shcboomer and ReTrans have said. Depending on formatting it can get a little complex, but it just basically needs a definition of what to exclude. If you feel confident in grasping new concepts, have a look at Python, regular expressions and string splitting - otherwise provide your scripts in some way and someone could do it for you, we have quite a few people around who should be able and willing to help. Quote
RusAnon Posted October 23, 2013 Posted October 23, 2013 Umm, if you want to just get chars frequency, you don't need regexes or anything else. >>> import collections >>> d = collections.Counter() >>> d.update(list(s)) where s is text string. Quote
Nayleen Posted October 23, 2013 Posted October 23, 2013 I made a few assumptions about the "chunk of text" equaling "script", which can contain any and all kinds of characters unrelated to the text to be counted, so I didn't apply Occam's razor to that. Â So above is the simplest solution to it, which should be fairly accurate in it's own right. Quote
Dark Ariel7 Posted October 23, 2013 Author Posted October 23, 2013 No I mean I literally have a txt file with much of the script from the Vn pasted onto it and want to know which kanji pop up the most. If someone would be willing to do that for me I would appreciate it very much. I can send you the txt file any way  you want. Quote
Nayleen Posted October 23, 2013 Posted October 23, 2013 Put it up wherever and I'll run the file through that script and provide you with the output. Quote
Dark Ariel7 Posted October 24, 2013 Author Posted October 24, 2013 Just realized the file comes out messed up in my last post. How do I get the file to you? Quote
Nayleen Posted October 24, 2013 Posted October 24, 2013 Zip and mail it to me: [email protected] Quote
Dark Ariel7 Posted October 24, 2013 Author Posted October 24, 2013 Thanks. You have been most helpful. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.