Building Cross-Compatible Voicebanks with Mac

8 min read

Deviation Actions

trelliah's avatar
By
Published:
1.2K Views
I originally wrote this as a help topic for an UTAU forum, then decided to make it available elsewhere. 

When I first started using UTAU-Synth, there really wasn't a lot out there when it came to help resources for Mac users. More to the point, there weren't many that addressed the hiccups and quirks specific to UTAU-Synth. So, being a Mac user, I decided to try and amend that a bit.

This resource is primarily written for people using Mac/UTAU-Synth, specifically users who want to be sure that the voicebanks they're making are cross-compatible with PC-based UTAU. This resource is also here to answer questions that PC users have about Mac-built UTAU voicebanks, and UTAU-Synth, since they may not be familiar with the program (or the differences between UTAU and UTAU-Synth) for obvious reasons. I'm not an expert by any means, but I hope that others will find this collection of notes and such to be helpful.

About 90% of this article was originally written as a response to a thread about making voicebanks on Mac, with a few additions or subtractions here and there for the sake of editing. All of it is derived from my own experiences I've had making voicebanks and getting them to work on non-Macintosh computers.

---

PART 1: The Configuration Samba
Let's talk about OTOs. What is an OTO? An OTO is the configuration file generated by UTAU / UTAU-Synth that determines how a voicebank functions. It's a vital part of a voicebank: without it, there is no voicebank -- just a folder full of sounds.

There are two types of OTO: "oto.ini" files, generated by UTAU, and "oto_ini.txt" files generated by UTAU-Synth. What's the difference (aside from the file extension,) you ask? Encoding formats.

UTAU generates it's OTOs as "oto.ini" files, encoded in Shift-JIS format. UTAU-Synth, on the other hand, generates it's OTOs as "oto_ini.txt" files, encoded in UTF-8.

UTAU has no idea what to do with Mac-generated "oto_ini.txt" files. This difference in file name and encoding (and UTAU's complete failure to deal accordingly) is the primary reason a Mac-built voicebank will fail to work on in UTAU. Unless special steps are taken to create an oto.ini that UTAU can read, a Mac-built voicebank cannot function in a PC environment. Even if you simply copy and paste the contents of the "oto_ini.txt" file and name it "oto.ini," without ensuring that the file has been encoded properly in Shift-JIS, the OTO will be a mess of random symbols and such -- garbage characters, also known as mojibake. We'll be talking about that little phenomenon in detail later on.

Right now, let's talk about how to ensure your Mac-made OTO will work translate to PC mojibake-free. 

1.1 Converting Mac-Made OTOs
The good news is, converting a Mac-made OTO is fairly simple, though it takes a little bit of effort, and sometimes some trial and error.
  • Open the oto_ini.txt file in a text editor, preferably one that's made for coding. I use TextWrangler, and highly recommend it -- it's amazing, and also free (made by the good folks who made BBEdit.)
  • Create a new document, then copy & paste the contents of the "oto_ini.txt" file into the new document. Go ahead and remove the top line, "# Charset:UTF-8." (Editorial Note: There's not usually a space between the "#" and "Charset," but I had to put one there, due to DA trying to make it into a tag, or something.)
  • Save the new file as "oto.ini," but make sure the encoding type is set to "Shift-JIS." 
Important Note: If you use filenames or aliases that are in hiragana or katakana, there's a possibility that when you attempt to save the file in Shift-JIS, the program may refuse to do so, citing the presence of "incompatible characters." When I had this issue, the culprits were the "V" and "Vy" sounds, which were aliased in hiragana. I opted to rename the affected files in romaji or use katakana "B" and "By" as an alternative. (If I have learned anything from this experience, it's that "Find and Replace" is your friend.)

PC users may be wondering, what about the flip side? Mac-built OTOs don't work in UTAU without some extra effort, so what about the reverse? Are special steps needed to be sure a PC-made voicebank is Mac compatible? The answer, happily, is no -- making a PC-built voicebank functional in UTAU-Synth requires exactly no special effort.

When a PC-built voicebank is loaded into UTAU-Synth, the program can and will load the existing "oto.ini" without issue. If changes are made to the OTO, the program automatically generates a duplicate OTO in the form of an "oto_ini.txt" file (encoded in UTF-8.) Any further updates or changes to the OTO will be applied to the newly generated txt file. The original "oto.ini" will be left untouched.

1.2 Configuring Multipitch Voicebanks
Let's talk about one of the most drastic differences between UTAU and UTAU-Synth: how they handle multipitch/multi-foldered voicebanks. 

Every voicebank requires an OTO, usually located in the main folder of the voicebank. However, UTAU requires that each folder inside a voicebank (subfolders) have it's own local "oto.ini" for the files it contains. In short, every folder has be configured as though it's an individual voicebank. 

UTAU-Synth users who are unfamiliar with how UTAU runs will probably find this a completely alien concept, because UTAU-Synth has no such requirement. As far as UTAU-Synth is concerned, the only OTO it cares about is the primary OTO located in the voicebank's main folder. It ignores the subfolder OTOs completely.

Creating OTOs for subfolders can be a pain, but it's not much harder than creating an "oto_ini" from an "oto_ini.txt" file. You can either go the long route of creating a new OTO from scratch (and then converting it,) or you can copy & paste what you need from the main file, edit the contents of the new OTO accordingly, and then be sure to save it as an "oto.ini" encoded in Shift-JIS. Be sure to exercise caution when deleting and editing. You'll find that tactical use of "Find and Replace" can and will save you a lot of time. And sanity.

---

PART 2: Mojibake, Mo' Problems: Compression & Extraction Troubles
It happens to a lot of users, regardless of what operating system they use. You download a Japanese voicebank -- maybe Teto, maybe Ritsu -- only to discover upon extracting it that it won't work. The voicebank just won't sing. When you look at the contents of the voicebank folder, you discover that every file name (as well as the contents of every text file) is nothing but gibberish. 

Welcome to the woe of mojibake. "Mojibake" is what happens when a program or operating system tries to display a file that uses an encoding type it's not configured to read. 

It's frustrating enough when you download a voicebank you really want to use, only to discover upon decompression that it's broken. It's spectacularly frustrating when you go through all the fun of converting your OTOs so they'll be PC compatible, only to find out that your OTOs broke in spite of your best efforts. 

So why does this happen? In the case of voicebanks that use hiragana and katakana aliasing, it can actually come down to the wrong choice of compression or extraction software.

Some programs simply can't handle non-Western characters at all. Others can generally handle Japanese characters, but are incapable of dealing with the characters that specifically contain dakuten and handakuten -- the little diacritics used in Japanese kana to indicate how a consonant is voiced. For example, see the little notation above the kana that differentiates the sound "ba" (ば) from "pa" (ぱ) and "ha" (は).

For Mac users, finding a compression/decompression program won't break your stuff can sometimes be difficult. Here are the two that I use.
  • Keka can handle a number of compression and extraction formats. Strangely enough, while Keka doesn't seem to be able to extract files containing non-Western characters without garbling file names and contents, it seems to work perfectly fine for compression. The only downside is that Keka doesn't offer RAR as an archiving format (at least, it does not as of this writing.) Considering that it's free and can archive a voicebank without breaking it's contents, not being able to compress in RAR format is a small price to pay for an otherwise solid program.
  • The Unarchiver is an archive unpacker, which can not only handle a large number of archive formats, but is also perfectly content handling file names containing non-Western characters. Meaning, you can unstuff Teto, Ruko, or Ritsu (or any other Japanese-language voicebak) to your heart's content.
---

Well, that's about it. I hope this tutorial answered some questions! If anyone has any questions or troubleshooting issues regarding Mac/PC UTAU stuff that I didn't address here, feel free to leave a comment! I'll try to help, and depending on the question might add it to this tutorial as a future update. ^^ 

© 2017 - 2024 trelliah
Comments0
Join the community to add your comment. Already a deviant? Log In