That is, you get the files in some weird Latin-3 encoding and you need it in UTF-8. Except you don't know that it's in Latin-3, it just looks like a bunch of garbage because you don't have fonts to render greek on your box. What is one to do?
The two things you need, if you're dealing with java, are iconv and native2ascii .. native2ascii is the tool that converts from the binary utf-8 encoding to an ascii encoding with escapes, so it looks like \uXXXX whenever there should be a non-ascii chacter. native2ascii comes with the jdk, so you've already got that. iconv or piconv are available under cygwin or any linux distro, and both work equally well. Remeber, in the end, you're going to end up running the file thru native2ascii before you try it in the app, anyway, because that is the only format that java can read natively.
Remember, when you're working on a moden linux machine, vi is UTF-8 clean, and on windows, notepad is UTF-8 clean. So what you need to do is using iconv or piconv to convert the file from whatever encoding you guess it is to UTF-8, and then open it with vi or notepad. If it does not look corrupt, you're in business. At that point, run native2ascii -encoding
The homebrew heuristic to detect the corrupt files that were converted with the wrong -encoding switch with native2ascii, based only on experience, is as follows: Pick a file that has a sequence like this [ascii character][non-ascii character][ascii character]. Take the file and run it thru native2ascii, with the source set to a your encoding that you think is right. If the output looks like [ascii character]\uXXXX[ascii character] then you've found the encoding. However, if there are more than 1 \uXXXX sequences between the ascii characters, your encoding is wrong and you need to try again.
Of course, if you had a hex editor and a copy of the code pages on hand, you could determine this much less heuristically, but I can never find either of those when I need them.
The biggest challenge is not learning how to gesture or use your mouse, though, its learning how to type. I have yet to switch over to the fingerworks keyboard for my main device, just because my programming on it is like 1/3 speed of my normal development, which personally makes me want to throw the computer out the window. But I am getting faster, and I can see the day where my Kinesis will go up on ebay and I'll be that much faster and efficent when I work.
The exciting bit here is that this is actually real, and the integration can happen in such a fashion that things are actually nicely coherant and integrated as a tools platform without having to spend a fortune on customization of the invidual tools from vendors, and deal with the fact that tools don't meet your requirements exactly. Instead, just having the ability to look inside the tools and work around issues, work around features that are getting in your way, you can very quickly assemble something that will rocket you ahead in your development effort.
The tools that I have glued together are jStateMachine (no longer publicly available), Hibernate and a collection of open source libs (dom4j, etc), using Eclipse UML as a charting tool. This has given us a system that lets you draw uml statecharts that describe your app, run them immediately in the appserver and cleanly drop in dhtml renderers for your data that are decoupled from the backend logic.
However, it turns out that Hibernate (which I am looking to use) has an Oracle9 dialect that understands the pattern of using a nested select and the Oracle ROWNUM pseudo-column. Here's to the Hibernate developers for a clean solution.
3 cheers for some real innovation.
Other useful blogs I like to read:
misbehaving
boingboing
k5
/dev/null
crazy bob
igor's lampost
vanity foul
James Strachan's Radio Weblog
Otaku