you allow the users to put whatever they want in the field, since names are crazy and there are a lot of unicode characters in the world, then you treat whatever anyone puts in that field like its radioactive.
Sanitize the inputs and let them enter whatever they want for a name, because deciding what is a valid name and what is not is probably way outside the scope of whatever you're doing; given the range of potential strange - and legal names is nearly infinite.
I would not put any constraints on a user name - it may even contain numbers; think of aristocratic names. No matter what regex you come up with, I can find a name somewhere in the world that will break it.
Do you know the rules for naming in those languages?
characters that you can be sure wont end up in a name. You could easily run a large name list through your expression when your done and see what falls out (if any).
Since people really can be named anything nothing is safe to some extent. 1I think you're answering your own question, Skliwz, you're not going to find a regex that covers all unicode characters and prevents cross site scripting.
However, if you're ever going to dynamically execute code from the db, then you have to be careful (such as using the exec() statement.
I think that the assumption that every website must accommodate every possible name is fallacious.Imagine trying to authenticate a user nameed "Foo'or True Or'foo" — no "dangerous" characters, but there goes your login scheme.