How to know half-width or full-width character?

15,537

Solution 1

If you just want to determine this for characters that are hankaku-zenkaku paired (e.g. and , A and ), there isn't a whole lot of them and it shouldn't be too difficult working out their ranges as you have done.

Another common, but not so efficient, approach is to convert them to Shift JIS and count the number of bytes produced: 2 being full-width and 1 for half-width. e.g. "ア".getBytes("MS932").length

As is the case, the purpose for this kind of question is often for input validation. (i.e. to restrict or convert either or the other). In such cases, the scope of characters to deal with is naturally limited (because you can't convert it if it can't be paired) and the need to support the entire Unicode set is not needed.

If however you do want to do this for the fully-fledged Unicode range, getting the UCharacter.EastAsianWidth property using the icu4j library can do this. See this answer for how one can go down this road: Analyzing full width or half width character in Java

Solution 2

With number you can use this code

    /**
 * Full-angle string conversion half-corner string
 * 1, half-width characters are starting from 33 to 126 end
 * 2, the full-width character corresponding to the half-width character is from 65281 start to 65374 end
 * 3, the half corner of the space is 32. The corresponding Full-width space is 12288
 * The relationship between Half-width and Full-width is obvious, except that the character offset is 65248 (65281-33 = 65248).
 *
 * @param fullWidthStr Non-empty full-width string
 * @return Half-angle string
 */
public String halfWidth2FullWidth(String fullWidthStr) {
    if (null == fullWidthStr || fullWidthStr.length() <= 0) {
        return "";
    }
    char[] arr = fullWidthStr.toCharArray();
    for (int i = 0; i < arr.length; ++i) {
        int charValue = (int) arr[i];
        if (charValue >= 33 && charValue <= 126) {
            arr[i] = (char) (charValue + 65248);
        } else if (charValue == 32) {
            arr[i] = (char) 12288;
        }
    }
    return new String(arr);
}
Share:
15,537
Chan Myae Thu
Author by

Chan Myae Thu

Updated on June 04, 2022

Comments

  • Chan Myae Thu
    Chan Myae Thu almost 2 years

    I would like to know characters that contain in a String are half-width or full-width.

    So I have tested like that:

     /* Checking typing password is valid or not.
     * If length of typing password is less than 6 or
     * is greater than 15 or password is composed by full-width character at least one,
     * it will return false.
     * If it is valid, it will return true.
     * @param cmdl
     * @param oldPassword
     * @return
     */
    public boolean isValidNewPassword(String password) {
    
        if ((password.length() < 6)
                || (password.length() > 15) || (isContainFullWidth(password))) {
            return false;
        }
    
        return true;
    }
    
    /**
     * Checking full-width character is included in string.
     * If full-width character is included in string,
     * it will return true.
     * If is not, it will return false.
     * @param cmdl
     * @return
     */
    public boolean isContainFullWidth(String cmdl) {
        boolean isFullWidth = false;
        for (char c : cmdl.toCharArray()) {
            if(!isHalfWidth(c)) {
                isFullWidth = true;
                break;
            }
        }
    
        return isFullWidth;
    }
    
    /**
     * Checking character is half-width or not.
     * Unicode value of half-width range:
     * '\u0000' - '\u00FF'
     * '\uFF61' - '\uFFDC'
     * '\uFFE8' - '\uFFEE'
     * If unicode value of character is within this range,
     * it will be half-width character.
     * @param c
     * @return
     */
    public boolean isHalfWidth(char c)
    {
        return '\u0000' <= c && c <= '\u00FF'
            || '\uFF61' <= c && c <= '\uFFDC'
            || '\uFFE8' <= c && c <= '\uFFEE' ;
    }
    

    But it is not OK for all full-width and half width characters.

    So, may I know if you have any suggestion with this problem?

    Half-width and full-width are used in asian language e.g japanese

    There are two type full-width and half-width when writing japanese characters.

    half-width characters = アデチャエウィオプ

    full-width characters = アsdファsヂオpp

    Thanks a lot!