Python: How to compare a unicode with unicode within variable

15,704

Look, the letter ç (a char that is not presented in ASCII) may be represented as a str object or as an unicode object (maybe you are a little confused about what unicode means).

Also, if you are trying to create an unicode object that is not present in ASCII table, you must pass another encoding table:

unicode('ç')

This will raise an UnicodeDecodeError because 'ç' is not in ASCII, but

unicode('ç', encoding='utf-8')

will work, because 'ç' is presented in UTF-8 encoding table (as your Arabic letters may be).

You can compare unicode objects with unicode objects as the same way you can compare str objects with str objects, and all this must work fine.

Also, you can compare a str object with unicode object but this is error prone if you are comparing not ASCII characters: 'ç' as a str is '\xc3\xa7' but as unicode it is just '\xe7' (returning False in a comparison).

So @Karsa may be really right. The problem is with your 'variables' (in Python, a better word is objects). You must certify that you are comparing just str or just unicode objects.

So, a better code could be:

#-*- coding: utf-8 -*-

def compare_first_letter(phrase, compare_letter):
    # making all unicode objects, with utf-8 codec
    compare_letter = unicode(compare_letter,encoding='utf-8')
    phrase = unicode(phrase,encoding='utf-8')
    # taking the first letters of each word in phrase
    first_letters = [word[0] for word in phrase.split()]
    # comparing the  first letters with the letter you want
    for letter in first_letters:
        if letter != compare_letter:
            return False
    return True # or your reply function

letter = 'ç'
phrase_1 = "one two three four"
phrase_2 = "çarinha çapoca çamuca"

print(compare_first_letter(phrase_1,letter))
print(compare_first_letter(phrase_2,letter))
Share:
15,704
KiDo
Author by

KiDo

Updated on June 04, 2022

Comments

  • KiDo
    KiDo almost 2 years

    SOLVED

    I solved the problem, thanks all for your time.

    First of all, these are the requirements:

    1. The comparison MUST be within variables. (Compare 2 variables contain unicode)
    2. The version of Python MUST be 2.x , I know version 3 has solved this problem, but unfortunately it won't work with me.

    So hello, I have a bot coded with python, and I would like to make it compare 2 non-English letters (unicode).

    The problem I have is, the letters MUST be within variables, so I can't use:

    u'letter'

    Both letters I would like to compare MUST be within variables.

    I have tried:

    letter1 == letter2

    it's showing this error: E:\bots\KiDo\KiDo.py:23: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal import sys

    and always returns False even the 2 letters are the same. So I guess it means I'm comparing 2 unicode letters.

    And tried:

    letter = unicode(letter)

    but it shows this error:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xd9 in position 0: ordinal not in range(128)
    

    I have searched all over Google, but all I could find is using u' ', but this won't work with the variables.

    Thank you.

    Comparison Code:

    word1 = parameters.split()[0]
    word2 = parameters.split()[1]
    word3 = parameters.split()[2]
    word4 = parameters.split()[3]
    word5 = parameters.split()[4]
    if word1[0] == letter:
        if word2[0] == letter:
            if word3[0] == letter:
                if word4[0] == letter:
                    if word5[0] == letter:
                        reply(type, source,u'True')