Areteem: Intro to algorithm anagram function problem

Online Course Discussion Forum

Intro to algorithm anagram function problem

Hi, I am stuck with the function that determines if 2 words are anagrams. My code searches if there's a matching letters in the first word and second word, if there is one, then it immideately moves on to the next letter without "crossing it out". My code works fine with words that do not have repeated letters but does not work with words that do have repeated letters. I think I am missing a step that "crosses out" the checked letters after they had been found. Is there a way to do it so it works with repeated letters?

def anagram(a,b):

stillok=True;

if len(a)!=len(b):

return False

else:

for i in range(0,len(a)):

for j in range(0,len(b)):

if a[i]==b[j]:

stillok=True;

continue;

if a[i]!=b[j]:

stillok=False;

return stillok;

print(anagram("tooth","otter"))

For example:

heart and earth returns true

crocodile and apple returns false

tooth and otter returns false

tooth and oooth returns true

UPDATE: i revised it and realized it would be better to do it by using the while loop. It originally did not work because it's all either all true or all false for everything, however after I looked at the textbook and added "and stillok" to the 9th line everything worked out. I tried tooth, ootth and oooth and they all turned out to be the correct result.

Can someone please explain the function of "and stillok" in the 9th line?

def anagram(a,b):

stillok=True;

alist=list(a);

blist=list(b);

if len(a)!=len(b):

return False

else:

i=0;

while i<len(a) and stillok: #after I added and stillok here everything worked out. I'm not sure what's the use of this.

j=0;

found=False

while j<len(a) and found==False:

if alist[i]==blist[j]:

blist[j]=None;

stillok=True;

found=True;

break;

if alist[i]!=blist[j]:

stillok=False;

j=j+1;

i=i+1;

return stillok;

print(anagram("tooth","ootth"))

print(anagram('tooth','oooth'))

显示父帖

I'll focus most on the second code.

I would describe the general idea of the algorithm is as follows: Looping through the 1st string letter by letter, we look in the 2nd string for a matching letter. To be anagrams, all the letters in the 1st string need to match with a letter in the 2nd.

Key Difficulties:

1) What about repeated letters? A first approach might just look for matching letters, without worrying about a repeated letter in the 1st string only appearing once in the second. Without being careful, this is how you'd get "tooth" and "oooth" being considered anagrams.

Fix for 1): Once a matching letter in the second string, that position is replaced by None, so it can't be matched twice.

It seems like you are okay with this portion, but let us know if you have questions here too.

2) How can we know that we have a match for ALL the letters in the first word? If we start by checking that both strings have the same length, then it is enough to ensure that each letter in the 1st string has a matching letter in the 2nd. (Here I'm assuming 1) has already been dealt with, so here the first "o" matches with the first "o", etc.)

Note then that as soon as we have ONE letter that doesn't match, we already know that the answer is False. This leads us to:

Fix for 2): As soon as we find a letter that doesn't match (meaning we've finished the "inner loop") we can return the answer False. This is what the variable stillok accomplishes. As the outer loop proceeds to move on the next letter, we also check that the previous letter was okay. If it's not, there's no use continuing, so we just stop and return the answer.

Additional Notes:

i) I'm not quite of the role of the variable found. Basically you just can use the break command to exit the inner loop once you find the letter.

ii) For the stillok variable, it's somewhat unnecessary to set it to False everytime there's a non-match. In cases like this, it might be easiest to have it set to False once at the beginning (before the loop), and then it's only changed to True once there is a match. This is fine to do because of the break command, as we stop looking as soon as there is a match.

Hope this helps!

显示父帖