Sunday, June 9, 2013

PyTutorial: more practice with actions.

This is the 4th in a series of posts.  If you want to go the the start, go here.

In the previous section, you got some experience with loops and conditionals.  I can't overstate how important these concepts are to programming.  Being what programming actually is, I want to spend some more time on these topics, mostly by giving you more practice with them.
You know what they say: practice makes better!

Time to get real: databases

Databases?  Have you heard of them?  No? Too bad.  Yes? Great!  They're basically a great way to store and retrieve a whole bunch of different information.  So why are we talking about them here, and not in the section on information?  Well, to really take advantage of the way that databases work, for loops and conditionals are essential.  Let me explain:

Databases basically store lots of information, but in a structured way.  How structured?  Great question!  They contain different tables, and each table contains a bunch of similar items.  Sounds familiar?  In Python, it can be implemented as a list of dictionaries.  Say what?  Yup, now is a good time to go back to that old lesson and brush up on dictionaries.  But a list of dictionaries can be something like this:

>>> q = [ {"z": 3, "a": "hi"}, {3: "there", "what": "not"}]

Etc.  The above is a mess, not very structured at all.  In databases, all the dictionaries contain the same keys, and each key has a value of the same type (text, number, etc.)  For instance, a table containing car information may look like this:

>>> cars = [
    {"make": "toyota", "year": 1997, "color": "blue", "miles": 129732},
    {"make": "ford", "year": 2003, "color": "green", "miles": 83832},
    {"make": "toyota", "year": 2007, "color": "green", "miles": 27212}

or one containing people information may look like this:
>>> people = [
    {"name": "Sally", "gender": "F", "birth_year": 1972, "height": 175},
    {"name": "Ido", "gender": "M", "birth_year": 1923, "height": 143},
    {"name": "Fred", "gender": "M", "birth_year": 2010, "height": 63},
    {"name": "Molly", "gender": "F", "birth_year": 1948, "height": 163},

I purposely wrote these as text so you can copy them into your code and play around with them.  Do you notice how they're structured?  For instance, in cars, all the makes are text, while all the years are numbers.  Also, in people, all the genders are "M" or "F", while all the heights are in numbers (and presumably in cm—if you're a crazy American, and you want the height to be in inches, well, I'll leave that for you as an exercise).

Do you remember how to access these dictionaries inside lists?  If not, you should probably go over the relevant section in that post, but a quick recap:
Remember now?  Great!

You're probably getting the feeling by now how for loops can be quite useful with data organized this way.
For instance, to print all the people's names, you can do something like this:
Neat, eh?
But you can do more, lots more.  Let's find out how!

Say you want to find out who the shortest person is.  What would you do?  You can look back at the example of finding the smallest number, and think about how you can make it fit in this problem.
Thought about it enough?  How about something like this:

That would work, no?  At least it would tell you what the shortest height was.  Can you figure out how to display the name of the person with the shortest height?  In our case, it would be "Fred"?
Assignment 4.1: try and figure out how to do that.
One way (out of many, of course), is to do it like this:

But I want to show you another way to do the same, a way that I find to be slightly more elegant:

Although its slightly longer, I find it to be more elegant.  Lets see how it works.

First, you probably noticed the new keyword "None".  To see what it is, you can do something like this:
>>> type(None)
Basically, Python will tell you that it's a new type of information, like integers and strings.  It's a "None" type. A None type can only be None.  Not very useful?  Well, it can be very useful in making the code clearer, because you can use it as a place-holder in situations in which you don't have information yet, but are planning on getting the info soon.  In our example, we start out without having found a person.  Only when we go through the list of dictionaries, or records in database-speak, do we start assigning found_person actual person data.  Of course, the first if, that compares found_person with None is very important.  What would happen without it?  Try it and find out!  Do you get the cryptic error:
TypeError: 'NoneType' object has no attribute '__getitem__'
Basically, the reason for this is that you tried doing found_person["height"] when in fact found_person was still None!  It's equivalent to having have done None["height"], which, of course, makes no sense.  Not to humans, and apparently, not to Python either. :)  Therefore, our first comparison to None is used to make sure that before we do anything else with found_person, it will actually contain person information.  Now, what is this person information that we're talking about?  You can get a better idea by printing stuff that will help you figure out what the code actually does, or debug information in tech speak:

Now, when you run the code, you should see a lot of information on the screen.  Take a look at the output.  Can you see that found_person starts out as None, and then becomes a Dict that contains person information?  Can you also see that p gets a different dictionary for each item in the people list?

Now the reason that I find this more elegant is that you end up with a dictionary that contains all the relevant information of the person you want, with which you can do as you please in the end.  For instance, if I asked you to print the gender of the shortest person, all you have to do is add something like this to the last line of code:
print found_person["gender"]
 Pretty easy, no?  Try it!

I hope you get a feel for what you can do with database-structured information.  Now let's have some fun with it:
Assignment 4.2: find the youngest person in the database ("Fred")
Assignment 4.3: find the youngest female in the database ("Sally")
Assignment 4.4: print the names of all the males in the database (in separate lines, as you loop through the records)
Assignment 4.5: count the number of records in the database (4). HINT: although you can use len(people) to find out, for the sake of exercise, don't
Assignment 4.6: count the number of males in the database (2) Now you can't use len even if you wanted to! HAHAHA!
You can also play around with the cars information:
Assignment 4.7: find the year of the car with the least amount of miles (2007)
Assignment 4.8: print the miles of all the cars in the database (in separate lines, as you loop through the records)
Assignment 4.9: print the total number of miles for the cars in the database (240,776) 
Assignment 4.10: find the average number of miles for the cars in the database (80,258)
Assignment 4.11: find out how many cars are of the make "toyota" in the database (2).
Now here's a challenge:
Assignment 4.12: find the average number of miles for all the cars with make "toyota" in the database (78,472)
Did you have fun?  Great!  Not easy, but by now, you're getting pretty good with loops and conditionals, aren't you? :)  Fancy!

Let's twist things around a bit now.  One of the assignments asked you to print the names of all the males.  You may have done something like this:
But what if I asked you to do something more general?  For instance, I can ask you to actually create a list that contains all the people that are male.  In essence, I would be asking you to create a new database-like structure, one very similar to the one we've been using, but with different data.  This can be achieved somewhat like this:
Do you see how we created a new, empty list using the empty bracket command:
new_list = []
 Then, as we found matching records in the people database, we added them to our new_list using the append command:
Notice that this is structured like this:
list_name DOT append ( item_to_add ) 
This is a way to tell python to append a new item, in our case, p, to an existing list, in our case, new_list.  We will learn more about these sort of things later, but for now, the key idea here is that we used a loop to create a new database structure!  Neat, huh?  Now, we can use this concept to break up complicated tasks into smaller parts.  For instance, if I asked you to calculate the average height of all the males, you can have one loop used to gather all the males in a new list, and another to calculate the average height of new list, something like this:

At first, we're creating a list of males.  After that, we check to see if we found any, and if we have, then we have another for loop that figures out the average height, but in the new list.
Assignment 4.13: use this technique to find the average height of everyone born after 1970.  (119)
Assignment 4.14: how about the average height of all the females born after 1970? (175) Use one for loop to create a new db of all the females born after 1970, and another for loop to calculate their average height.
Assignment 4.15: find the average height for everyone taller than "Ido".  (169)  Solve this by first writing a for loop to find Ido's height, then another for loop to find everyone that is taller than Ido, and finally, yet another for loop that calculates the average height in the newly created db.
Very nice.

Welcome to the 2nd dimension

Ok, so this section is going to be a lot more boring than you're probably imagining... :)  All I'm really talking about here is a list of lists.  In tech-speak, a list is usually called an array, and a list of lists is called a 2-dimensional array, and a list of lists of lists is called a 3-dimensional array, etc, etc.  We went over them in the section on playing with information.  I find them to be a lot less useful than database-like structures (lists of dictionaries), but as you will soon see, they can come in handy at times.  Lists of lists are considered 2-dimensional because you can use them to represent a flat space, kind of like a checkers board, or a simple city map.  In the checkers board example, each item in it may represent a little square on the board, or in the city map, each item may represent some small area of land.  You can do with it whatever you like.  For the sake of example, let's have some rows of baskets containing apples:

>>> apples = [ 
    [3, 8, 4, 0, 3, 4],
    [9, 3, 7, 2, 4, 2],
    [4, 0, 2, 4, 2, 1]

Although not a must, when dealing with 2 dimensional arrays each sub-list contains the same number of items.  In our apples example, each row contains 6 baskets, not more and no less.  Thus we can say that the above is a 3 x 6 array, that is, it contains three lists which contain 6 items each.  You can copy the above into your code and play with it.  Do you remember how to use lists of lists?  For instance, to get the number of apples in the 3rd basket of the 1st row, you do this:
>>> print apples[0][2]               # first get the first row, 0, and then the 3rd basket, 2

What will these do?
>>> type(apples)
>>> type(apples[1])
>>> type(apples[2][3])
>>> len(apples)
>>> len(apples[1])

Do you get a feel for it now?  Great!  Now let's have some fun with it!  How do you count the total number of apples in each row of baskets?  You can do something like this:
Can you see how this works?  If you're a bit lost, you can add debug information and find out, perhaps something like this:
This prints a lot of information to the screen.  But if you follow it, you can see step-by-step what the program does and how it figures things out.  Nice!  But to describe what it does in human-speak, the program goes through each row of baskets.  Then, for each row, it goes through all the baskets in the row, and sums up the number of apples in each basket.
Assignment 4.16: instead of printing the number of apples in each row, print the total number of apples (62)
BTW, one way to make the above code a little bit clearer (for some), is to change the line:
num_apples = num_apples + count
To be:
num_apples += count
Do you see the "+=" command?  Basically, it means: added the value of count to whatever value num_apples already has.  Thus:
>>> a = 7
>>> a += 3
>>> print a

What will you get? Good!  Some find this more confusing.  I personally like it, but it's a matter of personal preferences.  You can do other things, such as:
>>> a = 10
>>> a *= 3                     # multiply the value in a by 3
>>> a -= 4                     # subtract the value in a by 4
>>> a /= 2                      # divide the value in a by 2
>>> a += 1                     # add 1 to the value in a
>>> print a

What will you get?  Can you figure it out in your head? Did you get 14?  Good!  I'll be using this style of programming from now on, just because I feel like it.

In a twist, can you figure out how how to count the number of baskets that contain some number of apples:
Assignment 4.17: write a program that counts the number of basket containing 2 apples. (4)
Assignment 4.18: can you write the program in a way that in the very start, you assign a number to the variable search_for, kind of like: "search_for = 3", and then have the program count the number of baskets that contain search_for apples.  
Now, say you wanted to display the row ID that is associated with each row.  For instance, the 1st row has ID 0, the second has ID 1, the 3rd 2, etc.  The ID is the number that you need to access the row directly:
>>> row_id = 1
>>> print apples[row_id]

See?  Well, how would we do that?  Here is one option:
Do you see how this works?   Try it!
Another, more elegant way to achieve the same thing is to use the enumerate function in Python:
What enumerate does is that given a list (of anything, in our case, a list of lists), for each item in that list it returns a pair of values, in which the first contains the row id, or the index in tech-speak, of the item in the list, while the second value contains the item itself.   This lets you write cleaner code by avoid having to keep track of what index you're on (as we did with row_id), it takes care of that for you.   Another aspect of enumerate is that this "pair" of values that it returns, instead of being a typical list of two items, is a 2-item tuple.  A tuple is basically just a list that you can't change anything in.  Once a tuple is created, items in the list can't be added, removed, or replaced.  I personally think that it's a bit redundant, they could have just stuck with lists, but for some reason this is how some things work in Python.  This isn't very interesting, but just to get a feel for what I'm talking about, you can try these:
>>> t = (3, 4)                      # a tuple containing two items
>>> type(t)
>>> print t
>>> print t[0]
>>> print t[1]
>>> len(t)
>>> t = (6, 3, 2)                  # a tuple containing three items
>>> t = (4,)                         # a tuple containing one item.  NOTE: you need the comma in the end there

You get the hint.  For now, they're basically just like lists.   The main difference is that you can change the values in lists, while you can't change the values in tuples:
>>> listlist = [4, 3, 3]
>>> print listlist
>>> listlist[0] = 6
>>> print listlist
>>> tuptup = (4, 3, 3)
>>> print tuptup
>>> tuptup[0] = 6                          # ERROR! ERROR! ERROR!

Woohoo.  BTW, the advantage of using a tuple over a list, in some scenarios, is that because it cannot be modified, it can be used as the key in a dictionary.  For instance, you can do something like this:
>>> x = {}
>>> x[(5, 6)] = 'hi'

yet this will give you an error:
>>> x[[5, 6]] = 'hi'

That's probably one of the best things about tuples.

Getting back to our topic, enumerate returns a tuple with two items: the index and the original item in the given list.  Such is life.  Another way to see how it works, is to add debug information (as usual):

Did you notice (I'm sure you have) that in the first example, I have "for (row_id, row) in enumerate..." where as in the latter i have "for tt in enumerate...", and only later do I have "(row_id, row) = tt"?  What's going on here?  Here I'm taking advantage of another neat little trick in Python, in which you can assign more than one value at once!  For instance, doing something like:
>>> x, y = 1, 4
makes x equal 1, and y equal 4.

Another way of writing the same thing is:
>>> x, y = (1, 4)

or like this:
>>> (x, y) = (1, 4)

You can even do the same thing with lists:
>>> a, b, c = [6, 3, 0]

Of course, you'll get an error if the number of items don't match, such as in:
>>> a, b = (65, 3, 4)

This should help you understand the difference in the two examples above.  In the first, I'm assigning (row_id, row) directly to the 2-value tuple that enumerate returns.   It's a bit like writing:
>>> (row_id, row) = (2, [3, 4, 5])

Where as in the 2nd example, I'm assigning the tuple that enumerate returns to just one variable, tt, which I later use to extract the two values.  Sort of like doing something like this:
>>> tt = (2, [3, 4, 5])
>>> (row_id, row) = tt

One last thing about tuples here...  If you want to have a tuple with only 1 value, you can't do it like this: (7).  Python will treat it as a mathematical expression, which is basically just the number 7.  If you want to make this an actual tuple, then you would need to add a comma after the one and only value, like so: (7,).
>>> (x,) = (7,)

WHEW!  Let's move on, please...

OK, so we figured out how to get the index of each row, and how to add up the items in each row.  Let's see what else we can do!

Can you find out the number of apples in the basket that has the most apples?  Try it!
Assignment 4.19: find the number of apples in the basket with the most apples (9)
Did you figure it out?   Basically, you need to go through each rows of baskets, and then for each basket within, compare the number of apples with some variable that is the place-holder of the most apples that we saw so far.  Something like this:

Do you see how this works?  Add debug information if you haven't.  Great! :)
Now, and you may have seen this coming, what if instead of wanting to find out the number of apples in the basket, we wanted to find out the indexes of the basket with the most apples?  The reason I'm using the plural, indexes, instead of the singular, index, is that in our case, we need two indexes to fully identify the basket: the index of the row, as well as the index of the basket within the row.  We can use enumerate to help us with this task, but can you figure out how?
Assignment 4.20: print the indexes of the basket with the most apples (row index = 1, basket index = 0)
Well, there are plenty of ways to achieve this, but here's one way:  (but seriously, it's tough, but try to figure this out yourself first)

Do you see what happens here?  Add debug info if you don't!  Instead of keeping two variables, great_row_id and great_basket_id separately, I can "connect" them by using a list (or a tuple, as a matter of fact) to store the two values together:
Do you see what happens?  Did you figure out what this does:
Do you see how it gets the number of apples in the basket, based on the index values in great_index?  Add debug info if you don't!

Of course, I could have done the same using a tuple instead of a list:
Do you see that the only difference is the use of "(" and ")" instead of "[" and "]"... Not a very big deal for not a very big difference.  What using a tuple instead of a list gives you is that once that tuple is created, it cannot be modified.  Amazing, I know.

But my favorite way to do this here is to actually use a dictionary.  Can you figure out how?  Here's what I mean:

Do you see how I created the dictionary by using "{" and "}"?  The reason that I prefer this solution is that it makes the program a lot easier for someone to understand.  Compare the previous use of:
With the current:
While it's easy to lose track of what "0" and "1" stand for, "row" and "basket" is pretty straight forward to understand.  In my eyes, it makes the program a lot more elegant.  I like it only slightly more than using two separate variables, such as great_row_id and great_basket_id as shown a few examples earlier.

Whew!  That was a lot of information.

Previously, you had an assignment that for a given number, count the number of baskets that contain that many apples (Assignments 4.17 and 4.18).  Now, can you think of how to modify that program to instead of calculate the number of baskets that contain that many apples, actually return a list that contains all the indexes to those baskets?

We can take advantage of the database structure that we learned in the previous section to have the for loop generate a list containing all the indexes of all the baskets containing some number of apples.  Good idea, huh? :) I'm glad you like it.

Can you think of how to do that?  Well, the records can be in the form that we just saw:
{ "row": row_id, "basket": basket_id }
Now we need to write the appropriate for loop that appends such records to some new list that we create.
Assignment 4.21: try to figure out how to do this.  Look at the code that we have above to generate a list of males in the people list.  Can you figure out how to make this work with the for loop within a for loop above?  You'll probably want to use enumerate.  Go! :)
If you couldn't figure it out, well, no worries, here's what I came up with:

Can you figure it out?  Great!  :)

Now, it's your turn to work hard:
Assignment 4.22: count the number of baskets that have more than 3 apples (8)
Assignment 4.23: instead of a count, actually return the indexes (in db structure) of the baskets that have more than 3 apples.
Assignment 4.24: calculate the average number of apples in baskets that have more than 3 apples.  You can use the previous code to get the list of such baskets, and then write another for loop to calculate the average number of apples in them.
Assignment 4.25: find the smallest number of apples in a basket that has more than 3 apples.  You can do this in two ways: one using two for loops (one within another, and a more complicated conditional if statement that keeps track of the smallest number of apples that is more than 3), and another using three for loops, one within another to generate the indexes, and a third to find the minimal count within the new list.  Try to solve using both methods.  Which one do you prefer?  Why?

Super Challenge of the Day:

Imagine an organized pile of blocks. 4 blocks deep, 3 blocks wide, and 5 blocks high.  This gives you a total of 4 x 3 x 5 = 60 blocks.  Now, each block is actually a container that contains some number of marbles.  It may have no marbles at all, or many marbles.  I can describe it with a 3-dimensional matrix like this:

>>> blocks =  [
         [   [5, 3, 8, 3, 0],
             [4, 2, 4, 3, 2],
             [1, 4, 3, 9, 3]    ],
         [   [3, 4, 2, 4, 2],
             [0, 0, 0, 0, 0],
             [1, 1, 1, 1, 1]    ],
         [   [2, 3, 3, 4, 5],
             [5, 2, 1, 2, 1],
             [7, 8, 6, 7, 8]    ],
         [   [7, 5, 3, 2, 6],
             [3, 3, 6, 0, 9],
             [5, 4, 1, 2, 1]    ]

Assignment 4.26: count the number of blocks that have more than 3 marbles but less than 8 marbles.  HINT:  you'll need to use a loop within a loop within a loop!  Have fun!  (You should get 18 marbles)
Assignment 4.27: instead of a count, can you create a db type list of the relevant indexes, a record should look something like this: { "depth": depth_id, "width": width_id, "height": height_id }  HINT: if you run into strange problems here, remember references from the lesson on information, it may help you out.  Or not.  Either way, good luck!
Cry if you have to, but if you can do the above, you have officially mastered loops and conditionals.

Hurrah for Minesweeper!

In this lesson you've learned about some ways to better organize information: databases and matrices.  Has anything you learned changed the way you want to solve the previous problems?
  • How would you represent a Minesweeper cell now?
  • How would you represent a Minesweeper grid of cells? Say, a grid that has 5 rows containing 4 cells each (for a total of 20 cells).
  • Can you write the code that checks to see whether or not the game has been lost? That is, has the player clicked on any cells in the grid that contain a mine?
  • Can you write the code that checks to see whether or not the game has been won? That is, has the player clicked on all the cells that don't have a mine, but not on any cell that does?

Final Note

Now you're ready to learn how to organize your program.  Although less algorithmic than loops and conditionals, organizing well is key to making a complicated program look simple!

We'll start with functions, which are a lot of fun (trust me), then talk about classes (which are loads of fun as well), and finally libraries (which are so-so fun, but incredibly useful).  Woohoo!

As always, give me feedback, people!  If you got stuck somewhere, I really want to hear about it—if I did my job well, you really should not have.  So let me know!!  Seriously, people. :)  

Riddle me this...

Here you have a list of strings. For each string, the character that occurs most often is the one you want. Put all those characters together to see the message....

For instance, if the string is "abcaba", then 'a' occurs 3 times, 'b' 2 times, and 'c' once. Thus 'a' is the hidden letter.  

Hint: To go through all the letters of a string, just as in a list, do:
for c in 'hello there':
    print c

Good luck!

   " z zz  z   z   zz  z z ",
   "tt=ttkttttkkktk==kttt=tkttk=kk=kk==t====" +
   "e%>%%>>%eee>;%;;;%>%e>%e>>%e;%e;e>>>%;;>" +
   " k@@  55JJJ@ kJ 55k    @@ k@Jk @k55@J 55" +
   "ll[n:[l[nlnl<<nn<[ln<:n[::<[[nnn<l<lln:<" +
   "y>>aa>>y>ax>>yxyay>xxyaayxyaayayaxaxyxxy" +
   "t()t))tsstti))ttt)))((si(is(st)t(st)i(is" +
   " KKK  KK KK  K K     KK KKKKK K  K   ",
   "DeVeDe/VeD/D//VeeV///VD/e/De/D//VDVDe/DV" +
   "ksk}}ssksks}}kk}}}}kkk}}kskk}k}}s}skksks" +
   "vssvvv)v))sv)s)vsssvs)ssvv))sssvss))vs)v" +
   "0 _ _0__ _0_  _   __ 0_0 0 000 0 ",
   "sLs-VLVLVV---sVs-LsVVL-VsssVVs--VsLsVV-L" +
   "3 1131 ++1+33++333 +1311 13+1 33    +3 +" +
   "  1++  3+ 1 1 11 3+1+133+ +3",
   "S[[SxSx[sshsx[xShShxhxShhh[s[s[hhSssssS[" +
   "tLL0L;L;;Lt00t0LL0;ttLL00;;t00tL;L0Lt;t0" +
   "Fg:F:gF::FgFFgFFFF:::F:ggF:gggF:::::g:gF" +
   "nn}PPnwnPPnw+P}Pw}wn}+++}nnP}w}nPn}+ww}}" +
   "c}+-}+}c+c-t++--c-t+--}---c-}t-}tctc+cc-" +
   "Qr5$rrr$$r5Q5QQ5$Q$Qr$5QQ5Qr$$5rrrQ$5rrr" +
   "YYnnYY.MM.Y..M..nn.YYMnY.MYnnn.MnYM...nY" +
   "G3wGoGo3ZZ3w3GoZoZ3G3ZZZ3G3ooZGZowoGGo3w" +
   ":mCmC|m||m:C:Cmm|m:mCmC:m:C:C||m||mm|C|:" +
   "hb{/bbMhMbhM/bh{{hb/{M{MMhM/h{b//h/{b{//" +
   "/3/3II/I@@@3I@@/I@I/@@@//I/@/3@3I/3/III3" +
   "SX,S,f6f66X6XSXXXSfS,6XS,ffff6,SX6XX,6,," +
   "tzztyyMzttMzMzMyztytyzyttMttzMytMyyyMzyz" +
   "oJdJooJooddJdJJJJJddoddJododJoJdJoJdoodo" +
   "jC9j9jaaCC//aa99Ca/Cjaj/jjC/aa9j/9aa9CC9" +
   "2l222ool2l22l222llloollloo2lool2lo2loooo" +
   "lrrblrbbarlaaabalbalrbllrllalllarbrabaaa" +
   "iOO}i}--}iZZZi--}}Zi}i}}ii--Zi}O--}-ZZiZ" +
   "siG88si8iiGiGisiis88GG8GisGssG8sisGi8G8i" +
   "tS#^rrr#t^#^S^^r#tS##t^ttStrrSt#rS#^t#tt" +
   "G4G4nt44tntn4G4nnnGnG4GtGGtntnnGn4nGGGt4" +
   "}}}s,0}s},}}00,ss0}00s}s,0s0,0s0,}0},,,s" +
   "uIuu{bbI{.II.I..{ubuI.{.{.b.uIbu.I{bb..b" +
   ".9hh9T.hhh.hhh9T9T99hT9.hT9..TT.h..h9.TT" +
   "{r-lll-r--l{-lrrrrrlrl-rl-r{-lll-l{-{{-r" +

1 comment: