Kathryn's Blog

Friday, March 26, 2010

Matplotib - adding patterns to bars

Matplotlib is an excellent library for building graphs in python. I've used it before to create graphs for school and my post on jacks or better. Today, I needed to create a bar chart for black and white printing, so the standard color scheme wouldn't work! I tried to find documentation online for how to do this, and found nothing. Finally, I figured it out. You can use the hatch property in the optional kwargs arguments to create the patterns. For example, the code looks something like this

1:  kwargs = {'hatch':'.'}
2:  rects = ax.bar(left, height, width, color='w', **kwargs)

This creates a bar with dots in the center. Here's an example:

Here's all of the code:

1:  import numpy as np  
2:  import matplotlib.pyplot as plt  
3:    
4:  N = 2  
5:  ind = np.arange(N) # the x locations for the groups  
6:  offset = 0.05  
7:  width = 0.24    # the width of the bars  
8:    
9:  fig = plt.figure()  
10:  ax = fig.add_subplot(111)  
11:    
12:  baseline = [7.64, 3.89]  
13:  baselineStd = [0.59, 0.06]  
14:  kwargs = {"hatch":'x'}  
15:  rects1 = ax.bar(offset+ind, baseline, width, color='w', ecolor='k', yerr=baselineStd, **kwargs)  
16:    
17:  leaf = [7.06, 3.69]  
18:  leafStd = [0.67, 0.12]  
19:  kwargs = {"hatch":'.'}  
20:  rects2 = ax.bar(offset+ind+width, leaf, width, color='w', ecolor='k', yerr=leafStd, **kwargs)  
21:    
22:  ultrapeer = [5.76, 3.25]  
23:  ultrapeerStd = [0.32, 0.19]  
24:  kwargs = {"hatch":'/'}  
25:  rects3 = ax.bar(offset+ind+width+width, ultrapeer, width, color='w', ecolor='k', yerr=ultrapeerStd, **kwargs)  
26:    
27:  # add labels  
28:  ax.set_ylabel('Estimated Lifetime (hours)')  
29:  ax.set_xticks(offset+ind+width+width/2)  
30:  ax.set_xticklabels( ('Inactive Mac2', 'Active Mac2') )  
31:    
32:  ax.legend( (rects1[0], rects2[0], rects3[0]), ('Baseline', 'Leaf', 'Ultrapeer') )  
33:    
34:  plt.show()

Thursday, March 18, 2010

Lazy Load - a jQuery plugin

While visiting Mashable today, I noticed that the images on the page were being loaded as I scrolled down. Very cool! Allows for a very quick page load time. I did some research and found that this can be accomplished using Lazy Load, a plugin for jQuery: http://www.appelsiini.net/projects/lazyload. I haven't tried it myself, so I can't really attest to it's functionality, but I highly recommend this plugin for anyone creating a web page with heavy image content.

Saturday, March 6, 2010

Photo-bot

My latest project for school involved creating a photo editing and sharing website using Google App Engine. I was excited about the opportunity because (a) I love creating web apps and (b) I was interested in learning app engine. I was pretty happy with the results of my assignment, so I thought I'd share my site with everyone: http://photo-bot.appspot.com/.

There was one pretty tricky use case I had to handle for the assignment. App Engine limits the size of Blobs saved in their datastore to 1MB. However, I had to allow users to upload photos greater than 1MB. Before saving to the datastore, the images needed to be resized. Simply keeping the image data in memory and calling images.resize() is not good enough, since the call to images.resize() actually saves the image data in the datastore, which throws an exception when the image is too large.

To solve the problem, I used blobstore. Blobstore allows files up to 10MB in size. The images could then retrieved from the blobstore, resized and saved to the datastore. The blobstore entry would then be deleted, since saving to the blobstore costs $$. :)

Of course, this makes the code more complex. And, of course, documentation on the App Engine google page is quite limited! Luckily, I found some excellent code posted by Benjamin Pearson. Thanks to Benjamin for posting this code, it helped immensely in trying to solve the problem! I thought I'd post my code, too, in order to get more information out there on the topic. Here's what my basic class looked like:

1:  class AddPhotos(blobstore_handlers.BlobstoreUploadHandler, GenericHandler):
2:    def get(self):
3:      uploadUrl = blobstore.create_upload_url('/addPhotos')  
4:      self.template_values['uploadUrl']=uploadUrl
5:      self.setTemplate('templates/addPhotos.html')
6:
7:    def post(self):
8:      upload_files = self.get_uploads()  
9:      blob_info = upload_files[0]  
10:      blob_key = blob_info.key()  
11:      photo = main.Photo(image=db.Blob(self.getImage(maxImageDimension, maxImageDimension, blob_key)), 
12:               dateAdded=datetime.today(),
13:               dateModified=datetime.today(),
14:               thumb=db.Blob(self.getImage(maxThumbWidthDimension, maxThumbHeightDimension, blob_key))
15:               )
16:      photo.put()
17:      blobstore.get(blob_key).delete() #delete image after use  
18:      self.redirect('/')
19:
20:    def getImage(self, maxWidthDimension, maxHeightDimension, blob_key):
21:      img = images.Image(blob_key=str(blob_key))   
22:      img.resize(width=maxWidthDimension, height=maxHeightDimension)
23:      image = img.execute_transforms(output_encoding=images.PNG) 
24:      return image

First, a disclaimer about the code: this is the most basic form of my code. I didn't include error checking, among other things. I only wanted to cover the basics, since even the basics can be tricky.

That being said, I want to point out the important pieces of the code (in bold). At line 1, I made my class a subclass of BlobstoreUploadHandler, and not the typical request handler. This is required when uploading to the blobstore.

Next, I created an upload URL, passing it a path (line 3). App engine redirects to a post call on the path that you pass. This path should be in your call to the WSGIApplication constructor:

1:  application = webapp.WSGIApplication([
2:                       #Main Page
3:                       ('/', handlers.MainHandler),
4:                  
5:                       #Add photos
6:                       ('/addPhotos', handlers.AddPhotos),  
7:                       ],
8:                       debug=True)
9:  util.run_wsgi_app(application)

In my case, app engine redirects a post call on /addPhotos, which is handled by the AddPhotos class.

Now we have to handle the post and retrieve the blobstore upload. Behind the scenes, Google App engine uploads the file to the blobstore, and allows us to get access to the blob via the key. This is done in lines 8-10. First, the uploaded files are retrieved by calling self.get_uploads(). Next the blob_info is obtained for the first file uploaded. Finally, the key is retrieved. (Note: this could probably be done in one line!)

The key gives access to the image data. In line 21, you can see that a new Image object is created by passing the blob key to the constructor. The image object can be used to resize or transform the uploaded photo using any of the Image functions. After performing the transformations, make sure to execute the transformations by calling execute_transformations() on the Image.

Once the image has been altered, you can save the image to the database as a blob (line 11). Pass the result of the execute_transformations() call to the db.Blob constructor and, as long as your image is now less than 1MB in size, you can save this blob to the datastore!

Finally, delete the blobstore object, since there is no need for it anymore (line 17).

One thing I should mention before I end: you need a billing account with Google before you can use the Blobstore in production (locally it will work fine). However, as long as your blobstore data usage is small, you won't get charged.

Good luck!

Saturday, February 13, 2010

Jacks or Better

I recently went to Las Vegas and became intrigued with the game Jacks or Better (JoB). This was partly due to the fact that my fiance and I were fairly successful playing the game: we won $13 overall! I was also interested in the game because there is the illusion that you have excellent odds of winning. A hand of Jacks or Better seems very easy to get.

For those who don't know JoB, the rules are pretty simple: any hand with a pair of Jacks or better is a winner. The game can cost any amount to play, but here I focus on the $1 game. Winning hands with a $1 bet depend on the type of Jacks or Better, but they typically follow this pattern:

Jacks or better: $1
2 pair: $2
3 of a kind: $3
Straight: $4
Flush: $6
Full house: $9
Four of a kind: $25
Straight flush: $50
Royal flush: $250

There are many strategies already well documented online. I found the strategy below (I lost the link to the site with this strategy, but here's a link to a similar one). Basically, you want to hold cards in the order of rules listed.

Hold any winning hand of four cards or better
Hold any 4 cards to a royal flush (10, J, Q, K, A)
Hold any other winning hand
Hold any 4 cards to Straight Flush
Hold any 3 cards to a royal flush
Hold any 4 cards to a flush
Hold 2 of a kind
Hold any cards to an open straight
Hold any 2 high cards of the same suit
Hold any 3 cards to a straight flush
Hold a J, Q, and K of different suits
Hold any two high cards of different suits
Hold J, Q or K with a Ten of the same suit
Hold any single high card

With this strategy in mind, I was curious to see what the winnings would be over time. I wrote a program in python to simulate a person playing the game of JoB using the strategy listed above. I started the agent with an initial amount of $50. The agent played until he was below $50 or 50 times (which amounts to 50 plays). Over the course of play, I tracked how much the agent won or loss, and repeated this process 10,000 times. After 10,000 rounds, I plotted the results using matplotlib. The results are below:

As you can see, most of the runs fall between a gain of $50 and a loss of $50. There are much steeper gains than losses. Here are some statistics I gathered during the run:

Total Winnings: -$1.46
Number of Runs Zero or Above: 3734
Highest amount earned during a run: $258
Number of Runs Below Zero: 5973
Highest amount lost during a run: $32

What's so interesting about this is that the total winnings is practically zero! After 10,000 times of playing, the agent loses less than $2. I was curious to see if this small loss would hold over time, so, to the dismay of my computer, I ran the code a couple more times. The next two times, the agent lost $1.85 and $1.36. With such small gains for the casino, I'm wondering how much a casino makes off this game. I suppose that a good number of players do not know the best strategy, so this might tip there earnings even higher.

Also interesting about these results is the 5:3 ratio of losing to not losing. This would suggest that the agent should lose more. However, the agent can win a lot more than it loses. Each play only costs a dollar, but a win could be up to $250. Even with these large earnings, it's not enough to overcome the losing trend. It might seem like you have a good opportunity to win with Jacks or Better. However, in the long run, you will still lose like any other game in Las Vegas. :)

I've posted my code here for anyone to review. Also, please feel free to offer any suggestions for enhancements to the strategy or the code!

Tuesday, February 9, 2010

Summer Enrichment Program Website Now Live!

I'm pleased to announce the launch of the Summer Enrichment Program (SEP) Website! As I mentioned in my previous post, the SEP is a program designed to get young women excited about computer science. It's being held this summer at the University of San Francisco, most likely during the week of June 28th - July 2nd.

Thanks to Anna Hurley for designing the logo! It's perfect!

Saturday, February 6, 2010

Minimizing the Gender Gap in Computer Science

I'm female. And I like computers and technology. I'm a minority. This has led me to try to understand the gender gap in technology. In this post, I take a look at some recent studies in the field and look at my own experience to develop some ideas that might help minimize this gap.

First, I should ask the question: why do we even need women in technology? The field seems to be progressing fine without them, right? A good reason to have women more involved in technology is because groups with both women and men tend to work better together. According to this article in Forbes, papers that have both men and women authors have 42% more citations than papers with a single gender. It seems that women have a very positive effect on any collaborative work!

Next comes the question: why are girls avoiding the tech industry? One study suggests that the environment itself prevents them from taking an interest. A group of scientists recently performed a study on students' interest in computer science. Students were asked to sit in 2 very different rooms and asked to fill out a survey about their interest in computer science. Given a room decorated with very geeky paraphernalia, the girls were much less likely to be interested in computers than if they were in a room with gender-neutral decor.

Also, as a child, I remember my own experiences. I liked playing computer games as a kid, and I liked programming my TI-85 as a high school student. But I never took a computer class in high school. In fact, I never even considered computers as a potential career path. How could this happen? One obvious reason was that computer classes were never required in elementary school, middle school, or high school. Another major factor was peer pressure. I'm afraid to say this, but only the geekiest, most unpopular kids were taking the computer courses at my high school. Taking such a course would be akin to "social suicide".

So, what can be done to get more girls in the industry? First, we should address the fact that the environment might have an affect on students' perception of computer science. We should make an effort to minimize any male-oriented objects in the environment and make the classrooms as gender neutral as possible. One study suggests that we should eliminate all boys from the room, since single-sex classrooms increase the likelihood that girls will take interest in computer science.

Another suggestion I have is based on my own experience. I think we should require students to learn how to use computers starting from elementary school all the way through high school. By exposing all students to computers, these classes would not be reserved for the geekiest, most unpopular students. Plus, it would allow all students to explore computers as a potential career. We already impose requirements for biology, math, and English, why not computers too? Computer engineering is a highly viable career option (maybe more than English). And, even if computer science isn't a student's main interest, knowing how to use computers and technology is integral to almost every professional career and would be useful for everyone.

Finally, I think we need to reach out to girls on more areas than just schools. There are many other media that can help. I'm so excited to see that Barbie is reaching out to young girls! Let's not stop here, though! TV programs for young children can do the same by including more girls using computers and studying computer science. Teen magazines can include articles about programming and majoring in computer engineering. Movies can include more female computer scientists.

With all these ideas in mind, we can start making a difference! I'm going to start this semester. I'm helping plan the Summer Enrichment Program for high school girls interested in computer science. The program is held by my grad school, USF. It runs for a week in the summer and exposes girls to a wide range of topics in computer science. Last year, we taught them about computer architecture, python programming, and java programming. The program is such a great event. Last year, the girls showed marked increase in their interest in computer science, approximately 1.5 points on a scale of 5. I look forward to this year's program and hope to inspire the girls just as much!

If you are interested in minimizing the gender gap too, I urge you to get involved!

Tuesday, February 2, 2010

Forms with File Fields - Java

Background

The standard HttpServletRequest class is not capable of handling file input fields in an HTML form. If you use this class, you will only receive the file name as the parameter value and not the entire file itself. In this tutorial, I will show you how to create the HTML form and process the form data using the com.oreilly.servlet package.

Step 1: Creating the Form

To create a form with a file input field, you need two add 2 additional features to any form: an enctype attribute to the form tag and a file input field. Set the enctype as "multipart/form-data". Here is what the code would look like:

 <form action="upload" id="upload" name="upload" enctype="multipart/form-data" method="post">
 Name: <input type="text" name="name" id="name" />
 Attachment: <input type="file" name="attachment" id="attachment" />
 <input type="submit" value="submit" />
 </form>

Step 2: Processing the form data

Like I mentioned earlier, the standard HttpServletRequest class is not capable of handling file input fields. Luckily, there are several easy-to-use libraries already created for capturing file data from multipart forms. I recommend using the com.oreilly.servlet package. The methods in this library use the same names as the HttpServletRequest methods, making is very easy to learn and use.

Download the jar file from the servlets.com website. Install the jar file to use with your project. (If you are using Eclipse, you can find instructions on how to install .jar files in your project here.)

Import the package at the top of your file.

 import com.oreilly.servlet.*;

Create a new MultipartRequest object. The constructor can accept several parameters (see the API). I like to use the constructor accepting the following parameters: the HttpServletRequest from the doPost method, the path to the directory where file is to be saved, and the size limit of the files. If you do not specify a size limit of the files, a default of 1MB will be set.

 MultipartRequest mpRequest = new MultipartRequest(request, "/Users/bob/project/attachments", 3000000);

Once the MultipartRequest object has been initialized, you can retrieve form data. Use the getParameter method to retrieve form input field data for all fields other than the file field. Use this method with the input field name as a parameter to retrieve the value of the field.

 String name = mpRequest.getParameter("name");

For the file field, use the getFile method to retrieve the file uploaded in the file input field. Use the name of the file input field as a parameter. This method returns a file object, which is the file that was uploaded in the form. The file is saved in the directory specified when creating the MultipartRequest object.

 File file = mpRequest.getFile("attachment");

That's about it! Simple, right? There are several other methods that you can use in the com.oreilly.servlet package, which are discussed in the API.