Sunday, November 27, 2016

Transfer Learning with Satellite Imagery

Looking for new papers on Machine Learning this last week, I came across an interesting article. Titled “Combining Satellite Imagery and Machine Learning to Predict Poverty” it hits on some topics from class, specifically about data clustering in linear spaces that we have been discussing over the last week. The basic premise of the article was that by using various types of satellite imagery as training data, they could accurately predict the poverty level (or wealth) of third world areas. This was of importance to the authors because there are still large economic "data gaps" within Africa, the continent of interest. The basic gist of the problem is that for most African nations, surveying for income issues is cost prohibitive. If the growing governments had more accurate reports about finances of all areas in their country, it would help them immensely in distributing aid.

One of the most interesting things about this article is the use of transfer learning. Transfer learning is leveraging the principle that Convolutional Neural Networks are layered and re-purposing one of the layers that was previously trained on another data set. The article doesn’t go into the specific details of the algorithms used; it just gives a high level overview of the process the authors used. The authors’ first step to building their CNN was, strangely enough, training on simple images. A set of labeled images consisting of 1000 different simple categories gives CNN the ability to discern between simple properties of images. The paper gives the example of "cat" as a possible label in the data. The data couldn't have anything less to do with wealth distribution of nations. At this point in the training process, the CNN is still general purpose.

The next step after general purpose image detection was to re-tune the CNN, training it to predict the nighttime light intensities based off of daytime satellite images. Essentially, the model within the CNN is now learning to break apart daytime images into a linear space and quantify what the corresponding results at nighttime would be based off of the respective values via a linear mapping. Fortunately, Google maps has hi-res data that was available to the researchers to use for this task. Note that this step is now starting to form information used by the next training phase. It is building predictive clusters to be re-used.

Lastly, the authors train the CNN one more time using what little data they had from the aforementioned wealth surveys. This is to re-correlate the daytime linear space formed from the satellite images to an actual metric they possessed for poverty. The formed clusters now, somehow, become mapped from a picture of a piece of land to wealth and asset holdings. The reasoning the authors give for this being more informative than simply using nighttime lights is interesting as well. Lights, by their nature, are binary.  Is it on or off? Where as looking at roads is much more informative:  how many roads and how well are they maintained? Looking at actual physical features is much more informative. By using regression, the CNN is able to work out what features become the most dominant or, conversely, unimportant.

I have seen pictures of the lights on the East Coast at nighttime hundreds of times but never thought beyond the fact it looked interesting. To see a group of people use that information reminds me, just because I haven't thought of anything interesting to do with the information, doesn't mean that there isn't anything that can be done with it.

The article I am referring to in this post can be found here.

http://science.sciencemag.org/content/353/6301/790

Monday, November 21, 2016

Unsupervised Learning

Unsupervised learning is very similar to supervised learning in that it takes certain inputs, images for example, and can apply labels to that input. Supervised learning, by contrast, follows by example. Supervised learning requires a human to show the program a bunch of images and categorize them for the program. The human must specify what is in each photograph. The program will then be able to use its previously learned images to infer labels onto new images it sees.
Unsupervised learning differs from supervised learning in that it doesn’t require previous inputs to be categorized or labeled. Unsupervised learning can take an image and analyze it without any previous examples. For instance, our current programming assignment is to take a given image and analyze it to find lines. The program didn’t have any previous examples of lines in images to look for. The program instead uses the RANSAC algorithm and keeps finding lines in the image until no more lines can be found. This can be extended to full resolution and full color images. The program can be told to cluster the image into similar sections and colors. This can help the program find edges of objects in the pictures. These edges can help create wire meshes for the images or can slice the image off into different sections.
Unsupervised learning can also be used with text instead of images. A program could look at a corpus of text and sort that data into certain groups for clustering. For example, a program could try to categorize a small corpus of posts on a blogging site. To follow the RANSAC algorithm, the program could randomly pick a few words from random posts and find inliers for those words (blogs with similar words). The program would then continue to refit the model to the new inliers it picked up until the model stops changing. This would create clusters of different categories throughout the site and would help people browse categories they are interested in.

Yahoo does something similar to this where it tried indexing the whole web. Yahoo is primarily used to input search queries and find specifically what you are looking for on the web, but they also have the option to choose a category that they have predefined. You could then explore the web sites that they have in this category. They can create this organization in a similar fashion to the blogging site example. They create a narrow topic and then find all sites that fall into this model using unsupervised learning. These are only a few of the many examples for uses of unsupervised learning.

Monday, November 14, 2016

Limits of Linear Regression


For this weeks blog post, I have some thoughts on linear regression - in particular on the limits of using regression in the context of artificial intelligence.

To analyze the weaknesses of linear regression, it might be appropriate to talk about one of the strengths of regression first. The best thing that linear regression has going for it is simplicity. Anyone who has taken a course in algebra or basic statistics has enough knowledge to be able to understand the goal of this process. Given a set of observed data, try to describe what function produces it. Even in some of the more advanced topics such as fitting to multivarite functions, the basic idea of trying to create a function of best fit remains the same.

This simplicity, however, is also the main weakness of regression in that you need to be able to model the situation as a function. This isn’t a hard task in some fields where regression dominates, where you are gathering easily discretized data. There are no guarantees that you are going to be operating in a area with nice data. Even with a large bag of functions to fit to (exponential growth/decay, power law), you are still assuming that the function is well behaved. Not only does data have to be easy to dissect, it also has to be easy to digest.

In some ways, most of the algorithms we have talked about are the antithesis of linear regression. While before we had been striving for optimal solutions, in linear regression we are attempting to model the true solution, which will always be sub-optimal. The solutions error at any given point might be negligible, but how do you know? This is frustrating world we have stepped into to, we can only hope to have a good approximation to the solution.

I don’t actually think over-fitting is a huge problem with regression as it is a method to avoid overfitting error to begin with. Methods like Lagrange interpolation and cubic splines are much more aggressive. While over-fitting is a problem, it’s actually more operator error as opposed to fault with the method. This is more a problem of trying to throw all of the data at the equation without thinking about it.

And, I feel that is actually what people want to be able to do. Forget about the inherent logistical problems in managing large amounts of data, figuring out some way to sift through information and figure out whats important is just as much of a challenge. So while a neural network might not be as easily understandable, the ability to simply feed it all of your data is awesome. If the hardest part about dealing with the information you have is actually deciphering where to start, regression might be the wrong tool.


Don’t get me wrong, I think linear regression definitely has its place. I would personally use it over a neural network whenever I could get away with it.

Tuesday, November 8, 2016

Handwriting Analysis

Have you ever struggled to read someones handwriting? That seems to be a classic problem, if someone is only writing for themselves, their handwriting seems to become nearly illegible to anyone else. Yet, they can still read it perfectly fine. So if you were to get someone else's grocery list, would you be able to go get everything they need? Better still, what if you had an app that could make sure that you got everything on their list. Think about being able to scan your notes into an editable format, would that be very useful? How about possibly detecting when someone attempts to fake your signature? There are many interesting applications for handwriting recognition. This technology would be useful to have in society.

Handwriting recognition comes in two flavors, on line and off line. On line refers to any handwriting being directly inputted into a system, off line refers to the analysis of a photography of handwriting. The previous example of a grocery list would be off-line, where as signing the credit card system at the store would be an example of on line. The benefits off on line is that you have more situational data to analyze, length of contact with the screen, amount of pressure applied etc. Obviously the downside to this is that there is more to keep track of.

A straightforward way to tackle this problem is simplification, attempting to identify one character at a time. After that, simple learning techniques like k-nearest neighbors will work. It is important to specify what domain is being used for this technique, i.e. Roman alphabet of Modern Standard Arabic, the more you are able to reduce the domain the more success this technique enjoys.

A more common way to analyze handwriting is through training of a neural network based off of several handwriting characteristics. It is important when deciding what to represent in your vector as a characteristic of handwriting. Some things such as aspect ratio, curvature and location relevant to other letters. You can see how breaking down a problem into its fundamental aspects is a property of learning. How you choose to break down the information will determine the overall classification process and therefore your results. There are even some crazy people attempting to use unsupervised learning to classify handwriting, to varying degrees of success.

Something that is very cool, is that actually have competitions for this, whose algorithms can identify handwriting the best. There is a conference called ICFHR (International Conference on Frontiers in Handwriting Recognition) that holds a annual competition. The examples they give for pages that will be analyzed (the actual pages themselves are kept secret till the competition) actually look very impressive. I couldn’t understand anything from just trying to eyeball it, and shows that in practice it is possible for a computer to read better then a human.

Monday, November 7, 2016

Image Recognition

Image Recognition

               Image recognition is a large and vast field. Self-driving cars use it to distinguish road signs and markings while Google Photos uses it to recognize objects in photos and group photos together by type. These image recognition software tools use supervised learning to help guide the image recognition to determine what it sees in the photo. Supervised learning requires a human to show the program a bunch of images and categorize them for the program. The human must specify what is in each photograph. The program will then be able to make guesses on what objects it sees in new images presented to it based on the examples it was given from the human trainer.
               An example of this image recognition being used in self-driving cars is the company Comma.ai. Comma.ai is a small company that is working on retrofitting current cars with additional hardware that will allow limited self-driving on highways. The company will add a camera onto the car that uses image recognition software to determine where to drive the car. Before Comma.ai puts this product into production, they need to train it with supervised learning. Comma.ai released a phone app that you can use while driving. This app will record the road in front of your car and send videos and pictures to the company. This will gather lots of real world training data for Comma.ai to use when training its algorithm.
Comma.ai also released another tool that will allow normal people to train its algorithm. They put up a web tool that will display an image they received from their user-base. The user of the web tool will then be able to label parts of the image. For example, the user can mark they sky, road, cars, signs, road markings, and other parts of the image. These users are acting as trainers for the algorithm. The algorithm will take all these examples and will then be trained to determine what is in a certain image on its own. This will allow the algorithm to be deployed into a car. The product will then be able to determine what is on the road ahead of the car and drive the car accordingly.
This supervised learning is crucial in teaching the car to know what it is looking at on the road. This is not the whole picture, though. The car must still know what to do with the data is has acquired from the photos. It can determine that there are lines on the road, but it still needs to know that it must drive on the right side of a double yellow line, for example. Other algorithms along with supervised learning are required to come together to make self-driving technology.