Editor’s note: We found an interesting series of articlesLearn 30 New Technologies in 30 Days, ready for translation, one update a day, year-end gift package. Here’s day four.
Today is the 4th day of Learn 30 New Techniques in 30 Days. I’ve enjoyed it so far, and the response from the developers has been great. I’m now more motivated to finish it in 30 days. In this article, I’ll show you how to use PredictionIO in Java to simply build a blog recommendation engine. I haven’t found much documentation for using PredictionIO in Java. So, this article may be useful for those looking for a complete tutorial on using PredictionIO in Java. The contents of the “Learn 30 New Technologies in 30 Days” series can be found here.
What is PredictionIO?
Predictionio is an open source machine learning server application written in Scala that makes it easy to build a recommendation engine using REST APIs. It also provides a client-side SDK that encapsulates REST APIs. Java, Python, Ruby, and PHP all have client-side SDKs. At its core, Predictionio uses Apache Mahout. Apache Mahout is a scalable machine learning library that provides a number of clustering, sorting, and filtering algorithms. Apache Mahout can run these algorithms on a distributed Hapoop cluster.
As users, we don’t have to worry about these details. All we need to do is install PredictionIO and use it. Read the documentation for more details.
Why should I care about PredictionIO?
I decided to learn Predictionio because I wanted to use a library that would help me add machine learning capabilities. PredictionIO helps you implement functions such as recommending interesting content and finding similar content.
Install PredictionIO
There are many ways to install PredictionIO mentioned in the documentation. I use Vagrant so I don’t mess up my system and don’t have to configure everything myself.
-
Downloads for the latest version of the vagrant:http://downloads.vagrantup.com/ your operating system
-
Download and install VirtualBox. Please refer to https://www.virtualbox.org/wiki/Downloads
-
Download the latest contain PredictionIO vagrant package: https://github.com/PredictionIO/PredictionIO-Vagrant/releases
-
Unzip the predictionio-x.x.x.zip. This includes the scripts needed to set up PredictionIO. Open the command-line terminal and go to the predictionio-x.x.x directory.
The Vagrant script will first download the Ubuntu Vagrant Box and then install the dependencies — MongoDB, Java, Hadoop, and Predictionio servers. This is time consuming (depending on the network speed). If your location network is unstable, I suggest you use wget to download. The wget command supports breakpoint continuation. Download the Precise64 Box to the appropriate location using the following command:
wget -c http://files.vagrantup.com/precise64.box
After the download is complete, open Vagrantfile, change config.vm.box_url, and point to the download directory, for example:
config.vm.box_url = "/Users/shekhargulati/tools/vagrant/precise64.box"
Now you just need Vagrant Up to start the installation process. Depending on your Internet speed, this can take some time.
We will then create an administrator account as described in the documentation http://docs.prediction.io/current/installation/install-predictionio-with-virtualbox-vagrant.html#create-an-administrator -account
The app can be accessed at http://localhost:9000/. Read the following documentation for details http://docs.prediction.io/current/installation/install-predictionio-with-virtualbox-vagrant.html#accessing-predictionio- The server-vm-from-the-host-machine Predictionio app asks you to log in. After logging in, you should see the panel shown below.
Create the PredictionIO application
To start, let’s create a blog recommendation application. Click the “Add an App” button and enter the application name “Blog-Recommender”.
Once the application is created, you can see it in Applications, shown below.
Then click Develop and you’ll see the details of the application. The important information is the App Key. You need this when you write your application.
The application case
The use case we’re about to implement is very similar to Amazon’s “Customers who buy this product also buy” feature. What we want to implement is the “readers who view this blog also browse” feature.
Develop the Java application recommended by the blog
Now that we’ve created the PredictionIO application, it’s time to write our Java application. We used Eclipse to develop the application. I’m using Eclipse Kepler, with the M2Eclipse integration built in. Create a Maven-based project by creating a new > Maven project from file >. Select maven-archetype-quickstart and enter the details of the maven project. Replace pom.xml with the following.
The < project XMLNS = "http://maven.apache.org/POM/4.0.0" XMLNS: xsi = "http://www.w3.org/2001/XMLSchema-instance" Xsi: schemaLocation = "http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" > The < modelVersion > 4.0.0 < / modelVersion > < groupId > com. Shekhar < / groupId > < artifactId > blog - recommender < / artifactId > < version > 0.0.1 - the SNAPSHOT < / version > < packaging > jar < / packaging > < name > blog - recommender < / name > <url>http://maven.apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceencoding> </properties> <dependencies> <dependency> <groupId>io.prediction</groupid> <artifactId>client</artifactid> <version>0.6.1</version> </dependency> </dependencies> <build> <plugins> <plugin> . < groupId > org, apache maven plugins < / groupId > < artifactId > maven - compiler - plugin < / artifactId > < version > 3.1 < / version > <configuration> <! - http://maven.apache.org/plugins/maven-compiler-plugin/ - > < source > 1.7 < / source > < target > 1.7 < / target > < / configuration > </plugin> </plugins> </build> </project>
What is notable from the above is the dependencies between the Predictionio Java API and Maven.
Now we’ll write a class that inserts data into PredictionIO. This class looks like this.
package com.shekhar.blog_recommender; import io.prediction.Client; import io.prediction.CreateItemRequestBuilder; public class BlogDataInserter { private static final String API_KEY = "wwoTLn0FR7vH6k51Op8KbU1z4tqeFGZyvBpSgafOaSSe40WqdMf90lEncOA0SB13"; public static void main(String[] args) throws Exception { Client client = new Client(API_KEY); addUsers(client); addBlogs(client); userItemViews(client); client.close(); } private static void addUsers(Client client) throws Exception { String[] users = { "shekhar", "rahul"}; for (String user : users) { System.out.println("Added User " + user); client.createUser(user); } } private static void addBlogs(Client client) throws Exception { CreateItemRequestBuilder blog1 = client.getCreateItemRequestBuilder("blog1", new String[]{"machine-learning"}); client.createItem(blog1); CreateItemRequestBuilder blog2 = client.getCreateItemRequestBuilder("blog2", new String[]{"javascript"}); client.createItem(blog2); CreateItemRequestBuilder blog3 = client.getCreateItemRequestBuilder("blog3", new String[]{"scala"}); client.createItem(blog3); CreateItemRequestBuilder blog4 = client.getCreateItemRequestBuilder("blog4", new String[]{"artificial-intelligence"}); client.createItem(blog4); CreateItemRequestBuilder blog5 = client.getCreateItemRequestBuilder("blog5", new String[]{"statistics"}); client.createItem(blog5); CreateItemRequestBuilder blog6 = client.getCreateItemRequestBuilder("blog6", new String[]{"python"}); client.createItem(blog6); CreateItemRequestBuilder blog7 = client.getCreateItemRequestBuilder("blog7", new String[]{"web-development"}); client.createItem(blog7); CreateItemRequestBuilder blog8 = client.getCreateItemRequestBuilder("blog8", new String[]{"security"}); client.createItem(blog8); CreateItemRequestBuilder blog9 = client.getCreateItemRequestBuilder("blog9", new String[]{"ruby"}); client.createItem(blog9); CreateItemRequestBuilder blog10 = client.getCreateItemRequestBuilder("blog10", new String[]{"openshift"}); client.createItem(blog10); } private static void userItemViews(Client client) throws Exception { client.identify("shekhar"); client.userActionItem("view","blog1"); client.userActionItem("view","blog4"); client.userActionItem("view","blog5"); client.identify("rahul"); client.userActionItem("view","blog1"); client.userActionItem("view","blog4"); client.userActionItem("view","blog6"); client.userActionItem("view","blog7"); }}
The classes shown above mainly do these things:
- We create an instance of the Client class. The Client class encapsulates PredictionIO’s REST API. We need to get the PredictionIO blog to recommend the app
API_KEY
Provide it. - We then create two users using the Client instance. These two users are created in the PredictionIO application. only
userId
You have to fill it out. - Since then we have added 10 blogs using Clinet instances. Blogs are also created in the PredictionIO application. When you create a thing, you only have to pass two things —
itemId
anditemType
.blog1
.blog10
isitemId
And thejavascript
,scala
Is such asitemType
. - Then we impose some action on the thing we create. The user
shekhar
Browse theblog1
,blog2
andblog4
And the userrahul
Was browsingblog1
,blog4
,blog6
andblog7
. - Finally, we close the Cilent instance.
Run this class as a Java application. It inserts a record in PredictionIO, which you can verify by looking at the palette.
Now that the data is inserted into our PredictionIO application, we need to add the engine to our application. Click the Add an Engine button. Select the Item Similarity Engine as shown below.
Then create an Item Similarity Engine and enter Engine1 as the name.
Once the Create button is pressed, the Item Similarity Engine is created. Now you can change some of the configuration, but we will use the default configuration. Go to the Algorithms TAB, and you’ll see that the engine is not running yet. Click Train Data Model Now to run the engine.
Wait a while. After the data model training is complete, you will see that the status has changed to RUNNING.
The problem we are trying to solve is to recommend blogs to users based on the blogs they have visited. In the following code, we get a similar entry to blog1 for userId Shekhar.
import io.prediction.Client; import java.util.Arrays; public class BlogrRecommender { public static void main(String[] args) throws Exception { Client client = new Client("wwoTLn0FR7vH6k51Op8KbU1z4tqeFGZyvBpSgafOaSSe40WqdMf90lEncOA0SB13"); client.identify("shekhar"); String[] recommendedItems = client.getItemSimTopN("engine1", "blog1", 5); System.out.println(String.format("User %s is recommended %s", "shekhar", Arrays.toString(recommendedItems))); client.close(); }}
Run this Java program, and you’ll see the results: blog4, blog5, blog6, and blog7.
As you can see in the example above, it’s easy to add recommendations to your app. I’ll be using Predictionio for my future projects, and I’ll be spending more time learning and using it.
That’s all for today. Please give me more feedback.
Day 4: Predictionio –How to Build A Blog Recommender SegmentFault