Tutorial 1: Essentials

In this tutorial, you will write your first robot. You will learn how to:

The robot you will create in this tutorial will navigate to a page containing a table, extract the person data contained in that table, and output several PersonOutput objects.

Before proceeding, we recommend that you open your favorite browser, and navigate to http://www.kapowtech.com/tutorial/case1/index.html to take a look at the pages involved in this tutorial.

Let us begin by starting up RoboMaker and selecting “Create a new robot...”, which starts the New Robot Wizard as shown below.

This wizard will assist you in configuring the robot. Choose “Data collection robot” as the robot type and continue to the next step of the wizard.

As the URL to start from, enter "www.kapowtech.com/tutorial/case1/index.html".

In the next step of the wizard, add the object called “PersonOutput”. This object will be used to extract person data.

Click “Finish” to create the robot. The RoboMaker Main Window should now look like this:

As you can see, two steps have been inserted. The first step, called “Load Page”, loads the page using a Load Page action, and the second step, which is the current step, has not been configured yet.

Now let’s configure the second step of the robot, so make sure it is the current step. In the Browser View, we see the link “Go to Table” which leads to the page containing the table that we want to extract data from. To load this page, we choose Click as the action of the current step. In the “Action” tab in the Step View, select the Click action. (If you wish to read more about the Click action, click “More...”. This opens a page with the appropriate reference documentation in your browser. All step actions and data converters in RoboMaker have such a reference page.)

To select the link to be clicked by the Click action, click the “Go to Table” link in the Browser View. This will select the <a>-tag that defines the link (in the Tag Path View, you can see that the "a" is selected). Then, click the icon to configure the tag finders to find only that tag.

To load the page, click the icon. This causes two things to happen: First, it adds a new step after the current step. Then the new step becomes the current step. Changing the current step has some interesting effects: it always results in an update of the robot state shown in the State View, because the State View always shows the input state to the current step. The input state to the current step is always the output state of the previous step. To update the robot state in the State View, RoboMaker will execute as much of the robot as is needed to get the updated robot state. In our example, the output state of the previous step contains the loaded page.

An alternative and easier way to load the page would have been to simply right-click the link “Go to Table” and select “Click” in the pop-up menu. This would configure the current step to load the page referred to by the selected link, using the Click action, insert a new step after the current step and go to the new step.

The RoboMaker Main Window should now look like this:

You have now reached the page containing the content that we want to extract. Hence, the navigation part of the robot is over. However, before starting on the extraction part, let us try to change which step is the current step without adding a new one. You can make any step in the Robot View the current step by simply clicking it. Try clicking the first step in the Robot View. In the Step View, we can see that the Load Page action has been selected as action, and that the URL from which the action loads is the URL that we entered in the New Robot Wizard. Now try making the second step the current step. Note how the State View updates itself to appear exactly as it did when you finished configuring that step a moment ago. The changes of current step went pretty fast, didn’t they? The reason for this is that RoboMaker caches (stores) the output robot states from selected steps in order to minimize the waiting time when the current step changes. The idea of caching is not unique to RoboMaker; your own browser also caches loaded pages so that you can quickly step back and forth between them. Like in a normal browser, you sometimes want to refresh the cache. You can refresh the cache in RoboMaker by clicking the icon. Normally, however, it is not necessary to refresh the cache.

Let us return to the extraction part. Before continuing, make sure that the last (third) step is the current step. Taking a look at the table on the web page, we discover that the table contains three columns (PersonId, Name, and Age) and four rows (not counting the headline row). Furthermore, the trained eye will discover an irregularity in the table: Bill has no age! (As you will discover when you begin to write your own robots, these kinds of irregularities are quite common on real-world web sites.) How do we deal with this irregularity? First, we need to decide whether we wish to extract a PersonOutput object when there is no age available for that person. This is an important question that you will probably encounter many times: How much information should, as a minimum, be available in a returned object? Fortunately, we can see the right answer to that question by looking at the “Output Objects” tab in the Objects View.

As you can see, the object attributes “personId” and “name” have a small red dot next to them. This means that these two attributes are mandatory and must be given a value before the PersonOutput object can be returned by the Return Object action. The “Age” attribute has no red dot next to it. This means that the attribute is optional (i.e. not mandatory) and may be given a value before the PersonOutput object is returned.

So, we should extract four PersonOutput objects from the table. How do we do this? There are several approaches, but let us settle on one that uses the For Each Tag action to loop through (i.e. “do the same for”) each row in the table.

Select the For Each Tag action in the “Action” tab in the Step View. Input "tr" into the Tag Name property to tell the For Each Tag action that it should loop through the <tr>-tags (the table rows) contained in some tag. Next, type "1" in the First Tag Number property to skip the headline row. Finally, we need to identify which tag the For Each Tag action should look for the <tr>-tags in. Try clicking on the table in the Browser View. Then look at the Tag Path View and select the innermost <tbody>-tag. Finally, click the icon to configure the tag finders to find this <tbody>-tag. (Another, much simpler way to do all this would have been to right-click the table, and, in the pop-up menu, select “Loops”, then “For Each Table Row”, and finally “Exclude First Row”.)

The RoboMaker Main Window should now look like this:

Click the icon to add a new step and make it the current step. The input to the current step is the output of the first iteration of the For Each Tag action (iteration 1). You can change the iteration number of the For Each Tag action by clicking the icon (decrease iteration number by one) or the icon (increase iteration number by one), or by directly typing the iteration number into the small text field in-between and hit return. You can also go to the first or last iteration by clicking the or icons, respectively. Try and change the iteration number to 3.

The RoboMaker Main Window should now look like this:

Note that the current tag is now the third row in the table.

Let us extract the content of the current row. Right-click the PersonId "2" in the Browser View, select “Extraction” in the pop-up menu, then “Extract Number”, and finally “PersonOutput.personId”. Because we are extracting a number, the Extract Number Configuration window pops up. Select the “Convert to Integer” option, and click “OK”. The current step will now be configured to use an Extract action, with the Attribute property set to “PersonOutput.personId”, and an Extract Number data converter added to the list of data converters.

Do (more or less) the same for the Name "Jim": Right-click "Jim", select “Extraction”, then “Extract Text”, and finally “PersonOutput.name”.

Do (more or less) the same for the Age "72": Right-click "72", select “Extraction”, then “Extract Number”, and finally “PersonOutput.age”. Select the “Convert to Integer” option in the Extract Number Configuration window that pops up, and then click “OK” to close the window.

You have now extracted a PersonOutput object! Let us return it by selecting the Return Object action for the step. Remember to select “PersonOutput” in the drop-down box for the “Object” property in the Return Object action.

The RoboMaker Main Window should now look like this:

The robot now consists of seven steps: two steps concerned with navigation and five steps concerned with extraction. Let us have a closer look at how the objects change as the current step changes.

As you can see in the Objects View, you have extracted the “personId”, “name”, and “age” attributes of the PersonOutput object. Now, try to make the previous step (named "Extract Age") the current step by clicking on it. Note that this causes the value of the “age” attribute to become empty. The reason for this is that the Objects View shows the attribute values that are input to the current step; and as the attribute value for “age” has not yet been extracted, it is empty. Try clicking the previous step (named "Extract Name") and note that the value of the “name” attribute becomes empty. Finally, if you make the step named "Extract Person Id" the current step, then the value of the “personId” attribute also becomes empty.

Now, change the iteration number of the For Each Tag action to 1 by clicking the icon twice (or by clicking the icon). Then change the current step back to "Return Object" and note how the values of the attributes change to match those for the second row (containing info on "Bob") in the table even though you created the extraction steps for the third row (containing info on "Jim") in the table. This is because the branch beyond the For Each Tag step is applied on all robot states outputted by the For Each Tag action. This is a general principle for all loop actions and it is highly useful when you need to do the same thing more than once.

Try clicking the icon and notice how the PersonOutput object changes. Also, notice that there is also only one PersonOutput object at any time and not one per person data in the table; the same PersonOutput object is reused in different iterations.

Keep clicking the icon until the following message appears:

This error occurs because there is no age in the table row for Bill. This causes the tag finders in the step named "Extract Age" to fail. When you click "OK", this step will be made the current step.

How do we deal with this missing age attribute value? We will select this approach: Only extract an age attribute value if there is one. In other words, there are two cases: One in which there is an age value, and one in which there is not. We can represent these two cases by branching into two branches, each starting with a conditional step containing a Test Tag action. The first Test Tag action will only continue execution beyond the step if there is an age value, and the other Test Tag action only continues execution beyond the step if there is no age value.

Click the icon (and not the icon!) to insert a new step between the "Extract Name" and "Extract Age" steps. Click "3" or "Bill" in the Browser View, select "tr[4]" in the Tag Path View and click the icon to configure the tag finders to use the <tr>-tag as input. Select the Test Tag action in the “Action” tab in the Step View. We want this Test Tag action to continue execution beyond this step only if the row contains an age value. We enter the pattern ".*\d+" (which matches all texts ending with one or more digits) into the Pattern property, select the “Continue if Pattern Matches Found Tag” action, and select “Only Text” in the Match Against property. The “Only Text” option is selected because we want the pattern to be matched against just the text contents of the found tag, without the tags.

The RoboMaker Main Window should now look like this:

To verify that the Test Tag action works correctly, click the "Extract Age" step. This will cause the following message to appear:

The message says that the Test Tag action has stopped the execution. Click “OK” to dismiss the message. Change the iteration to 2 by clicking the icon twice. Then click the "Extract Age" step again. This time the Test Tag action will not stop the execution because the pattern matches the text — that is, the row contains an age value.

Now, let us create the branch for the case in which there is no age value. Make the "Extract Name" step the current step by clicking on it. Then, click the icon to add a new branch to the "Extract Name" step. The new branch contains a single step that becomes the current step. As before, click “1”, “Ted” or “25” in the Browser View, select "tr[2]" in the Tag Path View and click the icon to configure the tag finders. Select the Test Tag action in the “Action” tab in the Step View. This action should be configured so that it stops execution if the text contains an age value. Enter ".*\d+" in the Pattern property, set the “Action” property to “Stop if Pattern Matches Found Tag”, and select “Only Text” in the Match Against property.

The RoboMaker Main Window should now look like this:

Now, let us create a connection to the "Return Object" step. To do this, place the mouse cursor just to the right of the current step until a white arrowhead appears, then drag this arrow to the "Return Object" step. An alternative way to connect the two steps is to hold down the Ctrl key, and then click first the current step and then the “Return Object” step, to select both steps. Then, right-click the “Return Object” step to bring up the popup-menu for the step, and choose “Add Connection” from the pop-up menu. (To remove a connection between two steps, either hold down Ctrl and click the connection, then click the icon, or right-click the connection to bring up the pop-up menu and select “Delete” from the menu.)

The RoboMaker Main Window should now look like this:

Verify that the Test Tag action works correctly by changing the iteration using the and icons and then click the "Return Object" step. You should only be allowed to execute beyond the current step (containing a Test Tag action) when the iteration is 4. And this is exactly what we want. We have now achieved the desired behavior: The first (top-most) branch only allows iterations 1, 2, and 3 to continue (those with an age value). The second branch only allows iteration 4 to continue (which has no age value).

Have you noticed that the connections between steps are sometimes black and sometimes dark gray? This brings up the concept of the execution path. The execution path includes all steps from the first step to an end step (an end step is a step with no step after it) such that it includes the current step. As you change the current step repeatedly between the two “Test Tag” steps, notice how the execution path changes. You can use the execution path to see which of several branches was taken to reach a step. For example, if you make the “Return Object” step the current step, then the execution path will tell you which of the two branches that was executed to reach that step.

That's it! Congratulations, you have now created your first robot!

Let us test the robot in RoboDebugger and verify that it extracts the PersonOutput objects we expect, namely four PersonOutput objects, one for each row in the table containing person data. Click the icon to open RoboDebugger. Then click the icon in the RoboDebugger Main Window to start the debugging process. As the debugging process runs, objects are returned and displayed.

When the debugging process completes, the RoboDebugger Main Window should look like this:

If your RoboDebugger returns the same objects as suggested by this screenshot, then your robot is working as expected. Return to the RoboMaker Main Window (by either closing the RoboDebugger Main Window or simply switching to the RoboMaker Main Window) and press the icon to save the robot for later use.

It might seem that you had to do a lot of work to “simply” extract some person data from a table. Well, when you are trained in using RoboMaker, you can create a simple robot like the one in this tutorial in one or two minutes! Also, the robot is rather robust; for example, it will still work correctly if persons are added to, or removed from, the table, and if the “age” attribute value is missing for any person, not just "Bill". So what you have is a robot that can be reused as the table content grows or shrinks, and that can handle some table irregularities. And for many kinds of robot tasks, this flexibility is exactly what you want and need. For more on robot robustness, see How to Make Robots More Robust.

Before you proceed to the next tutorial, we recommend that you read the reference documentation entries for the step actions you have used so far. Also, you might want to experiment with the robot you have created:

Remember to test your modifications in RoboDebugger.