Turk Berserk: April 2008

Tuesday, April 8, 2008

Are These Items Different Improved

ver. 1.0.1

last updated: 04/08/08

Update 04/08/08 (ver. 1.0.1):Bug fixed. When you use keyboard shortcut to make your selection with the auto-submit off, you can now hit 'enter' to submit the hit.

This script has been downloaded

times.

Description:

Provides general cleanup and streamlining to the "Are these items different" HITs. Allows keyboard shortcut for radio buttons and provides option to auto-submit.

Screenshot of HIT with script installed.

On the standard Items HITs, the two items and their descriptions are arranged vertically with the radio buttons at the bottom. This requires a fair amount of scrolling for this quick-decision HIT. This script streamlines the HIT for quicker comparison and submission.

The Layout: The radio buttons are moved to the top of the HIT, and the items are placed side-by-side.

The blue box that describes the HIT has been streamlined to be as short as we could make it. (We removed the qualification information, and put the rest of the text on one line.)

The instructions are hidden using the show/hide scriptlet. Amazon has 6-8 different versions of the instructions, so it will take viewing quite a few HITs before you get all the instructions accepted into the routine. Once you've reviewed the instructions, click "Don't show me about this change again."

View of the instructions warning.

Selecting radio buttons: User can choose the radio button with a keyboard shortcut. The default shortcuts are: "s" selects the "These items are exact duplicates" button, and the "x" selects "These items are different." radio button. You can customize the keyboard shortcuts under the "User Script Commands" menu, using "Change hot keys." To change the hot keys, separate your two selections by a space. The first selection is the "exact duplicates", and the second is the "different." You can choose letters, numbers, and include ctrl and alt with them by adding a "+." So, for example, if you would like y for same and n for different, you type: "y n" Or, if you'd like ctrl+y and alt+n, you type:"ctrl+y alt+n". If you input an incorrect combination, the input dialog box will reappear.

Auto-submit option: The ultimate in speed. User can choose to have the HIT automatically submitted upon pressing "s" or "x", or the user-defined keyboard shortcuts. Note: this is an advanced feature. You must be very confident in your selection abilities to use this feature successfully. When I first gave it a try, I made several errors before getting used to the selection process.

To turn the auto-submit on, right-click on the Greasemonkey icon, then choose the "User Script Commands" menu option, and choose "Toggle auto submit." The state of the auto-submit toggle is shown at the top of the HIT.

Note: When the auto-submit is on, manually selecting the radio button with a mouse click does not automatically submit the HIT.

Other Notes:

If you're really looking to speed up your Turking, open two tabs in one window with active Items HITs running in each window, with the "Automatically accept next HIT" selected. While you are working on one HIT, the other is loading. With this script and the auto-submit on, you should be able to type, e.g.: x, ctrl+PageDown, x, ctrl+PageDown, s, ctrl+PageDown, etc. Both ctrl+tab and ctrl+PageDown move to the next tab in Firefox. Lickety split!

Instructions:

You must have Firefox, with the Greasemonkey extension installed.

User can choose to have selection automatically submitted via the "User Script Commands", choosing "Toggle auto submit." The state of the auto submit is stated at the top of the HIT.

Keyboard shortcuts can be changed via the "Change hot keys" menu option. The two selections should be separated by a space. You can choose any letter, number or include ctrl+ or alt+. Your new selection will be printed immediately after the radio button.

Disclaimer:
Our scripts are provided "as-is". We always aim to provide a well-tested and useful script that aids in your Turking and causes no adverse effects. Given the huge variety of configurations on which our scripts might be used we can never guarantee that something won't go wrong. We take no responsibility for any inconvenience, increased rejection rate, blocking by a requester, loss of income or damage or any other problem that use of our scripts might cause. We recommend that you only use HIT-specific scripts on HITs that you're very familiar with. When you use HIT-specific scripts, treat it as if you were starting a new type of HIT with a new Requester - try doing a few, then wait to be sure that they're getting accepted.

Show/Hide Scriptlet

The show/hide scriptlet is a small piece of script that we use in many HIT-specific Greasemonkey scripts. Because it is embedded, it is not a script that can be downloaded independently.

Description:

This scriptlet hides the instructions in a HIT. The instructions can be viewed by clicking on "show instructions." If the instructions change, a warning will appear, saying "Instructions appear to have changed. Click link above to view instructions." After the user views the changed instructions, they can click "Don't warn me about this change again" to accept the new set of instructions. After clicking this link, the changed-instructions warning won't appear again, unless the instructions change another time.

Other notes:

We don't think instructions should ever be completely removed from a HIT. A worker needs the ability to review the instructions as necessary. However, after doing hundreds, if not thousands of HITs, a worker does not need to scroll through identical instructions hit after hit.

On first install, the user will receive the changed-instructions warning, and will have to click "Don't warn me about this change again."

GeorgeTag script update #3

We've released version 1.2 of GeorgeTag Image Tagging Improved.

Here's the latest improvement:

You can customize the keyboard shortcut! Don't like the Alt+w, or it doesn't work for you? Choose ctrl+k, or alt+z, or even ctrl+alt+q. Pretty much whatever combination you find most convenient.

From the User Script Commands menu in Greasemonkey, choose "Image Tagging change skip key." In the dialog box, you can type in your choice. The format is like the examples above... you can type ctrl or alt or both, then the "+" key, then a letter or number. (Any character that requires the shift key won't work, but capital letters are ok.) If you input an invalid choice, the dialog box will reappear. The new choice appears in the header of the HIT, and the default remains alt+w.

Hopefully this improvement helps those of you who have non-English set-ups, or even the left-handed folks.

Sunday, April 6, 2008

Open HITs Warning

ver. 1.0

last updated: 04/06/08

This script has been downloaded

times.

Description:
Colors the "HITs" tab at the top of each mTurk page a discreet but noticeable shade of red when you have a HIT assigned to you. This provides a visual notification to help you avoid letting HITs expire.

Screenshot of mTurk with the script installed when a HIT is assigned.

This script checks whether you have an active HITs assigned to you (i.e. HITs which you've accepted but have not yet completed). If you do have active HITs assigned, it changes the color of the "HITs" tab at the top of the mTurk page to a discreet shade of red to provide a visual notification that you have work that needs completing. The tab will appear red no matter which mTurk page you are viewing.

It's all too easy to forget that you've accepted a HIT and allow it to expire as a result, hurting your qualifications score. With this script installed it will be much harder to forget!

Other Notes:
The tab color will only update when a page is refreshed or a new page opened - so if you have two mTurk windows open and accept a HIT in one, the "HITs" tab in the other won't turn red unless you refresh or open a new page in that window. This script will not warn you about HITs that have already expired (it's too late to do anything about them anyway!).

Instructions:
You must have Firefox, with the Greasemonkey extension installed. There are no options to set.

Saturday, April 5, 2008

Requesters: Getting High Quality Results, Part 2

As a follow-up to my original post, Requesters: Getting High Quality Results, there are several more strategies Requesters can use to keep their quality high. You might consider these to be advanced strategies.

Use groups of HITs to "grade" workers:
Rather than using a short qualification test, Requesters can use HITs to grade workers. To do this, you can run several batches of HITs at various times of the day, and analyze the returned data carefully. Then, assign the top workers a qualification value, and only allow workers above that threshold to work on your future HIT groups. Amazon Media Content recently tried this method for their advanced HITs. The downside of this grading method is that you only catch those workers who happened to find your HITs when you posted them. In addition, you might not have a large enough group of workers to draw on, and over time you can have dwindling numbers of workers completing your HITs. To rectify these problems, you will occasionally have to post open HITs to re-grade workers.
Use sliding grade scale:
This is like using a dynamic qualification test. After setting a grade (either via a qualification test, or via an assigned value), Requesters can assign qualification points for every approved response, and take away points for every rejection. This way a worker is motivated to give the highest-quality responses. CastingWords transcription HITs work this way. Don't forget to tell workers this is your methodology!
Include/exclude countries
Besides qualifying workers based on their approval/rejection stats, Requesters can restrict or allow workers based on their location. Using a locale qualification is a blunt instrument, but sometimes it's the only instrument available.

At least one Requester has previously mentioned that upon analysis, they noticed poor quality work returned from particular countries, perhaps due to language issues. Unfortunately, there is no Amazon "system qualification" for language. When a worker looks at their locale qualification, it states:
The Location Qualification represents the location you specified with your mailing address. Some HITs may only be available to residents of particular countries, states, provinces or cities.
Unfortunately, there is no "OR" operator when you list locale codes. This means that, as stated in the AWS mTurk Requester documentation, you cannot, for example, allow workers from US or GB or AU or NZ or CA. The best you can do is to exclude workers from a list of countries. What's unfortunate yet again, is that Amazon restricts the number of qualifications for a HIT group to a maximum of 10. (See this document and search for "QualificationRequirement".)

Establishing HITs based on the location of the worker is fraught with political problems. Workers frequently complain on the message boards at Turker Nation when a Requester only allows workers from a particular country (especially the US). Using a locale qualification to effectively impose a language qualification will inevitably unfairly block some works who speak the language fluently. Be aware that some workers will be annoyed by this, but you might not have any other way to apply a fluency test.

As a Requester, you are well within your rights to include and exclude workers who don't meet your criteria to help you achieve the highest quality results. Nowhere in the terms of service does it state that you have to be "fair" in excluding workers. In fact, Amazon won't arbitrate in any Requester-Worker disputes, as stated in the Participation Agreement.

However, the more restrictions and complicated hoops you place in your process, the less likely a worker will work on those HITs. You should also be prepared to respond to many inquiries any time you exclude any group of people. If it appears that you aren't being fair, you could also get a bad reputation, which will scare off other workers.

Although you may have to test out different strategies, when implemented correctly these 11 strategies can help Requesters achieve a high quality return on investment. Workers who achieve the "elite" qualifications take pride in their status, and might well put your HIT groups at the top of their to-do list.

Friday, April 4, 2008

GeorgeTag: New hits script warning

A new type of hit has been posted by georgetag, people matching... the script we've been using won't work on these new ones. They used the same title & instructions, though, so we're going to have to find some creative way to tell these hits apart.

If you are doing these people-matching hits, it might be best to turn off the georgetag greasemonkey script, because we don't know yet if it has adverse effects.

To turn off the script, right click on the greasemonkey icon. If you have a georgetag hit open, you will find the script listed with a green check mark next to it. Select the name (left-click) and it should be listed with a red x next to it. Alternatively you can go into the manage user scripts dialog and un-check the "enable" box for this script.

Thursday, April 3, 2008

GeorgeTag script update #2

We have the latest and greatest version of the Georgetag script for you, ver. 1.1. This version corrects two bugs and adds two great new features (no more scrolling, and minimized clicking!):

Bug fix 1: If you happened to have first clicked in the "Enter ALL text found in image" text field before selecting a radio button, then selected the "no text" radio button, the box turned red and warned you that you needed to type at least one character. Now once "no text" is selected, the box does not give this warning.

Bug fix 2: If you had already selected "no text" then changed your mind and wanted to add text, previously you had to manually erase the "no text." Now, if you click "Contains Text", it will clear the text box, but only if it previously said "no text."

New Feature 1: Refocusing of text boxes when radio buttons are checked. When you click the "Contains Text" radio button, the cursor is automatically placed in the "Enter ALL text found in image" field. When you select "no text", the cursor is automatically placed in the "tags" text box. No more clicking around! Select and start typing! (Don't forget, the TAB will jump to the next selection as well.)

New Feature 2: Click ALT+w to jump to the next photo. (No more scrolling!) The ALT+w jump iterates through the photos, and once you get to the end, it will put you back at the beginning. Note: the first ALT+w takes you to the second image (since the first was already in view). However, if you forget and scroll the page yourself, the script will only jump to what it thinks is the next image. For example, say you do the first two images using ALT+w, then keep going to the 4th by manually scrolling, the next ALT+w will take you to the 3rd image. It's hard to describe, so you will have to play around with the feature.

If you can't wait to get your hands on the latest, greatest version 1.1, head over to the GeorgeTag Image Tagging Improved post, and install the new version. (You shouldn't need to un-install the old script, Greasemonkey should overwrite the old one.)

GeorgeTag script update

Georgetag has uploaded new Image Tagging HITs, with changed title and instructions. This caused ver. 1.0 of our script to break.

We have uploaded ver. 1.0.1 to the blog.

Visit GeorgeTag Image Tagging Improved for the latest version of the script. You shouldn't need to un-install the old script; Greasemonkey should overwrite it for you.

Wednesday, April 2, 2008

Requesters: Getting High Quality Results

Sounds like TagCow has had some great publicity and response to their enterprise! This means more work, but a few snags as well. Georgetag posted some questions to Turker Nation. These are problems that most Requesters will encounter, but particularly those with high-volume HITs. Although my response is particularly about the Image Tagging set, the tactics outlined below are general enough to be used by any Requester.

Quote:

Garbage tags (intentional
and unintentional):
We filter out meaningless words (the, it, an, a, with) but we are getting some totally irrelevant tags
Vulgarity (jokesters are hijacking the program)"
It looks like MTurk is doing some filtering and we are doing some filtering as well. (Can anyone confirm that?)
...
Incomplete taggings:
We have gotten some images tagged with "boy" where there is more that could obviously be said about the photo, like "boy", "playing", "trains"

There are several things you can do to keep the quality high. Here is a list of tactics implemented by other Requesters on mTurk. I'm certainly not suggesting you use all these ideas, but one or two might work well.

Have a good, representative list of examples, and a good description.
I know folks over at Turker Nation have already mentioned this, but your description is very vague. We aren't certain if you want us to make a list of everything we see in the picture, or just keep it as simple as possible. A separate webpage with lots of examples will go very far. We can then emulate these examples.

Warn workers what response will be rejected, and what behavior will get them banned.
A clear (but not overly-dramatic) warning might just be enough to scare off Roboform-type workers.

Require a minimum approval rating for workers.
Some recent HITs by Amazon's Media Content, and the information extraction group had the approval rating greater than 90%. (Smart Travel Media also has this threshold.) This sounds about right to me. After doing 40k HITs, mine is 99.8%. In the forums, even those who do HITs with high rejections (like the items HITs) still seem to have above 90%. This will prevent some workers who are continually trying to "game" the system.

Offer a bonus for high-quality work.
Georgetag is already offering up a volume bonus based on approvals, which is excellent! Few requesters do this, and more should. Once you get a good verification workflow established, you could track workers' responses and reward those that have submitted the highest quality and volume. Money talks.

Include "gotchas."
Include some pictures that should have an obvious response, like a flower, bird, etc., and particularly ones with a word to include. Then you can start to weed out or ban workers who miss these images. The "are these items different" has a qualification set up and a large set of gotchas. When you miss one, your qualification goes down by 200 points. Basically, after getting 3 wrong, you get timed out for some length of time.

Ban very bad workers.
And ban them quickly. If you haven't already implemented it, check if a single worker is giving you the same response (or few responses) over and over again. I wouldn't be surprised if workers are bypassing the "Enter ALL text found in image" step. Luckily this can be automated. And don't forget to give them some rejections if they replicate their responses beyond some acceptable level.

Set up verification HITs.
After getting all the tags for a given image, you could then create a HIT where the worker verifies that the tags are relevant, and could let you know if any are vulgar or meaningless. An image with a row of checkboxes would be ideal, where you select any tags that are bad, and a comment field to let you know about anything unusual. Hopefully then you can get a great set of tags and you can identify the bad workers using another method.

Use a qualification test.
This one might be a pain to grade, but you could have 5 images that the worker has to successfully tag before being able to do your HITs. Some qualifications even have a quiz about the purpose of the HIT and whether an example is appropriate or not. The quiz-style could be automatically graded.

Implementing 1, 2, and 3 is dead-easy. Making a nice page with a list of good and bad examples will go a long way to fixing some of your problems. I think some workers might be inadvertently giving you poor tags due to lack of understanding.

Politically, using any of the tactics 2-8 can be slightly tricky. (Item 1 should be done by all Requesters. Don't forget, the "description" field cannot be seen by workers once they are in the HIT.) You don't want to scare off your best workers, nor stop people from trying your HITs. Don't be too threatening, or too strict. Workers get very upset if they feel wrongly slighted, and will happily share with all on the Turker Nation forums.

To end on a positive note:
You'll find that most workers really do want to give you exactly the high-quality response that you require. When paid well, we are eager to perfect our responses, and love having discussions and feedback. Continue a good dialog, and you will have a group of willing, quality workers in no time!

GeorgeTag Image Tagging Improved

ver. 1.2.2

last updated: 06/25/08

Update 06/25/08 (ver 1.2.2): Fixed to conform to Georgetag's new name TagCow.

Update 4/8/08 (ver. 1.2): Added capability for user-defined skip image keyboard shortcut, through a "User Script Commands" Greasemonkey option.

Update 4/7/08 (ver. 1.1.2): Fixed to work with new Georgetag group, titled "Image Tagging - Describe what you see. Earn a volume BONUS! Click here to see how."

Update 4/4/08 (ver. 1.1): Fixes 2 bugs and adds 2 features. Click here to see new features.

Update 4/3/08 (ver. 1.0.1): Requester changed the instructions and title of the HIT. New version works with both the old and the new instructions.

This script has been downloaded

times.

Description:

Does 5 things to the "Image Tagging" HITs by georgetag:

Places the 3 input fields next to each image (rather than below)
When "Does Not Contain Text" radio button is selected, the "text found in image" field is automatically filled in with "no text"
All images can be scaled based on a user-supplied percentage, to optimize the layout on your screen
Once text/no text radio button is selected, cursor is placed in appropriate text box.
Scroll through images using a ALT+w, or a user-defined keyboard shortcut.

Screenshot of HIT with script installed. Click on image for full-size version.

The requester georgetag, has uploaded a massive dump of image tagging HITs. They are offering a bonus based on the number of approved HITs, so we wanted to simply speed up the throughput of working on these HITs.

This script increases the efficiency of the Turker workflow by rearranging the layout and removing the reduncancy of the "no text" radio button/text field.

Rearranging the layout of the hit: Currently in these HITs, each image and its associated 3 input fields are are aligned vertically. We have created a 2x5 table, so that the 3 fields for a given image appear to the right of the image. This arrangement not only reduces the amount of scrolling, but it also allows you to see the tag text field and the image together on the screen.
Auto-filling of "no text": These Image Tagging HITs have a redundancy input, which is a time-waster for workers. Even when you click the radio button for "Does Not Contain Text", you must also fill out the second text field with "no text." This script automatically fills in the "text found in image" field with "no text" when the "Does Not Contain Text" radio button is clicked.
Resizing of Images:If you don't like the standard size of the images, you can resize all the images by a given percentage. I find that 75% makes the height of the image the same as the input boxes, but you might have a different preference. To set the scale, right click on the Greasemonkey face in the bottom toolbar, Select "User Script Commands", you will see a text box pop out that says "Image Tagging image size (currently 100%)". If you click that text, a pop-up window appears, and you can enter the image scale as a percentage. For example, to scale all images at 50%, enter "50" in this box. The image scale will take effect the next HIT you view.
New Feature 1: Refocusing of text boxes when radio buttons are checked. When you click the "Contains Text" radio button, the cursor is automatically placed in the "Enter ALL text found in image" field. When you select "no text", the cursor is automatically placed in the "tags" text box. No more clicking around! Select and start typing! (Don't forget, the TAB will jump to the next selection as well.)
New Feature 2: Click ALT+w to jump to the next photo. (No more scrolling!) The ALT+w jump iterates through the photos, and once you get to the end, it will put you back at the beginning. Note: the first ALT+w takes you to the second image (since the first was already in view). However, if you forget and scroll the page yourself, the script will only jump to what it thinks is the next image. For example, say you do the first two images using ALT+w, then keep going to the 4th by manually scrolling, the next ALT+w will take you to the 3rd image. It's hard to describe, so you will have to play around with the feature.
User-defined skip image keyboard shortcut. You can change the keyboard shortcut to be a more convenient combination. See Instructions below for directions.

Other important notes:

The Image Zoom add-on is a helpful extension for these HITs. If you set a small image scale, you can use Image Zoom (on right-click) to zoom in and out of each image for inspection.

Please note that this script was made on-the-fly, and will cease working if the Requester changes the HIT. In addition, we have only tested the script on Firefox ver. 2.0.0.13 and Greasemonkey 0.7.20080121.0. If you find bugs, please let us know, and we'll try to sort out what's going on, time permitting.

So you are aware, we have already had HITs approved by the requester for which we used this script.

Instructions:

You must have Firefox, with the Greasemonkey extension installed.

Image scale can be set via the Greasemonkey "User Script Commands." The image scale defaults at 100%.

You can change the skip image keyboard shortcut. From the User Script Commands menu in Greasemonkey, choose "Image Tagging change skip key." In the dialog box, you can type in your choice. The format is "ctrl" or "alt" or both ("ctrl+alt"), then the "+" key, then a letter or number. (Any character that requires the shift key won't work, but capital letters are ok.) Examples are: ctrl+k, or alt+z, or ctrl+alt+q. If you input an invalid choice, the dialog box will reappear. The new choice appears in the header of the HIT, and the default remains alt+w.

Tuesday, April 1, 2008

mTurk is for the dogs

Oh dear! The latest post at the AWS blog finally reveals in print how Amazon feels about the Workers on mTurk. (Something we have known all along! Where's that Worker mTurk API?) The guys in the development section have developed a DCI - The Dog Computer Interface.

They train the office dog, Rufus, to do mTurk hits. And what would a dog do with his reward? They are using dog biscuits instead of the Amazon Payments system, clearly violating the Terms of Service. No wonder our rejection stats have been down the tubes on the items HITs! I am so furious!

mTurk is for poor, starving Workers, period. I highly doubt the ultra-cute Rhodesian Ridgeback would have the troubles I have begging for food. I can't even get strangers to scratch behind my ears! Leave those penny bear HITs for me.

...and their DCI post was published on April 1. As is this one. ;-)

Although, the funniest jokes are ones that have a hint of truth in them.

Subscribe to: Posts

Tuesday, April 8, 2008

Are These Items Different Improved

Show/Hide Scriptlet

GeorgeTag script update #3

Sunday, April 6, 2008

Open HITs Warning

Saturday, April 5, 2008

Requesters: Getting High Quality Results, Part 2

Friday, April 4, 2008

GeorgeTag: New hits script warning

Thursday, April 3, 2008

GeorgeTag script update #2

GeorgeTag script update

Wednesday, April 2, 2008

Requesters: Getting High Quality Results

GeorgeTag Image Tagging Improved

Tuesday, April 1, 2008

mTurk is for the dogs

Pages

Useful Links

Labels

Blog Archive