Numbers (and symbols) as word shortcuts in other languages

When I first got my phone I remember starting to use “text speak”, or rather “txt speak”. This refers to the shortening of words to save on precious characters so you could say as much as you wanted to in 160 characters before being charged double for sending two messages. This usually involved substituting numbers for word segments, which are called shortcut words (like 4ever – forever, 2day – today, gr8 – great etc), or making acronyms (lol, omg, brb) or just simply removing the vowels (ths s hrd t rd).

Image showing numberplate reading "lolzomg"

Shortcut words are interesting, because they require you to read letters, and then translate numbers in the the sound they make, which is a different process. Shortcut words are different to simply r3plac1ng l2tt2r5 w1th num83r5 in this way. There you were just treating each digit as the letter it looks like, whereas with shortcut words you’re required to think of them as a number, thus switching between reading letters and digits.

Any way, txt speak is still used today, despite the fact that we can send endless messages on devices for free and I haven’t really given it a second thought…UNTIL NOW.

I was browsing facebook and noticed that a German friend of mine was presenting in Berlin at the “Lange nacht der museen” or “Long night a the museum”, a fantastic and fun event where the museums stay open all night and you race around the city trying to see as much as you can*. On the poster for the event, the title was written as “Lange n8 der museen”.

N8!! Now my first thought when seeing that was “nate”. But in German, 8 is pronounced “acht”, so when read aloud, that letter/digit combination made “nacht”, the German word for night. Incredible. I’d never before considered that there would be shortcut words in languages other than english, but of course there were!

Cue a twitter investigation, here are some more examples of shortcut words using numbers (and symbols, because I don’t want to discriminate) in languages other than English.

German

n8 – nACHT – “night”

French

k7 – kaSEPTE (kassette) – “cassette” as is “get Now 38 on CD and cassette!”

a+ – a PLUS – “see you”. The friend that told me this one moved to Belgium from the UK, she said for a while she thought her texts were getting graded.

9 – NEUF (neuve) – “new”

Spanish

enreda2 -> enredaDOS -> “tangled”. This is just one example, many words ending in “-dos” can use this shortcut

Italian

c6 -> ciSEI -> “are you there?”

6 3mendo -> SEI – TRESmendo -> “you are terrible”

qualc1 -> qualcUNO -> “someone”

nes1 -> nessUNO -> “no one”

Polish

3maj się -> TRZYmaj się -> “see you”

 

Those are all the ones I managed to collect so far. But this will be an eternal quest for me now. Let me know if you know of any other ones. And thank you to everyone who provided these examples.

(As one final thought, this investigation has led me to this paper, which looks at whether or not numbers used in shortcut words still activate numerical cues, or whether that is supressed because they take on a word form. The paper is available here and is very interesting!)

 

*A few years ago this ended up with me and my sister stood in the Bauhaus museum at 2am trying to discuss architecture. She was far better at this than I was.

Social Anxiety and Amazon Mechanical Turk

I just finished listening to Planet Money’s #600 episode titled “The People Inside Your Machine”. Here’s a link to the podcast episode. The show briefly discusses the origin of the MTurk program (something I didn’t know!). Planet Money is an economics podcast and so the hosts are particularly interested in the amount of money that Turkers are earning whist completing jobs. It turns out the answer is “not that much”. The show begins to question the ethics of this, highlighting how difficult it is to earn a reliable wage from working on MTurk. But the focus then quickly turns to whether or not Turkers are helping to train computers to take the jobs of humans essentially.

One particular aspect of the episode that I found interesting was the investigation into the Turkers themselves; aiming to understand who they are. In this show, the presenters hide a “secret message” as they call it into the HIT they place on MTurk. This message asks the Turker to contact Planet Money and have a chat. Through these chats the presenters find out that most of their Turkers are in the US (not surprising seeing as Mechanical Turk requires you to be based in the US, anyone outside is doing something clever). One other trait that the presenters on the episdoe highlight is that many of their Turkers has problems with social anxiety, making Turk work far preferable to them than work outside the home. As one Turker in the episode puts it: “It’s enough to say that my temperament and  the job market don’t see eye to  eye.”

In HCI research, particularly at CHI recently there have been many studies run using Turkers as participants, rather than using the standard bring-people-into-the-lab method of recruitment. I’m a big fan of this approach, I personally have begun to use Citizen Science style experiments in my research, asking people to complete experiments and learn a bit about themselves as a reward. The HCI community is currently dealing with issues of ethics; the HIT system on Mechanical Turk currently doesn’t come close to matching the rate that participants normally receive. Just because that’s standard on the site, does that make it right?

One big perceived positive of recruiting through MTurk is the diversity of potential participants. Whereas many HCI, and indeed Psychology studies suffer from recruiting only WEIRD (Western, Educated, Industrialized, Rich, and Democratic) participants, MTurk represents a broader sample of the population (though still requiring certain things like access to a computer and an internet connection). There have been studies aimed at understanding the demographics of Turkers, showing that your average Turker is likely to be younger than 34 earning less than $10k a year.

It’s important to know this information about the Turkers. When running studies with the aim to collect performance data about humans in particular tasks, you need to know who you’re testing on, and that they are representative of a larger population. But is it enough to just know this demographic information? The discussion in this podcast got me thinking.

How does a participant’s social anxiety affect their performance in a task? For many studies within HCI this won’t likely be a problem: there isn’t a clear reason why clicking targets on a screen would be affected by such anxiety. However, other studies certainly would be. For instance, studies asking about facebook behaviour, or trust. Although definitely not a rigourous analysis or in depth study into Turkers, this podcast highlights the possibility that Turkers could be more likely to be socially anxious than average. Meaning any results from studies into areas that would be affected by such a condition would be seriously compromised, and unrepresentative of the rest of the population. The conclusions drawn might suggest that people were more fearful or angry about certain situations than is necessarily true.

These studies, when run with the usual participants recruitment methods, would have no reason to suspect their participants were not representative of average social anxiety levels. However here we have some evidence that such an assumption cannot be made when using Turkers as participants.

It’s not a ground breaking conclusion to come to: that more information needs to be known about Turkers as they are used in HCI research. But this podcast has highlighted one key area that genuinely could be confounding the results of current experiments.

How To Get the Arduino Time Library up and going

I’m working on a new arduino project at the moment and for it to work, I need the arduino to be able to tell the time. 

To make this work, I decided to use the Arduino Time Library (specifically the updated Time library). I was going to use the TimeSerial sketch. I expected this would be an easy task, and indeed it did look straight forward, with nice simple functions to call up the hours (hour()), minutes (minute()) and seconds (second()).

However, despite this, I really struggled in working out what the actual time was. And I’m not sure if it was just that it was late at night, and my brain wasn’t functioning at its fullest, but I really could not find an easy guide anywhere. After lots of trial and error I worked it out, and here, for you, is the tutorial of how to flipping get the flipping Time Library to work.

Know the software to use

I didn’t get this, but you will need to use both the Arduino IDE and Processing. For some reason I thought it was the case that you use one or the other. The Arduino IDE talks to the arduino, and Processing is responsible for getting the time.

Know the programs to run

You will need to run two programs: TimeSerial (in Arduino IDE) and SyncArduinoClock (in Processing). Get these both opened in their respective software.

Image

Edit the Processing sketch

Now what’s going to happen, is Processing is kind of going to “jump start” the Arduino and give it the time at a certain time point. The arduino is capable of keeping time (if it has power) once it knows what the time is initially. So you need to get Processing talking on the right port. When I opened it, the line “public static final short portIndex = 0;” was set to 0, which is a bluetooth port. When you run it, it will try and talk to bluetooth, which wasn’t useful to me. But what was useful, was that in the terminal window at the bottom of Processing, it told me what was on each of the ports. Reading this I saw this line “[4] “/dev/tty.usbserial-A600bMH2″” and so changed the line of code to read “public static final short portIndex = 4;” so that it was now talking to the right port.

Image

Start your engines (in the correct order)

This step got me for a while, you need to start your sketches running in the correct order. If you start Processing first then Arduino complains that it can’t get to the port it wants to. So, start the TimeSerial arduino sketch, in Arduino. If you open the Serial viewer, you’ll see a message that says “Waiting for sync message”. This is because it is waiting for a message from Processing.

At this point, start the SyncArduinoClock Processing sketch. A box will hopefully appear that says “Click to send time sync”. Click it.

Image

You should now see in the Arduino serial monitor that there is a lot of weird text appearing. That is because it’s trying to count for itself, and also receiving sync messages from Processing. Go back to Processing, and stop that sketch running. And back to Arduino, and hopefully, you should be getting a nice neat print out of the date, and time every second. 

You’ll never be late for anything again.

Hope that helped someone avoid the long hours of struggle trying to understand how to get the stupid thing to work!

Google Translating Numbers

I am came across a post on reddit recently (in the wonderful subreddit “mildlyinteresting“, I’d recommend it if you like your internet entertainment mild) in which someone had put some numbers into the Google Translate engine, and got some interesting results. It turns out, that when you write a list of numbers, each with a full stop after them, and tell google you’re translating from Spanish to English, that some odd things happen.

1. translates as 1., but then 2. translates and Two. (the same happens for 3./Three.) but then, and here’s where things get really mildly interesting, 4. translates as April, as do the next four months.

Image

Go home Google Translate, you’re drunk etc. The interesting thing about this is the settings that are required to make this happen. There have to be dots after every number, without them google is boring and translates the numbers to numbers. This tells us something about why google is treating “5.” as May. Google has learnt (through user feedback and various algorithms) that when there’s a dot after a number, it’s likely to be a date. BUT JUST IN SPANISH. When you set the source language to things other than Spanish, google now decides that 4. is 4. Why on earth should this be the case for just Spanish text? I do not know the answer, do you?

Other languages do produce interesting results, translating from Armenian causes semicolons and parentheses to appear all over the place (link here). Then there are those that mix and match when words or digits are used. When translated in Catalan, the first five numbers translate as “First. Two. Three. Four. 5.”. A translation in Romanian will have the first 9 numbers translated as months apart from 3.

A Belarusian translation makes all the numbers ordinal, and rather cryptically translates 3. as “The Third”. (WHAT IS SO SPECIAL ABOUT 3?!) Is this a result of Kings and Queens in Belarusian history? A quick look at the wikipedia article shows there are indeed some people who are “the third” but there are equally some “the seconds” and “the fourths” out there. Mysterious.

Image

This translation quirk highlights the fact that a single digit can represent a whole range of things, depending on context. It is only because these numbers have been taken out of context that it seems odd to us. In the date today 20.8.13, it’s obvious that 8 stands for August, and that 20 should be read as twentieth (if you’re reading the date in the UK). When we see a digit, we see a whole range of different things. And this, dear readers, is one of the reasons that studying number entry is cool and interesting and a reason to be friends with me.

Controlling Number Entry Using Sifteo Cubes

As you have probably gathered, I am really interested in Number Entry. I’ve been thinking a lot about how we enter numbers – whether we think of them as a series of digits, or as full numbers. This has implications for the way we might ask people to enter numbers on an interface.

I was given an opportunity to test this out recently at the CHI+MED interface hack day. Last Friday (9/8/13) a group from CHI+MED came together at Swansea university to hack some interfaces to investigate novel number entry methods.

I got a chance to play with some Sifteo Cubes. Sifteo Cubes are fun little blocks with screens that are aware of which other blocks they are next to. This allows for some fun physical interaction, with the user picking up and moving the blocks around.

At a previous idea generation session, we came up with the idea that you might want to let a user enter numbers on the sifteo blocks using both a digit and number strategy. The difference between the two meaning that you could either control the number digit-by-digit (that is, incrementing the digit 9 in the number 659 would result in 650) or by controlling the whole number (that is incrementing the digit 9 in 659 would result in 660).

Using one Sifteo block as the “controller” and the other two as the numbers, I created a system to allow the user to enter numbers using any strategy they like. See the video for an example. You can see that they can control each block separately. When the blocks are joined, the entire number can be incremented (by placing the control block to the left or right) or just the digit can be changed (by placing the control block above or below).

Programming the Sifteo blocks was an interesting challenge. You need to program in C++ (a language I haven’t looked at for a few years) and at first, reading the example code was a bit daunting. But after a while I managed to hack it together. The really strange thing about doing this was working with images. Unlike many programs where you can directly write text (or numbers) to the screen, when programming for the Sifteo cubes you are dealing with lots of static images, and swapping them in and out as you need. Meaning in my application in the video, I have 10 different image files, one for each digit.

I enjoyed the experience of hacking and playing with the blocks. I think there are some interesting questions that could be explored using them. We will see!

Distribution of dates

Once more, Randall (from xkcd.com) and I are on the same wavelength.

In his latest comic, he analyses the frequency of dates used throughout all of the internet. Go and look at the comic now.

In months other than September, the 11th is mentioned substantially less often than any other date. It's been that way since long before 9/11 and I have no idea why.

It is very similar to my work on the distribution of numbers and digits used to program infusion pumps. See the poster here, full journal paper, and amusing web comic to follow.