Hi everybody! I’m glad you’re here, but I don’t keep this blog updated anymore.

I’ve moved all my blog content to www.gwilymlockwood.com. You’ll be able to find all the links with gwilymlockwood.wordpress.com there – just take the “.wordpress” out of the link! You’ll see all the new stuff I’ve been writing since September 2016 over there.


Quantifying three years of a long distance relationship

I read two really useful guides to processing text data recently; an analysis of Trump’s tweets to work out whether it’s him or an intern sending them, and a sentiment analysis of Pride and Prejudice. Three years of a long distance relationship means that I have a nice big corpus of Whatsapp messages between my girlfriend and me, so I did the romantic thing and quantified some of our interactions in R. Also, this required quite a bit of text munging in Excel first, which turned out to be far quicker and easier than using regex in this case.

First of all, let’s look at when we text each other throughout the day. We’re in different time zones, but only by an hour, and since texts are inherently dependent – one text is overwhelmingly likely to lead to another pretty soon after – I haven’t adjusted the times.

text no by hour of day.png

Our texting activity represents our general activity pretty well; nothing much going on until about 7am, then a slow start to the day, a bit of a post-lunch dip, and then an evening peak when we’re more likely to be out and about doing things.

We can also have a look at how many messages we send each other, and how that’s changed over time:

text no by date.png

We’ve sent each other a fairly similar number of texts per day throughout the long distance period, but it looks pretty bad on me that I have consistently sent fewer texts than her…

…or does it? When I plot the length of each text sent, I consistently write longer messages:

text length by date.png

So, there’s two distinct texting styles here; I write longer messages less frequently, she writes shorter messages more frequently. The other thing I like about the text length graph is that you can see the times when we’ve been together and not texted each other that much; three weeks in November 2014 when I was running experiments in London, three weeks around Christmas 2015, and a load of long weekends throughout. It’s not that we don’t text each other at all then, it’s more that those texts tend to be stuff like “have we got milk?”, or simply “pub?”.

Plotting log likelihood ratios of how much each of us uses each word in comparison to the other also captures our texting styles:

top 20 words each (no names).png

For example, we both use the word /ha/ to express laughter, but I spell it “ha” and she spells it “hah”. Likewise, “til” and “till” as abbreviations for “until”, and I seem to use “somebody” while she uses “someone”.

If we filter out equivalent words and proper names (like the pubs, supermarkets, and stations we go to most often), another difference in dialogue style appears:

top 10 words each (no proper names).png

I am apparently a lot more conversational; I write out interjections (hmm, oooh, hey, ohhh) and reactions (fuck’s comes from for fuck’s sake, hoera comes from the Dutch phrase hiep hiep hoera, and boourns comes from, erm, The Simpsons). Apart from hhmmm, she doesn’t write interjections or contextual replies at all. Apart from the interjections and replies, my main thing is adjectives; she tends towards nouns and verbs.

The next step is sentiment analysis. If I plot log likelihood bars for each sentiment, I seem to be an atrociously negative person:

sentiment error bars.png

…but this, I think, is more a problem with the way sentiment analysis works in the syuzhet and tidytext packages using NRC sentiment data. Each word in the NRC corpus has a given value, 0 or 1, for a range of sentiments, and this sentiment analysis style simply adds it up for each word in a given set.

Because of that, it doesn’t really capture the actual sentiment behind the way we’re using these words. Let’s look at the main words driving the differences in each sentiment:

sentiment log likelihood words.pngFor me, a lot of my disgust and anger is coming from the word damn. If I was texting damn! every time I stubbed my toe or something, perhaps that would be accurate; but in this case, a lot of the time I write damn is in sympathy, as in exchanges like:

“My computer crashed this afternoon and I lost all the work I’d done today”
“Damn, that’s horrible”

Meanwhile, the word coop is actually me talking about the coöp / co-op, where I get my groceries. I’m not talking about being trapped, either physically or mentally.

The same goes for my girlfriend being more positive. With words like engagement and ceremony, she’s not joyous or anticipatory about her own upcoming nuptials or anything; rather, several of her colleagues have got engaged and married recently, and most of her uses of the words engagement and ceremony are her complaining about how that’s the only topic of conversation at the office. As for assessment, council, and teacher, she works in education. These are generally neutral descriptions of what’s happened that day.

So, I was hoping to be able to plot some sentiment analyses to show our relationship over time, but either it doesn’t work for text messages, or we’re really fucking obtuse. I think it might be the former.

Instead, I’ll settle for showing how much we both swear over time:

expletives per month.png

Each dot represents the number of occurrences per month of a particular expletive. I’m clearly the more profane here, although I do waver a bit while she’s fairly consistent.

More importantly is how we talk about beer a similar amount:

beer per month.png

Since couples who drink together stay together (or in the words of this study, “concordant drinking couples reported decreased negative marital quality over time”), I think this bodes pretty well for us.


Synthesised size/sound sound symbolism: the MS Paint version

Earlier today, I presented my CogSci paper Synthesised size/sound sound symbolism. The six page paper is short and to the point, but hey, it technically counts as a publication, so I figure it’d be remiss of me not to render it badly in MS Paint.

We start with the Japanese ideophone learning study. Participants learned some Japanese ideophones, half with their real Dutch translations, half with their opposite translations, and it turned out that they were way better at remembering the ideophones with their real translations. Importantly, this didn’t happen when we did exactly the same thing with regular, arbitrary adjectives. You can read that paper here, and download some of the experiment materials for it here if you want to try it yourself. You can also read the replication study we did with EEG here, and download the data and analysis scripts for that here.

The results looked a bit like this:


But while we can see that there is an obvious effect, we don’t know how it works. Is it a learning boost that people get from the cross-modal correspondences in the real condition?

match boost lerning

Or is it a learning hindrance that people get from the cross-modal clashes in the opposite condition?

mismatch hindrance learning

(or indeed, is it both?)

Without adding a neutral condition where the words neither obviously match nor mismatch their meanings, we don’t really know.

So, we created some synthesised size/sound sound-symbolic pseudowords, which is easier said than done. It’s well known that people associate voiced consonants and low, back vowels with large size, and voiceless consonants and front, high vowels with small size. This is probably because of the mouth shape you make when saying those sounds:

a and i

We created big-sounding words (like badobado), small-sounding words (like kitikiti), and neutral-sounding words which were halfway in between (like depedepe).

A neutral condition could tell us if it’s a graded effect…

graded effect.png

a match boost effect…

match boost.png

or a mismatch hindrance effect:

mismatch hindrance.png

Turns out it’s a match boost effect. Participants learned the match pseudowords (e.g. badobado and big, kitikiti and small) better than the neutral pseudowords (e.g. kedekede and big, depedepe and small), but there wasn’t a difference in how well they learned the neutral and mismatch pseudowords (e.g. godagoda and small, tikutiku and big).

results n=30

For good measure, we did it again with double the original sample size, because it’s nice to check things.

first experiment and replication blog picture

…and, yes, we found exactly the same thing:

results n=60

So, it looks like it’s a special match boost effect from the cross-modal correspondences, not a graded effect reflecting all cross-modal information.

This is a nice paradigm which can be easily altered to try with different languages or different stimuli, e.g. using big and small shapes rather than the words “big” and “small” in order to rule out letter confounds. I’ll put up all the Presentation scripts, synthesised stimuli, and analysis scripts once I’ve tidied them up a lot (if you think my MS Paint scribbles are messy, you should see my code). It should be pretty straightforward for anybody to download and redo this, so it’d make a good project for a Bachelors/Masters intern. I’d love to see this get taken further and tried out with useful variations and changes. But good luck coming up with a more satisfying title!


Rolling form and how Arsenal lost the Premier League title

Is form taken seriously in the media discussion of football? Every game preview acknowledges the results from a team’s last five games, often with green, grey, and red blobs so you can see how much they’ve won, drawn, and lost recently… but it’s rarely quantified beyond that. This is kind of understandable, since you get three points for a win regardless of what form you or your opposition have been in, but I feel like a bit more of a deep dive into form data is needed.

I’ve been scraping a lot of game data from last season’s Premier League with the ultimate goal of making an adjusted table for rolling form. Which teams beat form teams most often? Which teams benefited most from playing teams at a low ebb more often than other teams? Is rolling form a better predictor of a match winner than a team’s position?

This has thrown up an absolute glut of data that can (and will) fill several blog posts. For now, though, I’d like to focus on Arsenal’s woes last season.

One of the main media narratives around Arsenal is that they’re flat track bullies; they swat aside mediocre teams with ease, looking fluent and impressive while doing so, but they come unstuck against the top teams. These criticisms extend to the individual players, with Olivier Giroud and Per Mertesacker often held up as examples of almost-but-not-quite players; good enough to beat most teams, but not in the elite level that would take Arsenal past the Chelseas, the Manchester Uniteds, the Manchester Citys (despite finishing above all of them last season, but whatever).

Thing is, that’s completely untrue.

Here’s a graph of Arsenal’s wins, losses, and draws last season. The dots and lines represent each game, and are positioned on the y-axis according to Arsenal’s and their opposition’s form over the last five games (in mean points per game) just before each game. This excludes the first five matches of the season, which are always a bit dicey and rarely reflect the final standings.

Arsenal form plot.png

What strikes me most clearly is that Arsenal lost all their games to teams in average to poor form. Indeed, three losses were against teams who’d been averaging less than a point per game, which, if applied to a whole season, would set them out as clear relegation candidates. Moreover, around half these losses have come when Arsenal have been in a good run of form themselves, averaging two or more points per game (which, if applied to a whole season, would have seen them score 76 points, five more than they actually managed). Meanwhile, a lot of Arsenal’s wins actually came against teams who were doing well, averaging 1.5 points per game or more. This suggests that Arsenal were winning the difficult games and losing the easy games.

This isn’t much use on its own, so let’s look at Leicester and Tottenham for comparison.

Not only did Leicester lose less often – obviously – their losses never came against teams averaging less than one point per game. The same goes for Tottenham, who lost their games to teams in a fairly decent run of form.

We can also look at how Arsenal did according to their league position and the league position of their opposition just before the game.

Arsenal position plot.png

This plot is also pretty illuminating. The bulk of Arsenal’s losses came against teams in the bottom half of the table, while a lot of their record against teams in the top five when they played them was actually pretty good, winning four, drawing three, and losing only one. There’s also a nice split between their wins and draws which shows that Arsenal generally beat the lower table teams and generally drew with the high mid-table teams when they played them.

Again, let’s compare that with Leicester and Tottenham.

Leicester only lost to teams in the top half of the table. Tottenham lost to teams in the relegation zone twice, but the rolling form plot showed that these relegation zone teams had been doing pretty well at the time.

To get a look at the whole league, we can plot the mean rolling form and the mean league position for the opposition for each team in losses. That’s a confusing sentence; another way of phrasing it is saying that this is looking at the average form and average league position of the teams that each team lost to last season.

form plot for losses plus lm.png

position plot for losses plus lm.png

Both these graphs show a slight relationship between how well a team did overall and the nature of their losses – the better performing teams tended to lose to better opposition, i.e., by losing to teams with higher points per game in the last five games and teams who were higher in the table at the time.

All except Arsenal. In fact, Arsenal were the worst team least season in terms of losing to teams in poor form. The teams that Arsenal lost to were in the worst average form and in the lowest average position compared with losses by any other team in the league.

In short, this contradicts the main narrative of Arsenal not being good enough against the top teams. Rather, Arsenal aren’t good enough against the bad teams, and lost out on the league by losing to teams in poor form in the relegation zone. Next time Arsenal play a poorly-performing bottom-table team, maybe a Hull or a Middlesborough in a bit of a rut, I’ll stick a tenner on Arsenal to lose; they’re only a little less likely to lose than they are to win.

wenger water bottle.gif


ERP graph competition!

A while back, I blogged about creating better ERP graphs. The data I used to generate those graphs has now been published, and all the data and analysis scripts from that paper are available to download here.

One of the things I find most fun about research is playing about with plotting graphs. There are so many different ways to visualise ERP data that it’s hard to pick one. I played around with a few different styles for my Collabra paper, and settled for plotting the two conditions with 95% confidence intervals. I was also tempted to plot the individual ERPs, as shown in the second plot, but felt the CIs were cleaner and more useful.

However, I also liked Guillaume’s approach of plotting the grand average difference wave and the difference waves for individual participants. I’ve done it with and without 95% CIs in plots 3 and 4. The difference in those two is that the green/orange distinction shows significance; the 320-784ms window was significant, the rest of the epoch wasn’t.

But I’d love to see how these can be improved! All my graphs and all the code needed are below. You can download eegdata.txt from my OSF page, but be careful not to click on the file name itself, otherwise your browser will probably freeze:

osf download instructions

After that, load it into R, and have at it. The only extra packages I use to make the graphs in this blog are ggplot2, ggthemes, tidyr, and dplyr.

If you think you can do better, email me your graphs and code to gwilym(dot)lockwood(at)mpi(dot)nl. I’ll post another blog in a few weeks with my favourite contributions… and I’ll buy the winner a beer :)


Parietal electrodes, title, all trials, lines and 95pc, capitals in legend (29-3-16 submission)

eegdata$condition <- gsub("real", "Real", eegdata$condition)
eegdata$condition <- gsub("opposite", "Opposite", eegdata$condition)
# create dataframe by measuring across participants
dfsmallarea <- filter(eegdata, smallarea == "parietal")
dfsmallarea <- aggregate(measurement ~ smallarea*time*condition, dfsmallarea,mean)
# work it out for all trials
parietaldf <- filter(eegdata, smallarea == "parietal")
parietaldf <- aggregate(measurement ~ time*participant*condition*electrode, parietaldf,mean)
std <- function(x)sd(x)/sqrt(length(x))
# i.e. std is a function to give you the standard error of the mean
# this is the standard deviation of a sample divided by the square root of the sample size
SD <- rep(NA,length(dfsmallarea$time))       # creates empty vector for standard deviation at each time point, which will be huge
SE <- rep(NA,length(dfsmallarea$time))       # creates empty vector for standard error at each time point
CIupper <- rep(NA,length(dfsmallarea$time))  # creates empty vector for upper 95% confidence limit at each time point
CIlower <- rep(NA,length(dfsmallarea$time))  # creates empty vector for lower 95% confidence limit at each time point
for (i in 1:length(dfsmallarea$time)){
  something <- subset(parietaldf,time==dfsmallarea$time[i] & condition==dfsmallarea$condition[i], select=measurement)
  SD[i] = sd(something$measurement)
  SE[i] = std(something$measurement)
  CIupper[i] = dfsmallarea$measurement[i] + (SE[i] * 1.96)
  CIlower[i] = dfsmallarea$measurement[i] - (SE[i] * 1.96)
dfsmallarea$CIL <- CIlower
dfsmallarea$CIU <- CIupper
# Now let's plot things, starting with all trials

colours <- c("#D55E00", "#009E73")  # colourblind friendly - red/orange for opposite, green for real
dfsmallarea$time <- as.integer(as.character(dfsmallarea$time))plot <- ggplot(dfsmallarea, aes(x=time, y=measurement, color=condition)) + 
  geom_line(size=1, alpha = 1)+
  scale_linetype_manual(values=c(1,1) )+  #, guide=FALSE)+
  scale_y_continuous(limits=c(-7, 7), breaks=seq(-7,7,by=1))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - all trials")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_bw() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+ theme(plot.title=  element_text( face="bold"), axis.text = element_text(size=8)) +
  geom_smooth(aes(ymin = CIL, ymax = CIU, fill=condition), stat="identity", alpha = 0.3) + #95% CIs
  scale_fill_manual(values=colours)+ #, guide=FALSE) +
  scale_colour_manual(values=colours) #, guide=FALSE)

Created by Pretty R at inside-R.org


Parietal electrodes, all participants at once, all data (size 0.3, alpha 0.3) (lighter)

temp <- filter(eegdata, smallarea == "parietal")
temp <- select(temp, smallarea,time, measurement, condition, participant)
temp <- aggregate(measurement ~ smallarea*time*condition*participant, temp,mean)
temp2 <- aggregate(measurement ~ smallarea*time*condition, temp,mean)
temp2$participant <- "Grand Average"
temp2$GA <- "Grand Average"
temp$GA <- "individuals"
temp2 <- select(temp2, smallarea,time, measurement, condition, participant, GA)
temp3 <- rbind(temp, temp2)
temp3$conbypar = paste(temp3$condition, temp3$participant, sep="")
temp3$time <- as.integer(as.character(temp3$time))plot <- ggplot(temp3, aes(x=time, y=measurement,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3, aes(colour=condition))+
  geom_line(data = subset(temp3, GA == "Grand Average"), size=1.5, alpha=1, aes(colour=condition)) +
  scale_colour_manual(values=colours, guide=FALSE)+
  scale_y_continuous(limits=c(-15, 15), breaks=seq(-15,15,by=3))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - individual participants for all trials")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))

Created by Pretty R at inside-R.org


Parietal electrodes, diffwave and individual diffwaves

temp <- filter(eegdata, smallarea == "parietal")
temp <- select(temp, smallarea,time, measurement, condition, participant)
temp <- aggregate(measurement ~ smallarea*time*condition*participant, temp,mean)
temp2 <- aggregate(measurement ~ smallarea*time*condition, temp,mean)
temp2$participant <- "Grand Average"
temp2$GA <- "Grand Average"
temp$GA <- "individuals"
temp2 <- select(temp2, smallarea,time, measurement, condition, participant, GA)
temp3 <- rbind(temp, temp2)
temp3$conbypar = paste(temp3$condition, temp3$participant, sep="")
temp4 <- spread(temp2, condition, measurement) # unmelt to create diffwave calculation
temp5 <- spread(temp, condition, measurement) 
temp4$diffwave <- temp4$real - temp4$opposite
temp5$diffwave <- temp5$real - temp5$opposite
temp6 <- rbind(temp4, temp5)
temp6$conbypar = paste(temp6$condition, temp6$participant, sep="")
temp6$sig <- ifelse(temp6$time %in% c(320:784), "yes", "no")
temp6$time <- as.integer(as.character(temp6$time))
sigcolours <- c("#D55E00", "#D55E00", "#009E73")plot <- ggplot(temp6, aes(x=time, y=diffwave,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3)+
  geom_line(data = subset(temp6, GA == "Grand Average"), size=2, alpha=1, aes(colour=sig)) +
  scale_colour_manual(values=sigcolours, guide=FALSE)+
  scale_y_continuous(limits=c(-16, 16), breaks=seq(-16,16,by=2))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - difference wave and individual difference waves")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))

Created by Pretty R at inside-R.org


Parietal electrodes, diffwave and individual diffwaves plus CIs

for (i in 1:length(temp6$time)){
  something <- subset(temp6,time==temp6$time[i], select=diffwave)
  SD[i] = sd(something$diffwave)
  SE[i] = std(something$diffwave)
  SDupper[i] = temp6$diffwave[i] + SD[i]
  SDlower[i] = temp6$diffwave[i] - SD[i]
  SEupper[i] = temp6$diffwave[i] + SE[i]
  SElower[i] = temp6$diffwave[i] - SE[i]
  CIupper[i] = temp6$diffwave[i] + (SE[i] * 1.96)
  CIlower[i] = temp6$diffwave[i] - (SE[i] * 1.96)
temp6$CIL <- CIlower
temp6$CIU <- CIupper
temp6$sig <- ifelse(temp6$time %in% c(320:784), "yes",ifelse(temp6$time %in% c(-200:318), "no1",
temp6$time <- as.integer(as.character(temp6$time))
sigcolours <- c("#D55E00", "#D55E00", "#009E73")plot <- ggplot(temp6, aes(x=time, y=diffwave,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3)+
  geom_line(data = subset(temp6, GA == "Grand Average"), size=2, alpha=1, aes(colour=sig)) +
  scale_colour_manual(values=sigcolours, guide=FALSE)+
  scale_y_continuous(limits=c(-16, 16), breaks=seq(-16,16,by=2))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - difference wave and individual difference waves")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))plot+ theme(plot.title=  element_text( face="bold"), axis.text = element_text(size=8)) +
  geom_smooth(data = subset(temp6, GA == "Grand Average"), linetype=0, aes(ymin = CIL, ymax = CIU,group=sig, fill=sig), stat="identity", alpha = 0.5) + #95% CIs
  scale_fill_manual(values=sigcolours, guide=FALSE)

Created by Pretty R at inside-R.org


How iconicity helps people learn new words: the MS Paint version

I have a new paper out!

Gwilym Lockwood, Peter Hagoort, and Mark Dingemanse. “How Iconicity Helps People Learn New Words: Neural Correlates and Individual Differences in Sound-Symbolic Bootstrapping.” Collabra 2, no. 1 (July 6, 2016). doi:10.1525/collabra.42.

The paper can be read and downloaded right here:

…and because we’re doing the whole open thing properly, you can also download the original stimuli, data, and analysis scripts here. You probably have no intention of sifting through it all, but the point is that you can:

The experiment was pretty similar to my Sound symbolism boosts novel word learning paper from a few months ago, except that this time it was only with the ideophones, and I used EEG to measure participants’ brain activity while they learned them. People learned 19 ideophones with their real Dutch translations and 19 ideophones with their opposite Dutch translations. After I told them about that and said sorry for the deception, they heard all the ideophones again and had to guess what the real translation was from a choice of two antonyms.

The first important thing is that the results were almost identical. People got the answers right 86.7% of the time for the ideophones in the real condition, and 71.3% of the time for the ideophones in the opposite condition. When they had to guess, they got  it right 73% of the time. These figures replicate the first study very closely (86.7% to 86.1%, 71.3% to 71.1%, and 73% to 72.3%), which is excellent news.

accuracy for JEPLMC and Collabra experiments - jitterdots (0-100) 9by7 for APS presentation

All kinds of things can happen in scientific studies, so replicating a study is really important for showing that the effect is real and not just coincidental. Sadly, replications aren’t considered to be very glamorous, so a shocking amount of published science is either unreplicated or unreplicable:

first experiment and replication blog picture

The second part of this new paper is that I also measured people’s brain activity using EEG. Once you average all the trials together, you get a signal of changing activity over time in response to a thing, which is known as an event-related potential. It looks a bit like this, and isn’t that useful by itself:

erp picture single line

Instead, you have to compare two conditions. If they differ at a certain point, that’s what tells you about how the brain processes things:

erp picture two lines

In this experiment, I found a big P3 effect in the test round:

figure 8 - Parietal electrodes, title, all trials, lines and 95pc, capitals in legend (29-3-16 submission)

The P3 is linked to memory and learning, so it’s not surprising that it came up in a task involving memory and learning. A lower P3 is linked to things being more difficult to learn, so again, it’s not surprising that the ideophones in the opposite condition have a lower P3 when they were harder to learn.

But, it wasn’t that simple. If it was a straight up learning effect, you would expect a correlation between how well people did in a condition and that condition’s P3 amplitude; people with lower test round scores in the opposite condition should have a lower P3 amplitude in the opposite condition. But they don’t.

However, there was a correlation between how sensitive participants were to sound symbolism (as measured by their meaning guessing accuracy in the task after the test) and how big the ERP difference between conditions was. When I split the participants into two groups (people scoring above and below the 73% average in the guessing task) and plotted their ERPs separately, it turns out that the P3 effect is big for the people who are more sensitive to sound symbolism and barely there at all for the people who are less sensitive to sound symbolism:

het talige brein picture

You can also see that the P3 amplitude in the real condition was the same across both groups. What’s behind the difference is how the amplitude in the opposite condition changes. This suggests that most people can recognise cross-modal correspondences and exploit them in word learning, but that some people are more sensitive to sound symbolism and get put off by cross-modal clashes as well.

I reckon that the variation in how sensitive people are to sound symbolism goes a little like this:

model of ss and cross modal perception

…and we do have some preliminary data from a massive cross-modal perception and synaesthesia study that shows that synaesthetes are better at the Japanese ideophone guessing game than regular people, but that’s another blog for another time.


Portugal: the worst Euro finalists ever

I haven’t been impressed with Portugal this tournament (and not just because I’m a bitter Welshman). They haven’t been very good; never really impressing in any of their matches, winning without dominating in the knock outs and held to draws by Iceland and Hungary in the group stages.

I’m pretty sure they’re the worst European Championship finalists I’ve seen, and perhaps the worst ever. But how can you measure how underwhelming a team is?

Step forward Elo ratings. Far better than the joke that is the FIFA rankings, Elo ratings adjust after every single international match based on the teams’ previous Elo ratings. For example, before Iceland and England faced off last week, Iceland had an Elo rating of 1688 and England had an Elo rating of 1929. After Iceland beat England, they exchanged 40 points – Iceland’s Elo rating went up 40 points to 1728, and England’s Elo rating went down 40 points to 1889. This is calculated using the status of the match, the number of goals scored, the result, and the expected result (more on that here, and have a browse of some more examples here).

I’ve taken every finalist (not including the winner of France vs. Germany, which kicks off about an hour from the time of writing) of the Euros since 1984, which was the first tournament where there were knock-out matches (before that, there was a group stage round-robin and the top two teams played off in a final). I’ve calculated each finalist’s mean Elo score throughout the tournament – not including the final itself – as well as calculating the mean Elo score of all the teams they played along the way.

That’s plotted right here:

Elo graph annotated

The dot size and colour shows the difference in each finalist’s Elo score from the start of the tournament until after the semifinal;the larger and lighter the dot, the more the team improved on their way to the final.

France in 2000 are probably the best team to reach a final – not only were they an excellent team at the time, they also beat a lot of strong opposition to get there. Spain in 2012 were the strongest team, but the opposition they faced wasn’t as tough. Greece in 2004 were the weakest team, but they beat some really strong teams along the way, which makes their achievement really impressive.

Portugal this year are dropping off the graph on the bottom left. They aren’t a high quality side – barely better than Greece in 2004 or Denmark in 1992 – but they haven’t been beating any impressive opposition either. They’ve not been that good, and they’ve had the weakest path to the final of any Euros finalist.

So, yes, it looks like my hunch was right – Portugal are the most underwhelming team to reach the final of a European Championship ever since there’s been a proper knock-out round. Sorry, Ronnie.

sad Ronaldo


Visualising competition in the Premier League and La Liga

Apologies, it’s taken me far too long to get round to writing this.

A little while ago, I set up a quiz to see whether people could accurately guess whether a spatially-plotted league table was from the Premier League or La Liga. The idea was to see if the (British) stereotype of the Premier League being competitive and La Liga being Barcelona and Real Madrid plus eighteen others was born out by the league tables. I did this because I’m pretty bored of reading comments like this:

“Right, yeah, English teams are bobbins in the Champions League because the Premier League might not have the highest quality, but it’s the best league in the world because of the competition and how close all the teams are. You never know who’s going to win it! It’s not like in Spain, where it’s always going to be Barcelona or Real Madrid winning it by a mile, and all the other teams don’t even matter.”

According to this stereotype, a spatial dotplot of the Premier League would look like this:

premier league stereotype

…while a spatial dotplot of La Liga would look like this:

la liga stereotype

So, I did what any reasonable person would do; I scraped a load of data, visualised the league tables as dotplots in random order, and created an online quiz to see if people could guess which league was which. If the differences between the two leagues are obvious, it should be easy enough to tell which league is which from a dotplot. The rest of this blog is basically a love letter to ggplot2.

People saw graphs like this, and simply had to say whether it was La Liga – Premier League or Premier League – La Liga:

League positions no lines no leagues 2008 classic

In this case, La Liga is on the left, and the Premier League is on the right. It’s the 2008-09 season, when Barcelona finished nine points ahead of Real Madrid and Manchester United finished four points ahead of Liverpool. The position of the league on the x-axis was shuffled, so that each league was on the left side and the right side five times each, and the questions were presented to people in a random order.

172 people did the quiz, and scored an average of 62%.

The most well-answered question was the 2011-12 season, where 74% of people correctly answered that La Liga is on the left and the Premier League is on the right:

League positions no lines leagues 2011 classic

…and the least well-answered question was the 2006-07 season, where only 35% of people correctly answered that La Liga is on the left and the Premier League is on the right:

League positions no lines leagues 2006 classic

This wasn’t a perfect psychology experiment. People doing the quiz already knew what it was about and were aware that I was looking into the stereotype, but it is a useful demonstration that it’s not easy to guess which league is which. Perhaps the stereotype isn’t entirely accurate. Indeed, if we take the ten seasons in the quiz plus the season just finished and do some descriptive statistics, the mean number of points in the Premier League is 52.1 and the mean number of points in La Liga is 52.4. while the standard deviation of points is 17.0 in the Premier League and 16.3 in La Liga. This would suggest that there’s no real difference in how widely the points are spread across the teams.

However, there might be a general trend when we look at all of them put together. If we take the ten seasons in the quiz, plus the season just finished, and plot the number of points per position per season, we get this graph:

League positions no lines leagues last 11 years classic

This actually does suggest that there may be something in the stereotype of the top teams in La Liga pulling away from the rest of the league. The dots at the top of the league are stretched higher than they are in the Premier League.

Similarly, we can look at the mean number of points per position in the last 11 seasons:

Mean league points per position no lines leagues last 11 years classic

…and this also seems to show that the top teams in La Liga are spread out a bit further than the top teams in the Premier League.

Sure enough, when we look at the stats for the top four teams, it shows a bit more spread; the mean number of points for the top four teams is very similar – 78.4 in the Premier League, 78.1 in La Liga – but the standard deviation is 8.2 in the Premier League and 12.1 in La Liga.

One last thing we can look at plotting is goal difference. Here’s the goal difference per position for each of the last 11 seasons (a goal difference of 0 is shown by the horizontal black line):

GD no lines leagues last 11 years classic

Again, the top teams in Spain seem to be more spread out compared to the rest of the league than the top teams in England are. In fact, when we average it together for the last 11 seasons, we get this:

Mean gd per position no lines leagues last 11 years classic

…which shows that, on average, the second-placed team in La Liga generally has a better goal difference than the winner of the Premier League (57.6 to 51.4), but that the third-placed team in La Liga generally has the same goal difference as the fourth-placed team in the Premier League (30.8 and 30.2 respectively).

How about combining these graphs, with points up the y-axis and goal difference denoted by dot size?

exaggerated mean pts and gd per position no lines leagues last 11 years classic

It seems that, on average, there may be some truth in the stereotype. The top two teams in La Liga dominate their competitors, both in terms of points and goal difference, whereas there seems to be less separating the top teams in the Premier League.

…and just for bonus points, let’s do the same thing for the top five European leagues over the last 11 seasons. This time it’s done by points per game rather than points, as the Bundesliga only has 18 teams:

exaggerated mean pts per game and gd per position no lines all five leagues last 11 years classic

It looks like the Premiership does have the most competitive title race after all.


Dutch lessons for Brexit.

I don’t post political things. Not because I’m 26 and apathetic like much of my generation, but because most political shit I see is virtue-signalling rather than actual discussion, and I don’t want to contribute to that.

I’m making an exception today.

I’m British and I voted for the UK to stay in the EU. This wasn’t a difficult decision for me; while I do agree with some of the arguments about the problems of the EU, I don’t think that leaving the EU solves any of those problems. Thing is, though, this vote was never really about EU membership. It’s a domestic British revolution expressed through the wrong international forum.

It’s a strange experience to watch this develop from somewhere else. I live in the Netherlands, where everything is continuing as normal and I can’t get a sense of what the atmosphere in the UK is like. I’ve had a few pro-EU friends from home say things like “this fucking country, I’m so jealous you get to miss out on this” or “I bet you’re glad you’re outside the UK right now!”.

I’m not. I might be the kind of middle-class multilingual metropolitan lefty liberal prick that the anger of a lot of Brexit voters is directed against, but the UK is still my country, my culture, my people, and I hate seeing it tearing itself apart. I wish I was back there now. I think that people who say “I don’t like X, I’m moving to Y”, whether it’s Brits upping sticks because of Brexit, or Americans moving to Canada because of Trump, is sort of cowardly.

And funnily enough, this is something that living in the Netherlands for three years has taught me.

British people are fucking weird about being British. Vast swaths of the country are embarrassed about being British, and English doubly so. It’s not new either. George Orwell wrote about it in the 1940s:

England is perhaps the only great country whose intellectuals are ashamed of their own nationality. In left-wing circles it is always felt that there is something slightly disgraceful in being an Englishman and that it is a duty to snigger at every English institution, from horse racing to suet puddings. It is a strange fact, but it is unquestionably true that almost any English intellectual would feel more ashamed of standing to attention during “God Save the King” than of stealing from a poor box.

The longer I spend in the Netherlands, the more I realise how ridiculous this is. I’ve seen a lot of people writing a lot of things recently about how patriotism leads to division, how flying the British flag is the start of a slippery slope that leads to white people shouting at brown people to fuck off home. Three years ago, I’d have agreed, but it’s arse about face and it’s taken me moving to the Netherlands to see that.

We need small acts of patriotism, of banal nationalism, because it’s the small overt acts of patriotism that allow a country to positively define itself. More than that, the small communal acts of patriotism are actually tools for integration and inclusivity, not against; it’s a signal that says “hey, this is who we are, this is what we do, and taking part will make you welcomed and accepted”.

The Dutch get this. They’re a country comfortable in their own skin, or at least, far more so than the UK. On Koningsdag, the Dutch dress up in orange, sell second-hand shit in parks, and get drunk in an overt celebration of Dutchness in its boozy, thrifty, orange glory. Through British lenses, this should naturally develop into Dutch people laying out markers of what isn’t Dutch and excluding anybody who doesn’t fit, but it’s not, it doesn’t work like that. Small acts of patriotism through which Dutch identity and nationality is positively expressed allows people to integrate better. I stick on my two-stripe 1974 shirt and get drunk with Dutch people, and in doing so, this is me, an immigrant on benefits who doesn’t speak the language that well, taking the chance to show that I want to be part of this place, and in doing so, this is the Dutch accepting me. That’s not to say the Dutch don’t have their problems… but I’ve never seen a British Het land van.

Without this, how can an immigrant – especially an immigrant from a much more culturally distant background – integrate into the UK if we’re too embarrassed to show what being British actually means? If you’ve moved to a different country, you’re probably there because you want to be. Without a template for integration, a forum for showing your enthusiasm to be here, these tensions feel inevitable.

And it’s more than immigration. This atmosphere, this feeling, this communal gezelligheid of Dutchness, creates an atmosphere where Dutch people care about Dutch people more. I don’t think it’s a coincidence that the Netherlands has some of the lowest economic inequality in Europe. Meanwhile, think of the message that this embarrassment sends to a lot of British people whose lives are pretty shit – not only are we doing nothing to address your problems, we’re telling you that your national identity, which is pretty much all you have to hold on to, is something to be ashamed of.

I have a horrible feeling that we’re going to waste this crisis. Because it’s not just a crisis, it’s an opportunity, an opportunity for the real, far-reaching reform that we so desperately need. Many people who voted to leave the EU say that this is a fundamental issue of democracy. Brilliant. Let’s take that enthusiasm and establish a 21st century proportional voting system (fuck, I’d settle for a 20th century voting system, it’s still more up to date), and let’s get rid of the House of Lords. Watching everything unfold after this referendum, the fact that four million people voted UKIP last year and got one MP was a deafening sign of things to come. Many people who voted to leave the EU say that this is about reinvestment in our own people rather than supporting other countries. Not sure I agree, but fine, let’s fucking do that, let’s take this imaginary £350m a week and renationalise public services, build affordable council housing, extend the NHS, support British industry, and decentralise everything from London so that the rest of the country actually has stuff to do.

This is Caroline Lucas’ time. If she can run with the Green party’s current policies and acknowledge and celebrate British (and especially English) patriotism, then we might just make something of this.

Sadly, nothing will happen if we can’t work out who we are. And if we don’t, then the UK really won’t be anything to be proud of.