Hi everybody! I’m glad you’re here, but I don’t keep this blog updated anymore.

I’ve moved all my blog content to www.gwilymlockwood.com. You’ll be able to find all the links with gwilymlockwood.wordpress.com there – just take the “.wordpress” out of the link! You’ll see all the new stuff I’ve been writing since September 2016 over there.


Quantifying three years of a long distance relationship

I read two really useful guides to processing text data recently; an analysis of Trump’s tweets to work out whether it’s him or an intern sending them, and a sentiment analysis of Pride and Prejudice. Three years of a long distance relationship means that I have a nice big corpus of Whatsapp messages between my girlfriend and me, so I did the romantic thing and quantified some of our interactions in R. Also, this required quite a bit of text munging in Excel first, which turned out to be far quicker and easier than using regex in this case.

First of all, let’s look at when we text each other throughout the day. We’re in different time zones, but only by an hour, and since texts are inherently dependent – one text is overwhelmingly likely to lead to another pretty soon after – I haven’t adjusted the times.

text no by hour of day.png

Our texting activity represents our general activity pretty well; nothing much going on until about 7am, then a slow start to the day, a bit of a post-lunch dip, and then an evening peak when we’re more likely to be out and about doing things.

We can also have a look at how many messages we send each other, and how that’s changed over time:

text no by date.png

We’ve sent each other a fairly similar number of texts per day throughout the long distance period, but it looks pretty bad on me that I have consistently sent fewer texts than her…

…or does it? When I plot the length of each text sent, I consistently write longer messages:

text length by date.png

So, there’s two distinct texting styles here; I write longer messages less frequently, she writes shorter messages more frequently. The other thing I like about the text length graph is that you can see the times when we’ve been together and not texted each other that much; three weeks in November 2014 when I was running experiments in London, three weeks around Christmas 2015, and a load of long weekends throughout. It’s not that we don’t text each other at all then, it’s more that those texts tend to be stuff like “have we got milk?”, or simply “pub?”.

Plotting log likelihood ratios of how much each of us uses each word in comparison to the other also captures our texting styles:

top 20 words each (no names).png

For example, we both use the word /ha/ to express laughter, but I spell it “ha” and she spells it “hah”. Likewise, “til” and “till” as abbreviations for “until”, and I seem to use “somebody” while she uses “someone”.

If we filter out equivalent words and proper names (like the pubs, supermarkets, and stations we go to most often), another difference in dialogue style appears:

top 10 words each (no proper names).png

I am apparently a lot more conversational; I write out interjections (hmm, oooh, hey, ohhh) and reactions (fuck’s comes from for fuck’s sake, hoera comes from the Dutch phrase hiep hiep hoera, and boourns comes from, erm, The Simpsons). Apart from hhmmm, she doesn’t write interjections or contextual replies at all. Apart from the interjections and replies, my main thing is adjectives; she tends towards nouns and verbs.

The next step is sentiment analysis. If I plot log likelihood bars for each sentiment, I seem to be an atrociously negative person:

sentiment error bars.png

…but this, I think, is more a problem with the way sentiment analysis works in the syuzhet and tidytext packages using NRC sentiment data. Each word in the NRC corpus has a given value, 0 or 1, for a range of sentiments, and this sentiment analysis style simply adds it up for each word in a given set.

Because of that, it doesn’t really capture the actual sentiment behind the way we’re using these words. Let’s look at the main words driving the differences in each sentiment:

sentiment log likelihood words.pngFor me, a lot of my disgust and anger is coming from the word damn. If I was texting damn! every time I stubbed my toe or something, perhaps that would be accurate; but in this case, a lot of the time I write damn is in sympathy, as in exchanges like:

“My computer crashed this afternoon and I lost all the work I’d done today”
“Damn, that’s horrible”

Meanwhile, the word coop is actually me talking about the coöp / co-op, where I get my groceries. I’m not talking about being trapped, either physically or mentally.

The same goes for my girlfriend being more positive. With words like engagement and ceremony, she’s not joyous or anticipatory about her own upcoming nuptials or anything; rather, several of her colleagues have got engaged and married recently, and most of her uses of the words engagement and ceremony are her complaining about how that’s the only topic of conversation at the office. As for assessment, council, and teacher, she works in education. These are generally neutral descriptions of what’s happened that day.

So, I was hoping to be able to plot some sentiment analyses to show our relationship over time, but either it doesn’t work for text messages, or we’re really fucking obtuse. I think it might be the former.

Instead, I’ll settle for showing how much we both swear over time:

expletives per month.png

Each dot represents the number of occurrences per month of a particular expletive. I’m clearly the more profane here, although I do waver a bit while she’s fairly consistent.

More importantly is how we talk about beer a similar amount:

beer per month.png

Since couples who drink together stay together (or in the words of this study, “concordant drinking couples reported decreased negative marital quality over time”), I think this bodes pretty well for us.


Synthesised size/sound sound symbolism: the MS Paint version

Earlier today, I presented my CogSci paper Synthesised size/sound sound symbolism. The six page paper is short and to the point, but hey, it technically counts as a publication, so I figure it’d be remiss of me not to render it badly in MS Paint.

We start with the Japanese ideophone learning study. Participants learned some Japanese ideophones, half with their real Dutch translations, half with their opposite translations, and it turned out that they were way better at remembering the ideophones with their real translations. Importantly, this didn’t happen when we did exactly the same thing with regular, arbitrary adjectives. You can read that paper here, and download some of the experiment materials for it here if you want to try it yourself. You can also read the replication study we did with EEG here, and download the data and analysis scripts for that here.

The results looked a bit like this:


But while we can see that there is an obvious effect, we don’t know how it works. Is it a learning boost that people get from the cross-modal correspondences in the real condition?

match boost lerning

Or is it a learning hindrance that people get from the cross-modal clashes in the opposite condition?

mismatch hindrance learning

(or indeed, is it both?)

Without adding a neutral condition where the words neither obviously match nor mismatch their meanings, we don’t really know.

So, we created some synthesised size/sound sound-symbolic pseudowords, which is easier said than done. It’s well known that people associate voiced consonants and low, back vowels with large size, and voiceless consonants and front, high vowels with small size. This is probably because of the mouth shape you make when saying those sounds:

a and i

We created big-sounding words (like badobado), small-sounding words (like kitikiti), and neutral-sounding words which were halfway in between (like depedepe).

A neutral condition could tell us if it’s a graded effect…

graded effect.png

a match boost effect…

match boost.png

or a mismatch hindrance effect:

mismatch hindrance.png

Turns out it’s a match boost effect. Participants learned the match pseudowords (e.g. badobado and big, kitikiti and small) better than the neutral pseudowords (e.g. kedekede and big, depedepe and small), but there wasn’t a difference in how well they learned the neutral and mismatch pseudowords (e.g. godagoda and small, tikutiku and big).

results n=30

For good measure, we did it again with double the original sample size, because it’s nice to check things.

first experiment and replication blog picture

…and, yes, we found exactly the same thing:

results n=60

So, it looks like it’s a special match boost effect from the cross-modal correspondences, not a graded effect reflecting all cross-modal information.

This is a nice paradigm which can be easily altered to try with different languages or different stimuli, e.g. using big and small shapes rather than the words “big” and “small” in order to rule out letter confounds. I’ll put up all the Presentation scripts, synthesised stimuli, and analysis scripts once I’ve tidied them up a lot (if you think my MS Paint scribbles are messy, you should see my code). It should be pretty straightforward for anybody to download and redo this, so it’d make a good project for a Bachelors/Masters intern. I’d love to see this get taken further and tried out with useful variations and changes. But good luck coming up with a more satisfying title!


Rolling form and how Arsenal lost the Premier League title

Is form taken seriously in the media discussion of football? Every game preview acknowledges the results from a team’s last five games, often with green, grey, and red blobs so you can see how much they’ve won, drawn, and lost recently… but it’s rarely quantified beyond that. This is kind of understandable, since you get three points for a win regardless of what form you or your opposition have been in, but I feel like a bit more of a deep dive into form data is needed.

I’ve been scraping a lot of game data from last season’s Premier League with the ultimate goal of making an adjusted table for rolling form. Which teams beat form teams most often? Which teams benefited most from playing teams at a low ebb more often than other teams? Is rolling form a better predictor of a match winner than a team’s position?

This has thrown up an absolute glut of data that can (and will) fill several blog posts. For now, though, I’d like to focus on Arsenal’s woes last season.

One of the main media narratives around Arsenal is that they’re flat track bullies; they swat aside mediocre teams with ease, looking fluent and impressive while doing so, but they come unstuck against the top teams. These criticisms extend to the individual players, with Olivier Giroud and Per Mertesacker often held up as examples of almost-but-not-quite players; good enough to beat most teams, but not in the elite level that would take Arsenal past the Chelseas, the Manchester Uniteds, the Manchester Citys (despite finishing above all of them last season, but whatever).

Thing is, that’s completely untrue.

Here’s a graph of Arsenal’s wins, losses, and draws last season. The dots and lines represent each game, and are positioned on the y-axis according to Arsenal’s and their opposition’s form over the last five games (in mean points per game) just before each game. This excludes the first five matches of the season, which are always a bit dicey and rarely reflect the final standings.

Arsenal form plot.png

What strikes me most clearly is that Arsenal lost all their games to teams in average to poor form. Indeed, three losses were against teams who’d been averaging less than a point per game, which, if applied to a whole season, would set them out as clear relegation candidates. Moreover, around half these losses have come when Arsenal have been in a good run of form themselves, averaging two or more points per game (which, if applied to a whole season, would have seen them score 76 points, five more than they actually managed). Meanwhile, a lot of Arsenal’s wins actually came against teams who were doing well, averaging 1.5 points per game or more. This suggests that Arsenal were winning the difficult games and losing the easy games.

This isn’t much use on its own, so let’s look at Leicester and Tottenham for comparison.

Not only did Leicester lose less often – obviously – their losses never came against teams averaging less than one point per game. The same goes for Tottenham, who lost their games to teams in a fairly decent run of form.

We can also look at how Arsenal did according to their league position and the league position of their opposition just before the game.

Arsenal position plot.png

This plot is also pretty illuminating. The bulk of Arsenal’s losses came against teams in the bottom half of the table, while a lot of their record against teams in the top five when they played them was actually pretty good, winning four, drawing three, and losing only one. There’s also a nice split between their wins and draws which shows that Arsenal generally beat the lower table teams and generally drew with the high mid-table teams when they played them.

Again, let’s compare that with Leicester and Tottenham.

Leicester only lost to teams in the top half of the table. Tottenham lost to teams in the relegation zone twice, but the rolling form plot showed that these relegation zone teams had been doing pretty well at the time.

To get a look at the whole league, we can plot the mean rolling form and the mean league position for the opposition for each team in losses. That’s a confusing sentence; another way of phrasing it is saying that this is looking at the average form and average league position of the teams that each team lost to last season.

form plot for losses plus lm.png

position plot for losses plus lm.png

Both these graphs show a slight relationship between how well a team did overall and the nature of their losses – the better performing teams tended to lose to better opposition, i.e., by losing to teams with higher points per game in the last five games and teams who were higher in the table at the time.

All except Arsenal. In fact, Arsenal were the worst team least season in terms of losing to teams in poor form. The teams that Arsenal lost to were in the worst average form and in the lowest average position compared with losses by any other team in the league.

In short, this contradicts the main narrative of Arsenal not being good enough against the top teams. Rather, Arsenal aren’t good enough against the bad teams, and lost out on the league by losing to teams in poor form in the relegation zone. Next time Arsenal play a poorly-performing bottom-table team, maybe a Hull or a Middlesborough in a bit of a rut, I’ll stick a tenner on Arsenal to lose; they’re only a little less likely to lose than they are to win.

wenger water bottle.gif


ERP graph competition!

A while back, I blogged about creating better ERP graphs. The data I used to generate those graphs has now been published, and all the data and analysis scripts from that paper are available to download here.

One of the things I find most fun about research is playing about with plotting graphs. There are so many different ways to visualise ERP data that it’s hard to pick one. I played around with a few different styles for my Collabra paper, and settled for plotting the two conditions with 95% confidence intervals. I was also tempted to plot the individual ERPs, as shown in the second plot, but felt the CIs were cleaner and more useful.

However, I also liked Guillaume’s approach of plotting the grand average difference wave and the difference waves for individual participants. I’ve done it with and without 95% CIs in plots 3 and 4. The difference in those two is that the green/orange distinction shows significance; the 320-784ms window was significant, the rest of the epoch wasn’t.

But I’d love to see how these can be improved! All my graphs and all the code needed are below. You can download eegdata.txt from my OSF page, but be careful not to click on the file name itself, otherwise your browser will probably freeze:

osf download instructions

After that, load it into R, and have at it. The only extra packages I use to make the graphs in this blog are ggplot2, ggthemes, tidyr, and dplyr.

If you think you can do better, email me your graphs and code to gwilym(dot)lockwood(at)mpi(dot)nl. I’ll post another blog in a few weeks with my favourite contributions… and I’ll buy the winner a beer :)


Parietal electrodes, title, all trials, lines and 95pc, capitals in legend (29-3-16 submission)

eegdata$condition <- gsub("real", "Real", eegdata$condition)
eegdata$condition <- gsub("opposite", "Opposite", eegdata$condition)
# create dataframe by measuring across participants
dfsmallarea <- filter(eegdata, smallarea == "parietal")
dfsmallarea <- aggregate(measurement ~ smallarea*time*condition, dfsmallarea,mean)
# work it out for all trials
parietaldf <- filter(eegdata, smallarea == "parietal")
parietaldf <- aggregate(measurement ~ time*participant*condition*electrode, parietaldf,mean)
std <- function(x)sd(x)/sqrt(length(x))
# i.e. std is a function to give you the standard error of the mean
# this is the standard deviation of a sample divided by the square root of the sample size
SD <- rep(NA,length(dfsmallarea$time))       # creates empty vector for standard deviation at each time point, which will be huge
SE <- rep(NA,length(dfsmallarea$time))       # creates empty vector for standard error at each time point
CIupper <- rep(NA,length(dfsmallarea$time))  # creates empty vector for upper 95% confidence limit at each time point
CIlower <- rep(NA,length(dfsmallarea$time))  # creates empty vector for lower 95% confidence limit at each time point
for (i in 1:length(dfsmallarea$time)){
  something <- subset(parietaldf,time==dfsmallarea$time[i] & condition==dfsmallarea$condition[i], select=measurement)
  SD[i] = sd(something$measurement)
  SE[i] = std(something$measurement)
  CIupper[i] = dfsmallarea$measurement[i] + (SE[i] * 1.96)
  CIlower[i] = dfsmallarea$measurement[i] - (SE[i] * 1.96)
dfsmallarea$CIL <- CIlower
dfsmallarea$CIU <- CIupper
# Now let's plot things, starting with all trials

colours <- c("#D55E00", "#009E73")  # colourblind friendly - red/orange for opposite, green for real
dfsmallarea$time <- as.integer(as.character(dfsmallarea$time))plot <- ggplot(dfsmallarea, aes(x=time, y=measurement, color=condition)) + 
  geom_line(size=1, alpha = 1)+
  scale_linetype_manual(values=c(1,1) )+  #, guide=FALSE)+
  scale_y_continuous(limits=c(-7, 7), breaks=seq(-7,7,by=1))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - all trials")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_bw() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+ theme(plot.title=  element_text( face="bold"), axis.text = element_text(size=8)) +
  geom_smooth(aes(ymin = CIL, ymax = CIU, fill=condition), stat="identity", alpha = 0.3) + #95% CIs
  scale_fill_manual(values=colours)+ #, guide=FALSE) +
  scale_colour_manual(values=colours) #, guide=FALSE)

Created by Pretty R at inside-R.org


Parietal electrodes, all participants at once, all data (size 0.3, alpha 0.3) (lighter)

temp <- filter(eegdata, smallarea == "parietal")
temp <- select(temp, smallarea,time, measurement, condition, participant)
temp <- aggregate(measurement ~ smallarea*time*condition*participant, temp,mean)
temp2 <- aggregate(measurement ~ smallarea*time*condition, temp,mean)
temp2$participant <- "Grand Average"
temp2$GA <- "Grand Average"
temp$GA <- "individuals"
temp2 <- select(temp2, smallarea,time, measurement, condition, participant, GA)
temp3 <- rbind(temp, temp2)
temp3$conbypar = paste(temp3$condition, temp3$participant, sep="")
temp3$time <- as.integer(as.character(temp3$time))plot <- ggplot(temp3, aes(x=time, y=measurement,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3, aes(colour=condition))+
  geom_line(data = subset(temp3, GA == "Grand Average"), size=1.5, alpha=1, aes(colour=condition)) +
  scale_colour_manual(values=colours, guide=FALSE)+
  scale_y_continuous(limits=c(-15, 15), breaks=seq(-15,15,by=3))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - individual participants for all trials")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))

Created by Pretty R at inside-R.org


Parietal electrodes, diffwave and individual diffwaves

temp <- filter(eegdata, smallarea == "parietal")
temp <- select(temp, smallarea,time, measurement, condition, participant)
temp <- aggregate(measurement ~ smallarea*time*condition*participant, temp,mean)
temp2 <- aggregate(measurement ~ smallarea*time*condition, temp,mean)
temp2$participant <- "Grand Average"
temp2$GA <- "Grand Average"
temp$GA <- "individuals"
temp2 <- select(temp2, smallarea,time, measurement, condition, participant, GA)
temp3 <- rbind(temp, temp2)
temp3$conbypar = paste(temp3$condition, temp3$participant, sep="")
temp4 <- spread(temp2, condition, measurement) # unmelt to create diffwave calculation
temp5 <- spread(temp, condition, measurement) 
temp4$diffwave <- temp4$real - temp4$opposite
temp5$diffwave <- temp5$real - temp5$opposite
temp6 <- rbind(temp4, temp5)
temp6$conbypar = paste(temp6$condition, temp6$participant, sep="")
temp6$sig <- ifelse(temp6$time %in% c(320:784), "yes", "no")
temp6$time <- as.integer(as.character(temp6$time))
sigcolours <- c("#D55E00", "#D55E00", "#009E73")plot <- ggplot(temp6, aes(x=time, y=diffwave,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3)+
  geom_line(data = subset(temp6, GA == "Grand Average"), size=2, alpha=1, aes(colour=sig)) +
  scale_colour_manual(values=sigcolours, guide=FALSE)+
  scale_y_continuous(limits=c(-16, 16), breaks=seq(-16,16,by=2))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - difference wave and individual difference waves")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))

Created by Pretty R at inside-R.org


Parietal electrodes, diffwave and individual diffwaves plus CIs

for (i in 1:length(temp6$time)){
  something <- subset(temp6,time==temp6$time[i], select=diffwave)
  SD[i] = sd(something$diffwave)
  SE[i] = std(something$diffwave)
  SDupper[i] = temp6$diffwave[i] + SD[i]
  SDlower[i] = temp6$diffwave[i] - SD[i]
  SEupper[i] = temp6$diffwave[i] + SE[i]
  SElower[i] = temp6$diffwave[i] - SE[i]
  CIupper[i] = temp6$diffwave[i] + (SE[i] * 1.96)
  CIlower[i] = temp6$diffwave[i] - (SE[i] * 1.96)
temp6$CIL <- CIlower
temp6$CIU <- CIupper
temp6$sig <- ifelse(temp6$time %in% c(320:784), "yes",ifelse(temp6$time %in% c(-200:318), "no1",
temp6$time <- as.integer(as.character(temp6$time))
sigcolours <- c("#D55E00", "#D55E00", "#009E73")plot <- ggplot(temp6, aes(x=time, y=diffwave,group=conbypar)) + 
  geom_line(size=0.3, alpha=0.3)+
  geom_line(data = subset(temp6, GA == "Grand Average"), size=2, alpha=1, aes(colour=sig)) +
  scale_colour_manual(values=sigcolours, guide=FALSE)+
  scale_y_continuous(limits=c(-16, 16), breaks=seq(-16,16,by=2))+ 
  scale_x_continuous(limits=c(-200,1000),breaks=seq(-200,1000,by=100),labels=c("-200", "-100","0","100","200","300","400","500","600","700","800","900","1000"))+
  ggtitle("Parietal electrodes - difference wave and individual difference waves")+
  ylab("Amplitude (µV)")+
  xlab("Time (ms)")+
  theme_few() +
  geom_vline(xintercept=0) +
  geom_hline(yintercept=0)plot+   theme(plot.title=  element_text( face="bold"),
              axis.text = element_text(size=8))plot+ theme(plot.title=  element_text( face="bold"), axis.text = element_text(size=8)) +
  geom_smooth(data = subset(temp6, GA == "Grand Average"), linetype=0, aes(ymin = CIL, ymax = CIU,group=sig, fill=sig), stat="identity", alpha = 0.5) + #95% CIs
  scale_fill_manual(values=sigcolours, guide=FALSE)

Created by Pretty R at inside-R.org


How iconicity helps people learn new words: the MS Paint version

I have a new paper out!

Gwilym Lockwood, Peter Hagoort, and Mark Dingemanse. “How Iconicity Helps People Learn New Words: Neural Correlates and Individual Differences in Sound-Symbolic Bootstrapping.” Collabra 2, no. 1 (July 6, 2016). doi:10.1525/collabra.42.

The paper can be read and downloaded right here:

…and because we’re doing the whole open thing properly, you can also download the original stimuli, data, and analysis scripts here. You probably have no intention of sifting through it all, but the point is that you can:

The experiment was pretty similar to my Sound symbolism boosts novel word learning paper from a few months ago, except that this time it was only with the ideophones, and I used EEG to measure participants’ brain activity while they learned them. People learned 19 ideophones with their real Dutch translations and 19 ideophones with their opposite Dutch translations. After I told them about that and said sorry for the deception, they heard all the ideophones again and had to guess what the real translation was from a choice of two antonyms.

The first important thing is that the results were almost identical. People got the answers right 86.7% of the time for the ideophones in the real condition, and 71.3% of the time for the ideophones in the opposite condition. When they had to guess, they got  it right 73% of the time. These figures replicate the first study very closely (86.7% to 86.1%, 71.3% to 71.1%, and 73% to 72.3%), which is excellent news.

accuracy for JEPLMC and Collabra experiments - jitterdots (0-100) 9by7 for APS presentation

All kinds of things can happen in scientific studies, so replicating a study is really important for showing that the effect is real and not just coincidental. Sadly, replications aren’t considered to be very glamorous, so a shocking amount of published science is either unreplicated or unreplicable:

first experiment and replication blog picture

The second part of this new paper is that I also measured people’s brain activity using EEG. Once you average all the trials together, you get a signal of changing activity over time in response to a thing, which is known as an event-related potential. It looks a bit like this, and isn’t that useful by itself:

erp picture single line

Instead, you have to compare two conditions. If they differ at a certain point, that’s what tells you about how the brain processes things:

erp picture two lines

In this experiment, I found a big P3 effect in the test round:

figure 8 - Parietal electrodes, title, all trials, lines and 95pc, capitals in legend (29-3-16 submission)

The P3 is linked to memory and learning, so it’s not surprising that it came up in a task involving memory and learning. A lower P3 is linked to things being more difficult to learn, so again, it’s not surprising that the ideophones in the opposite condition have a lower P3 when they were harder to learn.

But, it wasn’t that simple. If it was a straight up learning effect, you would expect a correlation between how well people did in a condition and that condition’s P3 amplitude; people with lower test round scores in the opposite condition should have a lower P3 amplitude in the opposite condition. But they don’t.

However, there was a correlation between how sensitive participants were to sound symbolism (as measured by their meaning guessing accuracy in the task after the test) and how big the ERP difference between conditions was. When I split the participants into two groups (people scoring above and below the 73% average in the guessing task) and plotted their ERPs separately, it turns out that the P3 effect is big for the people who are more sensitive to sound symbolism and barely there at all for the people who are less sensitive to sound symbolism:

het talige brein picture

You can also see that the P3 amplitude in the real condition was the same across both groups. What’s behind the difference is how the amplitude in the opposite condition changes. This suggests that most people can recognise cross-modal correspondences and exploit them in word learning, but that some people are more sensitive to sound symbolism and get put off by cross-modal clashes as well.

I reckon that the variation in how sensitive people are to sound symbolism goes a little like this:

model of ss and cross modal perception

…and we do have some preliminary data from a massive cross-modal perception and synaesthesia study that shows that synaesthetes are better at the Japanese ideophone guessing game than regular people, but that’s another blog for another time.


Portugal: the worst Euro finalists ever

I haven’t been impressed with Portugal this tournament (and not just because I’m a bitter Welshman). They haven’t been very good; never really impressing in any of their matches, winning without dominating in the knock outs and held to draws by Iceland and Hungary in the group stages.

I’m pretty sure they’re the worst European Championship finalists I’ve seen, and perhaps the worst ever. But how can you measure how underwhelming a team is?

Step forward Elo ratings. Far better than the joke that is the FIFA rankings, Elo ratings adjust after every single international match based on the teams’ previous Elo ratings. For example, before Iceland and England faced off last week, Iceland had an Elo rating of 1688 and England had an Elo rating of 1929. After Iceland beat England, they exchanged 40 points – Iceland’s Elo rating went up 40 points to 1728, and England’s Elo rating went down 40 points to 1889. This is calculated using the status of the match, the number of goals scored, the result, and the expected result (more on that here, and have a browse of some more examples here).

I’ve taken every finalist (not including the winner of France vs. Germany, which kicks off about an hour from the time of writing) of the Euros since 1984, which was the first tournament where there were knock-out matches (before that, there was a group stage round-robin and the top two teams played off in a final). I’ve calculated each finalist’s mean Elo score throughout the tournament – not including the final itself – as well as calculating the mean Elo score of all the teams they played along the way.

That’s plotted right here:

Elo graph annotated

The dot size and colour shows the difference in each finalist’s Elo score from the start of the tournament until after the semifinal;the larger and lighter the dot, the more the team improved on their way to the final.

France in 2000 are probably the best team to reach a final – not only were they an excellent team at the time, they also beat a lot of strong opposition to get there. Spain in 2012 were the strongest team, but the opposition they faced wasn’t as tough. Greece in 2004 were the weakest team, but they beat some really strong teams along the way, which makes their achievement really impressive.

Portugal this year are dropping off the graph on the bottom left. They aren’t a high quality side – barely better than Greece in 2004 or Denmark in 1992 – but they haven’t been beating any impressive opposition either. They’ve not been that good, and they’ve had the weakest path to the final of any Euros finalist.

So, yes, it looks like my hunch was right – Portugal are the most underwhelming team to reach the final of a European Championship ever since there’s been a proper knock-out round. Sorry, Ronnie.

sad Ronaldo