Can the cortical substrates for the perception of face actions be distinguished when the superficial visual qualities of these actions are very similar? Two fMRI experiments are reported. Compared with watching the face at rest, observing silent speech was associated with bilateral activation in a number of temporal cortical regions, including the superior temporal sulcus (STS). Watching face movements of similar extent and duration, but which could not be construed as speech (gurning; Experiment 1b) was not associated with activation of superior temporal cortex to the same extent, especially in the left hemisphere. Instead, the peak focus of the largest cluster of activation was in the posterior part of the inferior temporal gyrus (right, BA 37). Observing silent speech, but not gurning faces, was also associated with bilateral activation of inferior frontal cortex (BA 44 and 45). In a second study, speechreading and observing gurning faces were compared within a single experiment, using stimuli which comprised the speaker’s face and torso (and hence a much smaller image of the speaker’s face and facial actions). There was again differential engagement of superior temporal cortex which followed the pattern of Experiment 1. These findings suggest that superior temporal gyrus and neighbouring regions are activated bilaterally when subjects view face actions – at different scales – that can be interpreted as speech. This circuitry is not accessed to the same extent by visually similar, but linguistically meaningless actions. However, some temporal regions, such as the posterior part of the right superior temporal sulcus, appear to be common processing sites for processing both seen speech and gurns.