Sometimes we might be working on a project but we don’t have the right data handy or we cannot share the data we have to let others try an algorithm we are developing. In such situations, simulating a data set that shares the same characteristics of real data could be very useful.
In this case, I was working on a dashboard for visualizing gaze data (See Figure 1 below) and I created a script to generate fixation data of multiple participants. Typically, such a data set should include some information about participants (like name and sex), gaze coordinates (typically 2 dimensional mapped to the screen or stimulus coordinates) and a variable that tells whether a fixation falls within a predefined area of interest (AOI).
The data set generated includes the following variables (See Table 1 below):
- recordingTimestamp: The time at which each fixation or event occurred
- participantName: The name of the participant
- recordingName: The name of the recording for that participant
- Sex: The sex of the participant, to group participants of the same sex
- Age: The age group of the participant, to test for age effects
- Favorite: The favorite AOI of the participant as reported by them
- gazePointX: The X coordinates of the fixation
- gazePointY: The Y coordinates of the fixation
- fixationDuration: The duration of the fixation, typically in milliseconds
- aois: The AOI name that contains the fixation
- event: The name of an action performed by the participant (e.g. a click)
You can download the code to generate this dataset from GitHub and try it out. If you run it in R it should generate a data frame called “all_data”, which contains data of 50 participants. It should also plot the data with the AOIs.
You can also check out the dashboard on this link. Feel free to write me if you have any suggestions on either the data script or the dashboard. Also, if you need help with your data collection or analysis, drop me a line using this form or on social media links below and I’d be happy to help.