osk.rs/doc/report.Rnw

\documentclass[conference,a4paper]{IEEEtran}

\usepackage{graphicx}   % for including figures
\usepackage{booktabs}   % for nicer tables
\usepackage{float}
\usepackage[utf8x]{inputenc}
\usepackage[margin=1in]{geometry} % Adjust margins
\usepackage{caption}
\usepackage{hyperref}
\PassOptionsToPackage{hyphens}{url} % allow breaking urls
\usepackage{float}
\usepackage{wrapfig}
\usepackage{subcaption}
\usepackage{parskip}

\usepackage[style=ieee, backend=biber, maxnames=1, minnames=1]{biblatex}
\addbibresource{report.bib}

\title{On-Screen Keyboard Layout Study}
\begin{document}
\maketitle

\section{Abstract}\label{abstract}
We evaluated three on-screen keyboard layouts: QWERTY, Dvorak, and Circle. Objective performance, measured in words per minute (WPM), showed a significant main effect of layout. Post-hoc comparisons revealed that QWERTY was significantly faster than both Dvorak and Circle, while no difference was observed between Dvorak and Circle. Total error rate (TER) did not differ significantly between layouts. Subjective workload ratings assessed via NASA-TLX were similar for Dvorak and Circle, but QWERTY was perceived as less demanding. These results indicate that QWERTY offers superior typing speed, whereas error rates and perceived workload are comparable across layouts.

\section{Introduction}\label{introduction}

\section{Keyboard Designs}\label{keyboard-designs}
Three on-screen keyboard layouts were evaluated in this study: QWERTY, Dvorak, and a custom-designed Circle layout.

1. QWERTY: The standard layout commonly used in English typing, serving as a baseline for comparison.

2. Dvorak: An alternative layout designed to increase typing efficiency for physical keyboards
by placing frequently used letters in the home row, minimizing finger movements.

3. Circle Layout: A custom layout developed for this study, in which keys were arranged in a circular pattern. Letters that occur more frequently in English were positioned closer to the center and rendered larger to facilitate faster access. Less frequently used letters were placed toward the periphery and sized smaller, aiming to optimize ergonomic reach and visual salience.

This design allowed us to investigate both established and novel layouts, comparing objective typing performance, error rates, and subjective workload.

\begin{figure}[H]
    \centering
    \includegraphics[width=0.45\textwidth]{images/qwerty-pic.png}
    \caption{QWERTY Keyboard Layout}
    \label{fig:qwerty}
\end{figure}


\begin{figure}[H]
    \centering
    \includegraphics[width=0.45\textwidth]{images/dvorak-pic.png}
    \caption{Dvorak Keyboard Layout}
    \label{fig:dvorak}
\end{figure}


\begin{figure}[H]
    \centering
    \includegraphics[width=0.3\textwidth]{images/circle-pic.png}
    \caption{Circle Keyboard Layout}
    \label{fig:circle}
\end{figure}


\section{Experiment}\label{experiment}
<<echo=FALSE, message=FALSE>>=
# load libraries here
library(knitr)
library(dplyr)
library(tidyr)

# Read the results CSV
results <- read.csv("../data/results.csv", sep=",", header=TRUE)
@
\subsection{Participants}\label{participants}
Our experiment was conducted using a small sample of \Sexpr{nrow(results)} participants.
All of our participants, predominantly male with an average age of \Sexpr{round(mean(results$age),digits=1)},
were students familiar with computers.

\subsection{Apparatus}\label{apparatus}
The main body of our experimental apparatus was our On-Screen Keyboard, implemented using Tauri + Angular.
It provides a view of exactly one of the layouts described in \autoref{keyboard-designs} at a time.
Text-entry measures were collected using TextTest \cite{texttest}, with everything running on a stationary Windows 11 computer.

\subsection{Procedure}\label{procedure}
Each participant was first provided with an overview of the three keyboard models and design rationale
They were then presented all three layouts in a counterbalanced order to mitigate common order effects
such as practice, fatigue and boredom.
Each keyboard was evaluated using only lowercase letters, space and enter to display the next sentence.
Due to our chosen time constraint of 30 minutes, each participant was given three practice sentences per keyboard,
followed by 10 recorded sentences for the experiment.
After completion of all 3 layouts, the participants were then asked to fill out the NASA Task Load Index \cite{nasatlx}.

\section{Results}\label{results}

This section presents the experiment results comparing the three keyboard layouts QWERTY, Dvorak and Circle.
Performance was evaluated using typing speed measured in words per minute (WPM) and accuracy assessed through the total error rate (TER). In addition, subjective workload was collected using the NASA-TLX questionnaire.

\subsection{Descriptive Statistics}\label{descriptive-statistics}

This section reports descriptive statistics for the objective performance measures and the subjective workload assessments.

\subsubsection{Objective Measures}\label{objective-measures}

Objective typing performance was assessed using words per minute (WPM) and total error rate (TER) as shown in table~\ref{tab:wpm} and ~\ref{tab:ter}. Descriptive statistics show that QWERTY clearly outperformed the alternative layouts in terms of typing speed. Participants achieved a mean speed of 17.28 WPM on QWERTY, whereas both DVORAK (8.27 WPM) and CIRCLE (8.45 WPM) resulted in substantially lower average speeds. This indicates that participants typed more than twice as fast on the standard QWERTY layout compared to the other two designs.

In contrast, accuracy differences between layouts were relatively small. Mean TER values were low across all conditions, with QWERTY showing a slightly higher average error rate (0.036) than DVORAK (0.0298) and CIRCLE (0.0265). Overall, the objective results suggest that layout differences were most pronounced for speed rather than error performance.

<<echo=FALSE, message=FALSE>>=

ter_stats <- results %>%
  summarise(
    qwerty_min = min(qwerty_ter, na.rm = TRUE),
    qwerty_q1 = quantile(qwerty_ter, 0.25, na.rm = TRUE),
    qwerty_median = median(qwerty_ter, na.rm = TRUE),
    qwerty_mean = mean(qwerty_ter, na.rm = TRUE),
    qwerty_q3 = quantile(qwerty_ter, 0.75, na.rm = TRUE),
    qwerty_max = max(qwerty_ter, na.rm = TRUE),

    dvorak_min = min(dvorak_ter, na.rm = TRUE),
    dvorak_q1 = quantile(dvorak_ter, 0.25, na.rm = TRUE),
    dvorak_median = median(dvorak_ter, na.rm = TRUE),
    dvorak_mean = mean(dvorak_ter, na.rm = TRUE),
    dvorak_q3 = quantile(dvorak_ter, 0.75, na.rm = TRUE),
    dvorak_max = max(dvorak_ter, na.rm = TRUE),

    circle_min = min(circle_ter, na.rm = TRUE),
    circle_q1 = quantile(circle_ter, 0.25, na.rm = TRUE),
    circle_median = median(circle_ter, na.rm = TRUE),
    circle_mean = mean(circle_ter, na.rm = TRUE),
    circle_q3 = quantile(circle_ter, 0.75, na.rm = TRUE),
    circle_max = max(circle_ter, na.rm = TRUE)
  )
ter_tidy <- ter_stats %>%
  pivot_longer(
    cols = everything(),
    names_to = c("layout", ".value"),
    names_sep = "_"
  )
ter_tidy <- ter_tidy %>%
  select(layout, min, q1, median, mean, q3, max)

wpm_stats <- results %>%
  summarise(
    qwerty_min = min(qwerty_wpm, na.rm = TRUE),
    qwerty_q1 = quantile(qwerty_wpm, 0.25, na.rm = TRUE),
    qwerty_median = median(qwerty_wpm, na.rm = TRUE),
    qwerty_mean = mean(qwerty_wpm, na.rm = TRUE),
    qwerty_q3 = quantile(qwerty_wpm, 0.75, na.rm = TRUE),
    qwerty_max = max(qwerty_wpm, na.rm = TRUE),

    dvorak_min = min(dvorak_wpm, na.rm = TRUE),
    dvorak_q1 = quantile(dvorak_wpm, 0.25, na.rm = TRUE),
    dvorak_median = median(dvorak_wpm, na.rm = TRUE),
    dvorak_mean = mean(dvorak_wpm, na.rm = TRUE),
    dvorak_q3 = quantile(dvorak_wpm, 0.75, na.rm = TRUE),
    dvorak_max = max(dvorak_wpm, na.rm = TRUE),

    circle_min = min(circle_wpm, na.rm = TRUE),
    circle_q1 = quantile(circle_wpm, 0.25, na.rm = TRUE),
    circle_median = median(circle_wpm, na.rm = TRUE),
    circle_mean = mean(circle_wpm, na.rm = TRUE),
    circle_q3 = quantile(circle_wpm, 0.75, na.rm = TRUE),
    circle_max = max(circle_wpm, na.rm = TRUE)
  )

wpm_tidy <- wpm_stats %>%
  pivot_longer(
    cols = everything(),
    names_to = c("layout", ".value"),
    names_sep = "_"
  )

wpm_tidy <- wpm_tidy %>%
  select(layout, min, q1, median, mean, q3, max)

@


% WPM table
\begin{table}[H]
\centering
\caption{Summary of Words per Minute (WPM)}
\label{tab:wpm}
<<results='asis', echo=FALSE>>=
kable(wpm_tidy, format="latex", booktabs=TRUE)
@
\end{table}

% TER table
\begin{table}[H]
\centering
\caption{Summary of Total Error Rate (TER)}
\label{tab:ter}
<<results='asis', echo=FALSE>>=
kable(ter_tidy, format="latex", booktabs=TRUE)
@
\end{table}


<<echo=FALSE, results='hide'>>=
# Create figures directory if it doesn't exist
dir.create("../figures", showWarnings=FALSE)

# Helper functions for standard deviation and confidence intervals
mean_sd <- function(x) {
  m <- mean(x)
  s <- sd(x)
  c(mean=m, lower=m-s, upper=m+s)
}

mean_ci <- function(x) {
  m <- mean(x)
  se <- sd(x)/sqrt(length(x))
  ci <- qt(0.975, df=length(x)-1)*se
  c(mean=m, lower=m-ci, upper=m+ci)
}

# TER stats
ter_stats <- rbind(
  mean_ci(results$qwerty_ter),
  mean_ci(results$dvorak_ter),
  mean_ci(results$circle_ter)
)

# Save TER barplot as PDF using LaTeX-compatible fonts
suppressMessages(pdf("../figures/ter_plot.pdf"))
bar_pos <- barplot(
  ter_stats[,"mean"],
  names.arg=c("QWERTY","DVORAK","CIRCLE"),
  ylab="Total Error Rate (TER)",
  main="TER of layouts",
  ylim=c(0, max(ter_stats[,"upper"])*1.1)
)
# Add confidence intervals
arrows(
  x0=bar_pos, y0=ter_stats[,"lower"],
  x1=bar_pos, y1=ter_stats[,"upper"],
  angle=90, code=3, length=0.05
)
dev.off()

# WPM stats
wpm_stats <- rbind(
  mean_sd(results$qwerty_wpm),
  mean_sd(results$dvorak_wpm),
  mean_sd(results$circle_wpm)
)

# Save WPM barplot as PDF using LaTeX-compatible fonts
suppressMessages(pdf("../figures/wpm_plot.pdf"))
bar_pos <- barplot(
  wpm_stats[,"mean"],
  names.arg=c("QWERTY","DVORAK","CIRCLE"),
  ylab="Words per minute (WPM)",
  main="WPM of layouts",
  ylim=c(0, max(wpm_stats[,"upper"])*1.1)
)
arrows(
  x0=bar_pos, y0=wpm_stats[,"lower"],
  x1=bar_pos, y1=wpm_stats[,"upper"],
  angle=90, code=3, length=0.05
)
dev.off()
@

%Include ter and wpm plot
\begin{figure}[H]
\centering

\includegraphics[width=0.7\columnwidth]{../figures/wpm_plot.pdf}

\vspace{-0.4cm}

\includegraphics[width=0.7\columnwidth]{../figures/ter_plot.pdf}

\caption{TER (top) and WPM (bottom) by keyboard layout.}
\end{figure}


\subsubsection{Subjective Measures}\label{subjective-measures}

Subjective workload was measured using the NASA-TLX dimensions of mental demand, physical demand, effort, frustration, and perceived performance (as shown in figure \ref{fig:nasa}). Across all workload categories, QWERTY was consistently rated as the most favorable layout, indicating lower perceived demand and higher user comfort.

Dvorak and Circle received generally similar subjective evaluations, with no major differences between them. However, Circle was mostly better perceived than Dvorak across all NASA-TLX dimensions, suggesting a modest subjective preference for the circular layout design. Overall, the subjective findings align with the objective performance trends, with QWERTY being clearly preferred by participants.

<<echo=FALSE, results='hide'>>=
# Read NASA-TLX data
nasa <- read.csv("../data/nasaTLX.csv")
nasa$layout <- factor(nasa$layout)

# Save boxplots as PDF using LaTeX-compatible fonts
suppressMessages(pdf("../figures/nasa_boxplots.pdf", width=10, height=20, pointsize = 24))
par(mfrow=c(3,2))
boxplot(mental_demand ~ layout, data=nasa, main="Mental Demand")
boxplot(physical_demand ~ layout, data=nasa, main="Physical Demand")
boxplot(performance ~ layout, data=nasa, main="Performance")
boxplot(effort ~ layout, data=nasa, main="Effort")
boxplot(frustration ~ layout, data=nasa, main="Frustration")
par(mfrow=c(1,1))
dev.off()
@

% Include NASA-TLX boxplots
\begin{figure}[H]
\centering
\includegraphics[width=\columnwidth]{../figures/nasa_boxplots.pdf}
\caption{NASA-TLX Scores by Keyboard Layout}
\label{fig:nasa}
\end{figure}

\subsection{Inferential Statistics}\label{inferential-statistics}
To examine the effect of keyboard layout on typing performance, repeated-measures ANOVAs were conducted separately for typing speed (WPM) and total error rate (TER).


\subsubsection{Typing speed (WPM)}
The repeated-measures ANOVA revealed a significant main effect of layout on typing speed, $F(2, 22) = 120.56, p < .001$ (Table~\ref{tab:anova_wpm}). Post-hoc pairwise comparisons with Bonferroni correction indicated that QWERTY yielded significantly higher typing speeds than both DVORAK ($p < .001$) and CIRCLE ($p < .001$).


%Anova RM for WPM
<<echo=FALSE, results='hide'>>=
library(tidyr)

# Add participant ID
results$id <- 1:nrow(results)

# --- WPM Long Format ---
wpm_long <- results %>%
  select(id, qwerty_wpm, dvorak_wpm, circle_wpm) %>%
  pivot_longer(
    cols = -id,
    names_to = "layout",
    values_to = "wpm"
  )

wpm_long$id <- factor(wpm_long$id)

wpm_long$layout <- factor(wpm_long$layout,
                         levels=c("qwerty_wpm","dvorak_wpm","circle_wpm"),
                         labels=c("QWERTY","DVORAK","CIRCLE"))

# --- RM ANOVA for WPM ---
anova_wpm <- aov(wpm ~ layout + Error(id/layout), data=wpm_long)
@

\begin{table}[H]
\centering
\caption{Repeated-Measurements ANOVA for WPM}
\label{tab:anova_wpm}
<<results='asis', echo=FALSE>>=
wpm_tab <- summary(anova_wpm)[[2]][[1]]

wpm_effect <- wpm_tab["layout", , drop=FALSE]

wpm_effect$`Pr(>F)` <- "$p< .001$"

colnames(wpm_effect) <- c("Df", "Sum Sq", "Mean Sq", "F value", "p-value")

kable(wpm_effect,
      format="latex",
      booktabs=TRUE,
      escape=FALSE)
@
\end{table}

\subsubsection{Total Error Rate (TER)}
In contrast, the ANOVA for total error rate did not reveal a significant effect of layout, $F(2, 22) = 0.71, p = 0.505$ (Table~\ref{tab:anova_ter}). This indicates that accuracy was comparable across QWERTY, DVORAK, and CIRCLE layouts. Although QWERTY exhibited a slightly higher mean TER than the other layouts, these differences were not statistically significant. Therefore, while QWERTY facilitated faster typing, it did not compromise accuracy.

%Anova RM for TER
<<echo=FALSE, results='hide'>>=

# --- TER Long Format ---
ter_long <- results %>%
  select(id, qwerty_ter, dvorak_ter, circle_ter) %>%
  pivot_longer(
    cols = -id,
    names_to = "layout",
    values_to = "ter"
  )

ter_long$id <- factor(ter_long$id)

ter_long$layout <- factor(ter_long$layout,
                         levels=c("qwerty_ter","dvorak_ter","circle_ter"),
                         labels=c("QWERTY","DVORAK","CIRCLE"))

# --- RM ANOVA for TER ---
anova_ter <- aov(ter ~ layout + Error(id/layout), data=ter_long)
@

\begin{table}[H]
\centering
\caption{Repeated-Measures ANOVA for TER}
\label{tab:anova_ter}
<<results='asis', echo=FALSE>>=
ter_tab <- summary(anova_ter)[[2]][[1]]

ter_effect <- ter_tab["layout", , drop=FALSE]

colnames(ter_effect) <- c("Df", "Sum Sq", "Mean Sq", "F value", "p-value")

kable(ter_effect,
      format="latex",
      booktabs=TRUE,
      escape=FALSE)
@
\end{table}


\subsubsection{Post-hoc Comparison for WPM}
Post-hoc pairwise comparisons with Bonferroni adjustment were conducted to further explore differences between keyboard layouts (Table~\ref{tab:posthoc}). Results revealed that QWERTY was significantly faster than DVORAK ($p < .001$) and CIRCLE ($p < .001$), confirming the advantage of the standard layout. The difference between DVORAK and CIRCLE was not significant ($p = 1.000$), indicating comparable performance between these alternative layouts. These comparisons highlight that the observed main effect of layout on typing speed is primarily driven by the superior performance of QWERTY, while the two non-standard layouts yield similar typing speeds.

% Post-Hoc analysis with bonferroni correction for WPM
\begin{table}[H]
\centering
\caption{Post-hoc-comparison of layouts with Bonferroni correction}
\label{tab:posthoc}
<<echo=FALSE, results='asis'>>=
suppressMessages(library(emmeans))

suppressMessages(emm_wpm <- emmeans(anova_wpm, ~ layout))

posthoc <- pairs(emm_wpm, adjust = "bonferroni")
posthoc_df <- as.data.frame(posthoc)

posthoc_df <- posthoc_df %>%
  mutate(p.value = ifelse(p.value < 0.001, "$<0.001$", sprintf("%.3f", p.value)))

kable(
  posthoc_df,
  format = "latex",
  booktabs = TRUE,
  digits = 3,
  float=FALSE,
  escape=FALSE
)
@
\end{table}
\section{Discussion}\label{discussion}

\printbibliography
\end{document}