The Journey of a Touch - Part I

The main goal behind the “Starting with the Basics” series was to show that, to deliver even the most basic functionality to your end users, a large number of systems needs to work well, in concert. This is, of course, without even diving into Apple’s numerous frameworks. This was all to set the stage for what I believe to be a much more interesting topic: the mechanisms through which a touch event on a phone’s screen becomes an action in an app.


For each type of device they designed, Apple selected the most fitting operating system and the best suited input and output mechanisms, to deliver unique experiences, appropriate for the host device’s form factor. On iPhone, you use iOS and you interact with applications through touch gestures. On a Mac, with macOS, you typically use a combination of keyboard and mouse/trackpad, as well as various other peripherals (decks, drawing tablets etc.). On an Apple Watch, you use a combination between the Digital Crown and touch gestures to interact with applications running on WatchOS. On a newer Vision Pro, with visionOS, you use a combination of eye tracking and hand gestures and, at times, various buttons. And finally, on an Apple TV, you use a remote control with a built in trackpad, to interact with TvOS. On top of those, you have a wide variety of additional accessories you can connect to your device, as well as voice commands.

Regardless of the OS and the exact input technology, the main mechanisms involved are generally the same:

  • An input device, which captures input and translates it into a binary data packet. Then, the binary data is transmitted to the device’s SoC, through dedicated lines on a dedicated bus;

  • A driver running on the host device’s operating system kernel, which manages the communication between the input component and the operating system. It instructs the operating system how to interact with the device and how to control it. It also converts the device specific data intro a more generic Human Interface Device (HID) package(the IOKit IOHIDEvent);

  • An IO Manager, which aggregates and routes IO events from various connected devices to other software components. On iOS, this is the backboardd;

  • An Application and Window Manager/Server, which receives IOHIDEvents from the IO Manager and routes them to the relevant application. The relevant application is generally determined by keeping a record of which applications are in the foreground and/or in focus, as well as the frames of their windows (their coordinates). This work divided between the the SpringBoard and backboardd processes on iOS, or the WindowServer process, on macOS;

  • Applications and Frameworks, which ultimately receive the input event and effect some change in their own state, based on the input. Here, the IOKit IOHIDEvent is converted to application framework specific formats, such as UITouch (UIKit) or an appropriate Gesture type (SwiftUI);

  • A Window Compositor and a Render Server, which renders the images to be displayed on the screen(s), with the frequency dictated by the display with the highest refresh rate;

  • Optionally, depending on the design and technology, there can be various buffers and/or shared memory locations, on any of the previously mentioned components. Their purpose may vary, from ensuring that events are not lost in case of failures to various memory, speed or power consumption optimizations.

Throughout this article, we are going to follow the entire journey of a touch event, from the moment an end-user presses on a button in an application’s UI, to the moment the button’s effects are visible on the screen.

For reference, the application will render a single view, as shown in the screenshot below.

 

“The Journey of a Touch”. The initial state (left) and the final state, after the button had been pressed 10 times (right)

 

The following snippet represents the view’s code. We are going to explore SwiftUI in detail in the following sections. For now, the code is just for support, in case you would like to explore it further.

//   JourneyOfATouch.swift
//===================
//   Created by Samwise Prudent on 23.06.2025
//   Copyright (c) 2025 Prudent Leap Software SRL. All rights reserved.
//   


import SwiftUI

struct JourneyOfATouch: View {
    @State var counter = 0
    var body: some View {
        Spacer()
            .frame(maxHeight:40)
        Text("The journey of a touch")
            .font(.largeTitle)
        Spacer()
        Button(action: {
            counter += 1
        }, label: {
            Text("Please press the button")
        })
        .buttonStyle(.borderedProminent)
        Text("The button has been pressed \(counter) times")
        Spacer()
    }
}

#Preview {
    JourneyOfATouch()
}

The diagram below represents a very high level overview of the main elements that work together in order to capture, interpret and transform the act of touching an area on your screen into an action that effects change in an application.

 

Journey of a Touch - High Level Overview of the main components

 

Note how, in the diagram above, some elements exceed the outlines of their parent structures. This is intentional, because those elements (for example, the controllers) are used as interfaces with external components.

 

Understanding the devices

The announcement of the first generation iPhone, at MacWorld 2007, is still considered one of the most impactful keynote presentations in recent history. With the original iPhone, Apple entered the mobile phone market in a landscape dominated by two telecommunications giants, Nokia (Finland) and Motorola (United States of America). At the time, Nokia dominated the market by a large margin. Their flagship multimedia computer, the Nokia N95, was also looking to deliver a rich experience. Their design, however, was still heavily rooted in the classic, time-tested and resilient button based form factor.

Apple’s original iPhone marked several shifts in the mobile phone market. First, it encouraged the release and adoption of a new generation of Operating Systems, primarily focused on touch-based mobile devices. At the time, the main Operating Systems were Nokia’s Symbian and Microsoft’s Windows Phone, together with various proprietary Operating Systems, such as BlackBerry OS. Secondly, it solidified a design language around the modern smartphone, with a large screen and only a few hardware buttons. The latter was also made possible by the manufacturers’ move away from resistive touchscreens (pressure sensitive), to a different technology, the capacitive touch screen. This technology is described in Apple’s patents, US8243027 - Touch Screen Liquid Display, US7479949 - Touch Screen Device, Method, and Graphical User Interface for determining commands by applying heuristics, US8432371 - Touch Screen Liquid Display and US7663607 - Multipoint Touchscreen. The development of capacitive touch screen technology, combined with a dedicated, touch-focused User Interface, sparked a new generation of devices. This combination enabled a stylus-free experience, enriched by multitouch capabilities. If you experienced the transition, you may recall that some Android devices either didn’t support multitouch, or implemented inconsistently. Apple’s multitouch was leaps and bounds ahead.

On an iPhone, iPad or an Apple Watch, the screen assembly is, simultaneously, both an input and an output device. Underneath the screen’s glass, you would find a conductive layer (usually consisting of Indium Tin Oxide - ITO) and the screen’s digitizer. These components make up the main assembly that acts as the input side of the phone’s screen. Underneath those, you would find the LCD or OLED assembly, or the output part of the screen.

The screen’s input assembly, the Touch Application Specific Integrated Circuit (or, in short, the digitizer), consists of an array of sensors and a touch controller. As seen in Apple’s Patents US8243027B2 and US8432371B2, (in the patent material, sheet 17, Fig. 25), Apple may have preferred digitizers that are compliant to the SPI specification, but this is not necessarily indicative of the actual implementation. SPI is slightly more complex than other similar specs (such as I2C), but it does ensure higher bandwidth.

The Touch IC itself usually resides on the phone’s main logic board and it connects to the sensors array via a flat flexible cable. The main logic board consists of a multi-layered PCB assembly and it also hosts the SoC, Modem and GSM modules, the Input/Output connectors, the main memory modules, the T1/T2 security chips and various other modules.

In 2014, Apple submitted a new patent, for an in-cell touch technology. As a result, the old technology became known as on-cell touch. In essence, in-cell technology eliminates the need to overlay the display and the sensing layers, by interweaving the touch sensors with the pixel cells of the display. Apple holds a patent for an implementation of the technology, under US20140225838A1 - In-Cell Touch for LED.

The iPhone hardware is controlled by its dedicated firmware and by the Apple iOS Operating System, which is based on macOS X.

 

Apple does not typically publish too much information about the internal components of their devices, nor do the manufacturers of these components. For this reason, the sections discussing the digitizer and its connection to the SoC are speculative. It is likely that Apple is using some variant of SPI for the bus between the digitizer and the SoC, it’s likely they are using DMA, and it’s possible that they use out of band hardware interrupts.

 

Capturing a touch event and generating an Interrupt (in a hypothetical SPI-based flow)

When it comes to events directly initiated by end users’ interaction with any device, from any manufacturer, the trigger is always some form of change in the state of an input device. In most cases, end users interact with smartphones by touching a region of the screen. Because the human body is a relatively good conductor, the touch disrupts the electrostatic field of the capacitive screen’s digitizer, causing a change in the state of the its sensors array ( affecting the mutual capacitance between the sensors). This change is detected by the touch controller, which eventually sends the change information to the SoC, for processing. The diagram below depicts a general implementation of the touch section of a touchscreen, as an input device. It abstracts away the data bus used for communication between the Peripheral Touch Controller (which controls the sensors array) and the Host Touch Controller (which orchestrates other similar peripheral controllers, on the same bus but with different purposes). You can find schematics for various devices online or in dedicated software and, with enough experience, you can infer some of the design choices, but this is out of scope for now.

 

Capturing a touch event and sending it to the SoC - Generic (in/out-cell, no bus specifications)

 

To conserve power while also maintaining the overall responsiveness of the device, the digitizer’s touch controller polls the sensors array at a fixed frequency of 120 Hz, regardless of the display’s refresh rate (as opposed to constantly reading the array). Older displays polled the touch controller at 60Hz. This means that, in modern devices, a Pro Motion digitizer’s controller would cycle through its detection routine (draw power from the battery, power up the sensors array, read the mutual capacitance of the array then power off) 120 times every second. In other words, the controller takes a snapshot of the sensors array once every 8.33 milliseconds.

 

For comparison, Apple Pencil devices poll their controller at a frequency of 240 Hz, while gaming mice poll their controllers at much higher frequencies, up to 4000-8000 Hz. This variance in poll rates is handled by Apple’s Operating Systems and UI Frameworks, for example by use of Coalesced UITouch events.

 

When the touch controller detects a change in characteristics (a signal), it performs an initial analysis, to determine if the change represents a valid touch event. It performs noise filtering, to ensure that the detected signal is not triggered by electrical noise (such as interference with other devices) and, in some cases, it may also periodically perform various calibration tasks(for example, to account for changes in temperature, which may also affect the characteristics of the sensors array).

If the signal matches the parameters of a valid touch event, the digitizer converts and serializes the event’s characteristics into a special data structure, which would later be processed by other components of the host device (in our case, an iPhone).

The serialized touch data structure, which includes the coordinates of the touched points, as well as whether the touch gesture started or ended, together with some other potentially useful information, is then persisted into a First-In First-Out Queue, located in the controller’s memory. This allows the controller to queue up multiple touch events until they are processed by the host device, keeping the entire system responsive and ensuring no events are lost.

With its limited processing capabilities and focus on power efficiency, the digitizer can’t do much more (nor should it). Therefore, after the touch information is persisted, the touch information needs to be transferred to the SoC, for further processing.

There are many mechanisms Apple could use to transfer the touch information from the Touch Integrated Circuit to the SoC. Their initial patents mention I2C and SPI as possible buses to use for the touch screen assembly - and the schematics available online do indicate that Apple SoCs support both I2C and SPI. However, Apple is likely using a modified variant of an existing bus or a proprietary bus altogether.

As a hypothetical example, we are going to assume Apple uses an SPI bus with an out of band interrupt mechanism. While Apple didn’t publicly share the exact implementation details, it’s a plausible scenario, considering MacBook trackpads do use an SPI bus. Apple may prefer Message Signaled Interrupts, but it’s not necessarily the case. Both approaches have advantages and disadvantages and Apple can optimize the entire stack, from hardware to software and everything in-between.

Regardless of the exact implementation, though, the general mechanisms remain unchanged. The peripheral touch controller needs to communicate with its host controller on the SoC via some data bus and the SoC’s CPU requires a signal to stop its current execution and focus on processing the touch event.

The SPI Specification, originally developed by Motorola, includes a dedicated data bus used for synchronous, bi-directional (full-duplex) communication between a Main Node (the Host Controller) and one or multiple Peripheral Nodes.This data bus consists of four lines types (wires):

  • CS (Chip Select) Line(s), used by the Main Node to select the peripheral it talks to. In the old specification, this was called the Slave Select line. Usually, there is a dedicated line for each peripheral.

  • SCLK (Serial Clock) Line, used by the Main Node to synchronize clocks with the peripheral it selects using the CS Line

  • PICO (Peripheral In, Controller Out) Line, used by the Main Node to communicate with its selected Peripheral. In the old nomenclature, this line was MISO

  • POCI (Peripheral Out, Controller In) Line, used by the peripheral to send data to the Main Node. In the old nomenclature, this line was MOSI

In SPI, there is no dedicated message or packet format. Instead, the communication is governed by a command-response pattern, where the Host controller indicates what it needs to read (eg. a status register to know how many bytes are present in a buffer, or the content of the buffer) and the Peripheral controller provides the response.

By design, SPI is slightly more secure than I2C, another commonly used peripheral bus. SPI Hosts select a specific peripheral by pulling the dedicated Chip Select (CS) line (each peripheral has a dedicated pin on the host). While the data lines (PICO, POCI and SCLK) are still shared among devices, a peripheral only listens if its CS line is active, reducing the risk of unintended data exposure.

In contrast, I2C Hosts broadcast packets that include the address of the intended peripheral. By convention, only the controller that has the transmitted address would act on the message. Since the two I2C wires (Clock and Data) are shared across all peripherals on the bus, it’s much easier to eavesdrop on I2C traffic or spoof peripherals.

Regardless of interface, it’s good practice to secure the transmission, with various protection mechanisms. Some common approaches are encrypted packets, peripheral authentication via dedicated security chips and/or bus zone isolation.

Although SPI does not specifically include an Interrupt Line, it is commonly used when designing responsive, power-efficient devices. In most cases, it is a dedicated out-of-band line (not part of the standard SPI Bus specifications).

 

Capturing a touch event and sending it to the SoC - SPI and Out-of-Band Hardware Interrupt

 

As an example scenario, let us assume an end-user touches the "Please press the button" button in the interface presented at the start of this post. As long as the user touches the screen for slightly more than 8 milliseconds, the Touch ICwill detect a change in the mutual capacitance values of its sensors array. Having confirmed the readings indicate a valid touch, it will convert the analog readings into digital information and then store the details in its local memory.

Having persisted the touch information into its buffer, the peripheral host controller sends a signal on its interrupt line output pin. The interrupt signal is characterized by the presence of a higher voltage (relative to the ground). In slightly more technical terms, the digitizer asserts the interrupt line, signaling that it encountered a scenario that needs to be handled (serviced) by the circuit on the other end of the interrupt line. This line is connected to a dedicatedGeneral Purpose Input/Output connector (GPIO), which carries the signal further to the Generic Interrupt Controller (Apple Interrupt Controller in our case).

When the AIC receives the interrupt signal, it generates a request known as the Hardware Interrupt Request (IRQ). The purpose of an IRQ is to halt any task the processor may be currently doing and trigger the Interrupt Response process.

Based on system load and other criteria, the AIC then assigns a CPU Core to process the IRQ. At this moment, the AIC creates an interrupt mask, to ensure that subsequent IRQs are only directed to that core in specific circumstances (for example, higher priority IRQs of a different class). Otherwise, the Core is allowed to process the interrupt until it completes the task. This process has two main advantages:

  • It helps prevent possible data corruption and it ensures that Kernel panics, for example, which would have a very high priority, would still be processed by a core that is currently engaged in an IRQ process, avoiding deadlocks;

  • It allows other cores to potentially pick up other interrupts from the same digitizer, increasing the system’s responsiveness

During this time, other cores may handle other interrupts from the same digitizer and/or perform other tasks. For this reason, access to shared memory areas is controlled through a very carefully set up system of mutually exclusive locks and semaphores. This guarantees events are handled in order and it ensures one core doesn’t accidentally overwrite the data another core needs.

Handling the (primary) interrupt

Compared to the previous model, where the digitizer polls its sensors at a set frequency, interrupts are handled as they occur. When a peripheral controller asserts its interrupt line, it maintains the voltage until its host controller signals it to de-assert the line. Since this signal is the initial trigger of any action on the CPU side, it is referred to as the primary interrupt. Since it’s also a physical signal (not software), it’s a primary hardware interrupt.

The host controller is configured to instruct peripherals to clear the interrupt after the events are dequeued. This model allows the CPU Cores (which end up servicing the interrupts) to go into a deep sleep state. Requiring the CPU to poll every peripheral to determine if it should do anything would be highly inefficient.

 

Handling (servicing) the hardware (primary) interrupt

 

During normal operation, any given CPU Core is either in a deep sleep state or it’s executing instructions from one of the many programs that might be running on the device. When an interrupt signal arrives, the core needs to temporarily suspend this work (or if it’s sleeping, it would wake up) and handle the interrupt request. This process is known as Exception Handling(IRQs are a subtype of Exception) and it generally follows a clear pattern. First, the core switches from Exception Level 0 (EL0, for Application execution) to Exception Level 1 (EL1, for Rich OS or Kernel execution), to gain access to privileged Kernel instructions. After it gains the elevated privileges, it also automatically saves the Current Program Status Register, as well as the Program Counter (the next instruction it should execute when it resumes the interrupted task) and other special registries, in a dedicated memory location. Next, the core jumps to a special area of Kernel Memory, known as the Interrupt Vector Table(or, on x84, Interrupt Descriptor Table). The IVT stores short branching instructions, generally known as trampolines or redirects, which instruct the CPU to jump to another area in memory, where the appropriate long form interrupt-specific Interrupt Service Routine is stored. It runs in a highly restricted kernel space, in an interrupt context and its purpose is to identify the type of device that raised the interrupt, then schedule the more complex operations required to actually servicing the interrupt to a device-specific handler. Code that is executed in an interrupt context can only use specific memory addresses, cannot acquire locks, cannot create new memory structures and it completely blocks the CPU Core while it runs.

The ISR is also known as the top half of a driver, while the device-specific handler is known as the bottom half of the driver. The top half of the driver handles the primary (or direct) hardware interrupt.

In their Kernel Programming Guide and Handling Interrupts in IOKit guides, Apple describes how, in OS X and, by extension, in iOS, this ISR is a generic low level interrupt handling routine, which Apple programs and maintains. Its purpose is to hand off the direct interrupt to a device handler (in this case, the digitizer’s handler), then schedule an indirect interrupt ( an interrupt handler written as an IO Kit Work Loop Kernel Extension (kext)) and then immediately clear the interrupt bit in the Apple Interrupt Controller. By doing so, the Operating System Kernel (specifically the Mach Scheduler) gets the chance to further arrange the execution of other tasks (other IRQs or perhaps some other urgent work). As a result, the CPU Core can reload its previous state and continue the execution, in a reasonably low timeframe.

Handling the indirect (secondary) interrupt

Shortly after handling the primary interrupt, the system proceeds to execute the scheduled touch driver event handler’s IOKit WorkLoop. In the hypothetical case of a SPI bus, the driver could issue a request to the Main SPI Controller (this is the controller on the SoC), to retrieve events from the appropriate Subnode (Peripheral) Controller. The Peripheral SPI Controllerwould then dequeue the messages it has stored and transmit the binary (serialized) representation of every touch event it currently holds in its buffer, potentially in a batched touch events report.

 

Handling the secondary Interrupt

 

As Apple describes them and as shown in “Exploring Apple’s drivers ecosystem”, IOKit WorkLoops are essentially gating mechanisms that ensure single-threaded access to data structures used by the hardware. This is especially useful with driver constructs, which can be accessed concurrently by multiple threads for a variety of reasons (primary interrupts, timeout events and others). Device driver design and implementation are complex topics, which we are not going to explore further. For now, it’s useful to know that IOKit WorkLoops run on dedicated kernel threads and they run in the normal kernel space, which allows them to allocate memory, acquire locks and perform more complex logic. When the event assigned to an IOKit Workloop is processed by the system, a more comprehensive analysis of the touch data is performed. The final task of the driver is to package the processed information into a standard IOHIDEvent, which represents the processed touch input in Apple’s HID Event System ( you can find some examples here).

 

The mechanism could be slightly different on a Mac, where Kernel Extensions are no longer supported and have been replaced with the DriverKit System Extensions. System and Driver extensions were announced in WWDC19, in a dedicated set of talks.

 

Improving the hypothetical SPI model

Another, more complex (and arguably more likely) model , still within the hypothetical SPI scenario, involves the use of DMA (Direct Memory Access) enabled SPI flows. In this scenario, the CPU handles two interrupt sources, the SPI Peripheral and the DMA Controller. The driver initialization phase, which occurs when the Operating System starts up, is slightly more complex, because it involves configuring the DMA context. This process requires loading the DMA controller’s drivers, setting up the controller configuration and initializing its memory-mapped registers, among other tasks.

 

Capturing and processing Touch Data with a DMA-enabled SPI interface

 

The trade-off is that DMA enables the transfer of data from the SPI peripheral to the system memory (to be processed and encoded as an IOHIDEvent) with minimal CPU intervention. In contrast to the previous model, which requires the CPU to constantly read data from the controller’s dedicated MMIO register, DMA offloads most of this work by coordinating the transfer of data from the host controller to the dedicated system memory area. This allows the CPU to work on other tasks in the meantime.

In DMA-enabled flows, the CPU is responsible for initiating the DMA-supervised data transfer, and later for reading the data from the appropriate system memory buffer, once the DMA controller signals that the transfer is complete.

When the SPI Peripheral detects a valid touch event, it still buffers the data - and it still asserts an interrupt to the CPU. The generic ISR executes, but instead of issuing a read command to the SPI Host controller, the driver’s IOKit WorkLoopinstructs the DMA controller to perform the transfer. After doing so, the WorkLoop waits for the DMA to signal completion.

Next, the SPI Host controller retrieves data from the peripheral controller and store it in a dedicated RX_FIFO memory area.

The DMA controller then transfers data from the SPI Host controller’s RX_FIFO to a dedicated System Memory Buffer. Many Audio and Touch Screen implementations rely on ping-pong buffers. In these implementations, the DMA controller reads data from the peripheral, then writes to a buffer, usually named the Ping buffer. When the transfer for the Ping buffer is complete, the DMA controller raises an interrupt, signaling the CPU to read from that buffer. The driver’s interrupt handler responds by waking up the previously blocked WorkLoop, which had been put to sleep when the DMA transfer process was initiated.

The driver then reads the data from the system memory, processes it through a Driver Interface, then encodes it into an IOHIDEvent structure, which is then processed as a Human Interface Device Event.

While the CPU processes the Ping buffer, the DMA controller continues reading packets from the peripheral and writes them to another area in system memory, usually known as the Pong buffer.

When the transfer is ready, it raises an interrupt, to signal the CPU that the Pong buffer is ready for processing. The IOKit WorkLoop wakes up, consumes and processes the message, then blocks again. This alternating process repeats as long there is data to be transferred.


To Be Continued…

Previous
Previous

The Journey of a Touch - Part II

Next
Next

Starting with the basics - Part III