The following is an alternate tutorial for installing and running Windows 10 on the Raspberry Pi 4. This version concentrates on running Windows from a single USB drive plugged on one of the rear USB 3.0 ports, which is both much faster than other methods and does not require the use of a micro SD card at all.
This guide is provided “AS IS”, with NO WARRANTY that it will work for your specific environment or even that you may not end up losing important data as a result. Therefore, by using this guide, you accept that the responsibility for any software or hardware damage is entirely with YOU.
Also, though perfectly legal (since nothing in the licensing terms for Microsoft Windows prevents you from installing it on a Pi and you can download the Windows 10 ARM64 installation files straight from Microsoft), this method of installing or running of Windows on a Raspberry Pi, is NOT endorsed by Microsoft or the Raspberry Pi Foundation. If you choose to follow this guide, you accept that Microsoft and the Raspberry Pi Foundation do not bear any liability with regards to the behaviour of Windows on the targeted platform.
By following any of the steps below, you implicitly acknowledge that you have read these conditions and have agreed to them.
A Raspberry Pi 4 where the EEPROM is up to date enough to allow straight to USB boot. If you purchased a Pi 4 recently, this should already be the case, but if not (i.e. if you find that your machine cannot boot from USB) then you should download a recent version of a rpi-boot-eeprom-recovery archive from here, put all the files on a MBR-partitioned, FAT32-formatted SD card, and apply the update.
A fast USB 3.0 drive, with a capacity of at least 32 GB, such as a fast flash drive (please use a drive that has a write speed greater than 50 MB/s, and that also have a sufficient random I/O speed, as your experience will be greatly diminished otherwise), a USB 3.0 SSD enclosure, etc.
Screen, keyboard, mouse & a powerful enough PSU.
As mentioned above, you will notice that no microSD card is used when following this guide (provided that your EEPROM is recent enough).
A Windows host machine to create the drive, since this guide uses Windows-only utilities.
A Windows 10 for ARM64 ISO or install.wim. At this stage, we recommend to use the feature update to Windows 10, version 2004 (19041.330), as other releases, and especially more recent ones, are known to have broken the ability to boot from USB. Because Microsoft does not yet publish retail ARM64 ISOs, like they do for x86 or x64, you need to use a third party utility to create one, such as the one from https://uupdump.ml/. In this case, you want to use this direct link to download the script, allowing you to create the required 19041.330 installation media.
Plug in your target USB 3.0 drive. It is recommended that you unplug any other USB media, such as flash drives or USB HDDs, so that you don’t end up erasing them by mistake
Run WoR.exe and select your language. Note that the language you select for WoR has no effect on the language Windows will be installed into, which is dependent on your source image.
Select your device in the dropdown (again, make sure you do select the right device as internal drives may be listed!) and select Raspberry Pi 4 [ARM64] for the other option.
On the Select your Windows on ARM image, pick the .iso/.wim/.esd/.ffu for the Windows 10 image you want to install, which you obtained in the Software Requirements. If needed wait for the image to be mounted and then select the edition you would like to install.
On the Select the drivers screen, choose the option that is suitable for you (most likely Use the latest package available on the server if you haven’t already downloaded the drivers).
On the Select the UEFI firmware, choose Use the latest firmware available on the server, as you will need the most up to date official UEFI firmware for USB boot.
On the Configuration screen validate that everything is in order and click Next (Don’t change the General configuration options unless you know what you are doing). Especially, make sure to double check that the Target device being listed is really the drive you want to use on your Raspberry Pi, and press Install.
Wait for the installation to finish. Note that if that process takes more than 25 minutes to complete, it means that the drive you are trying to use is slow and will probably result in a poor Windows experience. In other words, the longer you spend creating the drive, the more likely it is that Windows will perform poorly.
Remove the USB and plug it to one of the USB 3.0 ports of your Raspberry Pi 4 (make sure that it is one of the blue USB 3.0 ones). Windows should boot, go through the finalization stage of the installation process (it should reboot once), and let you log on after going through the various installation screens.
If you have a 4 GB or 8 GB model, you will find that the RAM is limited to 3 GB by default. To enable the whole RAM, you will need to go to the UEFI settings (Esc key during boot) and then go to Device Manager → Raspberry Pi Configuration → Advanced Configuration and set Limit RAM to 3 GB to <Disabled>. Then save your settings and reboot.
A recurring topic is Windows drivers for the Raspberry Pi 3 and 4 series.
MCCI DesignWare USB2 driver
This is for the front USB ports on Pi 3 and only the Type-C port on the Pi 4.
MCCI Corporation has made their TrueTask USB host stack available to the Raspberry Pi WoA community for non-commercial, evaluation purposes. MCCI did the original work for the 32-bit Windows IoT Core. It is available courtesy of Terrill Moore, CEO of MCCI, who graciously spent time in early 2019 to get it building and validated with the 64-bit Pi 3 UEFI.
Once you download all of the things above you can proceed.
Open WoR. Select Disk from the list which will be your microsd card reader or usb storage device and select Raspberry pi 4 as a device that you will use. Then select build of windows WoR should use by pointing to a correct install.wim file. Use the latest drivers that WoR server provides. Select the latest UEFI for Raspberry pi 4 in WoR. Use Advanced tab in WoR to limit memory to 1024MB for USB type-c and if you don’t need type-c port then RAM is still limited by UEFI by default to 3GB of RAM. Do edit boot options in WoR if you need to (I always overclock as my Pi has a fan and heatsink attached).
WoR will deploy windows to the selected micro sd card or usb storage device which will take from 10 minutes to 3 hours depending on speed of your micro SD card or USB storage device .
Safely remove micro SD card or usb storage device and move it into the Raspberry Pi.
This guide will be most likely updated if anything changes. First boot will take between 6 minutes to 2 hours depending on speed of your micro SD card or usb storage device . If there are issues during OOBE setup pressing shift + F10 then typing
might help. If it doesn’t, you will need to test a different build of Windows 10 arm64. Good luck!
Currently a lot of drivers are missing and not a lot of development is happening. We are missing drivers for audio jack ,wifi and gpu. Development of audio jack drivers is most likely to happen soon.
I freqently check comments under this post but for a bit faster response you can join Discord server and ask for help there as most people in that server will be able to answer your question. You might find there a guide how to customise windows10 build to make it smaller and lighter.
In the past few days, there has been a lot of progress and a lot of publicity for this project, which shows the ecosystem’s desire and demand for lowering the barrier to entry on booting Arm SBC’s, in this case the Raspberry Pi 4 of course.
Tweets, LinkedIn posts, CNX Software replies, and Hackster comments all tell the same story: Allowing users to power on a single board computer, install the operating system of their choice using “normal” boot media, and proceed through an install process just like they are used to a typical PC is a missing piece in the Arm ecosystem. Without the ability for “regular” users to start to explore Arm hardware and get up and running in a way they are used to, Arm Servers will remain a niche product.
So, join us on the Discord Server, help contribute patches and code if you can, or simply spread awareness of the project on your Social Media channels!
One of the gaps between ServerReady and existing “Edge” SoCs is the latter’s general reliance on OS-coordinated device initialization and power state management. A typical BSP or port of an OS would involve clock source, GPIO, PoR and pin control/multiplexing drivers as prerequisites for any embedded devices, and usually I2C, SPI and voltage regulator drivers to be able to do anything “interesting” such as accessing sensors, doing storage I/O or driving graphics outputs.
While firmware could hypothetically pre-initialize everything into a working state, this presents a dilemma for devices meant to operate at the lowest possible power setting. Pre-initialization also means no device reconfiguration after OS boot.
ACPI has somewhat adapted to this space, but not abstracted enough from gory details. While ACPI does tidy up some platform device configuration via its interpreted AML byte code methods, it only appears to be marginally better than device tree for non-server systems. For example, ACPI’s notion of Generic Serial Bus (I2C, SPI, UART) and GPIO OpRegions lets device AML methods perform I/O without resorting to bit-banging MMIO addresses, but requires host operating system drivers to provide the underlying implementation. There are ACPI resource descriptors for tracking and describing GPIOs and pin configuration for devices, but these are again completely useless without appropriate drivers.
Great, so this basically reduces ACPI on non-server systems to obscure machine code coordinating a bunch of OS drivers, sourced from silicon providers and platform integrators. But maybe we can throw all these new vendor-specific OpRegions away, for compatibility’s sake, and code like it’s ACPI 4.0a?
Is writing AML really feasible?
Surely, device- and platform-specific configuration logic can be neatly limited to AML methods?
Well, that’s highly overrated.
Above is an excerpt of a TAD device for a memory-mapped RTC. It’s a good warning against doing anything beyond basic I/O and arithmetic in AML. Considering how straightforward RTCs are as a device class, this bit of code (roughly 1/4 of all AML required) is unmaintainable and without comments would be completely incomprehensible. Yes, that’s a busy-wait in there for 5 seconds, waiting on a completion from the device, but it might as well be looping forever. Note, that AML methods in most implementations run with a global interpreter lock (which also means AML code is not reentrant – an OpRegion cannot be backed by AML).
AML is machine code. Really slow and limited machine code. ASL (the source compiled into AML) is about as expressive as 8048 assembly.
How about translating the following real-life SBC support code to AML? On Raspberry Pi the Arm cores don’t have access to device control blocks and must request the VideoCore VPU (GPU) processor to act on its behalf via special “mailbox” requests in shared RAM. E.g.:
First of all, this is doing DMA…so for an AML implementation, you’ll need a chunk of memory. You can’t use an actual AML buffer for this, so you’ll have to carve some physical memory out in the UEFI memory map, mark it with the right memory attributes matching your coherency requirements, set up a SystemMemory OpRegion…
Oh, I’m having so much fun!
Well, Barbie, we are just getting started..
As you can see, the routine is a mix of MMIO and CPU operations, e.g. data cache cleaning and invalidation. This is problematic for compiling to AML, which doesn’t include any cache operations. You could do away with cache operations entirely by carving out a non-cache coherent memory chunk. Or perhaps you’re lucky and your device supports cache coherent DMA… Well, then it would have to look like this:
No more cache operations! But now we have barriers. AML doesn’t have memory barriers. Maybe you can play fast and loose? Well, expect problems when you rev up the cores to something a bit more OoO. Also, consider that an operating system’s AML interpreter is probably getting scheduled across multiple CPUs while running your method…
So, no AML for anything involving DMA. What a mess…
No one wants to write AML
Considering how problematic it is to write AML, Microsoft introduced their own notion of Platform Extension Plugins (PEPs) to stand in for entire AML methods.
PEPs are intended to be used for off-SoC power management methods. Since they are installable binaries, they can be updated on-the-fly as opposed to ACPI firmware which requires a firmware flash. … Power management was the original intent for PEPs, but they can be used to provide or override any arbitrary ACPI runtime method.
Providing power management using PEPs can be much easier to debug than code written for the ACPI firmware. …
PEPs can obscure any method and provide methods you didn’t know were necessary. The AML code now just provides a “skeleton” and some dummy method implementations. This makes ACPI system descriptions relying on PEPs completely useless, since PEPs are opaque platform knowledge, sourced from vendors and completely undocumented.
PEPs play no role in the construction of the ACPI namespace hierarchy because the namespace hierarchy must be provided in the firmware DSDT. When the ACPI driver evaluates a method at runtime, it will check against the PEP’s implemented methods for the device in question, and, if present, it will execute the PEP and ignore the firmware’s version. However, the device itself must be defined in the firmware.
Even worse, you can’t meaningfully tell which AML methods would be overridden by a PEP, or what kind of configuration data the PEP was meant to source. PEPs are even worse than device tree, because PEPs can fully hide configuration data that would otherwise be reported as properties through device tree.
This is a good explanation for why none of the Snapdragon laptops today can boot Linux using ACPI. And not just Linux – the HP Envy x2 cannot boot a “stock” Windows 10 image without Snapdragon customizations.
PEPs are not an answer.
Let’s put the BSP in TF-A
We know we don’t want to write in AML for a good number of reasons. In a few situations it is literally impossible to write safe and functional AML. We also want to avoid writing drivers for things OS vendors (and their customers) really don’t care about, like pin controllers and clock source management. And PEPs are not a standard interface, and their implementation is OS vendor specific.
It would appear that the best place to hide low-level platform drivers for device initialization and power state management is Trusted Firmware-A. TF-A is an industry-adopted TrustZone firmware usually used to implement PSCI or as a foundation for a Trusted Execution Environment (TEE). It’s a rich enough environment to contain complex code written in a high-level level. Also, TF-A likely already includes some of the component drivers. This way, TF-A becomes a software-based System Control Processor (SCP).
If the platform BSP entrails are squirreled away in TF-A, how would an OS interact with them? Via ACPI AML methods of course. Regardless of how the ACPI interface works, the actual calls to a firmware-based SCP would be via the well-standardized SMCCC specification.
Isn’t putting stuff in TF-A bad? It’s not ideal, but putting it into AML or OS vendor-specific drivers is much worse. Other platforms such as IA-64 (SAL) and OpenPower (OPAL) rely on firmware interfaces to abstract some platform I/O and implementation-specific details.
Isn’t that cycle stealing? No, because we’re talking about operations done on behalf of the operating system requesting them.
Note: It’s not just generic off-the-shelf operating systems that would win from TF-A abstracting common SoC hardware and control interfaces. UEFI firmware itself could make good use of these, reducing the development and support effort for all platforms, and removing similar/duplicated functionality further reducing code size and bug counts.
But what about UEFI runtime services?
Runtime services are meant to abstract parts of the hardware implementation of the platform from the OS, but the interface is fairly limited in scope today. Here’s the current set of calls:
Returns the current time, time context, and time keeping capabilities.
Sets the current time and time context.
Returns the current wakeup alarm settings.
Sets the current wakeup alarm settings.
Returns the value of a named variable.
Enumerates variable names.
Sets, and if needed creates, a variable.
Switches all runtime functions from physical to virtual addressing.
Subsumes the platform’s monotonic counter functionality.
Resets all processors and devices and reboots the system.
Passes capsules to the firmware with both virtual and physical mapping.
Returns information about the EFI variable store.
But what if this list could be extended? Instead of moving the BSP into TF-A, let’s make it all a UEFI runtime service, and figure out how to perform RT calls from AML.
That’s not a very good idea:
RT is fragile, as services share the same privilege level as the calling operating system. Differences in the way different operating systems call services are a constant source of bugs across vendor implementations, e.g. flat addressing or translated, interrupts enabled or disabled, UART enabled or disabled.
RT requires an environment – memory ranges must be correctly mapped, enough stack, disabled FP traps, etc. A single SMC instruction for trapping to TF-A is hard to beat.
RT is fragile, as runtime services are provided by the same drivers that provide boot time services in most UEFI implementation like Tiano. There’s no meaningful isolation between an RT driver and its (and other) BS components. The firmware programming model is just bad. It is extremely easy to make a seemingly benign change (new giobal variable, logging statement, driver dependency) that will break RT support for some users, but be very difficult to track down.
Limited facilities with no support for asynchronous implementation, e.g. can’t take an exception on behalf of a service, while in OS. This may make some hardware hard to expose efficiently or mean that certain workarounds are impossible, e.g. a device quirk that relies on handling external aborts on a system with firmware-first error handling.
Simply revising the UEFI specification with new RT services won’t do anything for existing code bases. Retrofitting will be a significant effort. In contrast, TF-A is a simpler code base that is easier to swap out.
UEFI implementations have a poor track record of being open-sourced by firmware vendors. TF-A implementations are done by silicon providers themselves, and have a better track record of being open source and audit-able.
Tying SMCCC and ACPI together
We need a generic escape hatch mechanism from ACPI to the operating system, to be able to easily perform arbitrary SMCCC calls from AML device methods.
Escape Hatch #1 – FFH
The Functional Fixed Hardware (FFH) OpRegion type seems like a good fit.
Unfortunately, the ACPI specification gives no examples of FFH usage for anything outside of Register resource descriptors, as part of processor Low Power Idle States (_LPI) support.
I never saw any examples with OpRegions being declared as of type FFixedHW, but the ACPICA compiler (iasl) didn’t barf at this quick draft:
The problem with Hatch #1 is the large amount of changes to operating system ACPI support. Additionally, the syntax is obtuse and doesn’t fit well the semantics of a method invocation, and the creation of OpRegions and buffers has additional overheads.
To borrow a page from the PEP book, the OS ACPI interpreter could provide a method to perform SMCCC calls. Unlike PEPs, this could be a well defined interface. Its presence could be negotiated using the standard _OSC (OS Capabilities) mechanism.
The Platform Communication Channel (PCC) is a generic mechanism for OSPM to communicate with an entity in the platform, such as as a BMC, SCP (system control processor). PCC relies on a shared memory region and a doorbell register. Starting ACPI 6.0, PCC also became a supported OpRegion type. Assuming the doorbell can be wired to an SMC, PCC could be used to communicate with a Trusted Firmware-based SCP.
PCC is a higher level protocol than invoking random SMC calls from ACPI or OS directly. The commands/messages all go via structured shared memory. There would be only one SMCCC call used – for the doorbell itself.
To use PCC with SMC via FFH, the Arm FFH specification would need to be amended to cover the PCC use case.
Escape Hatch #4 – SCMI + FFH = 💖?
We can do better than raw PCC. Arm already has an expansive adaptation of PCC – the System Control and Management Interface (SCMI), which covers exactly the use case we are after – device control and power management. SCMI is a higher-level protocol with a concrete command set.
SCMI is pretty advanced, and even supports asynchronous (delayed response) commands.
If the doorbell is SMC or HVC based, it should follow the SMC Calling Convention [SMCCC]. The doorbell needs to provide the identifier of the Shared Memory area that contains the payload. The Shared Memory area containing the payload is updated with the SCMI return response when the call returns. The identifier of the Shared Memory area should be 32-bits and each identifier should map to a distinct Shared Memory area.
While SCMI supports SMC as a doorbell according to the spec, the details are unfortunately left out. Presumably SMC must be exposed via FFH, yet the Arm FFH specification doesn’t currently cover this scenario.
SCMI could be a reasonable interface for a Trusted Firmware-based SCP. SCMI supports a lot of the interfaces one would otherwise implement in an arbitrary fashion, such as sensors, clocks and power and reset control. And not just from AML – as a well defined specification, SCMI could be a generic (fallback?) implementation for OS drivers (GPIO, PCIe, I2C, etc).
There are a few areas for improvement:
Add commands for pin and GPIO control.
Add commands to abstract embedded controller I/O (I2C, SPI, SMBUS).
Add commands to abstract PCIe CAM for systems without standards-compliant ECAM.
Add a purely SMC-based FastChannel transport, for areas where asynchronous support is irrelevant and where latency is key, like PCIe CAM, pin control or GPIO.
We looked at a number of schemes, but the most baked-through appears to be exposing a Trusted Firmware-based SCP via SCMI and an SMC doorbell, although the Arm FFH support for SMC-based SCMI doorbells still needs to figured out, and there are a few crucial categories of interfaces missing, such pin control and GPIO.
Of course, don’t forget that the Raspberry Pi 4, for example, has its GPU/VPU acting as the System Control Processor with it’s own SCMI-like mailbox. Could that be replaced with SCMI, avoiding a TF-A based SCMI interface entirely?
Finally, firmware-based SCPs aren’t just for Edge or Client devices. Even on servers, systems today can rely on GPIO-signaled Events instead of Interrupt-signaled Events. GPIO-signaled system events require a vendor-specific GPIO driver. Thus for servers, SCMI could mean never having to worry about vendor-specific GPIO drivers.