Quantcast
Channel: Artificial Intelligence News, Analysis and Resources - The New Stack
Viewing all articles
Browse latest Browse all 572

Why AI PCs Are Not for Developers

$
0
0
abstract

There is no compelling reason for developers to compile local AI models that take advantage of new AI processors installed on up-and-coming AI PCs.

There are multiple issues: the hardware isn’t capable, the models aren’t available, and the development tools are a headache to deploy.

I did months of testing on so-called AI PCs with Windows 11 and specialized AI processors, in the hope of running a local LLM with an internet connection turned off. The laptops included chips from Intel and Qualcomm with neural processors, which were designed for AI.

Microsoft has hyped its AI PCs as supporting lightweight AI models — or SLMs (Small Language Models) — such as Meta’s Llama 2 and Microsoft’s Phi Silica.

My attempts to load those models onto the PCs was a frustrating experience, with bumps at every step. Finding lightweight models compatible with the neural processors in the Qualcomm and Intel chips was the first challenge. Loading the Jupyter notebook and neural networks in which to run those SLMs was another.

When I did run the models, the SLMs were not using specialized AI processors, but were instead relying on GPUs or CPUs.

The Hype

Microsoft announced Copilot+ PCs at its Build conference this year. The first Co-Pilot PCs had hardware so that customers could run inferencing on devices, saving a trip to the cloud.

The Copilot PCs had some minimum requirements, which included a minimum AI performance of 45 TOPS. The first AI PCs with Qualcomm’s Snapdragon chips met that requirement.

Microsoft’s CEO Satya Nadella said the company had 40-plus models available out of the box to run locally on your inputs on Copilot+ PCs. One was Phi Silica, a 3.8-billion parameter SLM.

The DirectML and ONNX runtime allows users to run the Phi-3 models on Windows devices, but Qualcomm wasn’t ready when the devices came out. Qualcomm provides its own list of AI models it supports on its Snapdragon Elite X chips, via an AI Development Hub.

An early attempt to load Llama v2 wasn’t smooth and didn’t work for me. I sought Qualcomm’s help to load models, but there was no clear outcome.

Creating a Jupyter notebook using the tools recommended by Qualcomm was confusing, and I couldn’t load any AI model manually. Qualcomm recommended downloading the ONNX runtime to exploit NPUs, which was confusing.

Lately, LMStudio is providing a version of its AI software for Qualcomm chips.

I loaded the 8-billion parameter Llama v3.1 model using LMStudio, but it only used the Snapdragon CPU, not the GPU or NPU. It dished out 17.34 tokens per second, but it chewed up 87% of memory after just a few queries.

Meaningful models aren’t yet ready to take advantage of Qualcomm’s NPUs, which, like GPUs, are designed to speed up AI. Even if the NPUs worked, the Copilot PCs don’t have the memory to run long queries. It’ll also drain the battery quickly.

Microsoft is providing the tooling for developers to integrate AI capabilities within desktop applications. There’s no reason for them to help load Llama v3.1, since they already have Copilot features on PCs.

Microsoft’s Phi Silica support is more for developers to bring large language model style querying capabilities to Windows applications via the Windows App SDK.

Meteor Lake Failures

Intel got into the AI PC game late last year with a chip called Meteor Lake, which had a neural processing unit.

The chip is now a paperweight, and those who bought laptops with the chip for on-PC AI have been abandoned. There are no useful applications; the NPU was utilized for basic AI models like TinyLlama.

To be sure, Intel’s Meteor Lake chips don’t qualify under Microsoft’s minimum specs for an AI PC. Intel claimed 34 TOPS (trillion operations per second) of AI performance on Meteor Lake, which is lower than the 40 TOPS required for Windows PCs.

Meteor Lake opened to poor reviews. It was slower than the previous-generation laptop chip and offered no improvement in battery life.

In about six months after releasing Meteor Lake, Intel shipped its next-generation AI PC chip, Lunar Lake, which made it to PCs and offers 120 TOPS of AI.

I tried manually running AI models locally on Meteor Lake PCs.

Loading a neural network to exploit the NPU involved installing OpenVINO 2024.2 and following the instructions on OpenVINO’s website.

The install provides the NPU plugin, which you’d expect to run when loading a model on a Jupyter notebook. Intel said that I’d need the right NPU driver and firmware.

Installing the new NPU driver itself was a challenge — I had to uninstall the old driver within Windows’ Device Manager settings and then detect the new driver. In the end, I just updated the driver using driver search.

I ran models like TinyLlama from the Jupyter notebook, which ran just fine, but gave poor answers. But like Qualcomm, it didn’t utilize the NPU.

A handful of models like Stable Diffusion 1.4 utilize the NPU, but it was directly within the GIMP interface.

Intel’s AI software development is largely focused on its server CPUs.

Back to Nvidia

Developers should stick to Nvidia to run Jupyter notebooks for any meaningful AI on their PCs.

Buy AI PCs for productivity, but not for AI-related coding or trial and error. The NPUs by chip makers aren’t friendly to developers. The problems start with initiating neural networks, and every chip maker has its own. But on-device AI is an emerging field with plenty of opportunities for developers to experiment, such as optimizing AI for PCs by quantization.

For adventurous coders, the typical Windows challenges come into play — ensuring you have the right drivers and development toolkits. Qualcomm and Intel have their own preferential tools on which to compile and load models.

Thankfully, Windows command line and PowerShell make the command line adventures fun.

Expect AI features that take advantage of NPUs to be pre-packaged in applications. Intel is working with companies to take advantage of NPUs. It is the same as making software compatible with specific chip architectures.

AI hardware is emerging quickly, and Intel is hyping up its latest Lunar Lake chips. Recent reviews have praised the chip with great battery life. But don’t buy it for development purposes — it doesn’t have enough memory or bandwidth to run language models locally.

The post Why AI PCs Are Not for Developers appeared first on The New Stack.

There is no compelling reason for developers to compile local AI models that take advantage of new AI processors installed on AI PCs.

Viewing all articles
Browse latest Browse all 572

Trending Articles