Web LLM brings browser-based revolution to AI

A team of developers is working to bring language model chats directly to web browsers, operating entirely in-browser with WebGPU acceleration. The project acknowledges advancements in generative AI and language model development, thanks to open-source contributions like LLaMA, Alpaca, Vicuna, and Dolly. The goal is to create open-source language models and personal AI assistants integrated into web browsers, harnessing client-side computing power.
Web LLM is a groundbreaking innovation in AI and web development, enabling fine-tuned models to run natively within browser tabs without server support. This local processing addresses privacy and security concerns, giving users control over personal information and reducing the risk of data leaks, especially concerning Chrome extensions or web applications.
The creator has opened a demo site for the Web LLM, but at the time of writing the demo is down.
Overcoming challenges and optimizations
Significant challenges include the need for GPU-accelerated Python frameworks in client-side environments and optimizing memory usage and weight compression for large language models. The project aims to develop a workflow for efficient language model development and optimization using a Python-first approach and universal deployment.
The project employs machine learning compilation (MLC) with Apache TVM Unity, utilizing native dynamic shape support to optimize the language model's IRModule without padding. TensorIR programs are transformed and optimized for deployment across various environments, including JavaScript for web deployment.
Utilizing int4 quantization techniques, the project compresses model weights and employs static memory planning optimizations to reuse memory across multiple layers. A wasm port of the SentencePiece tokenizer is used, with all optimizations executed in Python, except for the JavaScript application connecting components.

The role of open-source ecosystems
The open-source ecosystem, specifically TVM Unity, facilitates a Python-centric development experience for optimizing and deploying language models on the web. TVM Unity's dynamic shape support addresses the dynamic nature of language models without padding, and tensor expressions enable partial-tensor computations.
You may check out the project's GitHub page via the link here.
Comparing WebGPU and native GPU runtimes reveals performance limitations due to Chrome's WebGPU implementation. Workarounds like special flags can enhance execution speed, and forthcoming features such as fp16 extensions exhibit potential for substantial improvements.
How to install Web LLM to Chrome?
WebGPU recently debuted in Chrome and is now in beta. They currently conduct their experiments in Chrome Canary, and you can also try the latest Chrome 113. Chrome version ? 112 isn't supported and using it will result in an error related to WebGPU device initialization and limits. The tests were performed on Windows and Mac, requiring a GPU with approximately 6.4GB of memory, say the creators.
For Mac users with Apple devices, follow these instructions to run the chatbot demo locally in your browser:
- Install Chrome Canary, a developer version of Chrome that enables WebGPU usage
- Launch Chrome Canary. We recommend launching it from the terminal with this command (or replace Chrome Canary with Chrome):
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --enable-dawn-features=disable_robustness - This command disables the robustness check in Chrome Canary that slows down image generation. While not mandatory, we strongly advise using this command to start Chrome
Enter your inputs and click "Send" to begin. The chatbot will first fetch model parameters into the local cache. The initial download may take a few minutes, but subsequent refreshes and runs will be faster.
Advertisement
Missing from the “story”: Ukraine’s agreement to never use Starlink for military purposes. This is why.
Ghacks quality is AI driven and very poor these days since AI is really artificial stupidity.
“Elon Musk biographer Walter Isaacson forced to ‘clarify’ book’s account of Starlink incident in Ukraine War
“To clarify on the Starlink issue: the Ukrainians THOUGHT coverage was enabled all the way to Crimea, but it was not. They asked Musk to enable it for their drone sub attack on the Russian fleet. Musk did not enable it, because he thought, probably correctly, that would cause a major war.”
https://nypost.com/2023/09/11/elon-musk-biographer-walter-isaacson-corrects-detail-about-starlink-in-ukraine/
I posted above comment to:
https://www.ghacks.net/2023/09/08/elon-musk-turned-off-starlink-during-ukranian-offence/
Not to the following article about Geforce where I currently also can see it published:
https://www.ghacks.net/2023/08/29/how-to-fix-geforce-experience-error-code-0x0003/
Well, using Brave, I can see Llama 2 being decent, but it is still not great?
All these AI stuff seems more like a ‘toy’ than anything special, I mean, it is good for some stuff like translations or asking quick questions but not for asking anything important.
The problem is Brave made it mostly for summarizing websites and all that, but all these Big tech controlled stuff, won’t summarize articles it doesn’t agree with, so it is also useless in many situations where you just want it to give you a quick summarization, and then it starts throwing you little ‘speeches’ about how it doesn’t agree with it and then it never summarizes anything, but give you all the 30 paragraphs reasons why the article is wrong, like if I am asking it what it thinks.
SO all this AI is mostly a toy, but Facebook with all the power they have will be able to get so much data from people, it can ‘train’ or better say, write algorithms that will get better with time.
But It is not intelligence, it is really not intelligence all these AI technology.
Article Title: Tech leaders meet to discuss regulation of AI
Article URL: [https://www.ghacks.net/2023/09/14/artificial-intelligence-regulation-tech-leaders/]
—
The eternal problematic of regulating, here applied to AI. Should regulations (interventionism) have interfered in the course of mankind ever since Adam and Eve where would we be now? Should spirituality, morality, ethics never have interfered where would we be now? I truly have always believed that the only possible consensus between ethics and freedom is that of individuals’ own consciousness.
Off-topic : Musk’s beard looks like a wound, AI-Human hand-shake is a quite nice pic :)
Haha, oh dear, Tom.
I thought that the comments system issue where comments shows up under a totally different article was fixed. But seeing your comment here, the “error” is clearly still active. Hopefully it is sorted as soon as possible.
Article Title: Tech leaders meet to discuss regulation of AI
Article URL: [https://www.ghacks.net/2023/09/14/artificial-intelligence-regulation-tech-leaders/]
—
Hi Karl :) Well, let’s remain positive and see the good sides : one’s comment appearing within different articles (the one it was written form and for, another unrelated one) brings ubiquity to that comment : say it once and it’s published twice, double your pleasure and double your fun (“with double-mint, double-mint gum” and old ad!). Let’s forget the complications and inherited misunderstandings it leads to. Not sure the fun is worth the complications though. Which is why, with a few others here, I include Article Title & URL with comment, to ease a bit the pain.
This said, I’m trying to find a logic key which would explain the mic-mac. One thing is sure : comments appearing twice keep the same comment number.
For instance my comment to which you replied just above is originally :
[https://www.ghacks.net/2023/09/14/artificial-intelligence-regulation-tech-leaders/#comment-4573676]
It then got duplicated to :
[https://www.ghacks.net/2023/08/29/how-to-fix-geforce-experience-error-code-0x0003/#comment-4573676]
Same comment number, which let’s me imagine comments are defined by their number as before but now dissociated in a way from their full path : that’s where something is broken, as i see it.
First amused me, then bothered, annoyed (I took some holidays to lower the pressure), then triggered curiosity.
I’m putting our best detectives on the affair, stay tuned.
Hehe, yes indeed, staying positive is what we should do. Good comes for those who wait, as the old saying goes. Hopefully true for this as well.
Interesting that the comments number stays the same, I noted that one thing is added to the duplicated comment in the URL, an error code, the following: “error-code-0x0003”.
Not useful for us, but hopefully for the developers (if there are any?), that perhaps will be able to sort this comments error out. Or our detectives, I hope they work hard on this as we speak ;).
Cheers and have a great weekend!
Whoops, my bad. I just now realized that the error I saw in your example URL (error-code-0x0003) was part of the linked article title and generated by Geforce! Oh dear! Why did I try to make it more confusing than it already is lol!
Original comment:
https://www.ghacks.net/2023/09/08/elon-musk-turned-off-starlink-during-ukranian-offence/#comment-4573788
Duplicate:
https://www.ghacks.net/2023/09/14/iphone-12-radiation-levels-are-too-high/#comment-4573788
Article Title: Tech leaders meet to discuss regulation of AI
Article URL: [https://www.ghacks.net/2023/09/14/artificial-intelligence-regulation-tech-leaders/]
—
@Karl, you write,
“I noted that one thing is added to the duplicated comment in the URL, an error code, the following: “error-code-0x0003”.”
I haven’t noticed that up to now but indeed brings an element to those who are actually trying to resolve the issue.
I do hope that Softonic engineers are working on fixing this issue, which may be more complicated than we can imagine. Anything to do with databases can become a nightmare, especially when the database remains accessed while being repaired, so to say.
P.S. My comment about remaining positive was, in this context, sarcastic. Your literal interpretation could mean you are, factually, more inclined to positiveness than I am myself : maybe a lesson of life for me :)
Have a nice, happy, sunny weekend as well :)
Correct: AI is certainly overhyped, it’s also advertised by some shady individuals. It’s can also be misused to write poor quality articles or fake your homework.
https://wordpress.com/support/post-vs-page/
https://wordpress.com/support/restore/
16 September 2023, this website is still experiencing issues with posts erroneously appearing in the wrong threads. There are even duplicates of the exact same post ID within the same page in some places.
Clerical error “[It] can also be misused …” you just can’t get the staff nowadays.
Obviously [#comment-4573795] was originally posted within [/2023/09/14/artificial-intelligence-regulation-tech-leaders/]. However, it has appeared misplaced within several threads.
Including the following:
[/2023/09/15/redmi-note-13-specs-release-date-and-more/]
[/2023/08/29/how-to-fix-geforce-experience-error-code-0x0003]
“How much radiation is dangerous?
Ionizing radiation, such as X-rays and gamma rays, is more energetic and potentially harmful. Exposure to doses greater than 1,000 millisieverts (mSv) in a short period can increase the risk of immediate health effects.
Above about 100 mSv, the risk of long-term health effects, such as cancer, increases with the dose.”
This ban is about NON-ionizing radiation limits, because there is too much radio wave power from the iphone. This has nothing to do with the much more dangerous ionizing radiations like X-rays, that are obviously not emitted at all by mobile phones. I invite you to correct your article.
“Aaro.mil makes history as the first official UFO website”
I wonder if it’s just smelly crowdsourcing for the spotting of chinese balloons or whatever paranoia they’re trying to instigate, or if they are also intentionally trying to look stupid enough to look for alien spaceships, for whatever reason. Maybe trying to look cute, instead of among the worst butchers of history ?
“The tech titan’s defense”
“Whether he provides a clear explanation or justifies his actions”
“the moral compass”
You take it for granted that this company should agree being a military communications provider on a war zone, and so directly so that his network would be used to control armed drones charged with explosives rushing to their targets.
You don’t need to repeat here everything you read in the mainstream press without thinking twice about it. You’re not just pointing interestingly that his company is more involved in the war that one may think at first and that this power is worrying, you’re also declaring your own support for a side in an imperialist killfest, blaming him for not participating enough in the bloodshed.
Now your article is unclear on how this company could be aware that its network is used for such military actions at a given time, which has implications of its own.
Reading other sources on that quickly, it seems that the company was: explicitly asked ; to extend its network geographically ; for a military attack ; at a time when there was no war but with the purpose of triggering it, if I understood well. You have to be joking if you’re crying about that not happening at that time. But today you have your war, be happy.
comments and article dont match
The writers were on strike?