Compress large language models for better perf with SparseGPT

Shaun
Jan 5, 2023
Apps, Software
|
7

On Twitter recently, a user named Jay Hack (@mathemagic1an) posted an excerpt of an abstract that deals with how to prune out massive language models to achieve a better performance metric. The abstract was written by Austrian researchers Elias Frantar and Dan Alistarh. In the abstract, the researchers attempt to show how language models in the GPT family can be ‘pruned to 50% sparsity without any retraining.’

This is a rather significant breakthrough in terms of the usability of large language models, particularly in the GPT family. GPT stands for Generative Pretrained Transformers, but you may be familiar with the most popular of these systems in the current artificial intelligence landscape; ChatGPT by OpenAI. As outlined in the abstract, this incredible breakthrough was achieved with a new pruning method known as SparseGPT. This method was designed specifically to work on truly large language models like OPT-175B and BLOOM-176B, both of which are completely open-source. Both language models are also viewed as the largest of their kind that is currently available. 

Related: What are lawmakers and regulators doing about AI?

The wording of the abstract is relatively difficult to understand if you don’t have a background in this particular field of scientific inquiry, but I’ll try and break it down into simpler terms. Basically, all machine learning algorithms work with some kind of neural network. Early researchers and pioneers chose to call this a neural network due to the end goal of trying to simulate how the human brain, with all of its billions of neurons, operates. In a simple neural network, you always have an input, and out, and a lot of confusing mathematical bits in the middle in various layers of complex data handling systems. 

Each ‘neuron’ represents a value, and the neurons connect with other neurons in the system to form a network that can perform certain equations to determine output values. Within these hidden layers, there are certain parameters in place that transform input data into the eventual output. These parameters are basically like dials you can adjust to change the way your network operates, and we call these weights.

We can now accurately prune large language models down to size

Now, with these massive language models, you need to take into account what kind of information these systems have to process. First, these systems need to recognize certain patterns that make up what we inherently recognize as letters, numbers, words, phrases and abstract thoughts. All of this complex computation is achieved through weights, and in large language models these weights can number in the innumerable. 

Basically, this new method, SparseGPT, shows how, in some cases, more than 100 billion of these individual weights can be ignored, while still producing largely accurate results. In even more basic terms, SparseGPT lets us minimize the number of computations and determinations that need to take place in these hidden layers of complex digital grey matter without altering the efficiency or the accuracy of the results we obtain. 

Game changing stuff, right? If you’re interested in artificial intelligence, machine learning, or neural networks, we have a host of other tantalizing news bits for you to read. If this wasn’t really your cup of tea, or rather, what you like to read while drinking a cup of tea, we have a bunch of other articles that may interest you. For instance, is it possible that Elon Musk’s Twitter verification crusade was all just clever marketing?

Advertisement

Tutorials & Tips


Previous Post: «
Next Post: «

Comments

  1. bruh said on August 18, 2023 at 1:25 pm
    Reply

    Uhh, this has already been possible – I am not sure how but remember my brother telling me about it. I’m not a whatsapp user so not sure of the specifics, but something about sending the image as a file and somehow bypassing the default compression settings that are applied to inbound photos.

    He has also used this to share movies to whatsapp groups, and files 1Gb+.

    Like I said, I never used whatsapp, but I know 100% this isn’t a “brand new feature”, my brother literally showed me him doing it, like… 5 months ago?

  2. 💥 said on August 18, 2023 at 3:55 pm
    Reply

    Martin, what happened to those: 12 Comments (https://www.ghacks.net/chatgpt-gets-schooled-by-princeton-university/#comments). Is there a specific justifiable reason why they were deleted?

    Hmm, it looks like the gHacks website database is faulty, and not populating threads with their relevant cosponsoring posts.

  3. 45 RPM said on August 19, 2023 at 6:29 pm
    Reply

    The page on ghacks this is on represents the best of why it has become so worthless, fill of click-bait junk that it’s about to be deleted from my ‘daily reads’.

    It’s really like “Press Release as re-written by some d*ck for clicks…poorly.” And the subjects are laughable. Can’t wait for “How to search for files on Windows”.

    1. owl said on August 20, 2023 at 12:51 am
      Reply

      > The page on ghacks this is on represents the best of why it has become so worthless, fill of click-bait junk…

      Sadly, I have to agree.

      Only Martin and Ashwin are worth subscribing to.
      Especially Emre Çitak and Shaun are the worst ones.

      If ghacks.net intended “Clickbait”, it would mark the end of Ghacks Technology News.
      Ghacks doesn’t need crappy clickbaits. Clearly separate articles from newer authors (perhaps AIs and external sales person or external advertising man) as just “Advertisements”!

      We, the subscribers of Ghacks, urge Martin to make a decision.

  4. chessandonions said on August 20, 2023 at 12:40 am
    Reply

    because nevermore wants to “monetize” on every aspect of human life…

  5. Frank Rizzo said on August 20, 2023 at 11:52 pm
    Reply

    “Threads” is like the Walmart of Social Media.

  6. Ashray said on August 21, 2023 at 4:06 pm
    Reply

    How hard can it be to clone a twitter version of that as well? They’re slow.

  7. Paul(us) said on August 21, 2023 at 5:16 pm
    Reply

    Yes, why not mention how large the HD files can be?
    Why, not mention what version of WhatsApp is needed?
    These omissions make the article feel so bare. If not complete.

    1. Paul(us) said on August 21, 2023 at 5:18 pm
      Reply

      Sorry posted on the wrong page.

  8. Marc said on August 21, 2023 at 6:00 pm
    Reply

    such a long article for such a simple matter. Worthless article ! waste of time

  9. plusminus_ said on August 21, 2023 at 7:54 pm
    Reply

    I already do this by attaching them via the ‘Document’ option.

  10. John G. said on August 21, 2023 at 11:43 pm
    Reply

    I don’t know what’s going on here at Ghacks but it’s obvious that something is broken, comments are being mixed whatever the article, I am unable to find some of my later posts neither. :S

  11. Tom Hawack said on August 23, 2023 at 2:28 pm
    Reply

    Quoting the article,
    “As users gain popularity, the value of their tokens may increase, allowing investors to reap rewards.”

    Besides, beyond the thrill and privacy risks or not, the point is to know how you gain popularity, be it on social sites as everywhere in life. Is it by being authentic, by remaining faithful to ourselves or is it to have this particular skill which is to understand what a majority likes, just like politicians, those who’d deny to the maximum extent compatible with their ideological partnership, in order to grab as many of the voters they can?

    I see the very concept of this Friend.tech as unhealthy, propagating what is already an increasing flaw : the quest for fame. I won’t be the only one to count himself out, definitely.

    1. Tom Hawack said on August 23, 2023 at 2:34 pm
      Reply

      @John G. is right : my comment was posted on [https://www.ghacks.net/2023/08/23/what-is-friend-tech/] and it appears there but as well here at [https://www.ghacks.net/2023/07/08/how-to-follow-everyone-on-threads/]

      This has been lasting for several days. Fix it or at least provide some explanations if you don’t mind.

  12. Tom said on August 24, 2023 at 11:53 am
    Reply

    > Google Chrome is following in Safari’s footsteps by introducing a new feature that allows users to move the Chrome address bar to the bottom of the screen, enhancing user accessibility and interaction.

    Firefox did this long before Safari.

  13. Mavoy said on September 16, 2023 at 2:17 pm
    Reply

    Basically they’ll do anything except fair royalties.

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

We love comments and welcome thoughtful and civilized discussion. Rudeness and personal attacks will not be tolerated. Please stay on-topic.
Please note that your comment may not appear immediately after you post it.