Profile Picture
  • All
  • Search
  • Images
  • Videos
    • Shorts
  • Maps
  • News
  • More
    • Shopping
    • Flights
    • Travel
  • Notebook
Report an inappropriate content
Please select one of the options below.
NVIDIA
Clips
NVIDIA
Boost
NVIDIA
BIOS-Update
NVIDIA
Glass
NVIDIA
Driver
MegaMacs PC
NVIDIA
Basic
DC's
NVIDIA
GitHub eGPU
Mac NVIDIA
NVIDIA
Driver 1050
NVIDIA
G-Sync
NVIDIA
Canvas
NVIDIA
580
Microsoft Edge
Mac
NVIDIA
Drive Sim
Microsoft Teams
Mac
NVIDIA
Drive Labs
GPU Fan Speed
NVIDIA
Mac Mac
Diri DAC
Mac Mac
Pablo
NVIDIA Mac
OS
NVIDIA Drivers Mac
OS
Schede
NVIDIA
Web Drivers
Google Docs Update
  • Length
    AllShort (less than 5 minutes)Medium (5-20 minutes)Long (more than 20 minutes)
  • Date
    AllPast 24 hoursPast weekPast monthPast year
  • Resolution
    AllLower than 360p360p or higher480p or higher720p or higher1080p or higher
  • Source
    All
    Dailymotion
    Vimeo
    Metacafe
    Hulu
    VEVO
    Myspace
    MTV
    CBS
    Fox
    CNN
    MSN
  • Price
    AllFreePaid
  • Clear filters
  • SafeSearch:
  • Moderate
    StrictModerate (default)Off
Filter
    NVIDIA
    Clips
    NVIDIA
    Boost
    NVIDIA
    BIOS-Update
    NVIDIA
    Glass
    NVIDIA
    Driver
    MegaMacs PC
    NVIDIA
    Basic
    DC's
    NVIDIA
    GitHub eGPU
    Mac NVIDIA
    NVIDIA
    Driver 1050
    NVIDIA
    G-Sync
    NVIDIA
    Canvas
    NVIDIA
    580
    Microsoft Edge
    Mac
    NVIDIA
    Drive Sim
    Microsoft Teams
    Mac
    NVIDIA
    Drive Labs
    GPU Fan Speed
    NVIDIA
    Mac Mac
    Diri DAC
    Mac Mac
    Pablo
    NVIDIA Mac
    OS
    NVIDIA Drivers Mac
    OS
    Schede
    NVIDIA
    Web Drivers
    Google Docs Update
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
0:13
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
103.4K views1 day ago
x.comLior Alexander
See more
Static thumbnail place holder
More like this
  • Privacy
  • Terms