TheDinarian
News • Business • Investing & Finance
FEDML Nexus AI Unlocks LLaMA-7B Pre-Training and Fine-tuning on Geo-distributed RTX4090s
👉 A Theta Network Partner
March 21, 2024
post photo preview

Since 2020, the machine learning (ML) community has experienced an exponential surge in large language model (LLM) sizes, escalating from 175 billion to a remarkable 10 trillion parameters in just three years. This rapid expansion has led to significant bottlenecks for AI developers, notably in the availability of GPUs, escalating compute costs, and the extended duration needed for training and improving these models.

As such, there has been a surge of research innovations in Parameter Efficient Fine Tuning (PEFT) methods, such as LoRA, which do not require fine-tuning of all parameters of the pre-trained model but only update a small portion of the parameters (task-specific parameters) while freezing most of the remaining weights. These methods have been shown to maintain task performance while substantially reducing the parameter budget needed for fine-tuning LLMs. However, they can’t be directly applied for full training (or, pre-training) of LLMs, since it is shown that training from scratch requires full-rank model training at the beginning until a low-rank subspace of the parameters is formed to reduce the optimization landscape.

This barrier was broken two weeks ago when Zhao et. al. introduced Gradient Low-rank Projection (GaLore). Their key idea was to instead leverage the slow-changing low-rank structure of the gradient of the weight matrix, rather than trying to approximate the weight matrix itself as low rank in LoRA. Interestingly, this phenomenon was also discovered and leveraged before for gradient compression to reduce the communication overhead in distributed training (e.g., GradiVeq in NeurIPS’18).

Compared to LoRA, GaLore significantly reduces optimizer states and total memory footprint by up to 82.5% and 63.3%, respectively, while maintaining training efficiency and model quality in pre-training and fine-tuning. GaLore’s central strategy involves maintaining only a compact “core” of the gradient in a dynamically adjusted low-rank subspace in optimizer memory. This approach leverages low-rank subspaces to approximate the primary trajectory of the gradient, commonly employed for complex optimization tasks such as LLM pre-training and fine-tuning. To ensure the alignment with the optimal full-rank outcome and prevent deviation, periodic recalculation of the projection is necessary. GaLore's low-rank subspaces act as a navigator, steering the descents towards the most likely direction for achieving the optimal results. 

By supporting the newly developed GaLore as a ready-to-launch job in FEDML Nexus AI, we have enabled the pre-training and fine-tuning of models like LLaMA 7B with a token batch size of 256 on a single RTX 4090, without additional memory optimization. 

This implies that we can train larger LLMs on a larger number of distributed GPUs than in data centers. But how can we achieve this? We further introduce FedLLM for federated learning on geo-distributed private data, and UnitedLLM for LLM pre-training and fine-tuning on decentralized community GPUs.

How to launch a GaLore Enabled training job in FEDML?

  1. Choose “Memory-Efficient LLM Training with GaLore” Job. Do a quick scan of the Description tab, it shows some basic usage for the code, referencing the original GaLore project's README. In the Source Code and Configuration tab, you can examine a more detailed layout and setup of the architecture.

Head to Launch Job Store > Train

Log into FEDML Nexus AI Platform

  1. Hit the Launch button on the top right. You will be prompted to the default configuration for the job. You could either run it as it is or customize the settings further.

For example, instead of running the default settings on A100, you could switch to RTX 3090. Under the Select Job section, click Add and replace “resource_type” to “RTX-3090”  in the computing section of the job YAML file.

  1. Once you've finalized the hyperparameters, hit the Create button, and you should be able to launch a full-scale GaLore + Checkpointing Activation pre-training for the LLaMA 7B model with a batch size of 16.

Key run statistics showing the efficiency and scalability potential

To see your own run statistics in FEDML Nexus AI leveraging our advanced experiment tracking capabilities, go to Train Run Job Name to find the job you launched.

For the duration of the training, check out the GPU memory utilization in the Hardware Monitoring tab within the job run page. Look for the signs of reduced allocated GPU memory.

We ran experiments for GaLore + Checkpointing Activation pre-training on the LLaMA 7B model with a batch size of 16 against two different single-GPU devices: NVIDIA A100 GPU (80GB) and NVIDIA GeForce RTX 3090 GPU (24GB).

The left figure shows training on NVIDIA A100 GPU for 1100 steps over 12 hours. We saw less than 30% memory footprint throughout the course of the training. Compared to A100 GPU, training on the NVIDIA GeForce RTX 3090 GPU for 200 steps over 10 hours, we saw an expected 94% memory usage, which is reasonable given the stringent space provided on a single enthusiast-class GPU.

Have fun training and fine-tuning models with GaLore on FEDML Nexus AI platform.

 

Link

 

community logo
Join the TheDinarian Community
To read more articles like this, sign up and join my community today
0
What else you may like…
Videos
Podcasts
Posts
Articles
Trump just posted this about chemtrails 👀

“The enthusiasm for experiments that would pump pollutants into the high atmosphere has set off alarm bells here at the TRUMP EPA.”

00:02:52
The future of crypto = access, trust, transparency.

@evernorthxrp gives institutional + public investors simple, regulated, liquid exposure to XRP – and we’re compounding that value.

Watch below to learn how. 🎥👇

OP: @Ashgoblue

00:01:32
Coinbase CEO Brian Armstrong on CNBC: Crypto Market Structure Bill is CLOSE to passing 👀
00:00:39
👉 Coinbase just launched an AI agent for Crypto Trading

Custom AI assistants that print money in your sleep? 🔜

The future of Crypto x AI is about to go crazy.

👉 Here’s what you need to know:

💠 'Based Agent' enables creation of custom AI agents
💠 Users set up personalized agents in < 3 minutes
💠 Equipped w/ crypto wallet and on-chain functions
💠 Capable of completing trades, swaps, and staking
💠 Integrates with Coinbase’s SDK, OpenAI, & Replit

👉 What this means for the future of Crypto:

1. Open Access: Democratized access to advanced trading
2. Automated Txns: Complex trades + streamlined on-chain activity
3. AI Dominance: Est ~80% of crypto 👉txns done by AI agents by 2025

🚨 I personally wouldn't bet against Brian Armstrong and Jesse Pollak.

👉 Coinbase just launched an AI agent for Crypto Trading

The first multi-asset crypto fund in the U.S., now listed on NYSE Arca.

Welcome Grayscale CoinDesk Crypto 5 ETF! $GDLC.

https://x.com/NYSE/status/1981705231055433831

The International Asteroid Warning Network Initiated a Campaign to Monitor 3I/ATLAS

The closest approach to Earth is Dec 19 2025.

By Christmas, we’ll know whether 3I/ATLAS was just another comet or something that came looking back.

https://avi-loeb.medium.com/the-international-asteroid-warning-network-initiated-a-campaign-to-monitor-3i-atlas-d2a698859747

EpicX, the first perpetuals exchange purpose-built for the XRP economy

A fresh look at EpicX, the first perpetuals exchange purpose-built for the XRP economy.

EpicX transforms the XRP economy into a global trading venue that combines institutional-grade liquidity with frictionless usability, making pro-level trading accessible to everyone.

What EpicX brings:

⚫ One-tap onboarding via social login (MPC-secured), no wallet setup or gas friction

⚫ Ultra-fast trade execution with up to 40x leverage and composable margin

⚫ Multi-asset markets spanning crypto, stocks, and other RWAs

⚫ A self-funding loop: all fees directed to referrals, $EPIC buybacks, and grants

Beta next.

Register here → trade.epicchain.io

https://x.com/EpicOnChain/status/1978431421456019519

post photo preview
New Human Force
Join this Now! YOU have what it takes!

They are in our solar system, and in our event-stream in this Eternal Now.

Officialdom is clueless.

They think we are going to be at WAR with the Aliens.

Officialdom is very stupid.

Aliens is here. It’s not WAR. It’s Contention.

There is a difference.

Officialdom is clueless, still living in the last Millennium.

Aliens is here.

The Field in which we contend is This Eternal Now.

ALL HUMANS LIVE HERE, and ONLY HERE, in this

ETERNAL NOW.

It’s a Field of potentials, of pending Manifestation, this continuous event-stream of karma in which we have always lived our body’s Life.

This Eternal Now has always been our body’s Field of Contention.

The Aliens is here, in our Eternal Now.

Our common, shared, reality that we all continuously co-create now has Aliens.

It’s getting very complex in here.

Officialdom is clueless. They see the Aliens. They are freaking out. They think you are children, when it is their small minds, trapped in a reality that is only grit, mud, and ‘random chance’ who are childish.

Officialdom is stupid. They will and are reacting badly. As is their way, they are trying to hide shit from you. Silly grit bound minds don’t realize you can see everything from within the Eternal Now. They have yet to grasp that what they perceive as this Matterium, filled with ‘matter’, is but a hardening of our previous (past) internal states of being.

WAR happens in the Matterium.

Contention occurs within this Eternal Now where Consciousness shapes the manifesting event-stream.

YOU know this to be fact. You are a co-creator.

Contention with Aliens is happening in this instant in this Eternal Now.

Officialdom ain’t doing shit. They are still stuck in trying to move matter around to affect unfolding circumstances. That’s redoing the mirror trying to affect the reflection. Dumb fucks….

It’s up to US. To the New Humans. Those of us who live in this Eternal Now. Those of us who see that our body’s Lives (the Chain that cannot be broken) are expressions of the Ontology revealing itself to itself. It’s up to us guys.

We are not an Army. That’s a concept from the past, from before the emergence of the New Humans. We are a Force. A self-organizing collective with leadership resident in each, and every participant.

We are the New Human Force. By the time officialdom starts to speak about the Aliens in near-factual terms, we will already be engaging them in this Eternal Now.

By the time officialdom begins to move matter around (space ships & such) thinking it’s War, we will already be suffering casualties in this Eternal Now. That part is inevitable. It’s how we learn.

By the time officialdom realizes that some shit is going on in places and ways beyond its conception, we will already be pushing our dominance onto our partners in this First Contention, the Aliens. Nage cannot train without Uke.

Just as officialdom is scrambling to research the Ontology, this Eternal Now, and the event-stream, we will be settling terms with our new partners, the Aliens.

Come, join with us. It’s going to be a hellacious Contention.

We ARE the NEW HUMANS!

Together we are the Force that cannot be defeated.

Start YOUR training in this instance of this Eternal NOW.

Consume Neville Goddard videos as though all of human existence depended on YOUR mind and YOUR active, effective, imaginings!

It’s not a question of Mind over Matter as there is only Mind and it cares not for Matter. That’s residue.

Source

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto Donations👇
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

Read full Article
post photo preview
The Great Onboarding: US Government Anchors Global Economy into Web3 via Pyth Network

For years, the crypto world speculated that the next major cycle would be driven by institutional adoption, with Wall Street finally legitimizing Bitcoin through vehicles like ETFs. While that prediction has indeed materialized, a recent development signifies a far more profound integration of Web3 into the global economic fabric, moving beyond mere financial products to the very infrastructure of data itself. The U.S. government has taken a monumental step, cementing Web3's role as a foundational layer for modern data distribution. This door, once opened, is poised to remain so indefinitely.

The U.S. Department of Commerce has officially partnered with leading blockchain oracle providers, Pyth Network and Chainlink, to distribute critical official economic data directly on-chain. This initiative marks a historic shift, bringing immutable, transparent, and auditable data from the federal government itself onto decentralized networks. This is not just a technological upgrade; it's a strategic move to enhance data accuracy, transparency, and accessibility for a global audience.

Specifically, Pyth Network has been selected to publish Gross Domestic Product (GDP) data, starting with quarterly releases going back five years, with plans to expand to a broader range of economic datasets. Chainlink, the other key partner, will provide data feeds from the Bureau of Economic Analysis (BEA), including Real Gross Domestic Product (GDP) and the Personal Consumption Expenditures (PCE) Price Index. This crucial economic information will be made available across a multitude of blockchain networks, including major ecosystems like Ethereum, Avalanche, Base, Bitcoin, Solana, Tron, Stellar, Arbitrum One, Polygon PoS, and Optimism.

This development is closer to science fiction than traditional finance. The same oracle network, Pyth, that secures data for over 350 decentralized applications (dApps) across more than 50 blockchains, processing over $2.5 trillion in total trading volume through its oracles, is now the system of record for the United States' core economic indicators. Pyth's extensive infrastructure, spanning over 107 blockchains and supporting more than 600 applications, positions it as a trusted source for on-chain data. This is not about speculative assets; it's about leveraging proven, robust technology for critical public services.

The significance of this collaboration cannot be overstated. By bringing official statistics on-chain, the U.S. government is embracing cryptographic verifiability and immutable publication, setting a new precedent for how governments interact with decentralized technology. This initiative aligns with broader transparency goals and is supported by Secretary of Commerce Howard Lutnick, positioning the U.S. as a world leader in finance and blockchain innovation. The decision by a federal entity to trust decentralized oracles with sensitive economic data underscores the growing institutional confidence in these networks.

This is the cycle of the great onboarding. The distinction between "Web2" and "Web3" is rapidly becoming obsolete. When government data, institutional flows, and grassroots builders all operate on the same decentralized rails, we are simply talking about the internet—a new iteration, yes, but the internet nonetheless: an immutable internet where data is not only published but also verified and distributed in real-time.

Pyth Network stands as tangible proof that this technology serves a vital purpose. It demonstrates that the industry has moved beyond abstract "crypto tech" to offering solutions that address real-world needs and are now actively sought after and understood by traditional entities. Most importantly, it proves that Web3 is no longer seeking permission; it has received the highest validation a system can receive—the trust of governments and markets alike.

This is not merely a fleeting trend; it's a crowning moment in global adoption. The U.S. government has just validated what many in the Web3 space have been building towards for years: that Web3 is not a sideshow, but a foundational layer for the future. The current cycle will be remembered as the moment the world definitively crossed this threshold, marking the last great opportunity to truly say, "we were early."

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto Donations👇
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

Read full Article
post photo preview
US Dept of Commerce to publish GDP data on blockchain

On Tuesday during a televised White House cabinet meeting, Commerce Secretary Howard Lutnick announced the intention to publish GDP statistics on blockchains. Today Chainlink and Pyth said they were selected as the decentralized oracles to distribute the data.

Lutnick said, “The Department of Commerce is going to start issuing its statistics on the blockchain because you are the crypto President. And we are going to put out GDP on the blockchain, so people can use the blockchain for data distribution. And then we’re going to make that available to the entire government. So, all of you can do it. We’re just ironing out all the details.”

The data includes Real GDP and the PCE Price Index, which reflects changes in the prices of domestic consumer goods and services. The statistics are released monthly and quarterly. The biggest initial use will likely be by on-chain prediction markets. But as more data comes online, such as broader inflation data or interest rates from the Federal Reserve, it could be used to automate various financial instruments. Apart from using the data in smart contracts, sources of tamperproof data 👉will become increasingly important for generative AI.

While it would be possible to procure the data from third parties, it is always ideal to get it from the source to ensure its accuracy. Getting data directly from government sources makes it tamperproof, provided the original data feed has not been manipulated before it reaches the oracle.

Source

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

Read full Article
See More
Available on mobile and TV devices
google store google store app store app store
google store google store app tv store app tv store amazon store amazon store roku store roku store
Powered by Locals