TheDinarian
News • Business • Investing & Finance
FEDML Nexus AI Unlocks LLaMA-7B Pre-Training and Fine-tuning on Geo-distributed RTX4090s
👉 A Theta Network Partner
March 21, 2024
post photo preview

Since 2020, the machine learning (ML) community has experienced an exponential surge in large language model (LLM) sizes, escalating from 175 billion to a remarkable 10 trillion parameters in just three years. This rapid expansion has led to significant bottlenecks for AI developers, notably in the availability of GPUs, escalating compute costs, and the extended duration needed for training and improving these models.

As such, there has been a surge of research innovations in Parameter Efficient Fine Tuning (PEFT) methods, such as LoRA, which do not require fine-tuning of all parameters of the pre-trained model but only update a small portion of the parameters (task-specific parameters) while freezing most of the remaining weights. These methods have been shown to maintain task performance while substantially reducing the parameter budget needed for fine-tuning LLMs. However, they can’t be directly applied for full training (or, pre-training) of LLMs, since it is shown that training from scratch requires full-rank model training at the beginning until a low-rank subspace of the parameters is formed to reduce the optimization landscape.

This barrier was broken two weeks ago when Zhao et. al. introduced Gradient Low-rank Projection (GaLore). Their key idea was to instead leverage the slow-changing low-rank structure of the gradient of the weight matrix, rather than trying to approximate the weight matrix itself as low rank in LoRA. Interestingly, this phenomenon was also discovered and leveraged before for gradient compression to reduce the communication overhead in distributed training (e.g., GradiVeq in NeurIPS’18).

Compared to LoRA, GaLore significantly reduces optimizer states and total memory footprint by up to 82.5% and 63.3%, respectively, while maintaining training efficiency and model quality in pre-training and fine-tuning. GaLore’s central strategy involves maintaining only a compact “core” of the gradient in a dynamically adjusted low-rank subspace in optimizer memory. This approach leverages low-rank subspaces to approximate the primary trajectory of the gradient, commonly employed for complex optimization tasks such as LLM pre-training and fine-tuning. To ensure the alignment with the optimal full-rank outcome and prevent deviation, periodic recalculation of the projection is necessary. GaLore's low-rank subspaces act as a navigator, steering the descents towards the most likely direction for achieving the optimal results. 

By supporting the newly developed GaLore as a ready-to-launch job in FEDML Nexus AI, we have enabled the pre-training and fine-tuning of models like LLaMA 7B with a token batch size of 256 on a single RTX 4090, without additional memory optimization. 

This implies that we can train larger LLMs on a larger number of distributed GPUs than in data centers. But how can we achieve this? We further introduce FedLLM for federated learning on geo-distributed private data, and UnitedLLM for LLM pre-training and fine-tuning on decentralized community GPUs.

How to launch a GaLore Enabled training job in FEDML?

  1. Choose “Memory-Efficient LLM Training with GaLore” Job. Do a quick scan of the Description tab, it shows some basic usage for the code, referencing the original GaLore project's README. In the Source Code and Configuration tab, you can examine a more detailed layout and setup of the architecture.

Head to Launch Job Store > Train

Log into FEDML Nexus AI Platform

  1. Hit the Launch button on the top right. You will be prompted to the default configuration for the job. You could either run it as it is or customize the settings further.

For example, instead of running the default settings on A100, you could switch to RTX 3090. Under the Select Job section, click Add and replace “resource_type” to “RTX-3090”  in the computing section of the job YAML file.

  1. Once you've finalized the hyperparameters, hit the Create button, and you should be able to launch a full-scale GaLore + Checkpointing Activation pre-training for the LLaMA 7B model with a batch size of 16.

Key run statistics showing the efficiency and scalability potential

To see your own run statistics in FEDML Nexus AI leveraging our advanced experiment tracking capabilities, go to Train Run Job Name to find the job you launched.

For the duration of the training, check out the GPU memory utilization in the Hardware Monitoring tab within the job run page. Look for the signs of reduced allocated GPU memory.

We ran experiments for GaLore + Checkpointing Activation pre-training on the LLaMA 7B model with a batch size of 16 against two different single-GPU devices: NVIDIA A100 GPU (80GB) and NVIDIA GeForce RTX 3090 GPU (24GB).

The left figure shows training on NVIDIA A100 GPU for 1100 steps over 12 hours. We saw less than 30% memory footprint throughout the course of the training. Compared to A100 GPU, training on the NVIDIA GeForce RTX 3090 GPU for 200 steps over 10 hours, we saw an expected 94% memory usage, which is reasonable given the stringent space provided on a single enthusiast-class GPU.

Have fun training and fine-tuning models with GaLore on FEDML Nexus AI platform.

 

Link

 

community logo
Join the TheDinarian Community
To read more articles like this, sign up and join my community today
0
What else you may like…
Videos
Podcasts
Posts
Articles
Ripple CEO on partnership with BNY to serve as custodian of stablecoin
00:01:12
Brad Garlinghouse In Washington 🚀

It’s time for a fair and open level playing field.

Under Gary Gensler it was quite the opposite.

  • Brad Garlinghouse
    July 9, 2025
00:01:56
More Of The Same...l

🚨 JUST IN: Patriot Tom Fitton, who has been fighting DOJ and FBI to release documents for years, has practically thrown in the towel.

👉 "The justice department and the FBI are irredeemably compromised and corrupted.
The leadership needs to understand that and act accordingly." ~Tom Fitton

00:01:30
👉 Coinbase just launched an AI agent for Crypto Trading

Custom AI assistants that print money in your sleep? 🔜

The future of Crypto x AI is about to go crazy.

👉 Here’s what you need to know:

💠 'Based Agent' enables creation of custom AI agents
💠 Users set up personalized agents in < 3 minutes
💠 Equipped w/ crypto wallet and on-chain functions
💠 Capable of completing trades, swaps, and staking
💠 Integrates with Coinbase’s SDK, OpenAI, & Replit

👉 What this means for the future of Crypto:

1. Open Access: Democratized access to advanced trading
2. Automated Txns: Complex trades + streamlined on-chain activity
3. AI Dominance: Est ~80% of crypto 👉txns done by AI agents by 2025

🚨 I personally wouldn't bet against Brian Armstrong and Jesse Pollak.

👉 Coinbase just launched an AI agent for Crypto Trading

🎁 As of July 8th there have been 84 VERI SmartMetal NFT Activations (1.3%). With shipments ramping up, we witness the corresponding jump in activations.

Need help getting started? Check out our knowledge base to get the info you need: https://veridao.freshdesk.com/support/solutions/articles/51000487052-what-are-the-nft-activation-steps

👉Interested in which NFTs have been activated? Check them out here:
https://basescan.org/token/0x4516a5d613c30a36d157d3b579813734cbb929a4

post photo preview

🚨BREAKING: The US House Committee on Financial Services says that next week the House will deliver on President Trump's call to make the US the "crypto capital of the world!

post photo preview
Brinc Launches Web3 Accelerator with Octopus, XDC & IDA

Brinc Launches Web3 Accelerator with Octopus, XDC & IDA to Transform Hong Kong’s Loyalty and Payment Systems.

Read more: https://www.brinc.io/blog/brinc-launches-octopus-backed-web3-accelerator-program-to-revolutionize-hong-kongs-retail-loyalty-and-payment-ecosystem-with-xdc-and-ida-as-key-web3-infrastructure-partners/

🔗 Startups can apply from July 10

📅 Launching Sept 8

Learn more about the Web3 Accelerator program and apply now: https://www.brinc.io/stablecoin-accelerator/

post photo preview
Musk Turns On Starlink to Save Iranians from Regime’s Internet Crackdown

Elon Musk, the world’s richest man and a visionary behind SpaceX, has flipped the switch on Starlink, delivering internet to Iranians amid a brutal regime crackdown.

This move comes on the heels of Israeli strikes targeting Iran’s nuclear facilities, as the Islamic Republic cuts off online access.

The former Department of Government Efficiency chief activated Starlink satellite internet service for Iranians on Saturday following the Islamic Republic's decision to impose nationwide internet restrictions.

As the Jerusalem Post reports, that the Islamic Republic’s Communications Ministry announced the move, stating, "In view of the special conditions of the country, temporary restrictions have been imposed on the country’s internet."

This action followed a series of Israeli attacks on Iranian targets.

Starlink, a SpaceX-developed satellite constellation, provides high-speed internet to regions with limited connectivity, such as remote areas or conflict zones.

Elizabeth MacDonald, a Fox News contributor, highlighted its impact, noting, "Elon Musk turning on Starlink for Iran in 2022 was a game changer. Starlink connects directly to SpaceX satellites, bypassing Iran’s ground infrastructure. That means even during government-imposed shutdowns or censorship, users can still get online, and reportedly more than 100,000 inside Iran are doing that."

During the 2022 "Woman, Life, Freedom" protests, Starlink enabled Iranians to communicate and share footage globally despite network blackouts," she added.

MacDonald also mentioned ongoing tests of "direct-to-cell" capabilities, which could allow smartphone connections without a dish, potentially expanding access and supporting free expression and protest coordination.

Musk confirmed the activation, noting on Saturday, "The beams are on."

This follows the regime’s internet shutdowns, which were triggered by Israeli military actions.

Adding to the tension, Israeli Prime Minister Benjamin Netanyahu addressed the Iranian people on Friday, urging resistance against the regime.

"Israel's fight is not against the Iranian people. Our fight is against the murderous Islamic regime that oppresses and impoverishes you,” he said.

Meanwhile, Reza Pahlavi, the exiled son of Iran’s last monarch, called on military and security forces to abandon the regime, accusing Supreme Leader Ayatollah Ali Khamenei in a Persian-language social media post of forcing Iranians into an unwanted war.

Starlink has been a beacon in other crises. Beyond Iran, Musk has leveraged Starlink to assist people during natural disasters and conflicts.

In the wake of hurricanes and earthquakes, Starlink has provided critical internet access to affected communities, enabling emergency communications and coordination.

Similarly, during the Ukraine-Russia conflict, Musk activated Starlink to support Ukrainian forces and civilians, ensuring they could maintain contact and access vital information under dire circumstances.

The genius entrepreneur, is throwing a lifeline to the oppressed in Iran, and the libs can’t stand it.

Conservative talk show host Mark Levin praised Musk’s action, reposting a message stating that Starlink would "reconnect the Iranian people with the internet and put the final nail in the coffin of the Iranian regime."

"God bless you, Elon. The Starlink beams are on in Iran!" Levin wrote.

Musk, who recently stepped down from leading the DOGE in the Trump administration, has apologized to President Trump for past criticisms, including his stance on the One Big Beautiful Bill.

Source

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto – Support via Coinbase Wallet to: [email protected]

Or Buy me a coffee: https://buymeacoffee.com/thedinarian

Your generosity keeps this mission alive, for all! Namasté 🙏 Crypto Michael ⚡  The Dinarian

Read full Article
post photo preview
GENIUS Act lets State banks conduct some business nationwide. Regulators object

The Senate passed the GENIUS Act for stablecoins last week, but significant work remains before it becomes law. The House has a different bill, the STABLE Act, with notable differences that must be reconciled. State banking regulators have raised strong objections to a provision in the GENIUS Act that would allow state banks to operate nationwide without authorization from host states or a federal regulator.

The controversial clause permits a state bank with a regulated stablecoin subsidiary to provide money transmitter and custodial services in any other state. While host states can impose consumer protection laws, they cannot require the usual authorization and oversight typically needed for out-of-state banking operations.

The Conference of State Bank Supervisors welcomed some changes in the GENIUS Act but remains adamantly opposed to this particular provision. In a statement, CSBS said:

“Critical changes must be made during House consideration of the legislation to prevent unintended consequences and further mitigate financial stability risks. CSBS remains concerned with the dramatic and unsupported expansion of the authority of uninsured banks to conduct money transmission or custody activities nationwide without the approval or oversight of host state supervisors (Sec. 16(d)).”

The National Conference of State Legislatures expressed similar concerns in early June, stating:

“We urge you to oppose Section 16(d) and support state authority to regulate financial services in a manner that reflects local conditions, priorities and risk tolerances. Preserving the dual banking system and respecting state autonomy is essential to the safety, soundness and diversity of our nation’s financial sector.”

Evolution of nationwide authorization

Section 16 addresses several issues beyond stablecoins, including preventing a recurrence of the SEC’s SAB 121, which forced crypto assets held in custody onto balance sheets. However, the nationwide authorization subsection was added after the legislation cleared the Senate Banking Committee, with two significant modifications since then.

Originally, the provision applied only to special bank charters like Wyoming’s Special Purpose Depository Institutions or Connecticut’s Innovation Banks. Examples include crypto-focused Custodia Bank and crypto exchange Kraken in Wyoming, plus traditional finance player Fnality US in Connecticut. Recently the scope was expanded to cover most state chartered banks with stablecoin subsidiaries, possibly due to concerns about competitive advantages.

Simultaneously, the clause was substantially tightened. The initial version allowed state chartered banks to provide money transmission and custody services nationwide for any type of asset, which would include cryptocurrencies. Now these activities can only be conducted by the stablecoin subsidiary, and while Section 16(d) doesn’t explicitly limit services to stablecoins, the GENIUS Act currently restricts issuers to stablecoin related activities.

However, the House STABLE Act takes a more permissive approach, allowing regulators to decide which non-stablecoin activities are permitted. If the House version prevails in reconciliation, it could result in a significant expansion of allowed nationwide banking activities beyond stablecoins.

Is it that bad?

As originally drafted, the clause seemed overly permissive.

The amended clause makes sense for stablecoin issuers. They want to have a single regulator and be able to provide the stablecoin services throughout the United States. But it also leans into the perception outside of crypto that this is just another form of regulatory arbitrage.

The controversy over Section 16(d) reflects concerns about creating a regulatory gap that allows banks to operate interstate without the oversight typically required from either federal or state authorities. As the two Congressional chambers work toward reconciliation, lawmakers must decide whether stablecoin legislation should include provisions that effectively reduce traditional banking oversight requirements.

Source

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto – Support via Coinbase Wallet to: [email protected]

Or Buy me a coffee: https://buymeacoffee.com/thedinarian

Your generosity keeps this mission alive, for all! Namasté 🙏 Crypto Michael ⚡  The Dinarian

Read full Article
post photo preview
Dubai regulator VARA classifies RWA issuance as licensed activity
Virtual Asset Regulatory Authority (VARA) leads global regulatory framework - makes RWA issuance licensed activity in Dubai.

Real-world assets (RWAs) issuance is now licensed activity in Dubai.

~ Actual law.
~ Not a legal gray zone.
~ Not a whitepaper fantasy.

RWA issuance and listing on secondary markets is defined under binding crypto regulation.

It’s execution by Dubai.

Irina Heaver explained:

“RWA issuance is no longer theoretical. It’s now a regulatory reality.”

VARA defined:

- RWAs are classified as Asset-Referenced Virtual Assets (ARVAs)

- Secondary market trading is permitted under VARA license

- Issuers need capital, audits, and legal disclosures

- Regulated broker-dealers and exchanges can now onboard and trade them

This closes the gap that killed STOs in 2018.

No more tokenization without venues.
No more assets without liquidity.

UAE is doing what Switzerland, Singapore, and Europe still haven’t:

Creating enforceable frameworks for RWA tokenization that actually work.

Matthew White, CEO of VARA, said it perfectly:

“Tokenization will redefine global finance in 2025.”

He’s not exaggerating.

$500B+ market predicted next year.

And the UAE just gave it legal rails.

~Real estate.
~Private credit.
~Shariah-compliant products.

Everything is in play.

This is how you turn hype into infrastructure.

What Dubai is doing now is 3 years ahead of everyone else.

Founders, investors, ecosystem builders:

You want to build real-world assets onchain.

Don’t waste another year waiting for clarity.

Come to Dubai.

It’s already here.

 

Source

🙏 Donations Accepted 🙏

If you find value in my content, consider showing your support via:

💳 PayPal: 
1) Simply scan the QR code below 📲
2) or visit https://www.paypal.me/thedinarian

🔗 Crypto – Support via Coinbase Wallet to: [email protected]

Or Buy me a coffee: https://buymeacoffee.com/thedinarian

Your generosity keeps this mission alive, for all! Namasté 🙏 Crypto Michael ⚡  The Dinarian

 

Read full Article
See More
Available on mobile and TV devices
google store google store app store app store
google store google store app tv store app tv store amazon store amazon store roku store roku store
Powered by Locals