TheDinarian
News • Business • Investing & Finance
FEDML Nexus AI Unlocks LLaMA-7B Pre-Training and Fine-tuning on Geo-distributed RTX4090s
👉 A Theta Network Partner
March 21, 2024
post photo preview

Since 2020, the machine learning (ML) community has experienced an exponential surge in large language model (LLM) sizes, escalating from 175 billion to a remarkable 10 trillion parameters in just three years. This rapid expansion has led to significant bottlenecks for AI developers, notably in the availability of GPUs, escalating compute costs, and the extended duration needed for training and improving these models.

As such, there has been a surge of research innovations in Parameter Efficient Fine Tuning (PEFT) methods, such as LoRA, which do not require fine-tuning of all parameters of the pre-trained model but only update a small portion of the parameters (task-specific parameters) while freezing most of the remaining weights. These methods have been shown to maintain task performance while substantially reducing the parameter budget needed for fine-tuning LLMs. However, they can’t be directly applied for full training (or, pre-training) of LLMs, since it is shown that training from scratch requires full-rank model training at the beginning until a low-rank subspace of the parameters is formed to reduce the optimization landscape.

This barrier was broken two weeks ago when Zhao et. al. introduced Gradient Low-rank Projection (GaLore). Their key idea was to instead leverage the slow-changing low-rank structure of the gradient of the weight matrix, rather than trying to approximate the weight matrix itself as low rank in LoRA. Interestingly, this phenomenon was also discovered and leveraged before for gradient compression to reduce the communication overhead in distributed training (e.g., GradiVeq in NeurIPS’18).

Compared to LoRA, GaLore significantly reduces optimizer states and total memory footprint by up to 82.5% and 63.3%, respectively, while maintaining training efficiency and model quality in pre-training and fine-tuning. GaLore’s central strategy involves maintaining only a compact “core” of the gradient in a dynamically adjusted low-rank subspace in optimizer memory. This approach leverages low-rank subspaces to approximate the primary trajectory of the gradient, commonly employed for complex optimization tasks such as LLM pre-training and fine-tuning. To ensure the alignment with the optimal full-rank outcome and prevent deviation, periodic recalculation of the projection is necessary. GaLore's low-rank subspaces act as a navigator, steering the descents towards the most likely direction for achieving the optimal results. 

By supporting the newly developed GaLore as a ready-to-launch job in FEDML Nexus AI, we have enabled the pre-training and fine-tuning of models like LLaMA 7B with a token batch size of 256 on a single RTX 4090, without additional memory optimization. 

This implies that we can train larger LLMs on a larger number of distributed GPUs than in data centers. But how can we achieve this? We further introduce FedLLM for federated learning on geo-distributed private data, and UnitedLLM for LLM pre-training and fine-tuning on decentralized community GPUs.

How to launch a GaLore Enabled training job in FEDML?

  1. Choose “Memory-Efficient LLM Training with GaLore” Job. Do a quick scan of the Description tab, it shows some basic usage for the code, referencing the original GaLore project's README. In the Source Code and Configuration tab, you can examine a more detailed layout and setup of the architecture.

Head to Launch Job Store > Train

Log into FEDML Nexus AI Platform

  1. Hit the Launch button on the top right. You will be prompted to the default configuration for the job. You could either run it as it is or customize the settings further.

For example, instead of running the default settings on A100, you could switch to RTX 3090. Under the Select Job section, click Add and replace “resource_type” to “RTX-3090”  in the computing section of the job YAML file.

  1. Once you've finalized the hyperparameters, hit the Create button, and you should be able to launch a full-scale GaLore + Checkpointing Activation pre-training for the LLaMA 7B model with a batch size of 16.

Key run statistics showing the efficiency and scalability potential

To see your own run statistics in FEDML Nexus AI leveraging our advanced experiment tracking capabilities, go to Train Run Job Name to find the job you launched.

For the duration of the training, check out the GPU memory utilization in the Hardware Monitoring tab within the job run page. Look for the signs of reduced allocated GPU memory.

We ran experiments for GaLore + Checkpointing Activation pre-training on the LLaMA 7B model with a batch size of 16 against two different single-GPU devices: NVIDIA A100 GPU (80GB) and NVIDIA GeForce RTX 3090 GPU (24GB).

The left figure shows training on NVIDIA A100 GPU for 1100 steps over 12 hours. We saw less than 30% memory footprint throughout the course of the training. Compared to A100 GPU, training on the NVIDIA GeForce RTX 3090 GPU for 200 steps over 10 hours, we saw an expected 94% memory usage, which is reasonable given the stringent space provided on a single enthusiast-class GPU.

Have fun training and fine-tuning models with GaLore on FEDML Nexus AI platform.

 

Link

 

community logo
Join the TheDinarian Community
To read more articles like this, sign up and join my community today
0
What else you may like…
Videos
Podcasts
Posts
Articles
Patent US10144532B2 | Craft using an inertial mass reduction device

🚀 The Mind-Blowing Patent That Could Revolutionize Space Travel: US Navy's Anti-Gravity Craft! 🛸

December 4, 2018 - The day physics got weird

🤯 What If I Told You...

The US Navy patented a spacecraft that could bend the laws of physics as we know them? No, this isn't science fiction or the latest Marvel movie – this is US Patent US10144532B2, and it's about to blow your mind! 💥

🎯 The Patent That Made Physicists Go "Wait, WHAT?!"

Filed on April 28, 2016, and granted on December 4, 2018, this patent describes a "Craft Using an Inertial Mass Reduction Device" – which is fancy talk for "spaceship that can make itself lighter than physics allows."

Invented by Salvatore Cezar Pais and assigned to the US Department of Navy, this isn't your average paper airplane design. We're talking about technology that could theoretically allow spacecraft to travel at extreme speeds by literally manipulating the fabric of spacetime itself! ⚡

🔬 The Science Behind the Magic✨

👉Here's where it gets really wild:

🌀 The ...

00:05:23
🚨Senate Delays CLARITY Act Vote After Coinbase Pulls Support🚨

The bipartisan CLARITY Act seeks to clarify digital asset rules by dividing oversight between the SEC and CFTC, while covering stablecoins, DeFi, and tokenized assets. Coinbase withdrew support over a provision blocking interest payments on payment stablecoins, arguing it favors banks that pay depositors just 0.14% while stablecoin reserves earn 3.8% in Treasuries. Bank of America CEO Brian Moynihan countered that yield-bearing stablecoins could drain $6 trillion in deposits, hurting lending for small businesses. Lawmakers are negotiating revisions, with a possible vote by late January.

Brad Garlinghouse, the CEO of Ripple chimes in...

00:00:31
EXCLUSIVE: Visa Direct's $1.7 trillion payout network just added stablecoin funding and stablecoin payouts "push to stablecoin wallet"

Visa Just Turned Every Wallet Into a Bank Account—And You Probably Missed It 💸🚀

Visa Direct quietly flipped two switches that make $1.7 trillion of annual payout volume speak fluent crypto. No press-release fireworks 🎆—just a Slack ping from BVNK engineers: “We’re live.” Here’s why that ping is louder than it sounds. 🔊

1️⃣ The “push-to” menu grew a new button

🔹Merchants, neobanks & creator platforms already use Visa Direct to shove money to cards, bank accounts, PayPal, Venmo, you-name-it.

🔹 Now they can push USDC straight to any on-chain wallet the recipient controls. Same API call, different destination.

⏱️ Settlement: ~90 seconds
💰 Cost: fractions of a cent
🌍 Geography: anywhere with internet

2️⃣ Treasury teams can stop apologizing for FX 🏦

🔹 Until today, if you funded cross-border payouts you wired fiat into Visa’s prefund account and waited for the bank’s 8-hour cut-off.

🔹 Starting today you can drop USDC (or ...

00:06:25
👉 Coinbase just launched an AI agent for Crypto Trading

Custom AI assistants that print money in your sleep? 🔜

The future of Crypto x AI is about to go crazy.

👉 Here’s what you need to know:

💠 'Based Agent' enables creation of custom AI agents
💠 Users set up personalized agents in < 3 minutes
💠 Equipped w/ crypto wallet and on-chain functions
💠 Capable of completing trades, swaps, and staking
💠 Integrates with Coinbase’s SDK, OpenAI, & Replit

👉 What this means for the future of Crypto:

1. Open Access: Democratized access to advanced trading
2. Automated Txns: Complex trades + streamlined on-chain activity
3. AI Dominance: Est ~80% of crypto 👉txns done by AI agents by 2025

🚨 I personally wouldn't bet against Brian Armstrong and Jesse Pollak.

👉 Coinbase just launched an AI agent for Crypto Trading

🚨 BIS Project Agora Enters Testing Phase for Tokenized Cross-Border Payments 🚨

The Bank for International Settlements’ “Project Agora”—a joint venture with seven G-7 central banks and 24 commercial banks—has moved from design to live sandbox, testing a unified ledger where tokenized wholesale CBDCs (wCBDCs) and commercial bank deposits move atomically across borders, cutting correspondent-bank delays from days to seconds.

🔑 Key points

🔹 Test architecture: Agora’s permissioned network hosts distinct “currency partitions” run by each participating central bank (Fed, ECB, BoJ, BoE, SNB, BdF, BoC); commercial banks mint mirror claims (tokenized deposits) that trade 1:1 with wCBDCs inside a single smart-contract venue.

🔹 Live pilots:

  • Fed/BoE leg: Citigroup and JPMorgan swap tokenized USD and GBP for FX spot settled T+0 vs. current T+2.

  • EUR/JPY corridor: BNP Paribas and MUFG use Agora to fund intraday NOSTRO buffers, saving €140 M in trapped liquidity per day....

Introducing the University Digital Asset Xcelerator (UDAX). 🎓

UC Berkeley and Ripple's University Blockchain Research Initiative launched a pilot program to accelerate the transition from academic innovation to institutional XRP utility.

The UDAX - UC Berkeley mission:

➡️ Scaling enterprise solutions using XRP
➡️ Bridging the gap between early-stage ideas & market readiness
➡️ Connecting founders with Ripple engineers & global VCs

https://ripple.com/insights/ripple-and-uc-berkeley-launch-the-university-digital-asset-xcelerator-udax-to-supercharge-the-xrp-ecosystem/

South Korea just opened digital doors with a framework for "TOKENIZED SECURITIES" 🇰🇷

Now, why is this important for Ripple and its ecosystem counterparts? 👇🏼

BDACs is one of only four licensed crypto custodians in South Korea 🇰🇷

Ripple and BDACS have a collaboration to provide custody services for "TOKENIZED SECURITIES", XRP, RLUSD and other stablecoins..

If that isnt enough.. more regulatory clarity is also unfolding in the Asian giants region this week that presents opportunity corridors for Ripple 👇🏼

South Korea's largest exchange hits $1 TRILLION in $XRP trading volume last year, outperforming both BTC and ETH. Adoption is evident.

South Korea have also removed a 9-year corporate crypto ban in the last week paving the way for further crypto adoption.

Ripple is positioned in South Korea to capitalize as conditions and clarity are becoming increasingly clear and forthcoming in the region.

post photo preview
🚨David Grusch on The Megyn Kelly Show🚨

Earlier this week, UFO/UAP whistleblower David Grusch appeared on The Megyn Kelly Show for a brief but revealing interview. During the conversation, Grusch named individuals he claimed were involved in managing the alleged UFO/UAP Legacy crash retrieval program, statements that immediately drew attention across the disclosure community.

Most notably, Grusch asserted that former Vice President Dick Cheney played a central role in overseeing the program. Cheney’s name has circulated within UFO/UAP research circles for years, but this marks the first time it has been spoken publicly by a former intelligence official who claims direct knowledge of the issue. It is also notable that just weeks ago, journalist Ross Coulthart independently referenced Cheney in a similar context, lending additional weight to the consistency of these claims.

Grusch also named former Director of National Intelligence James Clapper, stating that Clapper was not only aware of the crash retrieval issue, but managed it and helped place individuals into key roles, both publicly and behind the scenes. These are serious assertions that warrant scrutiny and further investigation, given their potential implications for disclosure.

Please watch the full interview and consider its significance within the broader context of the disclosure conversation. Please note that the interview concludes with a paid promotional pitch, and Grusch does not provide any additional comments after the pitch.

 

  🙏 Donations Accepted, Thank You For Your Support 🙏

If you find value in my content, consider showing your support via:

💳 Stripe:
1) or visit http://thedinarian.locals.com/donate

💳 PayPal: 
2) Simply scan the QR code below 📲 or Click Here: https://www.paypal.com/donate/?business=8K3TZ2YFZ7SMU&no_recurring=0&item_name=Support+Crypto+Michael+%E2%9A%A1+Dinarian+on+Locals+Blog&currency_code=USD


🔗 Crypto Donations Graciously Accepted👇
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

 

Read full Article
post photo preview
Stellar CEO Reveals Where Real Opportunity Lies in Crypto Market: Details

In a recent tweet, Stellar Development Foundation (SDF) CEO and Executive Director Denelle Dixon defines what "real opportunity" is in blockchain as a new financial future beckons.

The SDF CEO was reacting to a recent Bloomberg report on Bank of New York Mellon Corp (BNY), Nasdaq, S&P Global and iCapital participation in a new $50 million investment round by Digital Asset Holdings. This comes as some of Wall Street’s biggest names embrace the technology that underpins cryptocurrencies to handle traditional assets.

Reacting to this development, Stellar Foundation CEO Denelle Dixon stated that every blockchain investment is a bet on a different financial future. Dixon added that seeing banks explore blockchain technology validates what has been known over the years.

Real opportunity defined

While Wall Street’s biggest names betting on blockchain might be one of the most significant adoption milestones in the digital asset market, Dixon defines what real opportunity is and what it is not.

According to the SDF executive director, real opportunity is not replicating old systems on new rails but rather building open networks that fundamentally expand global finance participation.

"But the real opportunity isn’t replicating old systems on new rails—it’s building open networks that fundamentally expand who gets to participate in global finance. That’s the opportunity," Dixon tweeted.

At the Meridian 2025 event, Stellar outlined its long-term privacy strategy, committing to investing in critical privacy infrastructure and building foundational cryptographic capabilities.

Stellar eyes privacy upgrade

A new protocol upgrade is on the horizon for the Stellar network: X-Ray, which lays the groundwork for developers to build privacy applications on Stellar using zero-knowledge (ZK) cryptography.

The protocol timeline testnet vote is anticipated for Jan. 7, 2026, while the mainnet vote is expected for Jan. 22, 2026.

Source

  🙏 Donations Accepted, Thank You For Your Support 🙏

If you find value in my content, consider showing your support via:

💳 Stripe:
1) Visit http://thedinarian.locals.com/donate

💳 PayPal: 
2) Simply scan the QR code below 📲 or Click Here

🔗 Crypto Donations Graciously Accepted👇
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

Read full Article
post photo preview
XDC Network's acquisition of Contour Network

XDC Network's acquisition of Contour Network marks a silent shift to connect the digital trade infrastructure to real-time, tokenized settlement rails.

In a world where cross-border payments still take days and trap trillions in idle liquidity, integrating Contour’s trade workflows with XDC Network Blockchains' ISO 20022 financial messaging standard to bridge TradFi and Web3 in Trade Finance.

The Current State of Cross-Border Trade Settlements

Cross-border payments remain one of the most inefficient parts of global finance. For decades, companies have inter-dependency with banks and their correspondent banks across the world, forcing them to maintain trillions of dollars in pre-funded nostro and vostro balances — the capital that sits idle while transactions crawl across borders.

Traditional settlement is slow, often 1–5 days, and often with ~2-3% in FX and conversion fees. For every hour a corporation can’t access its own cash increases the cost of financing, tightens liquidity that could be used for other purposes, which in turn slows economic activity.

Before SWIFT, payments were fully manual. Intermediary banks maintained ledgers, and reconciliation across multiple institutions limited speed and volume.

SWIFT reshaped global payments by introducing a secure, standardized messaging infrastructure through ISO 20022 - which quickly became the language of money for 11,000+ institutions in 200 countries.

But SWIFT only fixed the messaging — not the movement. Actual value still moves through slow, capital-intensive correspondent chains.

Regulated and Compliant Stablecoin such as USDC (Circle) solves the part SWIFT never could: instant, on-chain settlement.

Stablecoin Settlement revamping Trade and Tokenization

Stablecoin such as USDC is a digital token pegged to the US Dollar, still the most widely used currency for trade, enabling the movement of funds instantly 24*7 globally - transparently, instantly, and without the need for any intermediaries and the need to lock in trillions of dollars of idle cash.

Tokenized settlement replaces multi-day reconciliation with on-chain finality, reducing:

  • Dependency on intermediaries
  • Operational friction
  • Trillions locked in idle liquidity

For corporates trapped in long working capital cycles, this is transformative.

Digital dollars like USDC make the process simple:

Fiat → Stablecoin → On-Chain Transfer → Fiat

This hybrid model is already widely used across remittances, payouts, and treasury flows.

But one critical piece of global commerce is still lagging:

👉 Trade finance.

The Missing link is still Trade Finance Infrastructure.

While payments innovation has raced ahead, trade finance infrastructure hasn’t kept up. Document flows, letters of credit, and supply-chain financing remain siloed, paper-heavy, and operationally outdated.

This is exactly where the next breakthrough will happen - and why the recent XDC Network acquisition of Contour is a silent revolution.

It transforms to a new era of trade-driven liquidity through an end-to-end digital trade from shipping docs to payment confirmation – one infrastructure that powers all.

The breakthrough won’t come from payments alone — it will come from connecting trade finance to real-time settlement rails.

The XDC + Contour Shift: A Silent Revolution

  • Contour already connects global banks and corporates through digital LCs and digitized trade workflows.
  • XDC Blockchain brings a settlement layer built for speed, tokenization, and institutional-grade interoperability and ISO 20022 messaging compatibility

Contour’s digital letter of credit workflows will be integrated with XDC’s blockchain network to streamline trade documentation and settlement.

Together, they form the first end-to-end digital trade finance network linking:

Documentation → Validation → Settlement all under a single infrastructure.

XDC Ventures (XVC.TECH) is launching a Stable-Coin Lab to work with financial institutions on regulated stablecoin pilots for trade to deepen institutional trade-finance integration through launch of pilots with banks and corporates for regulated stable-coin issuance and settlement.

The Bottom Line

Payments alone won’t transform Global Trade Finance — Trade finance + Tokenized Settlement will.

This is the shift happening underway XDC Network's acquisition of Contour is the quiet catalyst.

Learn how trade finance is being revolutionised:

https://www.reuters.com/press-releases/xdc-ventures-acquires-contour-network-launches-stablecoin-lab-trade-finance-2025-10-22/

Source

🙏 Donations Accepted, Thank You For Your Support 🙏

If you find value in my content, consider showing your support via:

💳 Stripe:
1) or visit http://thedinarian.locals.com/donate

💳 PayPal
2) Simply scan the QR code below 📲 or Click Here

🔗 Crypto Donations Graciously Accepted👇
XRP: r9pid4yrQgs6XSFWhMZ8NkxW3gkydWNyQX
XLM: GDMJF2OCHN3NNNX4T4F6POPBTXK23GTNSNQWUMIVKESTHMQM7XDYAIZT
XDC: xdcc2C02203C4f91375889d7AfADB09E207Edf809A6

 

Read full Article
See More
Available on mobile and TV devices
google store google store app store app store
google store google store app tv store app tv store amazon store amazon store roku store roku store
Powered by Locals