✍️ Text Generation

Qwen3.6-35B-A3B-NVFP4

nvidia/Qwen3.6-35B-A3B-NVFP4

Get AI Model →
4.6M
Downloads
❤️
314
Likes
🏷️
18
Tags
📦
Model Optimizer
Library
Model Details
Full Model IDnvidia/Qwen3.6-35B-A3B-NVFP4
Pipeline / Tasktext-generation
LibraryModel Optimizer
Downloads (all-time)4.6M
Likes314
Last Modified6/12/2026
Author / Orgnvidia
PrivateNo — public
⚡ Quick Usage (Python)

Using the 🤗 Transformers library. Install with pip install transformers

from transformers import pipeline

# Load the model
pipe = pipeline("text-generation", model="nvidia/Qwen3.6-35B-A3B-NVFP4")

# Run inference
result = pipe("Your input here")
print(result)
🏷️ Tags
Model Optimizersafetensorsqwen3_5_moenvidiaModelOptQwen3.6quantizedFP4fp4text-generationconversationalbase_model:Qwen/Qwen3.6-35B-A3Bbase_model:quantized:Qwen/Qwen3.6-35B-A3Blicense:apache-2.08-bitmodeloptdeploy:azureregion:us
More Text Generation Models
See all →
Qwen3-0.6B

Qwen/Qwen3-0.6B

27.8M❤️ 1.4K
Get AI Model →
Qwen3-4B

Qwen/Qwen3-4B

16.4M❤️ 641
Get AI Model →
gpt2

openai-community/gpt2

13.3M❤️ 3.3K
Get AI Model →
🚀 Use This Model

Access model files, inference API, and full documentation on Hugging Face.

Open on Hugging Face →Browse Model Files ↗← Browse All Models
✍️ Task: Text Generation

This model is designed for the Text Generation task. Explore more models for this use case.

All Text Generation Models →
📊 Popularity
Downloads4.6M
❤️ Community Likes314
🛠️ Requirements
  • Install: pip install Model Optimizer
  • Python 3.8+ recommended for Transformers.
  • GPU (CUDA) speeds up inference significantly.
  • Use model.half() for fp16 on limited VRAM.
👋 Need help with code?