Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NVIDIA-NeMo/NeMo
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: intercom/NeMo
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 20 commits
  • 46 files changed
  • 5 contributors

Commits on Sep 25, 2025

  1. feat: Add Qwen3-Next hybrid attention architecture support

    Implement complete Qwen3-Next architecture integration with NeMo Megatron framework:
    
    - Add hybrid attention mechanism supporting both full and linear attention layers
    - Implement gated delta rule for efficient linear attention with O(n) complexity
    - Create modular layer specifications with dynamic attention type selection
    - Add custom transformer layer supporting mixed attention mechanisms
    - Integrate with existing NeMo model infrastructure via megatron_gpt_model.py
    
    Key features:
    - Linear attention with chunk-based and recurrent processing algorithms
    - L2 normalization and clipping mechanisms for numerical stability
    - Full tensor parallelism support for distributed training
    - Memory-efficient processing for long sequences (4K+ tokens)
    - Configurable layer type patterns for optimal performance/efficiency balance
    
    Architecture allows 3B total parameters with ~1B active during inference,
    providing substantial efficiency gains while maintaining model expressivity.
    
    Files added:
    - nemo/collections/nlp/models/language_modeling/megatron/qwen3_next/__init__.py
    - nemo/collections/nlp/models/language_modeling/megatron/qwen3_next/qwen3_next_modules.py
    - nemo/collections/nlp/models/language_modeling/megatron/qwen3_next/qwen3_next_spec.py
    - nemo/collections/nlp/models/language_modeling/megatron/qwen3_next/qwen3_next_layer.py
    - nemo/collections/nlp/models/language_modeling/megatron/qwen3_next/README.md
    
    Files modified:
    - nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py
    Intercom Engineering committed Sep 25, 2025
    Configuration menu
    Copy the full SHA
    6846fc2 View commit details
    Browse the repository at this point in the history
  2. feat: Add FineWeb dataset preprocessing and streaming support

    Add comprehensive FineWeb dataset integration for NeMo pre-training:
    
    - Create preprocessing script to convert HuggingFace FineWeb to NeMo binary format
    - Implement streaming dataset support for large-scale web data processing
    - Add optimized data loading with configurable batch sizes and workers
    - Support for subset selection (sample-10BT, sample-350BT) for experimentation
    - Integrate Qwen2 tokenizer for consistent text processing
    
    Key features:
    - Efficient streaming from HuggingFace datasets hub
    - Automatic text filtering and length validation
    - Parallel processing with configurable worker count
    - Memory-efficient JSONL intermediate format
    - Compatible with NeMo's standard indexed dataset pipeline
    
    Files added:
    - scripts/nlp_language_modeling/preprocess_fineweb_for_qwen3_next.py
    - nemo/collections/nlp/data/language_modeling/megatron/hf_streaming_dataset.py
    - nemo/collections/nlp/data/language_modeling/megatron/gpt_fineweb_dataset.py
    
    Enables training on web-scale datasets with proper tokenization and formatting
    for foundation model pre-training workflows.
    Intercom Engineering committed Sep 25, 2025
    Configuration menu
    Copy the full SHA
    10b6e04 View commit details
    Browse the repository at this point in the history
  3. feat: Add complete Qwen3-Next training configuration and documentation

    Add production-ready training setup for 3B parameter Qwen3-Next model:
    
    - Complete training configuration for 8-GPU setup with optimal parallelism
    - Hybrid attention pattern optimized for 3B total/1B active parameters
    - BFloat16 mixed precision and gradient accumulation for memory efficiency
    - Integration with preprocessed FineWeb dataset using NeMo binary format
    - Comprehensive documentation with step-by-step training guide
    
    Training configuration features:
    - Tensor parallelism across 2 GPUs per model replica
    - Sequence parallel processing for 4K sequence lengths
    - Activation checkpointing for memory optimization
    - Cosine annealing scheduler with appropriate warmup
    - Validation and checkpointing with best model selection
    
    Documentation includes:
    - Complete preprocessing pipeline instructions
    - Training command examples and parameter explanations
    - Performance tuning guidelines for different GPU configurations
    - Troubleshooting section for common training issues
    - Architecture details and scaling recommendations
    
    Files added:
    - qwen3_next_3b_fineweb_training.yaml
    - QWEN3_NEXT_TRAINING_GUIDE.md
    
    Ready for production training of hybrid attention language models on web-scale data.
    jamesoneill12 committed Sep 25, 2025
    Configuration menu
    Copy the full SHA
    0e2bc0a View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2025

  1. Bump transformers from 4.51.3 to 4.53.0 in /requirements (#1)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Sep 30, 2025
    Configuration menu
    Copy the full SHA
    c4fae31 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2025

  1. Bump vllm from 0.8.5.post1 to 0.10.1.1 in /requirements (#2)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Oct 1, 2025
    Configuration menu
    Copy the full SHA
    470f9c4 View commit details
    Browse the repository at this point in the history
  2. Bump vite from 6.3.5 to 6.3.6 in /examples/voice_agent/client (#3)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Oct 1, 2025
    Configuration menu
    Copy the full SHA
    6c69d36 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2025

  1. Bump vllm from 0.10.1.1 to 0.11.0 in /requirements (#4)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Oct 8, 2025
    Configuration menu
    Copy the full SHA
    a2153d8 View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2025

  1. Bump vite from 6.3.6 to 6.4.1 in /examples/voice_agent/client (#5)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Oct 21, 2025
    Configuration menu
    Copy the full SHA
    54c2323 View commit details
    Browse the repository at this point in the history

Commits on Mar 3, 2026

  1. Bump rollup from 4.43.0 to 4.59.0 in /examples/voice_agent/client (#6)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Mar 3, 2026
    Configuration menu
    Copy the full SHA
    314d361 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2026

  1. Bump picomatch from 4.0.2 to 4.0.4 in /examples/voice_agent/client (#7)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Mar 26, 2026
    Configuration menu
    Copy the full SHA
    ff60acd View commit details
    Browse the repository at this point in the history

Commits on Apr 8, 2026

  1. Bump vite from 6.4.1 to 6.4.2 in /examples/voice_agent/client (#8)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Apr 8, 2026
    Configuration menu
    Copy the full SHA
    d963cae View commit details
    Browse the repository at this point in the history

Commits on Apr 15, 2026

  1. Bump transformers from 4.53.0 to 5.0.0rc3 in /requirements (#9)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Apr 15, 2026
    Configuration menu
    Copy the full SHA
    36ea214 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2026

  1. Bump protobufjs from 7.5.3 to 7.5.5 in /examples/voice_agent/client (#10

    )
    
    Automatic merge of Dependabot PR
    dependabot[bot] authored Apr 21, 2026
    Configuration menu
    Copy the full SHA
    ba88b6a View commit details
    Browse the repository at this point in the history

Commits on Jun 3, 2026

  1. Disable GitHub Actions workflows (#15)

    Move .github/workflows/*.yml(.yaml) to .github/workflows-disabled/ to neutralize
    them ahead of the org-wide GitHub Actions enablement. Restore via the
    re-enable-workflow process when a workflow is intentionally re-enabled.
    
    Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    eredi93 and claude authored Jun 3, 2026
    Configuration menu
    Copy the full SHA
    a1c444b View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2026

  1. Bump protobufjs from 7.5.5 to 7.6.1 in /examples/voice_agent/client (#14

    )
    
    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 4, 2026
    Configuration menu
    Copy the full SHA
    e26ede8 View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2026

  1. Bump vllm from 0.11.0 to 0.20.0 in /requirements (#11)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 10, 2026
    Configuration menu
    Copy the full SHA
    501bbde View commit details
    Browse the repository at this point in the history
  2. Bump postcss from 8.5.6 to 8.5.15 in /examples/voice_agent/client (#16)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 10, 2026
    Configuration menu
    Copy the full SHA
    f2b5559 View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2026

  1. Bump esbuild, @vitejs/plugin-react-swc and vite (#18)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 16, 2026
    Configuration menu
    Copy the full SHA
    5e6f605 View commit details
    Browse the repository at this point in the history
  2. Bump vllm from 0.20.0 to 0.22.0 in /requirements (#17)

    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 16, 2026
    Configuration menu
    Copy the full SHA
    2427db1 View commit details
    Browse the repository at this point in the history

Commits on Jun 17, 2026

  1. Bump protobufjs from 7.6.1 to 8.6.3 in /examples/voice_agent/client (#19

    )
    
    Automatic merge of Dependabot PR
    dependabot[bot] authored Jun 17, 2026
    Configuration menu
    Copy the full SHA
    2da1aac View commit details
    Browse the repository at this point in the history
Loading