Version
0.13.0
On which installation method(s) does this occur?
No response
Describe the issue
π Title
[Bug] AttributeError: 'NoneType' object has no attribute 'patches_enabled' during import (Multi-GPU without megatron-core)
π Description
When running inference on a multi-GPU system without distributed dependencies (e.g., megatron-core) installed, importing physicsnemo causes a fatal AttributeError.
The code correctly uses a try...except block to catch the ImportError for distributed libraries, falling back to setting ShardTensor = None. However, on line 223 of natten_patches.py, there is an unprotected module-level access to ShardTensor.patches_enabled, which inevitably crashes the application upon import.
π Steps to Reproduce
- Use a multi-GPU environment (e.g., 2x NVIDIA RTX/A100).
- Install
earth2studio and physicsnemo without megatron-core or apex.
- Attempt to import the module:
python -c "from physicsnemo.domain_parallel.shard_utils.natten_patches import ShardTensor"
- Observe the crash.
β Error Traceback
Traceback (most recent call last):
File "<string>", line 1, in <module>
File ".../site-packages/physicsnemo/domain_parallel/shard_utils/natten_patches.py", line 223, in <module>
"natten.functional", "na2d", enabled=ShardTensor.patches_enabled
AttributeError: 'NoneType' object has no attribute 'patches_enabled'
π Root Cause
In physicsnemo/domain_parallel/shard_utils/natten_patches.py, the fallback logic sets ShardTensor to None if distributed dependencies are missing.
However, at line 223, the code executes at the module level:
patch_method(
"natten.functional", "na2d", enabled=ShardTensor.patches_enabled
)
Since ShardTensor is None, calling .patches_enabled raises an AttributeError, making it a hard blocker for any non-distributed multi-GPU inference.
π‘ Proposed Solution
Add a safe None check or use getattr to allow the import to succeed gracefully for non-distributed users:
# Suggested fix for line 223:
patch_method(
"natten.functional", "na2d", enabled=getattr(ShardTensor, 'patches_enabled', False)
)
π₯οΈ Environment
- OS: Linux
- Python: 3.11
- PyTorch: 2.5.1
- Hardware: Multi-GPU (e.g., 2 GPUs)
Version
0.13.0
On which installation method(s) does this occur?
No response
Describe the issue
π Title
[Bug]
AttributeError: 'NoneType' object has no attribute 'patches_enabled'during import (Multi-GPU withoutmegatron-core)π Description
When running inference on a multi-GPU system without distributed dependencies (e.g.,
megatron-core) installed, importingphysicsnemocauses a fatalAttributeError.The code correctly uses a
try...exceptblock to catch theImportErrorfor distributed libraries, falling back to settingShardTensor = None. However, on line 223 ofnatten_patches.py, there is an unprotected module-level access toShardTensor.patches_enabled, which inevitably crashes the application upon import.π Steps to Reproduce
earth2studioandphysicsnemowithoutmegatron-coreorapex.python -c "from physicsnemo.domain_parallel.shard_utils.natten_patches import ShardTensor"β Error Traceback
π Root Cause
In
physicsnemo/domain_parallel/shard_utils/natten_patches.py, the fallback logic setsShardTensortoNoneif distributed dependencies are missing.However, at line 223, the code executes at the module level:
Since
ShardTensorisNone, calling.patches_enabledraises anAttributeError, making it a hard blocker for any non-distributed multi-GPU inference.π‘ Proposed Solution
Add a safe
Nonecheck or usegetattrto allow the import to succeed gracefully for non-distributed users:π₯οΈ Environment