Classical Core Components:
All embedded in a high-level language
You've already seen and used this sort of thing: NumPy.
x.to("device_name")
When we manifest a node of the compute graph, we can either:
This was the classic distinction between TensorFlow and PyTorch
import torch
x = torch.ones((1,3,4))
x = torch.ones(())
x
tensor(1.)
x = torch.ones(())
x.requires_grad = True
u = (x + 2)
y = u.square() # (x + 2)^2 --> 2 * (1 + 2) = 6
Advantages of eager mode (compute values & manifest graph at the same time):
Advantages of lazy mode/graph mode (manifest graph first, then compute values):
import time
N**2 / (1024**3)
0.25
N = 1024 * 16
X = torch.randn(N,N)
Y = torch.randn(N,N)
begin = time.time()
Z = X + Y
print(Z[0,0])
end = time.time()
print(f"elapsed: {(end - begin) * 1000} ms")
tensor(-1.4525) elapsed: 60.443878173828125 ms
X_mps = X.to("mps")
Y_mps = Y.to("mps")
begin = time.time()
Z_mps = X_mps + Y_mps
print(Z_mps[0,0])
end = time.time()
print(f"elapsed: {(end - begin) * 1000} ms")
tensor(-1.4525, device='mps:0') elapsed: 56.93316459655762 ms
554.5499324798584/ 118.25203895568848
4.689559160055244
4210.312843322754/985.3289127349854
4.273002434929221
U = torch.ones((1))
U_mps = U.to("mps")
U
tensor([1.])
U_mps[0] = 3
U
tensor([1.])
U_mps
tensor([3.], device='mps:0')
U_mps.cpu()
tensor([3.])
U_mps.to("cpu")
tensor([3.])
U = torch.ones((5,5,5),device="mps")
(N ** 3)/(1024**3)
4096.0
U = torch.ones((N,N,N),device="meta")
torch.ones((10000,20000),device="meta") @ torch.ones((20000,30000),device="meta")
tensor(..., device='meta', size=(10000, 30000))
torch.ones((10000,20000),device="meta") @ torch.ones((30000,30000),device="meta")
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[116], line 1 ----> 1 torch.ones((10000,20000),device="meta") @ torch.ones((30000,30000),device="meta") File /opt/anaconda3/lib/python3.9/site-packages/torch/_meta_registrations.py:448, in meta_mm(a, b) 446 N, M1 = a.shape 447 M2, P = b.shape --> 448 check(M1 == M2, lambda: "a and b must have same reduction dim") 449 return a.new_empty(N, P) File /opt/anaconda3/lib/python3.9/site-packages/torch/_prims_common/__init__.py:1563, in check(b, s, exc_type) 1556 """ 1557 Helper function for raising an error_type (default: RuntimeError) if a boolean condition fails. 1558 Error message is a callable producing a string (to avoid wasting time 1559 string formatting in non-error case, and also to make it easier for torchdynamo 1560 to trace.) 1561 """ 1562 if not b: -> 1563 raise exc_type(s()) RuntimeError: a and b must have same reduction dim
torch.ones((1000000,2000000),device="meta")
tensor(..., device='meta', size=(1000000, 2000000))
torch.ones((1000000,2000000)).to("meta")
import torch
X = torch.ones((1000,2000),device="meta")
X.nonzero()
--------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) Cell In[4], line 1 ----> 1 X.nonzero() NotImplementedError: Could not run 'aten::nonzero' with arguments from the 'Meta' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::nonzero' is only available for these backends: [CPU, MPS, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMeta, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher]. CPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31034 [kernel] MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMPS.cpp:22748 [kernel] BackendSelect: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:144 [backend fallback] FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:491 [backend fallback] Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:280 [backend fallback] Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback] Negative: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback] ZeroTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:63 [backend fallback] AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_0.cpp:15256 [autograd kernel] Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:16728 [kernel] AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:487 [backend fallback] AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:354 [backend fallback] FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/BatchRulesDynamic.cpp:64 [kernel] FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback] Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1073 [backend fallback] VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:210 [backend fallback] PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:152 [backend fallback] FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:487 [backend fallback] PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:148 [backend fallback]