# Building autograd engine tinytorch 01

Community blog post
Published January 21, 2024

This blog used to live on pythonstuff.com , This is new home for this blog ðŸ¤—, I am gonna show you how to build your own tiny pytorch Yes for fr.

Its weekend, and I am going to build my own autograd engine. I have done this ones before so I shoudnt be problem. started with empty git repo, I will keep commiting without any rebase so in case someone go back they can see eveything.

this is dev blog, I'll try to write explaination to you can recreate it I'm not used to writing tutorials to DM me on @shxf0072 twitter (X now), If I f something up all code will be at github.com/joey00072/tinytorch. I'll leave commits in blog so do git checkout COMMIT_ID to go to that point.

## tinytorch

import numpy as np

class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)

return Tensor(self.data + other.data)

def __mul__(self,other):
return Tensor(self.data + other.data)

def __repr__(self):
return f"tensor({self.data})"

if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x+y

print(z)



and my friends called me up for valorant so I'll be back in 2hr or so (20:02 19-08-2023 )

Back (23:02 19-08-2023 )

don't worry about math code is easy

derivatve of anything with itsed is 1 for $f\left(x\right)=xf\left(x\right) = x$ for $\frac{d}{dx}=1\frac\left\{\left\{d\right\}\right\}\left\{\left\{dx\right\}\right\} = 1$

$f\left(x\right)=x+10f\left(x\right) = x + 10$

its derivative will be 1, since $\frac{d\left(x\right)}{dx}=1\frac\left\{\left\{d\left(x\right)\right\}\right\}\left\{\left\{dx\right\}\right\} = 1$ the derivative of a constant is 0, and the derivative of (10) is $00$. So $1+0=11 + 0 = 1$. $f\left(x,y\right)=x+yf\left(x,y\right) = x + y$

For two variables: $f\left(x,y\right)=x+yf\left(x,y\right) = x + y$ with respect to (x): with respect to (y):

so if (x =10) & (y = 20)

noice adding give equial graidnt back to both node
since z has grident 1, x and y got both grident 1 this will be usefull in residual connections in transformers


#### MUL

Now, let's consider multiplication. If you have the function $g\left(x\right)=xâ‹\dots 10,g\left(x\right) = x \cdot 10,$ its derivative will be 10, since (\frac{{d(x)}}{{dx}} = 10) and the derivative of 10 is 0. So (10 \cdot 1 = 10).

(f(x,y) = x \cdot y)

For two variables: with respect to (x): with respect to (y):

Noice in this case derivative or x have value of y (20) and derivate of y have value of x (10)


Lets code

we will create Add MUL and Funtion class move operation login in foward methoed of each class and store value of args in Function.args for backward


class Function:
def __init__(self,op,*args):
self.op = op
self.args = args

@staticmethod
def forward(x,y):
return Tensor(x.data + y.data)

@staticmethod
x,y = ctx.args
return Tensor([1]) ,Tensor([1])

class Mul:
@staticmethod
def forward(x,y):
return Tensor(x.data * y.data) # z = x*y

@staticmethod
x,y = ctx.args
return  Tensor(y.data), Tensor(x.data) #  dz/dx, dz/dy


Functions class is to store will funtion/operation that we have applied so if we add x=10 and y = 20, funtion will have fn.op = Add abd fn.args = (10,20)

we pass function object as context to backward when we will get original args back when we doing backward pass


class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)
self._ctx = None

result._ctx = fn
return result

def __mul__(self,other):
fn = Function(Mul,self,other)
result = Mul.forward(self,other)
result._ctx = fn
return result

def __repr__(self):
return f"tensor({self.data})"


So when you do some op

1. fist store all info related to that op in Function Object
2. than do the op.forward
3. store all information in result node
4. return result

If you want to see this this graph creare new visualize.py file

pip install graphviz
sudo apt-get install -y graphviz # IDK what to do for windows I use wsl

import graphviz
from tinytorch import *

G = graphviz.Digraph(format='png')
G.clear()
def visit_nodes(G:graphviz.Digraph,node:Tensor):
uid = str(id(node))
G.node(uid,f"Tensor: {str(node.data) } ")
if node._ctx:
ctx_uid = str(id(node._ctx))
G.node(ctx_uid,f"Context: {str(node._ctx.op.__name__)}")
G.edge(uid,ctx_uid)
for child in node._ctx.args:
G.edge(ctx_uid,str(id(child)))
visit_nodes(G,child)

if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x+y
visit_nodes(G,z)
G.render(directory="vis",view=True)
print(z)

print(len(G.body))

import numpy as np

class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)
self._ctx = None

result._ctx = fn
return result

def __mul__(self,other):
fn = Function(Mul,self,other)
result = Mul.forward(self,other)
result._ctx = fn
return result

def __repr__(self):
return f"tensor({self.data})"

class Function:
def __init__(self,op,*args):
self.op = op
self.args = args

@staticmethod
def forward(x,y):
return Tensor(x.data + y.data)

@staticmethod
x,y = ctx.args
return Tensor([1]),Tensor([1])

class Mul:
@staticmethod
def forward(x,y):
return Tensor(x.data * y.data) # z = x*y

@staticmethod
x,y = ctx.args
return  Tensor(y.data), Tensor(x.data) #  dz/dx, dz/dy

if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x*y
print(z)



Till commit dc11629 https://github.com/joey00072/tinytorch

sleeping now backprop tomorrow