Skip to content

Extend to matrices and high-order tensors#94

Open
jli05 wants to merge 56 commits into
karpathy:masterfrom
brief-ds:master
Open

Extend to matrices and high-order tensors#94
jli05 wants to merge 56 commits into
karpathy:masterfrom
brief-ds:master

Conversation

@jli05

@jli05 jli05 commented Apr 10, 2025

Copy link
Copy Markdown

First clean-up:

  • accessory image files moved into assets/
  • demo notebooks moved into demos/
  • use Python's built-in unittest package for unit tests, one dependency (on pytest) less

Next, I plan to extend the engine to matrix or higher-order tensors, and make the backward() more efficient so that it won't have to recompute the operator topology for each piece of new input data.

@jli05 jli05 force-pushed the master branch 2 times, most recently from 224da0a to e7f552f Compare April 21, 2025 13:10
@dnparadice

Copy link
Copy Markdown

Why do you keep making PRs? The repo is fine the way it is. If you re-organize it, it will not match the video demos that go along with it.

@jli05

jli05 commented Apr 21, 2025

Copy link
Copy Markdown
Author

This is an extension to matrices and higher-order tensors. The notebook demos still run.

To see how it is like, check out https://github.com/brief-ds/micrograd @dnparadice

@jli05 jli05 changed the title Rearrange the files before further development Extend to matrices and high-order tensors Apr 21, 2025
@jli05

jli05 commented Apr 21, 2025

Copy link
Copy Markdown
Author

@karpathy could you review? You can check it out at https://github.com/brief-ds/micrograd:

  • extend the Value class and backward derivation to matrices and higher-order tensors
  • add a minimal set of operators
  • split the tests into tests/test_engine.py to test deterministically (no dependency on torch) and tests/test_vs_torch.py

The Value class and tensordot function are exported at the package level:

from micrograd import Value, tensordot

@jli05

jli05 commented Oct 1, 2025

Copy link
Copy Markdown
Author

@karpathy this version works with tensors.

As the core is just one 500-line Python file micrograd/engine.py, the learning curve is almost zero.

This blog post TensorFlow, Apple's MLX and our micrograd explains in terms of install size and performance micrograd is in par with MLX, but micrograd is extra easy to learn, play with and profile.

Would you consider merging?

@jli05

jli05 commented Feb 10, 2026

Copy link
Copy Markdown
Author

I made micrograd tensor-capable with in mind the idea to try simplifying the attention mechanism. Below is a proposal. Any thought? @karpathy

https://www.brief-ds.com/2026/02/10/roadmap-att.html

Only micrograd's characteristics makes the study possible (simplicity, opening up forward and backward methods). Feel free to merge this PR. :)

@jli05

jli05 commented Mar 7, 2026

Copy link
Copy Markdown
Author

@karpathy it seems the max() in softmax() should be mathematically derivable as in microgpt

https://gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95

@jli05 jli05 force-pushed the master branch 2 times, most recently from 9dbca4b to 6d900fe Compare March 14, 2026 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants