Skip to content
View MGhanayim's full-sized avatar

Block or report MGhanayim

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. qlora-finetuning-and-attention qlora-finetuning-and-attention Public

    QLoRA fine-tuning of two small open LLMs (decoder-only and encoder-decoder) for audience-adaptive Q&A, plus from-scratch scaled dot-product attention in PyTorch.

    Jupyter Notebook 1

  2. customer-service-analytics-agent-mohammad-ghanayim customer-service-analytics-agent-mohammad-ghanayim Public

    Python

  3. k8s-distributed-llm-finetuning k8s-distributed-llm-finetuning Public

    Multi-node PyTorch DDP fine-tuning of a causal LM on Nebius GPU Kubernetes, with SkyPilot workload orchestration — a 2-node torchrun job with verified NCCL collectives.

    Python

  4. observable-vllm-text2sql-agent observable-vllm-text2sql-agent Public

    Text-to-SQL agent on vLLM (Qwen3-30B-A3B) with a LangGraph verify→revise loop, instrumented on two observability planes for metric-grounded SLO diagnosis.

    Python

  5. gpu-cuda-inference-optimization gpu-cuda-inference-optimization Public

    Three measured notebooks on GPU inference optimization: roofline analysis, KV-cache decode optimization (4.21x), and CUDA-graph launch-overhead elimination (5.38x). Pure PyTorch.

    Jupyter Notebook