Build software better, together

IAAR-Shanghai / ICSFSurvey

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

decoding self-improvement knowledge-distillation data-augmentation reasoning self-consistency preference-learning hallucination self-correction attention-head large-language-models chain-of-thought large-language-model internal-consistency self-feedback self-refine self-correct

Updated Dec 7, 2024
Jupyter Notebook

SuperBruceJia / Awesome-LLM-Self-Consistency

Star

Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models

semantics reasoning self-consistency pretrained-language-model gpt-3 factual-consistency gpt-4 llms chain-of-thought chatgpt self-consistency-learning logical-consistency self-consistent-generation self-consistency-benchmark llms-reasoning semantics-preserving semantics-consistency hypothetical-consistency compositional-consistency

Updated Jul 20, 2025

CycloneBoy / csc_sql

Star

CSC-SQL: Corrective Self-Consistency in Text-to-SQL via Reinforcement Learning

self-consistency text-to-sql grpo

Updated Aug 12, 2025
Python

Cohorte-ai / trustgate

Star

Black-box AI reliability certification via self-consistency sampling and conformal calibration

python reliability certification ai-safety ai-agents self-consistency conformal-prediction llm-evaluation deployment-gating

Updated Mar 28, 2026
Python

Amirhosein-gh98 / Guided-by-Gut

Star

The official PyTorch implementation for the Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

efficient tree-search gg prm self-consistency confidence dvts rl-training llm inference-time-compute grpo test-time-scaling guided-by-gut

Updated Jun 9, 2025
Python

Toronto-Condensed-Matter-Theory / MeanFieldToolkit.jl

Star

Object oriented package for solving self-consistent mean field theory of interacting lattice systems.

magnetism lattice julia-package superconductivity self-consistency mean-field-theory condensed-matter-physics interacting-particle-system ground-state-energy

Updated May 26, 2026
Julia

ictup / Enhancing-QA-Systems-through-Integrated-Reasoning-over-Knowledge-Bases-and-Large-Language-Models

Star

KG-RAG + ToT + multi-agent LLMs for evidence-grounded QA with Neo4j and fine-tuning; reproducible medical case study & evaluation.

neo4j knowledge-graph question-answering lora mindmap bm25 reasoning self-consistency autogen fine-tuning nli peft rag llm prompt-engineering tree-of-thoughts llm-ranking

Updated Aug 14, 2025
Python

snikol03 / Markov-Chain-Project

Star

Perl implementation of Markov Chain for the course BIO331

Updated May 3, 2017
Perl

Toronto-Condensed-Matter-Theory / FixedPointToolkit.jl

Star

Fixed Point solver for generic functions

physics fixed-point julia-package self-consistency condensed-matter-physics

Updated Oct 2, 2023
Julia

SuperBruceJia / GSM8K-Consistency

Star

GSM8K-Consistency is a benchmark database for analyzing the consistency of Arithmetic Reasoning on GSM8K.

Updated Dec 31, 2023

haiyang5535 / smboost

Star

Decoding-time harness that lifts small open-weight LLMs (Qwen 2.5 2B Q4) on verifier-friendly benchmarks via parallel self-consistency, per-sample program verifiers, and raw-anchored majority voting.

decoding benchmarks self-consistency on-device-ai llm qwen small-language-models langgraph

Updated Apr 28, 2026
Python

giselamarti / Electronic_Structure

Star

Subject of Electronic structure for my master's degree

python quantum-mechanics computational-physics self-consistency hydrogen-atom gaussian-orbitals ground-state-energy

Updated Apr 29, 2023
Python

Tactical next-action + reasoning prediction on 348 football match contexts (Shipd Project Eris). 4-component ensemble with task-coupling: DeBERTa-v3-base / large, cross-encoder MCQ scorer, zero-shot NLI, and a three-pass Qwen3.5-35B-A3B-Int4 + Gemma-4-26B-A4B-it MoE fusion with PRM rerank. W&B-instrumented. Target combined ≥ 0.80

nlp text-classification pytorch ensemble football gemma self-consistency mixture-of-experts weights-and-biases huggingface-transformers deberta-v3 qwen3 shipd-ai multiple-choice-qa mlm-pretraining

Updated Apr 21, 2026
Python

msmrexe / llm-math-reasoning-analysis

Star

An evaluation of prompting techniques (Zero-Shot CoT, Few-Shot, Self-Consistency) on the Mistral-7B model for mathematical reasoning. This project systematically benchmarks 7 distinct methods on the GSM8K dataset.

Updated Nov 2, 2025
Python

TEJA4704 / prompt-engineering-toolkit

Star

Advanced prompt engineering techniques: Chain-of-Thought, Tree-of-Thoughts, ReAct, Self-Consistency

python reasoning ai-agents self-consistency llm prompt-engineering chain-of-thought tree-of-thoughts react-agent langchain-alternative

Updated Jan 20, 2026
Python

sjain-stanford / SCM-PLL

Star

Self consistent model based filter design for 3-phase PLLs.

matlab mathematical-modelling filter-design pll self-consistency simulink-model

Updated Jan 4, 2018
Makefile

jameswniu / self-hosted-llm-evals-lab

Star

Evaluation framework for self-hosted LLMs. Systematic prompt ablation (baseline, CoT, few-shot, self-consistency voting) on Llama 3.1 8B via lm-evaluation-harness, with Wilson CI statistical analysis, determinism validation, and load testing under concurrency. Found chain-of-thought degrades accuracy 25pp at small scale.

benchmark natural-language-processing load-testing self-hosted statistical-analysis llama self-consistency determinism ablation-study prompt-engineering chain-of-thought ollama llm-evaluation lm-eval-harness

Updated Mar 9, 2026
Python

alexneilgreen / UCF-ComputerUnderstandingOfNaturalLanguage-LLMReliabilityEval

Star

CAP6640-Spring2026: Benchmarks GPT-3.5, GPT-4, Claude Haiku, and Gemini on GSM8k and TruthfulQA, measuring accuracy, self-consistency, and confidence calibration.

python benchmark natural-language-processing gemini openai self-consistency ucf confidence-calibration anthropic gsm8k llm-evaluation truthfulqa cap6640

Updated May 1, 2026
Python

Dr-AneeshJoseph / blast-rag-firewall

Star

A consistency-based firewall for high-stakes Retrieval Augmented Generation (RAG). Queries the model multiple times and incinerates the output if entropy is high (divergent answers), preferring silence over hallucination.

reliability-engineering self-consistency rag entropy-checking legal-ai medical-ai hallucination-firewall

Updated Dec 16, 2025
Python

Saharsh1005 / autonomous-prompting

Star

Developing an autonomous system for prompt selection for Large Language Models (LLMs), enhancing performance across tasks by balancing generality and specificity. This project automates diverse, high-quality prompt creation and selection, reducing manual intervention and maximizing LLM utility across applications.

self-consistency cot large-language-models llm prompt-engineering chain-of-thought gsm8k

Updated Dec 10, 2024
Jupyter Notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

self-consistency

Here are 29 public repositories matching this topic...

IAAR-Shanghai / ICSFSurvey

SuperBruceJia / Awesome-LLM-Self-Consistency

CycloneBoy / csc_sql

Cohorte-ai / trustgate

Amirhosein-gh98 / Guided-by-Gut

Toronto-Condensed-Matter-Theory / MeanFieldToolkit.jl

ictup / Enhancing-QA-Systems-through-Integrated-Reasoning-over-Knowledge-Bases-and-Large-Language-Models

snikol03 / Markov-Chain-Project

Toronto-Condensed-Matter-Theory / FixedPointToolkit.jl

SuperBruceJia / GSM8K-Consistency

haiyang5535 / smboost

giselamarti / Electronic_Structure

cataluna84 / football-ntap

msmrexe / llm-math-reasoning-analysis

TEJA4704 / prompt-engineering-toolkit

sjain-stanford / SCM-PLL

jameswniu / self-hosted-llm-evals-lab

alexneilgreen / UCF-ComputerUnderstandingOfNaturalLanguage-LLMReliabilityEval

Dr-AneeshJoseph / blast-rag-firewall

Saharsh1005 / autonomous-prompting

Improve this page

Add this topic to your repo