Skip to content

feat(core): Add a new string index: n-gram#9463

Merged
mangalaman93 merged 29 commits intomainfrom
harshil-goel/shingles
Aug 21, 2025
Merged

feat(core): Add a new string index: n-gram#9463
mangalaman93 merged 29 commits intomainfrom
harshil-goel/shingles

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Jul 9, 2025

This PR adds ngram (aka shingles) indexing to string predicates.

Background

Ngram (also known as shingles) indexing is a technique for indexing and searching text that allows for flexible, partial-text matching. It works by breaking down a string into a sequence of n overlapping tokens (words).

Implementation

The implementation supports basic text analysis (lower-casing and normalization), stop word removal, language-specific work stemming and adds support in both DQL and GraphQL.

DQL:

We introduce the ngram search directive and search function.

Schema

description: string @index(ngram) .

Query syntax

{
    me(func: ngram(description, "quick brown fox")) {
	uid
	description
    }
}

GraphQL

Schema

description: String @search(by: [ngram])

Query syntax

queryFoo(filter: { description: { ngram: $title } }) {
    title
}

Checklist

  • Code compiles correctly and linting passes locally
  • For all code changes, an entry added to the CHANGELOG.md file describing and linking to
    this PR
  • Tests added for new functionality, or regression tests for bug fixes added as applicable

@ghost ghost self-requested a review July 9, 2025 05:06
@github-actions github-actions Bot added area/querylang Issues related to the query language specification and implementation. area/core internal mechanisms go Pull requests that update Go code labels Jul 9, 2025
@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Jul 9, 2025

Static BadgeStatic BadgeStatic BadgeStatic Badge

Failed Test Failure Summary Logs
TestBackupMinio The backup test failed because it could not establish a connection to retrieve predicate information. Logs ↗︎

View Full Report ↗︎Docs

@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented Jul 9, 2025

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

Comment thread tok/tok.go Outdated
@ghost ghost force-pushed the harshil-goel/shingles branch from 395cf24 to 2f906ed Compare July 15, 2025 12:38
@github-actions github-actions Bot added the area/testing Testing related issues label Jul 15, 2025
fixed numgo stuff

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations

added batching for mutations
@ghost ghost changed the title feat(core): Add a new string index: shingles feat(core): Add a new string index: kterm Aug 18, 2025
@ghost ghost changed the title feat(core): Add a new string index: kterm feat(core): Add a new string index: k-gram Aug 18, 2025
@ghost ghost changed the title feat(core): Add a new string index: k-gram feat(core): Add a new string index: n-gram Aug 18, 2025
Copy link
Copy Markdown
Contributor

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a description stating why we felt the need to add this new index and keep it when we merge the PR.

Comment thread posting/index.go
Comment thread posting/index.go Outdated
Comment thread tok/tok.go Outdated
@github-actions github-actions Bot added the area/graphql Issues related to GraphQL support on Dgraph. label Aug 20, 2025
@mangalaman93 mangalaman93 enabled auto-merge (squash) August 20, 2025 23:51
@mangalaman93 mangalaman93 merged commit d7dfe3b into main Aug 21, 2025
14 of 15 checks passed
@mangalaman93 mangalaman93 deleted the harshil-goel/shingles branch August 21, 2025 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core internal mechanisms area/graphql Issues related to GraphQL support on Dgraph. area/querylang Issues related to the query language specification and implementation. area/testing Testing related issues go Pull requests that update Go code

Development

Successfully merging this pull request may close these issues.

3 participants