This project's goal is to produce a higher level API for the go bindings to libddwaf: DataDog in-app WAF. It consists of 2 separate entities: the bindings for the calls to libddwaf, and the encoder whose job is to convert any go value to its libddwaf object representation.
An example usage would be:
import waf "github.com/DataDog/go-libddwaf/v5"
//go:embed
var ruleset []byte
func main() {
var parsedRuleset any
if err := json.Unmarshal(ruleset, &parsedRuleset); err != nil {
panic(err)
}
// v2: NewBuilder no longer takes obfuscator regex parameters
builder, err := waf.NewBuilder()
if err != nil {
panic(err)
}
_, err = builder.AddOrUpdateConfig("/rules", parsedRuleset)
if err != nil {
panic(err)
}
wafHandle, err := builder.Build()
if err != nil {
panic(err)
}
defer wafHandle.Close()
wafCtx, err := wafHandle.NewContext(context.Background(), timer.WithUnlimitedBudget(), timer.WithComponent("waf", "rasp"))
if err != nil {
panic(err)
}
defer wafCtx.Close()
// v2: Use Data field instead of Persistent
result, err := wafCtx.Run(context.Background(), waf.RunAddressData{
Data: map[string]any{
"server.request.path_params": "/rfiinc.txt",
},
TimerKey: "waf",
})
// v2: For ephemeral data, use NewSubcontext
subCtx, err := wafCtx.NewSubcontext(context.Background())
if err != nil {
panic(err)
}
defer subCtx.Close()
result, err = subCtx.Run(context.Background(), waf.RunAddressData{
Data: map[string]any{
"server.request.body": "ephemeral data",
},
})
}The API documentation details can be found on pkg.go.dev.
go-libddwaf v5 tracks libddwaf v2 and includes a few breaking API changes:
NewBuilder()no longer takes obfuscator regex arguments; obfuscation now lives in builder config viaAddOrUpdateConfig(..., "obfuscator/config", ...)RunAddressDatanow uses a singleDatafield instead ofPersistentandEphemeral- ephemeral evaluation now goes through
NewSubcontext() Context.Run,Handle.NewContext, andContext.NewSubcontextnow require acontext.ContextBuilder.Build()now returns(*Handle, error)WAFObjectandWAFObjectKVare now type aliases to internal binding types (transparent but no longer require importinginternal/bindings)- The
Encodableinterface'sEncodemethod is nowEncode(enc *Encoder, obj *WAFObject, depth int) errorinstead of taking*bindings.WAFObject - The internal
depthOffunction now takes atimer.Timerinstead of relying oncontext.Background()
The v5 update introduces a more ergonomic and performant encoding API. Key changes include:
- Type changes:
WAFObjectandWAFObjectKVare now value types (type aliases to bindings). AWAFObject{}is a valid zero-value. - Direct field access for KV: Use
kv.Key.SetString(pinner, "...")andkv.Val.SetBool(true)directly. Thekv.Key()andkv.Value()accessors have been removed. - Pinner access: External
Encodableimplementers access*runtime.Pinnerviaenc.Config.Pinner(theinternal/pinpackage has been removed). - Truncations value type: The
map[TruncationReason][]inthas been replaced by aTruncationsvalue type. Uset.StringTooLongetc. for direct access, ort.AsMap()for backward compatibility. - Encoder helper: A bundle for
Encodableimplementers that providesWriteString,Map,Array, andTimeouthelpers. - MapBuilder / ArrayBuilder: Ergonomic builders that replace the manual slice juggling and
SetMapData/SetArrayDatapattern. - Encodable interface change: The new signature is
Encode(enc *Encoder, obj *WAFObject, depth int) error. Truncations now accumulate inenc.Truncations. - Best-effort encoding philosophy: Errors should be self-recovered whenever possible. Only fatal conditions like
ErrTimeoutorErrMaxDepthExceededshould propagate.
BEFORE (v4):
// BEFORE (v4) — manual slice juggling + truncation map merge dance
type Encodable struct {
data []byte
// ... fields elided
}
func (e *Encodable) Encode(config libddwaf.EncoderConfig, obj *libddwaf.WAFObject, remainingDepth int) (map[libddwaf.TruncationReason][]int, error) {
truncations := map[libddwaf.TruncationReason][]int{}
// ... manual JSON walk ...
// For each map:
var wafObjs []libddwaf.WAFObject
var length int
for /* each (key, value) */ {
length++
if config.Timer.Exhausted() {
return truncations, waferrors.ErrTimeout
}
if len(wafObjs) >= config.MaxContainerSize {
continue
}
wafObjs = append(wafObjs, libddwaf.WAFObject{})
entryObj := &wafObjs[len(wafObjs)-1]
// Manual key truncation
if len(key) > config.MaxStringSize {
truncations[libddwaf.StringTooLong] = append(
truncations[libddwaf.StringTooLong], len(key))
key = key[:config.MaxStringSize]
}
entryObj.SetMapKey(config.Pinner, key) // v4-only method
// ... encode value into entryObj ...
if err := encodeValue(entryObj, value, remainingDepth-1); err != nil {
entryObj.SetInvalid()
continue
}
}
if len(wafObjs) >= config.MaxContainerSize {
truncations[libddwaf.ContainerTooLarge] = append(
truncations[libddwaf.ContainerTooLarge], length)
}
obj.SetMapData(config.Pinner, wafObjs)
return truncations, nil
}AFTER (v5):
// AFTER (v5) — MapBuilder + Encoder helpers handle truncation, capacity,
// key truncation, and finalization automatically.
type Encodable struct {
data []byte
// ... fields elided
}
func (e *Encodable) Encode(enc *libddwaf.Encoder, obj *libddwaf.WAFObject, depth int) error {
if enc.Timeout() {
return waferrors.ErrTimeout
}
if depth < 0 {
enc.Truncations.Record(libddwaf.ObjectTooDeep, enc.Config.MaxObjectDepth-depth)
return waferrors.ErrMaxDepthExceeded
}
// ... walk JSON ...
// For each map:
mb := enc.Map(obj)
defer mb.Close()
for /* each (key, value) */ {
if enc.Timeout() {
return waferrors.ErrTimeout
}
slot := mb.NextValue(key) // auto-truncates key, returns nil at cap
if slot == nil {
mb.Skip()
continue
}
if err := encodeValue(slot, value, depth-1); err != nil {
slot.SetInvalid() // best-effort: key preserved, value invalid
if errors.Is(err, waferrors.ErrTimeout) {
return err
}
}
}
return nil
}Best-effort encoding philosophy: The WAF prefers a malformed payload to no payload at all (so at least some inspection happens). When implementing
Encodable, treat encoding errors as recoverable: leave the object as the zero value (which isWAFInvalidType) or callobj.SetInvalid()and continue. Only return errors fromEncodefor fatal conditions:waferrors.ErrTimeout(whenenc.Timeout()returns true) orwaferrors.ErrMaxDepthExceeded(when depth budget is exhausted). TheMapBuilderpreserves keys with invalid values on error; theArrayBuilderlets youDropLast()to remove an entry that couldn't be encoded.
Note: For more detailed examples, see the planned migration_spike_test.go companion.
// v4
builder, _ := waf.NewBuilder("keyRegex", "valueRegex")
ctx.Run(waf.RunAddressData{Persistent: data, Ephemeral: ephemeral})
// v5
builder, err := waf.NewBuilder()
builder.AddOrUpdateConfig("obfuscator/config", map[string]any{
"key_regex": keyRegex,
"value_regex": valueRegex,
})
wafHandle, err := builder.Build()
ctx.Run(context.Background(), waf.RunAddressData{Data: data})
subCtx, err := ctx.NewSubcontext(context.Background())
defer subCtx.Close()
subCtx.Run(context.Background(), waf.RunAddressData{Data: ephemeral})For the upstream libddwaf v2 migration details and release notes, prefer the canonical docs in the libddwaf repository:
- https://github.com/DataDog/libddwaf/blob/master/docs/upgrading/UPGRADING-v2.0.md
- https://github.com/DataDog/libddwaf/blob/master/docs/changelog/CHANGELOG-v2.0.0.md
Context.SubContext(ctx) (*Context, error) has been renamed to Context.NewSubcontext(ctx) (*Subcontext, error).
The returned type is now *Subcontext instead of *Context. Subcontext has its own Run, Close, and Truncations methods.
Subcontext.NewSubcontext is not available — only a Context can spawn Subcontexts. To create a sibling subcontext, call parentContext.NewSubcontext(...).
// Before
subCtx, err := ctx.SubContext(context.Background())
// After
subCtx, err := ctx.NewSubcontext(context.Background())
defer subCtx.Close()Originally this project only provided CGO wrappers for calls to libddwaf.
With the appearance of the ddwaf_object tree-like structure and the goal of building CGO-less bindings, it has grown into an integrated component of the DataDog tracer.
That made it necessary to document the project and keep it maintainable.
This library currently supports the following platform pairs:
| OS | Arch |
|---|---|
| Linux | amd64 |
| Linux | aarch64 |
| OSX | amd64 |
| OSX | arm64 |
This means that when the platform is not supported, top-level functions will return a WafDisabledError explaining why.
Note that:
- Linux support includes glibc and musl variants
- OSX under 10.9 is not supported
- A build tag named
datadog.no_wafcan be manually added to force the WAF to be disabled.
The WAF bindings have multiple moving parts that are necessary to understand:
Builder: an object wrapper over the pointer to the C WAF BuilderHandle: an object wrapper over the pointer to the C WAF HandleContext: an object wrapper over a pointer to the C WAF Context- Encoder: its goal is to construct a tree of Waf Objects to send to the WAF
- Decoder: Transforms Waf Objects returned from the WAF to usual go objects (e.g. maps, arrays, ...)
- Library: The low-level go bindings to the C library, providing improved typing
flowchart LR
START:::hidden -->|NewBuilder| Builder -->|Build| Handle
Handle -->|NewContext| Context
Context -->|NewSubcontext| Subcontext
Context -->|Encode Inputs| Encoder
Subcontext -->|Encode Inputs| Encoder
Handle -->|Encode Ruleset| Encoder
Handle -->|Init WAF| Library
Context -->|Decode Result| Decoder
Subcontext -->|Decode Result| Decoder
Handle -->|Decode Init Errors| Decoder
Context -->|Run| Library
Subcontext -->|Run| Library
Encoder -->|Allocate Waf Objects| runtime.Pinner
Library -->|Call C code| libddwaf
classDef hidden display: none;
When passing Go values to the WAF, it is necessary to make sure that memory remains valid and does
not move until the WAF no longer has any pointers to it. We do this by using runtime.Pinner from the standard library.
Each call to Run() creates a new runtime.Pinner; pinners are collected per-Context (or per-Subcontext) and unpinned when the Context (or Subcontext) is closed.
Here is an example of the flow of operations on a simple call to Run():
- Create a
runtime.Pinnerfor this call - Encode input data into WAF Objects, pinning Go pointers via the pinner
- Lock the context mutex
- Call
ddwaf_run - Decode the matches and actions
- Unlock the mutex; append the pinner to the context's pinner list (unpinned on
Close())
This library uses purego to implement C bindings without requiring use of CGO at compilation time. The high-level workflow
is to embed the C shared library using go:embed, dump it into a file, open the library using dlopen, load the
symbols using dlsym, and finally call them. On Linux systems, using memfd_create(2) enables the library to be loaded without
writing to the filesystem.
Another requirement of libddwaf is to have a FHS filesystem on your machine and, for Linux, to provide libc.so.6,
libpthread.so.0, and libdl.so.2 as dynamic libraries.
⚠️ Keep in mind that purego only works on linux/darwin for amd64/arm64 and so does go-libddwaf.
- Cannot dlopen twice in the app lifetime on OSX. It messes with Thread Local Storage and usually finishes with a
std::bad_alloc() keepAlive()calls are here to prevent the GC from destroying objects too early- Since there is a stack switch between the Go code and the C code, usually the only C stacktrace you will ever get is from GDB
- If a segfault happens during a call to the C code, the goroutine stacktrace which has done the call is the one annotated with
[syscall] - GoLand does not support
CGO_ENABLED=0(as of June 2023) - Keep in mind that we fully escape the type system. If you send the wrong data it will segfault in the best cases but not always!
- The structs in
ctypes.goare here to reproduce the memory layout of the structs ininclude/ddwaf.hbecause pointers to these structs will be passed directly - Do not use
uintptras function arguments or results types, coming fromunsafe.Pointercasts of Go values, because they escape the pointer analysis which can create wrongly optimized code and crash. Pointer arithmetic is of course necessary in such a library but must be kept in the same function scope. - GDB is available on arm64 but is not officially supported so it usually crashes pretty fast (as of June 2023)
- No pointer to variables on the stack shall be sent to the C code because Go stacks can be moved during the C call. More on this here
Debug-logging can be enabled for underlying C/C++ library by building (or testing) by setting the
DD_APPSEC_WAF_LOG_LEVEL environment variable to one of: trace, debug, info, warn (or
warning), error, off (which is the default behavior and logs nothing).
The DD_APPSEC_WAF_LOG_FILTER environment variable can be set to a valid (per the regexp package)
regular expression to limit logging to only messages that match the regular expression.