JSON Parsing and Regular Expressions
When dealing with Web Proofs, the ability to parse JSON data is essential. Similarly, finding specific strings or patterns in the subject or body of an email is crucial for Email Proofs.
To support these needs, we provide helpers for parsing text using regular expressions and extracting data from JSON directly within vlayer Prover
contracts.
JSON Parsing
We provide four functions to extract data from JSON based on the field type:
jsonGetInt(json, path)
: Extracts an integer value and returnsint256
;jsonGetBool(json, path)
: Extracts a boolean value and returnsbool
;jsonGetString(json, path)
: Extracts a string value and returnsstring memory
;jsonGetFloatAsInt(json, path, precision)
: Extracts a decimal number from JSON, moves its decimal point right by the specifiedprecision
, and returns it as a truncatedint256
. Ifprecision
is greater than the number of decimal digits, it pads it with zeros. For example, reading1.234
at precision2
yields123
, and at precision4
yields12340
. This approach is used because Solidity does not support floating-point numbers.
import {Prover} from "vlayer/Prover.sol";
import {Web, WebLib} from "vlayer/WebProof.sol";
contract JSONContainsFieldProof is Prover {
using WebLib for Web;
function main(Web memory web) public returns (Proof memory, string memory) {
require(web.jsonGetInt("deep.nested.field") == 42, "deep nested field is not 42");
// If we return the provided JSON back, we will be able to pass it to verifier
// Together with a proof that it contains the field
return (proof(), web.body);
}
}
In the example above, the function extracts the value of the field deep.nested.field
from the JSON string below and checks if it equals 42
.
{
"deep": {
"nested": {
"field": 42
}
}
}
The functions will revert if the field does not exist or if the value is of the wrong type.
Jmespath
Field paths provided to jsonGet...
functions are evaluated using JMESPath, a query language for JSON that allows powerful expressions beyond simple key access.
For example, to get the number of elements in an array:
int256 length = web.jsonGetInt("root.nested_level.field_array | length(@)");
require(length == 2, "Expected array of length 2");
To access a specific element from an array:
string memory value = web.jsonGetString("root.nested_level.field_array[1]");
require(keccak256(bytes(value)) == keccak256("val2"), "Unexpected array value");
You can also access fields within arrays of objects:
int256 value = web.jsonGetInt("root.nested_level.field_array_of_objects_with_numbers[1].key");
require(value == 2, "Expected value at index 1");
This makes it easy to work with complex JSON structures directly inside your prover logic, without needing preprocessing.
Regular Expressions
Regular expressions are a powerful tool for finding patterns in text.
We provide functions to match and capture a substring using regular expressions:
matches
checks if a string matches a regular expression and returnstrue
if a match is found;capture
checks if a string matched a regular expression and returns an array of strings. First string is the whole matched text, followed by the captures.
Regex size optimization
Internally, the regular expression is compiled into a DFA.
The size of the DFA is determined by the regular expression itself, and it can get quite large even for seemingly simple patterns.
It's important to remember that the DFA size corresponds to the cycles used in the ZK proof computation, and therefore it is important to keep it as small as possible.
We have a hard limit for a DFA size which should be enough for most use cases.
For example the regex "\w"
includes all letters including the ones from unicode and as a result will be over 100x larger than a simple "[a-zA-Z0-9]"
pattern.
In general, to bring the compiled regular expression size down, it is recommended to use more specific patterns.
import {Prover} from "vlayer/Prover.sol";
import {RegexLib} from "vlayer/Regex.sol";
contract RegexMatchProof is Prover {
using RegexLib for string;
function main(string calldata text, string calldata hello_world) public returns (Proof memory, string memory) {
// The regex pattern is passed as a string
require(text.matches("^[a-zA-Z0-9]*$"), "text must be alphanumeric only");
// Example for "hello world" string
string[] memory captures = hello_world.capture("^hello(,)? (world)$");
assertEq(captures.length, 3);
assertEq(captures[0], "hello world");
assertEq(captures[1], "");
assertEq(captures[2], "world");
// Return proof and provided text if it matches the pattern
return (proof(), text);
}
}