Building a JSON validator with Sylver - Part3/3 : From queries to analyzer

Building a JSON validator with Sylver - Part3/3 : From queries to analyzer

In Part1 and Part2 of the series, we learned how to build a language spec and how to use Sylver's query language to explore the parse tree of our JSON documents.

While it can be insightful to explore a codebase interactively through source-code queries, it's not the most practical way to perform source code verification. In this tutorial, we'll learn how to package the queries we built in the last part into a ruleset to use them in a linter-like fashion.

If you have already installed sylver and sylver --version doesn't output a version number >= 0.1.4, please go to https://sylver.dev to download a fresh copy of the software.

Prelude

We'll reuse two files from the last tutorial:

  • json.syl
node JsonNode { }

node Null: JsonNode { }

node Bool: JsonNode { }

node Number: JsonNode { }

node String: JsonNode { }

node Array: JsonNode { 
    elems: List<JsonNode> 
}

node Object: JsonNode {
    members: List<Member>
}

node Member: JsonNode {
    key: String,
    value: JsonNode
}

term COMMA = ','
term COLON = ':'
term L_BRACE = '{'
term R_BRACE = '}'
term L_BRACKET = '['
term R_BRACKET = ']'
term NULL = 'null'

term BOOL_LIT = `true|false`
term NUMBER_LIT = `\-?(0|([1-9][0-9]*))(.[0-9]+)?((e|E)(\+|-)?[0-9]+)?`
term STRING_LIT = `"([^"\\]|(\\[\\/bnfrt"])|(\\u[a-fA-F0-9]{4}))*"`


ignore term WHITESPACE = `\s`

rule string = String { STRING_LIT }

rule member = Member { key@string COLON value@main }

rule main =
    Null { NULL }
  | Number { NUMBER_LIT }
  | Bool { BOOL_LIT }
  | string
  | Array { L_BRACKET elems@sepBy(COMMA, main) R_BRACKET }
  | Object { L_BRACE members@sepBy(COMMA, member) R_BRACE }@
  • invalid_config.json
{
    "variables": [
        {
            "name": "date of birt`",
            "description": "Customer's date of birth",
            "type": "datetime"
        },
        {
            "name": "activity",
            "description": "A short text describing the customer's profession",
            "type": "string"
        },
        {
            "name": "country",
            "description": "Customer's country of residence",
            "type": "string",
            "values": ["us", "fr", "it" ]
        }
    ]
}

Stepping out of the REPL

Creating a ruleset

Packaging the rules from the previous tutorial into a reusable ruleset is as simple as creating the following YAML file:

id: 'JSON ruleset'
language: json.syl

rules: 
  - id: variable_length
    message: Variable name is too long
    category: style
    query: >
      match String desc when desc.text.length > 37 && desc.parent is {
        Member m when m.key.text == '"description"'
      }  

  - id: variable_format 
    message: Variable name isn't a lowercase word
    category: style
    query: >
      match String s when !s.text.matches(`"[a-z]+"`) && s.parent is {
        Member m when m.key.text == '"name"'
      }    

  - id: types_or_values
    message: Fields 'type' and 'values' are mutually exclusive    
    category: error
    note: The type can be deduced from the values list.
    query: >
      match Object n when
        any n.members.children match {  
            Member m when m.key.text == '"type"' 
        }
        && any n.members.children match { 
            Member m when m.key.text == '"values"' 
        }

Where id is a human-readable description of the ruleset, and language refers to a language spec file.

The following properties describe the individual rules composing the ruleset:

  • id: unique and short name of the rule
  • message: a concise description of the issue
  • category: error, bug, smell, style
  • query: inline query
  • note: optional additional informations

Assuming that our ruleset file is called ruleset.yaml, we can run this ruleset on every .json file in the current directory by invoking the following command:

sylver ruleset run --files "*.json" --rulesets ruleset.yaml

Storing our project configuration

If we wish to validate our codebase against multiple rulesets, repeating the above command for every ruleset can be tedious. Instead, we can write a project configuration in a sylver.yaml file at the root of our project:

subprojects:
  - language: json.syl
    rulesets: ['ruleset.yaml']
    include:
      - './**/*.json'

The configuration contains a list of subprojects, each having a language, an optional list of rulesets, and a list of files to include.

Invoking sylver check will read the config from sylver.yaml and run the specified rulesets.

Git integration

Should you want to reuse your language specs or rulesets in several projects, copying your .syl and .yaml files in every project would be inconvenient. Luckily rulesets and project configurations can refer to artifacts stored in a git repository.

The language spec and ruleset for this tutorial have been uploaded to this repo, so if we rewrite our sylver.yaml config file as:

subprojects:
  - language: 
      repo: https://github.com/geoffreycopin/getting_started_json_tutorial
      file: json.syl
    rulesets: 
      - repo: https://github.com/geoffreycopin/getting_started_json_tutorial
        file: 'ruleset.yaml'
    include:
      - './**/*.json'

the language spec and ruleset will be cloned automatically in the .sylver directory when running sylver check.

Conclusion

We now have a reusable linter for our JSON configuration files built from scratch using Sylver's DSL.

The following tutorial will use a pre-built Golang to write a general-purpose Go linter.