Accurate values in comments

Values in comments

Take any JavaScript code example. Usually, the inputs are given and
expected output is shown as a comment. For example,
Ramda R.compose function is shown below.
The library gives two examples for compose. I find that the more examples
are given, the faster I understand the code.

1
2
3
4
5
6
var classyGreeting = (firstName, lastName) =>
"The name's " + lastName + ", " + firstName + " " + lastName
var yellGreeting = R.compose(R.toUpper, classyGreeting);
yellGreeting('James', 'Bond'); //=> "THE NAME'S BOND, JAMES BOND"

R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 7

Similarly, Lodash docs follow the same convention.
The code examples in blog posts,
tutorial slides
and even books
use the same "code, result as a comment" format.

Yet, there is a problem. More complex examples, especially the ones that
evolve over time require extra effort from the author to ensure that the
value comments alongside the code remain accurate. There is nothing worse
for the reader than to see an out of date comment!

1
2
R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 3
// oops, value "3" was produced when we had `R.add(5)`!

Inspiration

Recently, Chrome DevTools implemented live value previews. When the debugger is
paused, the values of the variables that are already computed are shown
right next to the code. Here is how the JavaScript values preview works
(source https://developers.google.com/web/updates/2015/07/preview-javascript-values-inline-while-debugging)

Live Preview

This is extremely nice feature, if only we could mark values in comments as
"live" during Nodejs execution and ask the runtime to update them.
Wait a minute!

Code and comment instrumentation

Recording executed statements is the main feature of code coverage tools
like istanbul. It is
implemented via on the fly code instrumentation; when a JavaScript file is
loaded by the Nodejs module system a user-supplied callback function is
called. This function can transform the loaded source before it is evaluated.
Thus we can do all sorts of
interesting things.
For example, using my node-hook to
instrument we can print a message on file load.

1
2
3
4
5
6
7
8
var hook = require('node-hook');
function logLoadedFilename(source, filename) {
return 'console.log("' + filename + '");\n' + source;
}
hook.hook('.js', logLoadedFilename);
// load the actual file
require('./dummy');
// prints fulle dummy.js filename, runs dummy.js

We do not have to include the above wrapper code in our application. Instead
I prefer using Node "preload" module feature. Place the wrapper code into
a separate module and load it before the first JavaScript file.

printer.js
1
2
3
4
5
var hook = require('node-hook');
function logLoadedFilename(source, filename) {
return 'console.log("' + filename + '");\n' + source;
}
hook.hook('.js', logLoadedFilename);
1
2
$ node -r ./printer.js dummy.js
/home/user/dummy.js

Thus we can create a wrapper for Node that is simple to use and can modify
any loaded JavaScript file to include any desired additional code.
For example we could find all variables in the comments and insert additional
statements into the loaded source to save the values of those variables in
a big data structure. When the program finishes its run, we need to save this
data structure and / or update the original file with new values.

comment-value

This is how the comment-value
was born. Its first goal is to update the variable values in specially
formatted line comments that put just the variable name followed by
colon like this // name:.

In the simple example we added 3 extra line comments (// a:, // b: and
// sum:, recording argument variables a and b and the variable sum.
The comments are empty – we do not even bother writing the expected values
manually.

example.js
1
2
3
4
5
6
7
function add(a, b) {
// a:
// b:
return a + b
}
const sum = add(10, 2)
// sum:

Install the tool comment-value and run it on the file example.js

1
2
3
4
5
6
7
8
9
10
$ npm i -g comment-value
$ values example.js
$ cat example.js
function add(a, b) {
// a: 10
// b: 2
return a + b
}
const sum = add(10, 2)
// sum: 12

Now, imagine that we have decided to use simple 2 + 3 to explain the above
addition. Just change the values when calling the add function to
add(2, 3) and rerun the values example.js.

1
2
3
4
5
6
7
8
9
$ values example.js
$ cat example.js
function add(a, b) {
// a: 2
// b: 3
return a + b
}
const sum = add(2, 3)
// sum: 5

All values have been recomputed and the comments have been updated. No need
to do this manually, and the reader can rest assured – the example is
correct and up to date.

The implementation is pretty simple. We look at each source line, finding
every variable name that matches format // name:. Then we insert an object
at the begging of the source file to record values and a statement after
the comment line to record the value. The above example.js code
would look something like this when instrumented

example.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
const values = []
function add(a, b) {
// a:
values[0] = a
// b:
values[1] = b
return a + b
}
const sum = add(10, 2)
// sum:
values[2] = a
// saveUpdated puts the values back
// into example.js and saves it to disk
process.on('exit', saveUpdated)

Perfect.

Taking it to the next level

While the comment-value tool is already useful, we can do better. Not only
we want to show the result variable, we also want to easily show the
intermediate expression values. For example, the large Ramda compose
function example in the beginning blog post has been written with implicit
code style without intermediate variables.

compose-example.js
1
2
var R = require('ramda')
R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 7

There is no result variable even! Can we somehow update the value in the
line comment //=> 7? Yes, but using a more complex transformation.

We cannot simply look at each line in isolation, finding a single variable name
and inserting a quick assignment statement.
Instead we need to understand the structure of our code to the left of
a "magical" line comment that starts with //=> string. Luckily, the
raw JavaScript source can be parsed into an Abstract Syntax Tree (AST) using
off the shelf tools. I am using
falafel which takes source
string and a callback function that visits every node in the tree.

In this tiny example, we discovered CallExpression and a "magic" comment
that are next to each other.

index.js
1
2
3
4
5
6
7
add(2, 3) //=> ?
---
add(2, 3): CallExpression, "add" is callee,
2 and 3 are arguments
add: Identifier
2 : Literal
3 : Literal

Each node in the AST has its location information and the source code.
For example the above CallExpression node would be processed like this:

1
2
3
4
5
6
7
8
9
10
11
const source = fs.readFileSync('./index.js', 'utf8')
const output = falafel(source, visitor)
function visitor (node) {
// node.type is "CallExpression"
// node.source() returns "add(2, 3)"
// node.loc.end is {line: 0, column: 9}
const c = findSpecialCommentToTheRight(node.loc.end)
// c is {index: 0} pointing to "values" array
const wrapped = wrap(node)
node.update(wrapped)
}

If we call node.update() with new source code, it will replace the code
for this particular node. With falafel, there is no need to generate a
complex replacement AST node when wrapping a source fragment, we could
just return a new string.

We could do anything inside the wrapping logic, but we have to be careful not
to break the surrounding code. For CallExpression node we want to
actually execute the function, record its result and then return it to the
outside code. For example here is the input and instrumented code.

index.js
1
const sum = add(2, 3) //=> ?
instrumented.js
1
2
3
4
5
6
const values = []
const sum = (function () {
const result = add(2, 3)
values[0] = result
return result
}())

You can see the actual instrumented code by passing -i option to the
values program. It will look a lot more complex in order to handle some
edge cases.

In action

Let us take the compose example again and see how useful comments could be.
Rewrite the example to split the composed functions to one per line for
clarity.

compose-example.js
1
2
3
4
5
6
const R = require('ramda')
R.compose(
Math.abs,
R.add(1),
R.multiply(2)
)(-4)

In the real example, we would split making the composed function and we would
print the result, right?

compose-example.js
1
2
3
4
5
6
7
const R = require('ramda')
const fn = R.compose(
Math.abs,
R.add(1),
R.multiply(2)
)
console.log(fn(-4))

Let us insert "magic" comments – and we can insert them inside the
composition! We can use
different strings
to mark the comments, I prefer short //> (or // > to be compatible with
standard js linter)

compose-example.js
1
2
3
4
5
6
7
const R = require('ramda')
const fn = R.compose(
Math.abs, //>
R.add(1), //>
R.multiply(2) //>
)
console.log(fn(-4))

Run the values to compute the values values compose-example.js

compose-example.js
1
2
3
4
5
6
7
const R = require('ramda')
const fn = R.compose(
Math.abs, //> 7
R.add(1), //> -7
R.multiply(2) //> -8
)
console.log(fn(-4))

Great! We can even wrap the call inside the console.log (going extra mile for
a common use case). Just put //> after console.log(fn(-4)) to extract
the value of the first argument.

compose-example.js
1
2
3
4
5
6
7
const R = require('ramda')
const fn = R.compose(
Math.abs, //> 7
R.add(1), //> -7
R.multiply(2) //> -8
)
console.log(fn(-4)) //> 7

Finally, we can enable live update and let the users explore how the
intermediate values change as we keep changing the parameters. Just run
the tool in watch mode and keep editing the file. See this in action
in the clip below

Future work

The current comment-value tool
solves my problems, but I hope to extend it with several features.

  • testing – the tool could run in "comparison" mode. If a value comment is
    empty, then a new value will be filled. If there is a value there already
    the tool will compare the computed and the current value. If they are
    different it will raise an error. This can be used to test code and
    intermediate values given some specific inputs.
  • online mode – testing lots of code examples in the my presentation
    slides. Maybe if I target GitHub gists …
  • type signatures – we could record the run time type signatures of
    intermediate expressions, instead of values. This would explain the code
    and allow its refactoring
  • data coverage during unit tests – we could collect all different data
    items for a given variable during unit tests. This would be helpful to
    find out if there are missing tests. For example, the following testing code
    achieves 100% code coverage during unit tests.
1
2
3
4
5
6
7
8
9
function isEmail(email) {
return /^[\w\.]+@\w+\.\w+$/.test(email)
}
it('allows valid email', () => {
console.assert(isEmail('foo@gmail.com'))
})
it('allows email with dots', () => {
console.assert(isEmail('foo.bar@gmail.com'))
})

Yet, this is a perfect example when full statement coverage is possible yet
guarantees neither code robustness nor correctness. But what if we could
collect all values of input argument email during unit tests?

1
2
3
4
5
function isEmail(email) {
// s:
return /^[\w\.]+@\w+\.\w+$/.test(s)
}
// all unit tests

We would get a list back, probably as a JSON file. In our case, the variable
email would be all emails we have passed to isEmail, no matter how they
arrived, maybe even from other tests!

data-coverage.json
1
2
3
{
"email": ["foo@gmail.com", "foo.bar@gmail.com"]
}

This will quickly give you an idea of more email "types" that you should
test. For example, there were no emails with other characters, like dashes!
We really would quickly notice (or could even automate) missing test data
classes and edge cases.

QA Engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 999999999 beers. Orders a lizard. Orders -1 beers. Orders a sfdeljknesv.

— Bill Sempf (@sempf) September 23, 2014

  • code documentation – we should stop using @example inside JavaDoc
    block comments. They are hard to format, paint to write and a chore to
    maintain. Instead we could have little executable snippets with
    "comment-value" tool executing them and updating the expected values.

related xplain generates
documentation examples from unit tests, which makes sure the code examples
are accurate and in sync with the code, but approaching this problem from
the opposite direction.

You may also like...